Compare commits

..

371 Commits

Author SHA1 Message Date
kshitijk4poor 0f05b18413 feat(skills): add comfyui-mcp optional skill for generative image/video workflows
New optional skill under optional-skills/creative/comfyui-mcp/ with:
- SKILL.md: Setup guide (local, cloud, remote), Python helper functions
  for execute_code, common workflow patterns (txt2img, parameterized gen),
  queue management, and MCP server integration option.
- references/api.md: ComfyUI REST API reference (endpoints, JSON formats).
- references/recipes.md: Ready-to-use workflow templates (SDXL txt2img,
  img2img, Flux).

Zero core code changes — skill-only PR. Uses the established
skill-as-prompt pattern (like blender-mcp): teaches the agent to
interact with ComfyUI's REST API via execute_code.
2026-04-18 17:30:32 +05:30
Teknium 2edebedc9e feat(steer): /steer <prompt> injects a mid-run note after the next tool call (#12116)
* feat(steer): /steer <prompt> injects a mid-run note after the next tool call

Adds a new slash command that sits between /queue (turn boundary) and
interrupt. /steer <text> stashes the message on the running agent and
the agent loop appends it to the LAST tool result's content once the
current tool batch finishes. The model sees it as part of the tool
output on its next iteration.

No interrupt is fired, no new user turn is inserted, and no prompt
cache invalidation happens beyond the normal per-turn tool-result
churn. Message-role alternation is preserved — we only modify an
existing role:"tool" message's content.

Wiring
------
- hermes_cli/commands.py: register /steer + add to ACTIVE_SESSION_BYPASS_COMMANDS.
- run_agent.py: add _pending_steer state, AIAgent.steer(), _drain_pending_steer(),
  _apply_pending_steer_to_tool_results(); drain at end of both parallel and
  sequential tool executors; clear on interrupt; return leftover as
  result['pending_steer'] if the agent exits before another tool batch.
- cli.py: /steer handler — route to agent.steer() when running, fall back to
  the regular queue otherwise; deliver result['pending_steer'] as next turn.
- gateway/run.py: running-agent intercept calls running_agent.steer(); idle-agent
  path strips the prefix and forwards as a regular user message.
- tui_gateway/server.py: new session.steer JSON-RPC method.
- ui-tui: SessionSteerResponse type + local /steer slash command that calls
  session.steer when ui.busy, otherwise enqueues for the next turn.

Fallbacks
---------
- Agent exits mid-steer → surfaces in run_conversation result as pending_steer
  so CLI/gateway deliver it as the next user turn instead of silently dropping it.
- All tools skipped after interrupt → re-stashes pending_steer for the caller.
- No active agent → /steer reduces to sending the text as a normal message.

Tests
-----
- tests/run_agent/test_steer.py — accept/reject, concatenation, drain,
  last-tool-result injection, multimodal list content, thread safety,
  cleared-on-interrupt, registry membership, bypass-set membership.
- tests/gateway/test_steer_command.py — running agent, pending sentinel,
  missing steer() method, rejected payload, empty payload.
- tests/gateway/test_command_bypass_active_session.py — /steer bypasses
  the Level-1 base adapter guard.
- tests/test_tui_gateway_server.py — session.steer RPC paths.

72/72 targeted tests pass under scripts/run_tests.sh.

* feat(steer): register /steer in Discord's native slash tree

Discord's app_commands tree is a curated subset of slash commands (not
derived from COMMAND_REGISTRY like Telegram/Slack). /steer already
works there as plain text (routes through handle_message → base
adapter bypass → runner), but registering it here adds Discord's
native autocomplete + argument hint UI so users can discover and
type it like any other first-class command.
2026-04-18 04:17:18 -07:00
Teknium f9667331e5 docs(browser): improve /browser connect setup guidance (#12123)
- Note that /browser connect is CLI-only and won't work in gateways (WebUI, Telegram, Discord).
- Update the Chrome launch command to use a dedicated --user-data-dir, so port 9222 actually comes up even when Chrome is already running with the user's regular profile.
- Add --no-first-run --no-default-browser-check to skip the fresh-profile wizard.
- Explain why the dedicated user-data-dir matters.

Community tip via Karamjit Singh.

Co-authored-by: teknium1 <teknium@noreply.github.com>
2026-04-18 04:14:05 -07:00
Teknium 9527707f80 fix(signal): back off sendTyping spam for unreachable recipients (#12118)
base.py's _keep_typing refresh loop calls send_typing every ~2s while
the agent is processing. If signal-cli returns NETWORK_FAILURE for the
recipient (offline, unroutable, group membership lost), the unmitigated
path was a WARNING log every 2 seconds for as long as the agent stayed
busy — a user report showed 1048 warnings in 41 minutes for one
offline contact, plus the matching volume of pointless RPC traffic to
signal-cli.

- _rpc() accepts log_failures=False so callers can route repeated
  expected failures (typing) to DEBUG while keeping send/receive at
  WARNING.
- send_typing() tracks consecutive failures per chat. First failure
  still logs WARNING so transport issues remain visible; subsequent
  failures log at DEBUG. After three consecutive failures we skip the
  RPC during an exponential cooldown (16s, 32s, 60s cap) so we stop
  hammering signal-cli for a recipient it can't deliver to. A
  successful sendTyping resets the counters.
- _stop_typing_indicator() clears the backoff state so the next agent
  turn starts fresh.

E2E simulation against the reported 41-minute window: RPCs drop from
1230 to 45 (-96%), log lines from 1048 WARNINGs to 1 WARNING + 44
DEBUGs.

Credits kshitijk4poor (#12056) for the _rpc log_failures kwarg idea;
the broader restructure in that PR (nested per-chat loop inside
send_typing) is avoided here in favour of stateful backoff that
preserves base.py's existing _keep_typing architecture.
2026-04-18 04:13:32 -07:00
Teknium cf012a05d8 docs(terminal): warn against stacking watch_patterns + notify_on_complete on end-of-run markers (#12113)
Stacking both features on the same event produces duplicate, delayed
notifications — delivery is async and continues firing after the process
exits, so matches on end-of-run markers (SUMMARY, DONE, PASS) arrive
after the agent has already polled/waited and moved on.

Updates both the terminal tool JSON schema description and the
terminal_tool() function docstring to make the split explicit:

- watch_patterns: mid-process signals only (errors, readiness markers,
  intermediate steps you want to react to before the process exits)
- notify_on_complete: end-of-run completion signal

No behavioural change.
2026-04-18 03:53:21 -07:00
teknium1 3b69b2fd61 test(session-search): regression coverage for CJK LIKE fallback
Twelve tests under TestCJKSearchFallback guarding:
 - CJK detection across Chinese/Japanese/Korean/Hiragana/Katakana ranges
   (including the full Hangul syllables block \uac00-\ud7af, to catch
   the shorter-range typo from one of the duplicate PRs)
 - Substring match for multi-char Chinese, Japanese, Korean queries
 - Filter preservation (source_filter, exclude_sources, role_filter)
   in the LIKE path — guards against the SQL-builder bug from another
   duplicate PR where filter clauses landed after LIMIT/OFFSET
 - Snippet centered on the matched term (instr-based substr window),
   not the leading 200 chars of content
 - English fast-path untouched
 - Empty/no-match cases
 - Mixed CJK+English queries

Also:
 - hermes_state.py: LIKE-fallback snippet is now
   `substr(content, max(1, instr(content, ?) - 40), 120)`, centered on
   the match instead of the whole-content default. Credit goes to
   @iamagenius00 for the snippet idea in PR #11517.
 - scripts/release.py: add @iamagenius00 to AUTHOR_MAP so future
   release attribution resolves cleanly.

Refs #11511, #11516, #11517, #11541.

Co-authored-by: iamagenius00 <iamagenius00@users.noreply.github.com>
2026-04-18 01:57:57 -07:00
vominh1919 8826d9c197 fix: FTS5 LIKE fallback for CJK (Chinese/Japanese/Korean) queries
FTS5 default tokenizer splits CJK text character-by-character, causing
multi-character queries like '记忆断裂' to return 0 results.

This fix adds a LIKE fallback: when FTS5 returns no results and the
query contains CJK characters, retry with WHERE content LIKE '%query%'.
Preserves FTS5 performance for English queries.

Fixes #11511
2026-04-18 01:57:57 -07:00
Teknium a2c9f5d0a7 docs(execute_code): document project/strict execution modes (#12073)
Follow-up to PR #11971. Documents the new code_execution.mode config
key and what each mode actually does.

- user-guide/configuration.md: add mode: project to the yaml example,
  explain project vs strict and call out that security invariants are
  identical across modes.
- user-guide/features/code-execution.md: new 'Execution Mode' section
  with a comparison table and usage guidance; update the 'temporary
  directory' note so it reflects that script.py runs in the session
  CWD in project mode (staging dir stays on PYTHONPATH for imports);
  drop stale 'sandboxed' framing from the intro and skill-passthrough
  paragraph.
- getting-started/learning-path.md: update the one-line Code Execution
  summary to match (no longer 'sandboxed environments' — the default
  runs in the session's real working directory).

No code changes.
2026-04-18 01:53:09 -07:00
Teknium 8322b42c6c fix(streaming): surface dropped tool-call on mid-stream stall (#12072)
When streaming died after text was already delivered to the user but
before a tool-call's arguments finished streaming, the partial-stream
stub at the end of _interruptible_streaming_api_call silently set
`tool_calls=None` on the returned message and kept `finish_reason=stop`.
The agent treated the turn as complete, the session exited cleanly with
code 0, and the attempted action was lost with zero user-facing signal.

Live-observed Apr 2026 with MiniMax M2.7 on a ~6-minute audit task:
agent streamed 'Let me write the audit:', started emitting a write_file
tool call, MiniMax stalled for 240s mid-arguments, the stale-stream
detector killed the connection, the stub fired, session ended, no file
written, no error shown.

Fix: the streaming accumulator now records each tool-call's name into
`result['partial_tool_names']` as soon as the name is known. When the
stub builder fires after a partial delivery and finds any recorded tool
names, it appends a human-visible warning to the stub's content — and
also fires it as a live stream delta so the user sees it immediately,
not only in the persisted transcript. The next turn's model also sees
the warning in conversation history and can retry on its own. Text-only
partial streams keep the original bare-recovery behaviour (no warning).

Validation:
| Scenario                                    | Before                    | After                                       |
|---------------------------------------------|---------------------------|---------------------------------------------|
| Stream dies mid tool-call, text already sent | Silent exit, no indication | User sees ⚠ warning naming the dropped tool |
| Text-only partial stream                     | Bare recovered text       | Unchanged                                   |
| tests/run_agent/test_streaming.py            | 24 passed                 | 26 passed (2 new)                           |
2026-04-18 01:52:06 -07:00
Teknium 285bb2b915 feat(execute_code): add project/strict execution modes, default to project (#11971)
Weaker models (Gemma-class) repeatedly rediscover and forget that
execute_code uses a different CWD and Python interpreter than terminal(),
causing them to flip-flop on whether user files exist and to hit import
errors on project dependencies like pandas.

Adds a new 'code_execution.mode' config key (default 'project') that
brings execute_code into line with terminal()'s filesystem/interpreter:

  project (new default):
    - cwd       = session's TERMINAL_CWD (falls back to os.getcwd())
    - python    = active VIRTUAL_ENV/bin/python or CONDA_PREFIX/bin/python
                  with a Python 3.8+ version check; falls back cleanly to
                  sys.executable if no venv or the candidate fails
    - result    : 'import pandas' works, '.env' resolves, matches terminal()

  strict (opt-in):
    - cwd       = staging tmpdir (today's behavior)
    - python    = sys.executable (today's behavior)
    - result    : maximum reproducibility and isolation; project deps
                  won't resolve

Security-critical invariants are identical across both modes and covered by
explicit regression tests:

  - env scrubbing (strips *_API_KEY, *_TOKEN, *_SECRET, *_PASSWORD,
    *_CREDENTIAL, *_PASSWD, *_AUTH substrings)
  - SANDBOX_ALLOWED_TOOLS whitelist (no execute_code recursion, no
    delegate_task, no MCP from inside scripts)
  - resource caps (5-min timeout, 50KB stdout, 50 tool calls)

Deliberately avoids 'sandbox'/'isolated'/'cloud' language in tool
descriptions (regression from commit 39b83f34 where agents on local
backends falsely believed they were sandboxed and refused networking).

Override via env var: HERMES_EXECUTE_CODE_MODE=strict|project
2026-04-18 01:46:25 -07:00
Teknium 54e0eb24c0 docs: correctness audit — fix wrong values, add missing coverage (#11972)
Comprehensive audit of every reference/messaging/feature doc page against the
live code registries (PROVIDER_REGISTRY, OPTIONAL_ENV_VARS, COMMAND_REGISTRY,
TOOLSETS, tool registry, on-disk skills). Every fix was verified against code
before writing.

### Wrong values fixed (users would paste-and-fail)

- reference/environment-variables.md:
  - DASHSCOPE_BASE_URL default was `coding-intl.dashscope.aliyuncs.com/v1` \u2192
    actual `dashscope-intl.aliyuncs.com/compatible-mode/v1`.
  - MINIMAX_BASE_URL and MINIMAX_CN_BASE_URL defaults were `/v1` \u2192 actual
    `/anthropic` (Hermes calls MiniMax via its Anthropic Messages endpoint).
- reference/toolsets-reference.md MCP example used the non-existent nested
  `mcp: servers:` key \u2192 real key is the flat `mcp_servers:`.
- reference/skills-catalog.md listed ~20 bundled skills that no longer exist
  on disk (all moved to `optional-skills/`). Regenerated the whole bundled
  section from `skills/**/SKILL.md` \u2014 79 skills, accurate paths and names.
- messaging/slack.md ":::info" callout claimed Slack has no
  `free_response_channels` equivalent; both the env var and the yaml key are
  in fact read.
- messaging/qqbot.md documented `QQ_MARKDOWN_SUPPORT` as an env var, but the
  adapter only reads `extra.markdown_support` from config.yaml. Removed the
  env var row and noted config-only nature.
- messaging/qqbot.md `hermes setup gateway` \u2192 `hermes gateway setup`.

### Missing coverage added

- Providers: AWS Bedrock and Qwen Portal (qwen-oauth) \u2014 both in
  PROVIDER_REGISTRY but undocumented everywhere. Added sections to
  integrations/providers.md, rows to quickstart.md and fallback-providers.md.
- integrations/providers.md "Fallback Model" provider list now includes
  gemini, google-gemini-cli, qwen-oauth, xai, nvidia, ollama-cloud, bedrock.
- reference/cli-commands.md `--provider` enum and HERMES_INFERENCE_PROVIDER
  enum in env-vars now include the same set.
- reference/slash-commands.md: added `/agents` (alias `/tasks`) and `/copy`.
  Removed duplicate rows for `/snapshot`, `/fast` (\u00d72), `/debug`.
- reference/tools-reference.md: fixed "47 built-in tools" \u2192 52. Added
  `feishu_doc` and `feishu_drive` toolset sections.
- reference/toolsets-reference.md: added `feishu_doc` / `feishu_drive` core
  rows + all missing `hermes-<platform>` toolsets in the platform table
  (bluebubbles, dingtalk, feishu, qqbot, wecom, wecom-callback, weixin,
  homeassistant, webhook, gateway). Fixed the `debugging` composite to
  describe the actual `includes=[...]` mechanism.
- reference/optional-skills-catalog.md: added `fitness-nutrition`.
- reference/environment-variables.md: added NOUS_BASE_URL,
  NOUS_INFERENCE_BASE_URL, NVIDIA_API_KEY/BASE_URL, OLLAMA_API_KEY/BASE_URL,
  XAI_API_KEY/BASE_URL, MISTRAL_API_KEY, AWS_REGION/AWS_PROFILE,
  BEDROCK_BASE_URL, HERMES_QWEN_BASE_URL, DISCORD_ALLOWED_CHANNELS,
  DISCORD_PROXY, TELEGRAM_REPLY_TO_MODE, MATRIX_DEVICE_ID, MATRIX_REACTIONS,
  QQBOT_HOME_CHANNEL_NAME, QQ_SANDBOX.
- messaging/discord.md: documented DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY,
  HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS and HERMES_DISCORD_TEXT_BATCH_SPLIT
  _DELAY_SECONDS (all actively read by the adapter).
- messaging/matrix.md: documented MATRIX_REACTIONS (default true).
- messaging/telegram.md: removed the redundant second Webhook Mode section
  that invented a `telegram.webhook_mode: true` yaml key the adapter does
  not read.
- user-guide/features/hooks.md: added `on_session_finalize` and
  `on_session_reset` (both emitted via invoke_hook but undocumented).
- user-guide/features/api-server.md: documented GET /health/detailed, the
  `/api/jobs/*` CRUD surface, POST /v1/runs, and GET /v1/runs/{id}/events
  (10 routes that were live but undocumented).
- user-guide/features/fallback-providers.md: added `approval` and
  `title_generation` auxiliary-task rows; added gemini, bedrock, qwen-oauth
  to the supported-providers table.
- user-guide/features/tts.md: "seven providers" \u2192 "eight" (post-xAI add
  oversight in #11942).
- user-guide/configuration.md: TTS provider enum gains `xai` and `gemini`;
  yaml example block gains `mistral:`, `gemini:`, `xai:` subsections.
  Auxiliary-provider enum now enumerates all real registry entries.
- reference/faq.md: stale AIAgent/config examples bumped from
  `nous/hermes-3-llama-3.1-70b` and `claude-sonnet-4.6` to
  `claude-opus-4.7`.

### Docs-site integrity

- guides/build-a-hermes-plugin.md referenced two nonexistent hooks
  (`pre_api_request`, `post_api_request`). Replaced with the real
  `on_session_finalize` / `on_session_reset` entries.
- messaging/open-webui.md and features/api-server.md had pre-existing
  broken links to `/docs/user-guide/features/profiles` (actual path is
  `/docs/user-guide/profiles`). Fixed.
- reference/skills-catalog.md had one `<1%` literal that MDX parsed as a
  JSX tag. Escaped to `&lt;1%`.

### False positives filtered out (not changed, verified correct)

- `/set-home` is a registered alias of `/sethome` \u2014 docs were fine.
- `hermes setup gateway` is valid syntax (`hermes setup \<section\>`);
  changed in qqbot.md for cross-doc consistency, not as a bug fix.
- Telegram reactions "disabled by default" matches code (default `"false"`).
- Matrix encryption "opt-in" matches code (empty env default \u2192 disabled).
- `pre_api_request` / `post_api_request` hooks do NOT exist in current code;
  documented instead the real `on_session_finalize` / `on_session_reset`.
- SIGNAL_IGNORE_STORIES is already in env-vars.md (subagent missed it).

Validation:
- `docusaurus build` \u2014 passes (only pre-existing nix-setup anchor warning).
- `ascii-guard lint docs` \u2014 124 files, 0 errors.
- 22 files changed, +317 / \u2212158.
2026-04-18 01:45:48 -07:00
Teknium 73bccc94c7 skills: consolidate mlops redundancies (gguf+llama-cpp, grpo+trl, guidance→optional) (#11965)
Three tightly-scoped built-in skill consolidations to reduce redundancy in
the available_skills listing injected into every system prompt:

1. gguf-quantization → llama-cpp (merged)
   GGUF is llama.cpp's format; two skills covered the same toolchain. The
   merged llama-cpp skill keeps the full K-quant table + imatrix workflow
   from gguf and the ROCm/benchmarks/supported-models sections from the
   original llama-cpp. All 5 reference files preserved.

2. grpo-rl-training → fine-tuning-with-trl (folded in)
   GRPO isn't a framework, it's a trainer inside TRL. Moved the 17KB
   deep-dive SKILL.md to references/grpo-training.md and the working
   template to templates/basic_grpo_training.py. TRL's GRPO workflow
   section now points to both. Atropos skill's related_skills updated.

3. guidance → optional-skills/mlops/
   Dropped from built-in. Outlines (still built-in) covers the same
   structured-generation ground with wider adoption. Listed in the
   optional catalog for users who specifically want Guidance.

Net: 3 fewer built-in skill lines in every system prompt, zero content
loss. Contributor authorship preserved via git rename detection.
2026-04-17 21:36:40 -07:00
Teknium 598cba62ad test: update stale tests to match current code (#11963)
Seven test files were asserting against older function signatures and
behaviors. CI has been red on main because of accumulated test debt
from other PRs; this catches the tests up.

- tests/agent/test_subagent_progress.py: _build_child_progress_callback
  now takes (task_index, goal, parent_agent, task_count=1); update all
  call sites and rewrite tests that assumed the old 'batch-only' relay
  semantics (now relays per-tool AND flushes a summary at BATCH_SIZE).
  Renamed test_thinking_not_relayed_to_gateway → test_thinking_relayed_to_gateway
  since thinking IS now relayed as subagent.thinking.
- tests/tools/test_delegate.py: _build_child_agent now requires
  task_count; add task_count=1 to all 8 call sites.
- tests/cli/test_reasoning_command.py: AIAgent gained _stream_callback;
  stub it on the two test agent helpers that use spec=AIAgent / __new__.
- tests/hermes_cli/test_cmd_update.py: cmd_update now runs npm install
  in repo root + ui-tui/ + web/ and 'npm run build' in web/; assert
  all four subprocess calls in the expected order.
- tests/hermes_cli/test_model_validation.py: dissimilar unknown models
  now return accepted=False (previously True with warning); update
  both affected tests.
- tests/tools/test_registry.py: include feishu_doc_tool and
  feishu_drive_tool in the expected builtin tool set.
- tests/gateway/test_voice_command.py: missing-voice-deps message now
  suggests 'pip install PyNaCl' not 'hermes-agent[messaging]'.

411/411 pass locally across these 7 files.
2026-04-17 21:35:30 -07:00
Teknium 5ff65dbf68 docs(execute_code): clarify that scripts run in their own temp dir, not session CWD (#11956)
Weaker models (Gemma-class) repeatedly rediscover and forget that execute_code's
working directory differs from terminal()/read_file()'s, leading to
os.path.exists('.env') returning False even though the file exists in the
session's CWD. They then bounce between 'the file exists' and 'the file is
missing' across tool calls.

Adds a 'Working directory' note to the execute_code schema description
pointing agents at absolute paths (os.path.expanduser) or terminal()/read_file()
for inspecting user files.

Carefully avoids the 'sandbox'/'isolated'/'cloud' language that commit
39b83f34 removed (it caused agents on local backends to refuse networking
tasks and save false sandbox beliefs to persistent memory). Purely factual
CWD guidance — no restriction implications.
2026-04-17 21:30:34 -07:00
Teknium c20e236b71 chore: map AviArora02-commits author email in release AUTHOR_MAP 2026-04-17 21:30:17 -07:00
AviArora02-commits 994faacce8 fix: suppress Authorization: Bearer for Gemini provider to prevent HTTP 400 (#7893) 2026-04-17 21:30:17 -07:00
Teknium 8a59f8a9ed fix(update): survive mid-update terminal disconnect (#11960)
hermes update no longer dies when the controlling terminal closes
(SSH drop, shell close) during pip install.  SIGHUP is set to SIG_IGN
for the duration of the update, and stdout/stderr are wrapped so writes
to a closed pipe are absorbed instead of cascading into process exit.
All update output is mirrored to ~/.hermes/logs/update.log so users can
see what happened after reconnecting.

SIGINT (Ctrl-C) and SIGTERM (systemd) are intentionally still honored —
those are deliberate cancellations, not accidents.  In gateway mode the
helper is a no-op since the update is already detached.

POSIX preserves SIG_IGN across exec(), so pip and git subprocesses
inherit hangup protection automatically — no changes to subprocess
spawning needed.
2026-04-17 21:29:24 -07:00
Teknium 1c352f6b1d docs(browser): expand Camofox persistence guide with troubleshooting (#11957)
The existing 'Persistent browser sessions' section had the correct config
snippet but users still hit the flag at the wrong config path, assumed
Hermes could force persistence when the server was ephemeral, and had no
way to verify the flag was actually taking effect.

Adds to that section:
- Warning admonition calling out the nested path vs top-level mistake.
- Explicit 'What Hermes does / does not do' split so users understand
  Hermes can only send a stable userId; the Camofox server must map it
  to a persistent profile.
- 5-step verification flow for confirming persistence works end-to-end.
- Reminder to restart Hermes after editing config.yaml.
- Where Hermes derives the stable userId (~/.hermes/browser_auth/camofox/)
  so users can reset or back up state.

Docs-only change.
2026-04-17 21:23:31 -07:00
Teknium 11a89cc032 docs: backfill coverage for recently-merged features (#11942)
Fills documentation gaps that accumulated as features merged ahead of their
docs updates. All additions are verified against code and the originating PRs.

Providers:
- Ollama Cloud (#10782) — new provider section, env vars, quickstart/fallback rows
- xAI Grok Responses API + TTS (#10783) — provider note, TTS table + config
- Google Gemini CLI OAuth (#11270) — quickstart/fallback/cli-commands entries
- NVIDIA NIM (#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference
- HERMES_INFERENCE_PROVIDER enum updated

Messaging:
- DISCORD_ALLOWED_ROLES (#11608) — env-vars, discord.md access control section
- DingTalk QR device-flow (#11574) — wizard path in Option A + openClaw disclosure
- Feishu document comment intelligent reply (#11898) — full section + 3-tier access control + CLI

Skills / commands:
- concept-diagrams skill (#11363) — optional-skills-catalog entry
- /gquota (#11270) — slash-commands reference

Build: docusaurus build passes, ascii-guard lint 0 errors.
2026-04-17 21:22:11 -07:00
Teknium 45acd9beb5 fix(gateway): ignore redelivered /restart after PTB offset ACK fails (#11940)
When a Telegram /restart fires and PTB's graceful-shutdown `get_updates`
ACK call times out ("When polling for updates is restarted, updates may
be received twice" in gateway.log), the new gateway receives the same
/restart again and restarts a second time — a self-perpetuating loop.

Record the triggering update_id in `.restart_last_processed.json` when
handling /restart.  On the next process, reject a /restart whose
update_id <= the recorded one as a stale redelivery.  5-minute staleness
guard so an orphaned marker can't block a legitimately new /restart.

- gateway/platforms/base.py: add `platform_update_id` to MessageEvent
- gateway/platforms/telegram.py: propagate `update.update_id` through
  _build_message_event for text/command/location/media handlers
- gateway/run.py: write dedup marker in _handle_restart_command;
  _is_stale_restart_redelivery checks it before processing /restart
- tests/gateway/test_restart_redelivery_dedup.py: 9 new tests covering
  fresh restart, redelivery, staleness window, cross-platform,
  malformed-marker resilience, and no-update_id (CLI) bypass

Only active for Telegram today (the one platform with monotonic
cross-session update ordering); other platforms return False from
_is_stale_restart_redelivery and proceed normally.
2026-04-17 21:17:33 -07:00
Teknium c5c0bb9a73 fix: point optional-dep install hints at the venv's python (#11938)
Error messages that tell users to install optional extras now use
{sys.executable} -m pip install ... instead of a bare 'pip install
hermes-agent[extra]' string.  Under the curl installer, bare 'pip'
resolves to system pip, which either fails with PEP 668
externally-managed-environment or installs into the wrong Python.

Affects: hermes dashboard, hermes web server startup, mcp_serve,
hermes doctor Bedrock check, CLI voice mode, voice_mode tool runtime
error, Discord voice-channel join failure message.
2026-04-17 21:16:33 -07:00
Teknium 20f2258f34 fix(interrupt): propagate to concurrent-tool workers + opt-in debug trace (#11907)
* fix(interrupt): propagate to concurrent-tool workers + opt-in debug trace

interrupt() previously only flagged the agent's _execution_thread_id.
Tools running inside _execute_tool_calls_concurrent execute on
ThreadPoolExecutor worker threads whose tids are distinct from the
agent's, so is_interrupted() inside those tools returned False no matter
how many times the gateway called .interrupt() — hung ssh / curl / long
make-builds ran to their own timeout.

Changes:
- run_agent.py: track concurrent-tool worker tids in a per-agent set,
  fan interrupt()/clear_interrupt() out to them, and handle the
  register-after-interrupt race at _run_tool entry.  getattr fallback
  for the tracker so test stubs built via object.__new__ keep working.
- tools/environments/base.py: opt-in _wait_for_process trace (ENTER,
  per-30s HEARTBEAT with interrupt+activity-cb state, INTERRUPT
  DETECTED, TIMEOUT, EXIT) behind HERMES_DEBUG_INTERRUPT=1.
- tools/interrupt.py: opt-in set_interrupt() trace (caller tid, target
  tid, set snapshot) behind the same env flag.
- tests: new regression test runs a polling tool on a concurrent worker
  and asserts is_interrupted() flips to True within ~1s of interrupt().
  Second new test guards clear_interrupt() clearing tracked worker bits.

Validation: tests/run_agent/ all 762 pass; tests/tools/ interrupt+env
subset 216 pass.

* fix(interrupt-debug): bypass quiet_mode logger filter so trace reaches agent.log

AIAgent.__init__ sets logging.getLogger('tools').setLevel(ERROR) when
quiet_mode=True (the CLI default). This would silently swallow every
INFO-level trace line from the HERMES_DEBUG_INTERRUPT=1 instrumentation
added in the parent commit — confirmed by running hermes chat -q with
the flag and finding zero trace lines in agent.log even though
_wait_for_process was clearly executing (subprocess pid existed).

Fix: when HERMES_DEBUG_INTERRUPT=1, each traced module explicitly sets
its own logger level to INFO at import time, overriding the 'tools'
parent-level filter. Scoped to the opt-in case only, so production
(quiet_mode default) logs stay quiet as designed.

Validation: hermes chat -q with HERMES_DEBUG_INTERRUPT=1 now writes
'_wait_for_process ENTER/EXIT' lines to agent.log as expected.

* fix(cli): SIGTERM/SIGHUP no longer orphans tool subprocesses

Tool subprocesses spawned by the local environment backend use
os.setsid so they run in their own process group. Before this fix,
SIGTERM/SIGHUP to the hermes CLI killed the main thread via
KeyboardInterrupt but the worker thread running _wait_for_process
never got a chance to call _kill_process — Python exited, the child
was reparented to init (PPID=1), and the subprocess ran to its
natural end (confirmed live: sleep 300 survived 4+ min after SIGTERM
to the agent until manual cleanup).

Changes:
- cli.py _signal_handler (interactive) + _signal_handler_q (-q mode):
  route SIGTERM/SIGHUP through agent.interrupt() so the worker's poll
  loop sees the per-thread interrupt flag and calls _kill_process
  (os.killpg) on the subprocess group. HERMES_SIGTERM_GRACE (default
  1.5s) gives the worker time to complete its SIGTERM+SIGKILL
  escalation before KeyboardInterrupt unwinds main.
- tools/environments/base.py _wait_for_process: wrap the poll loop in
  try/except (KeyboardInterrupt, SystemExit) so the cleanup fires
  even on paths the signal handlers don't cover (direct sys.exit,
  unhandled KI from nested code, etc.). Emits EXCEPTION_EXIT trace
  line when HERMES_DEBUG_INTERRUPT=1.
- New regression test: injects KeyboardInterrupt into a running
  _wait_for_process via PyThreadState_SetAsyncExc, verifies the
  subprocess process group is dead within 3s of the exception and
  that KeyboardInterrupt re-raises cleanly afterward.

Validation:
| Before                                                  | After              |
|---------------------------------------------------------|--------------------|
| sleep 300 survives 4+ min as PPID=1 orphan after SIGTERM | dies within 2 s   |
| No INTERRUPT DETECTED in trace                          | INTERRUPT DETECTED fires + killing process group |
| tests/tools/test_local_interrupt_cleanup                | 1/1 pass          |
| tests/run_agent/test_concurrent_interrupt               | 4/4 pass          |
2026-04-17 20:39:25 -07:00
Teknium 607be54a24 fix(discord): forum channel media + polish
Extend forum support from PR #10145:

- REST path (_send_discord): forum thread creation now uploads media
  files as multipart attachments on the starter message in a single
  call. Previously media files were silently dropped on the forum
  path.
- Websocket media paths (_send_file_attachment, send_voice, send_image,
  send_animation — covers send_image_file, send_video, send_document
  transitively): forum channels now go through a new _forum_post_file
  helper that creates a thread with the file as starter content,
  instead of failing via channel.send(file=...) which forums reject.
- _send_to_forum chunk follow-up failures are collected into
  raw_response['warnings'] so partial-send outcomes surface.
- Process-local probe cache (_DISCORD_CHANNEL_TYPE_PROBE_CACHE) avoids
  GET /channels/{id} on every uncached send after the first.
- Dedup of TestSendDiscordMedia that the PR merge-resolution left
  behind.
- Docs: Forum Channels section under website/docs/user-guide/messaging/discord.md.

Tests: 117 passed (22 new for forum+media, probe cache, warnings).
2026-04-17 20:25:48 -07:00
ChimingLiu e5333e793c feat(discord): support forum channels 2026-04-17 20:25:48 -07:00
helix4u 148459716c fix(kimi): cover remaining fixed-temperature bypasses 2026-04-17 20:25:42 -07:00
Teknium 53e4a2f2c6 feat(update): warn about legacy hermes.service units during hermes update (#11918)
Follow-up to #11909: surface the legacy-unit warning where users are most
likely to see it. After a 'hermes update', if a pre-rename hermes.service
is still installed alongside the current hermes-gateway.service, print
the list of legacy units + the 'hermes gateway migrate-legacy' command.

Profile-safe: reuses _find_legacy_hermes_units() which is an explicit
allowlist of hermes.service only — profile units never match.
Platform-gated: only prints on systemd hosts (the rename is Linux-only).
Non-blocking: just prints, never prompts, so gateway-spawned
hermes update --gateway runs aren't affected.
2026-04-17 19:35:12 -07:00
Teknium 07db20c72d fix(gateway): detect legacy hermes.service + mark --replace SIGTERM as planned (#11909)
* fix(gateway): detect legacy hermes.service units from pre-rename installs

Older Hermes installs used a different service name (hermes.service) before
the rename to hermes-gateway.service. When both units remain installed, they
fight over the same bot token — after PR #5646's signal-recovery change,
this manifests as a 30-second SIGTERM flap loop between the two services.

Detection is an explicit allowlist (no globbing) plus an ExecStart content
check, so profile units (hermes-gateway-<profile>.service) and unrelated
third-party services named 'hermes' are never matched.

Wired into systemd_install, systemd_status, gateway_setup wizard, and the
main hermes setup flow — anywhere we already warn about scope conflicts now
also warns about legacy units.

* feat(gateway): add migrate-legacy command + install-time removal prompt

- New hermes_cli.gateway.remove_legacy_hermes_units() removes legacy
  unit files with stop → disable → unlink → daemon-reload. Handles user
  and system scopes separately; system scope returns path list when not
  running as root so the caller can tell the user to re-run with sudo.
- New 'hermes gateway migrate-legacy' subcommand (with --dry-run and -y)
  routes to remove_legacy_hermes_units via gateway_command dispatch.
- systemd_install now offers to remove legacy units BEFORE installing
  the new hermes-gateway.service, preventing the SIGTERM flap loop that
  hits users who still have pre-rename hermes.service around.

Profile units (hermes-gateway-<profile>.service) remain untouched in
all paths — the legacy allowlist is explicit (_LEGACY_SERVICE_NAMES)
and the ExecStart content check further narrows matches.

* fix(gateway): mark --replace SIGTERM as planned so target exits 0

PR #5646 made SIGTERM exit the gateway with code 1 so systemd's
Restart=on-failure revives it after unexpected kills. But when a user has
two gateway units fighting for the same bot token (e.g. legacy
hermes.service + hermes-gateway.service from a pre-rename install), the
--replace takeover itself becomes the 'unexpected' SIGTERM — the loser
exits 1, systemd revives it 30s later, and the cycle flaps indefinitely.

Before calling terminate_pid(), --replace now writes a short-lived marker
file naming the target PID + start_time. The target's shutdown_signal_handler
consumes the marker and, when it names this process, leaves
_signal_initiated_shutdown=False so the final exit code stays 0.

Staleness defences:
- PID + start_time combo prevents PID reuse matching an old marker
- Marker older than 60s is treated as stale and discarded
- Marker is unlinked on first read even if it doesn't match this process
- Replacer clears the marker post-loop + on permission-denied give-up
2026-04-17 19:27:58 -07:00
Teknium 38436eb4e3 chore(release): add pedh to AUTHOR_MAP 2026-04-17 19:26:53 -07:00
pedh 86fd0f846d docs(dingtalk): document AI Cards, emoji reactions, and display settings
- AI Cards: how to configure ``card_template_id`` for streaming rich replies
- Emoji reactions: 🤔Thinking → 🥳Done lifecycle
- Per-platform display settings (streaming, tool_progress, reasoning, etc.)
- Installation: switch to the ``hermes-agent[dingtalk]`` extra (adds
  alibabacloud-dingtalk alongside dingtalk-stream)
- Messaging capability matrix updated to reflect images, audio, video,
  and threading support
2026-04-17 19:26:53 -07:00
pedh 4459913f40 feat(dingtalk): AI Cards streaming, emoji reactions, and media handling
Cherry-picked from #10985 by pedh, adapted to current main:

* Keeps main's full group-chat gating (require_mention + allowed_users +
  free_response_chats + mention_patterns) — PR's simpler subset dropped.
* Keeps main's fire-and-forget process() dispatch + session_webhook
  fallback for SDK >= 0.24.
* Picks up PR's REQUIRES_EDIT_FINALIZE capability flag on
  BasePlatformAdapter + finalize kwarg on edit_message(), plumbed through
  stream_consumer.  Default False so Telegram/Slack/Discord/Matrix stay
  on the zero-overhead fast path.
* DingTalk AI Card lifecycle: per-chat _message_contexts, two-card flow
  (tool-progress + final response) with sibling auto-close driven by
  reply_to, idempotent 🤔Thinking → 🥳Done swap, $alibabacloud-dingtalk$
  for media URL resolution (replaces raw HTTP that was 403-ing).
* pyproject: dingtalk extra now dingtalk-stream>=0.20,<1 +
  alibabacloud-dingtalk>=2.0.0 + qrcode.

Closes #10991

Co-authored-by: pedh
2026-04-17 19:26:53 -07:00
Teknium d7ef562a05 fix(file-ops): follow terminal env's live cwd in _exec instead of init-time cached cwd (#11912)
ShellFileOperations captured the terminal env's cwd at __init__ time and
used that stale value for every subsequent _exec() call.  When the user
ran `cd` via the terminal tool, `env.cwd` updated but `ops.cwd` did not.
Relative paths passed to patch_replace / read_file / write_file / search
then targeted the ORIGINAL directory instead of the current one.

Observed symptom in agent sessions:

  terminal: cd .worktrees/my-branch
  patch hermes_cli/main.py <old> <new>
    → returns {"success": true} with a plausible unified diff
    → but `git diff` in the worktree shows nothing
    → the patch landed in the main repo's checkout of main.py instead

The diff looked legitimate because patch_replace computes it from the
IN-MEMORY content vs new_content, not by re-reading the file.  The
write itself DID succeed — it just wrote to the wrong directory's copy
of the same-named file.

Fix: _exec() now resolves cwd from live sources in this order:

  1. Explicit `cwd` arg (if provided by the caller)
  2. Live `self.env.cwd` (tracks `cd` commands run via terminal)
  3. Init-time `self.cwd` (fallback when env has no cwd attribute)

Includes a 5-test regression suite covering:
  - cd followed by relative read follows live cwd
  - the exact reported bug: patch_replace with relative path after cd
  - explicit cwd= arg still wins over env.cwd
  - env without cwd attribute falls back to init-time cwd
  - patch_replace success reflects real file state (safety rail)

Co-authored-by: teknium1 <teknium@nousresearch.com>
2026-04-17 19:26:40 -07:00
helix4u 47010e0757 fix(gateway): allow systemd-backed distrobox services 2026-04-17 19:24:30 -07:00
Teknium 213e39463b chore(release): add akhater to AUTHOR_MAP
Contributor of PR #11858 (nous OAuth providers mirror fix).  CI
blocks releases on unmapped author emails.
2026-04-17 19:13:40 -07:00
Teknium 2297c5f5ce fix(auth): restore --label for hermes auth add nous --type oauth
persist_nous_credentials() now accepts an optional label kwarg which
gets embedded in providers.nous under the 'label' key.
_seed_from_singletons() prefers the embedded label over the
auto-derived label_from_token() fingerprint when materialising the
pool entry, so re-seeding on every load_pool('nous') preserves the
user's chosen label.

auth_commands.py threads --label through to the helper, restoring
parity with how other OAuth providers (anthropic, codex, google,
qwen) honor the flag.

Tests: 4 new (embed, reseed-survives, no-label fallback, end-to-end
through auth_add_command). All 390 nous/auth/credential_pool tests
pass.
2026-04-17 19:13:40 -07:00
Antoine Khater c7fece1f9d fix: normalise Nous device-code pool source to avoid duplicates
Review feedback on the original commit: the helper wrote a pool entry
with source `manual:device_code` while `_seed_from_singletons()` upserts
with `device_code` (no `manual:` prefix), so the pool grew a duplicate
row on every `load_pool()` after login.

Normalise: the helper now writes `providers.nous` and delegates the pool
write entirely to `_seed_from_singletons()` via a follow-up
`load_pool()` call. The canonical source is `device_code`; the helper
never materialises a parallel `manual:device_code` entry.

- `persist_nous_credentials()` loses its `label` and `source` kwargs —
  both are now derived by the seed path from the singleton state.
- CLI and web dashboard call sites simplified accordingly.
- New test `test_persist_nous_credentials_idempotent_no_duplicate_pool_entries`
  asserts that two consecutive persists leave exactly one pool row and
  no stray `manual:` entries.
- Existing `test_auth_add_nous_oauth_persists_pool_entry` updated to
  assert the canonical source and single-entry invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 19:13:40 -07:00
Antoine Khater c096a6935f fix(auth): mirror Nous OAuth credentials to providers.nous on CLI login
`hermes auth add nous --type oauth` only wrote credential_pool.nous,
leaving providers.nous empty. When the Nous agent_key's 24h TTL expired,
run_agent.py's 401-recovery path called resolve_nous_runtime_credentials
(which reads providers.nous), got AuthError "Hermes is not logged into
Nous Portal", caught it as logger.debug (suppressed at INFO level), and
the agent died with "Non-retryable client error" — no signal to the
user that recovery even tried.

Introduce persist_nous_credentials() as the single source of truth for
Nous device-code login persistence. Both auth_commands (CLI) and
web_server (dashboard) now route through it, so pool and providers
stay in sync at write time.

Why: CLI-provisioned profiles couldn't recover from agent_key expiry,
producing silent daily outages 24h after first login. PR #6856/#6869
addressed adjacent issues but assumed providers.nous was populated;
this one wasn't being written.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 19:13:40 -07:00
Teknium a155b4a159 feat(auxiliary): default 'auto' routing to main model for all users (#11900)
Before: aggregator users (OpenRouter / Nous Portal) running 'auto'
routing for auxiliary tasks — compression, vision, web extraction,
session search, etc. — got routed to a cheap provider-side default
model (Gemini Flash).  Non-aggregator users already got their main
model.  Behavior was inconsistent and surprising — users picked
Claude / GPT / their preferred model, but side tasks ran on
Gemini Flash.

After: 'auto' means "use my main chat model" for every user,
regardless of provider type.  Only when the main provider has no
working client does the fallback chain run (OpenRouter → Nous →
custom → Codex → API-key providers).  Explicit per-task overrides
in config.yaml (auxiliary.<task>.provider / .model) still win —
they are a hard constraint, not subject to the auto policy.

Vision auto-detection follows the same policy: try main provider +
main model first (with _PROVIDER_VISION_MODELS overrides preserved
for providers like xiaomi and zai that ship a dedicated multimodal
model distinct from their chat model).  Aggregator strict vision
backends are fallbacks, not the primary path.

Changes:
  - agent/auxiliary_client.py: _resolve_auto() drops the
    `_AGGREGATOR_PROVIDERS` guard.  resolve_vision_provider_client()
    auto branch unifies aggregator and exotic-provider paths —
    everyone goes through resolve_provider_client() with main_model.
    Dead _AGGREGATOR_PROVIDERS constant removed (was only used by
    the guard we just removed).
  - hermes_cli/main.py: aux config menu copy updated to reflect
    the new semantics ("'auto' means 'use my main model'").
  - tests/agent/test_auxiliary_main_first.py: 12 regression tests
    covering OpenRouter/Nous/DeepSeek main paths, runtime-override
    wins, explicit-config wins, vision override preservation for
    exotic providers, and fallback-chain activation when the main
    provider has no working client.

Co-authored-by: teknium1 <teknium@nousresearch.com>
2026-04-17 19:13:23 -07:00
Teknium b449a0e049 fix(feishu-comment): use get_hermes_home(); drop dead asyncio wrapper; AUTHOR_MAP
Follow-up polish on top of the cherry-picked #11023 commit.

- feishu_comment_rules.py: replace import-time "~/.hermes" expanduser fallback
  with get_hermes_home() from hermes_constants (canonical, profile-safe).
- tools/feishu_doc_tool.py, tools/feishu_drive_tool.py: drop the
  asyncio.get_event_loop().run_until_complete(asyncio.to_thread(...)) dance.
  Tool handlers run synchronously in a worker thread with no running loop, so
  the RuntimeError branch was always the one that executed. Calls client.request
  directly now. Unused asyncio import removed.
- tests/gateway/test_feishu.py: add register_p2_customized_event to the mock
  EventDispatcher builder so the existing adapter test matches the new handler
  registration for drive.notice.comment_add_v1.
- scripts/release.py: map liujinkun@bytedance.com -> liujinkun2025 for
  contributor attribution on release notes.
2026-04-17 19:04:11 -07:00
liujinkun 85cdb04bd4 feat: add Feishu document comment intelligent reply with 3-tier access control
- Full comment handler: parse drive.notice.comment_add_v1 events, build
  timeline, run agent, deliver reply with chunking support.
- 5 tools: feishu_doc_read, feishu_drive_list_comments,
  feishu_drive_list_comment_replies, feishu_drive_reply_comment,
  feishu_drive_add_comment.
- 3-tier access control rules (exact doc > wildcard "*" > top-level >
  defaults) with per-field fallback. Config via
  ~/.hermes/feishu_comment_rules.json, mtime-cached hot-reload.
- Self-reply filter using generalized self_open_id (supports future
  user-identity subscriptions). Receiver check: only process events
  where the bot is the @mentioned target.
- Smart timeline selection, long text chunking, semantic text extraction,
  session sharing per document, wiki link resolution.

Change-Id: I31e82fd6355173dbcc400b8934b6d9799e3137b9
2026-04-17 19:04:11 -07:00
Teknium 9b14b76eb3 fix(wecom): bound req_id cache, revert undocumented is_group change, add tests
Follow-up to the cherry-picked contributor fix:

- Extract `_remember_chat_req_id()` and bound it at DEDUP_MAX_SIZE like
  `_reply_req_ids` — the unbounded dict would grow forever on a long-
  running gateway with many chats.
- Move the cache write to AFTER the group/DM policy check so we don't
  cache req_ids from blocked senders.
- Revert the undocumented `is_group` change: the contributor flipped
  `chattype == 'group'` to `bool(chatid)`, which wasn't mentioned in
  the PR description and weakens the signal (chattype is the explicit
  hint; relying on chatid presence assumes DMs never carry it). Keep
  the original check.
- Drop the defensive `getattr(self, '_last_chat_req_ids', {})` reads
  at both send sites — the attribute is initialized in __init__.
- Update `test_send_uses_passive_reply_stream_...` → `_markdown_...`
  to match the new msgtype, and add a new TestWeComZombieSessionFix
  class covering device_id presence in subscribe, per-chat req_id
  caching + bounding, blocked-sender cache exclusion, and the group
  APP_CMD_RESPONSE fallback path.
2026-04-17 19:03:29 -07:00
Devorun 2992802b35 fix(wecom): resolve WebSocket zombie sessions and group chat 600039 errors #11554 2026-04-17 19:03:29 -07:00
Teknium 04a0c3cb95 fix(config): preserve env refs when save_config rewrites config (#11892)
Co-authored-by: binhnt92 <84617813+binhnt92@users.noreply.github.com>
2026-04-17 19:03:26 -07:00
Teknium 8444f66890 feat(hermes model): add Configure auxiliary models UI to hermes model (#11891)
Previously users had to hand-edit config.yaml to route individual auxiliary
tasks (vision, compression, web_extract, etc.) to a specific provider+model.
Add a first-class picker reachable from the bottom of the existing `hermes
model` provider list.

Flow:
  hermes model
    → Configure auxiliary models...
      → <task picker: 9 tasks, shows current setting inline>
        → <provider picker: authenticated providers + auto + custom>
          → <model picker: curated list + live pricing>

The aux picker does NOT re-run credential/OAuth setup; users authenticate
providers through the normal `hermes model` flow, then route aux tasks to
them here.  `list_authenticated_providers()` gates the list to providers
the user has configured.

Also:
  - 'Cancel' entry relabeled 'Leave unchanged' (sentinel still 'cancel'
    internally, so dispatch logic is unchanged)
  - 'Reset all to auto' entry to bulk-clear aux overrides; preserves
    user-tuned timeout / download_timeout values
  - Adds `title_generation` task to DEFAULT_CONFIG.auxiliary — the task
    was called from agent/title_generator.py but was missing from defaults,
    so config-backed timeout overrides never worked for it

Co-authored-by: teknium1 <teknium@nousresearch.com>
2026-04-17 19:02:06 -07:00
Teknium bb85404b16 chore: add Sara Reynolds to AUTHOR_MAP 2026-04-17 18:58:29 -07:00
Sara Reynolds 8ab1aa2efc fix(gateway): fix discrepancies in gateway status 2026-04-17 18:58:29 -07:00
Xowiek 511ed4dacc fix(gateway): bypass active-session guard for gateway-handled slash commands 2026-04-17 18:58:03 -07:00
Michel Belleau d465fc5869 fix(skills): use frontmatter name in skills index instead of directory name
build_skills_system_prompt() was using the skill directory name (skill_name)
when appending to skills_by_category in all three code paths (snapshot cache,
cold filesystem scan, external dirs). This meant any skill whose directory name
differed from its frontmatter `name` field would appear under the wrong name in
the system prompt, causing LLM routing failures.

The snapshot entry already stores both skill_name (dir) and frontmatter_name
(declared); switch the three tuple appends to use frontmatter_name. Also fix
the external-dir dedup set (seen_skill_names) to track frontmatter names for
consistency with the local-skill tuples now stored under frontmatter_name.

Fixes #11777

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 18:56:37 -07:00
helix4u 016ae5c334 fix(kimi): force 0.6 on main chat path 2026-04-17 18:47:01 -07:00
Teknium 304fb921bf fix: two process leaks (agent-browser daemons, paste.rs sleepers) (#11843)
Both fixes close process leaks observed in production (18+ orphaned
agent-browser node daemons, 15+ orphaned paste.rs sleep interpreters
accumulated over ~3 days, ~2.7 GB RSS).

## agent-browser daemon leak

Previously the orphan reaper (_reap_orphaned_browser_sessions) only ran
from _start_browser_cleanup_thread, which is only invoked on the first
browser tool call in a process. Hermes sessions that never used the
browser never swept orphans, and the cross-process orphan detection
relied on in-process _active_sessions, which doesn't see other hermes
PIDs' sessions (race risk).

- Write <session>.owner_pid alongside the socket dir recording the
  hermes PID that owns the daemon (extracted into _write_owner_pid for
  direct testability).
- Reaper prefers owner_pid liveness over in-process _active_sessions.
  Cross-process safe: concurrent hermes instances won't reap each
  other's daemons. Legacy tracked_names fallback kept for daemons
  that predate owner_pid.
- atexit handler (_emergency_cleanup_all_sessions) now always runs
  the reaper, not just when this process had active sessions —
  every clean hermes exit sweeps accumulated orphans.

## paste.rs auto-delete leak

_schedule_auto_delete spawned a detached Python subprocess per call
that slept 6 hours then issued DELETE requests. No dedup, no tracking —
every 'hermes debug share' invocation added ~20 MB of resident Python
interpreters that stuck around until the sleep finished.

- Replaced the spawn with ~/.hermes/pastes/pending.json: records
  {url, expire_at} entries.
- _sweep_expired_pastes() synchronously DELETEs past-due entries on
  every 'hermes debug' invocation (run_debug() dispatcher).
- Network failures stay in pending.json for up to 24h, then give up
  (paste.rs's own retention handles the 'user never runs hermes again'
  edge case).
- Zero subprocesses; regression test asserts subprocess/Popen/time.sleep
  never appear in the function source (skipping docstrings via AST).

## Validation

|                              | Before        | After        |
|------------------------------|---------------|--------------|
| Orphan agent-browser daemons | 18 accumulated| 2 (live)     |
| paste.rs sleep interpreters  | 15 accumulated| 0            |
| RSS reclaimed                | -             | ~2.7 GB      |
| Targeted tests               | -             | 2253 pass    |

E2E verified: alive-owner daemons NOT reaped; dead-owner daemons
SIGTERM'd and socket dirs cleaned; pending.json sweep deletes expired
entries without spawning subprocesses.
2026-04-17 18:46:30 -07:00
helix4u 64b354719f Support browser CDP URL from config 2026-04-17 16:05:04 -07:00
brooklyn! e9b8ece103 Merge pull request #4692 from NousResearch/feat/ink-refactor
Feat/ink refactor
2026-04-17 18:02:37 -05:00
Teknium 3f43aec15d fix(tools): bound _read_tracker sub-containers + prune _completion_consumed (#11839)
Two accretion-over-time leaks that compound over long CLI / gateway
lifetimes.  Both were flagged in the memory-leak audit.

## file_tools._read_tracker

_read_tracker[task_id] holds three sub-containers that grew unbounded:

  read_history     set of (path, offset, limit) tuples — 1 per unique read
  dedup            dict of (path, offset, limit) → mtime — same growth pattern
  read_timestamps  dict of resolved_path → mtime — 1 per unique path

A CLI session uses one stable task_id for its lifetime, so these were
uncapped.  A 10k-read session accumulated ~1.5MB of tracker state that
the tool no longer needed (only the most recent reads are relevant for
dedup, consecutive-loop detection, and write/patch external-edit
warnings).

Fix: _cap_read_tracker_data() enforces hard caps on each container
after every add.  Defaults: read_history=500, dedup=1000,
read_timestamps=1000.  Eviction is insertion-order (Python 3.7+ dict
guarantee) for the dicts; arbitrary for the set (which only feeds
diagnostic summaries).

## process_registry._completion_consumed

Module-level set that recorded every session_id ever polled / waited /
logged.  No pruning.  Each entry is ~20 bytes, so the absolute leak is
small, but on a gateway processing thousands of background commands
per day the set grows until process exit.

Fix: _prune_if_needed() now discards _completion_consumed entries
alongside the session dict evictions it already performs (both the
TTL-based prune and the LRU-over-cap prune).  Adds a final
belt-and-suspenders pass that drops any dangling entries whose
session_id no longer appears in _running or _finished.

Tests: tests/tools/test_accretion_caps.py — 9 cases
  * Each container bound respected, oldest evicted
  * No-op when under cap (no unnecessary work)
  * Handles missing sub-containers without crashing
  * Live read_file_tool path enforces caps end-to-end
  * _completion_consumed pruned on TTL expiry
  * _completion_consumed pruned on LRU eviction
  * Dangling entries (no backing session) cleared

Broader suite: 3486 tests/tools + tests/cli pass.  The single flake
(test_alias_command_passes_args) reproduces on unchanged main — known
cross-test pollution under suite-order load.
2026-04-17 15:53:57 -07:00
Brooklyn Nicholson aa583cb14e Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 17:51:40 -05:00
Teknium 0a83187801 refactor(kimi): use _fixed_temperature_for_model helper in flush_memories
Replace the hardcoded 'kimi-for-coding' string check with the helper
from auxiliary_client so there is one source of truth for the list of
models with fixed-temperature contracts. Adding a new entry to
_FIXED_TEMPERATURE_MODELS now automatically covers flush_memories too.
2026-04-17 15:49:14 -07:00
helix4u 2b60478fc2 fix(kimi): force kimi-for-coding temperature to 0.6 2026-04-17 15:49:14 -07:00
Teknium c6fd2619f7 fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833)
Google-side 429 Code Assist errors now flow through Hermes' normal rate-limit
path (status_code on the exception, Retry-After preserved via error.response)
instead of being opaque RuntimeErrors. User sees a one-line capacity message
instead of a 500-char JSON dump.

Changes
- CodeAssistError grows status_code / response / retry_after / details attrs.
  _extract_status_code in error_classifier picks up status_code and classifies
  429 as FailoverReason.rate_limit, so fallback_providers triggers the same
  way it does for SDK errors. run_agent.py line ~10428 already walks
  error.response.headers for Retry-After — preserving the response means that
  path just works.
- _gemini_http_error parses the Google error envelope (error.status +
  error.details[].reason from google.rpc.ErrorInfo, retryDelay from
  google.rpc.RetryInfo). MODEL_CAPACITY_EXHAUSTED / RESOURCE_EXHAUSTED / 404
  model-not-found each produce a human-readable message; unknown shapes fall
  back to the previous raw-body format.
- Drop gemma-4-26b-it from hermes_cli/models.py, hermes_cli/setup.py, and
  agent/model_metadata.py — Google returned 404 for it today in local repro.
  Kept gemma-4-31b-it (capacity-constrained but not retired).

Validation
|                           | Before                         | After                                     |
|---------------------------|--------------------------------|-------------------------------------------|
| Error message             | 'Code Assist returned HTTP 429: {500 chars JSON}' | 'Gemini capacity exhausted for gemini-2.5-pro (Google-side throttle...)' |
| status_code on error      | None (opaque RuntimeError)     | 429                                       |
| Classifier reason         | unknown (string-match fallback) | FailoverReason.rate_limit                |
| Retry-After honored       | ignored                        | extracted from RetryInfo or header        |
| gemma-4-26b-it picker     | advertised (404s on Google)    | removed                                   |

Unit + E2E tests cover non-streaming 429, streaming 429, 404 model-not-found,
Retry-After header fallback, malformed body, and classifier integration.
Targeted suites: tests/agent/test_gemini_cloudcode.py (81 tests), full
tests/hermes_cli (2203 tests) green.

Co-authored-by: teknium1 <teknium@nousresearch.com>
2026-04-17 15:34:12 -07:00
Teknium d2206c69cc fix(qqbot): add back-compat for env var rename; drop qrcode core dep
Follow-up to WideLee's salvaged PR #11582.

Back-compat for QQ_HOME_CHANNEL → QQBOT_HOME_CHANNEL rename:
  - gateway/config.py reads QQBOT_HOME_CHANNEL, falls back to QQ_HOME_CHANNEL
    with a one-shot deprecation warning so users on the old name aren't
    silently broken.
  - cron/scheduler.py: _HOME_TARGET_ENV_VARS['qqbot'] now maps to the new
    name; _get_home_target_chat_id falls back to the legacy name via a
    _LEGACY_HOME_TARGET_ENV_VARS table.
  - hermes_cli/status.py + hermes_cli/setup.py: honor both names when
    displaying or checking for missing home channels.
  - hermes_cli/config.py: keep legacy QQ_HOME_CHANNEL[_NAME] in
    _EXTRA_ENV_KEYS so .env sanitization still recognizes them.

Scope cleanup:
  - Drop qrcode from core dependencies and requirements.txt (remains in
    messaging/dingtalk/feishu extras). _qqbot_render_qr already degrades
    gracefully when qrcode is missing, printing a 'pip install qrcode' tip
    and falling back to URL-only display.
  - Restore @staticmethod on QQAdapter._detect_message_type (it doesn't
    use self). Revert the test change that was only needed when it was
    converted to an instance method.
  - Reset uv.lock to origin/main; the PR's stale lock also included
    unrelated changes (atroposlib source URL, hermes-agent version bump,
    fastapi additions) that don't belong.

Verified E2E:
  - Existing user (QQ_HOME_CHANNEL set): gateway + cron both pick up the
    legacy name; deprecation warning logs once.
  - Fresh user (QQBOT_HOME_CHANNEL set): gateway + cron use new name,
    no warning.
  - Both set: new name wins on both surfaces.

Targeted tests: 296 passed, 4 skipped (qqbot + cron + hermes_cli).
2026-04-17 15:31:14 -07:00
WideLee 103beea7a6 fix(qqbot): fix test failures after package refactor
- Re-export _ssrf_redirect_guard from __init__.py
- Fix _parse_json @staticmethod using self._log_tag
- Update test_detect_message_type to call as instance method
- Fix mock.patch path for httpx.AsyncClient in adapter submodule
2026-04-17 15:31:14 -07:00
WideLee 287d3e12c7 chore: add author map 2026-04-17 15:31:14 -07:00
WideLee 6fd58e1e4a refactor(qqbot): replace log tags with self._log_tag 2026-04-17 15:31:14 -07:00
WideLee 235e6ecc0e refactor(qqbot): replace hardcoded log tags with self._log_tag and adjust STT log levels
- Remove @staticmethod from _detect_message_type, _convert_silk_to_wav,
  _convert_raw_to_wav, _convert_ffmpeg_to_wav so they can use self._log_tag
- Replace all remaining hardcoded "QQBot" log args with self._log_tag
- Downgrade STT routine flow logs (download, convert, success) from info to debug
- Keep warning level for actual failures (STT failed, ffmpeg error, empty transcript)
2026-04-17 15:31:14 -07:00
WideLee 1648e41c17 refactor(qqbot): change qrcode style 2026-04-17 15:31:14 -07:00
WideLee c4cdf3b861 refactor(qqbot): change setup method selection prompt_choice style 2026-04-17 15:31:14 -07:00
WideLee 02f5e3dc27 refactor(qqbot): use _log_tag with app_id in all logger calls for multi-instance disambiguation 2026-04-17 15:31:14 -07:00
WideLee b7d330211a fix(qqbot): simplify home channel prompt wording 2026-04-17 15:31:14 -07:00
WideLee a5f4d652d3 feat(qqbot): prompt to add scanned user to allow list and home channel during setup 2026-04-17 15:31:14 -07:00
WideLee 6358501915 refactor(qqbot): split qqbot.py into package & add QR scan-to-configure onboard flow
- Refactor gateway/platforms/qqbot.py into gateway/platforms/qqbot/ package:
  - adapter.py: core QQAdapter (unchanged logic, constants from shared module)
  - constants.py: shared constants (API URLs, timeouts, message types)
  - crypto.py: AES-256-GCM key generation and secret decryption
  - onboard.py: QR-code scan-to-configure API (create_bind_task, poll_bind_result)
  - utils.py: User-Agent builder, HTTP headers, config helpers
  - __init__.py: re-exports all public symbols for backward compatibility

- Add interactive QR-code setup flow in hermes_cli/gateway.py:
  - Terminal QR rendering via qrcode package (graceful fallback to URL)
  - Auto-refresh on QR expiry (up to 3 times)
  - AES-256-GCM encrypted credential exchange
  - DM security policy selection (pairing/allowlist/open)

- Update hermes_cli/setup.py to delegate to gateway's _setup_qqbot()
- Add qrcode>=7.4 dependency to pyproject.toml and requirements.txt
2026-04-17 15:31:14 -07:00
Teknium 31e7276474 fix(gateway): consolidate per-session cleanup; close SessionDB on shutdown (#11800)
Three closely-related fixes for shutdown / lifecycle hygiene.

1. _release_running_agent_state(session_key) helper
   ----------------------------------------------------
   Per-running-agent state lived in three dicts that drifted out of sync
   across cleanup sites:
     self._running_agents       — AIAgent per session_key
     self._running_agents_ts    — start timestamp per session_key
     self._busy_ack_ts          — last busy-ack timestamp per session_key

   Inventory before this PR:
     8 sites: del self._running_agents[key]
       — only 1 (stale-eviction) cleaned all three
       — 1 cleaned _running_agents + _running_agents_ts only
       — 6 cleaned _running_agents only

   Each missed entry was a (str, float) tuple per session per gateway
   lifetime — small, persistent, accumulates across thousands of
   sessions over months.  Per-platform leaks compounded.

   This change adds a single helper that pops all three dicts in
   lockstep, and replaces every bare 'del self._running_agents[key]'
   site with it.  Per-session state that PERSISTS across turns
   (_session_model_overrides, _voice_mode, _pending_approvals,
   _update_prompt_pending) is intentionally NOT touched here — those
   have their own lifecycles tied to user actions, not turn boundaries.

2. _running_agents_ts cleared in _stop_impl
   ----------------------------------------
   Was being missed alongside _running_agents.clear(); now included.

3. SessionDB close() in _stop_impl
   ---------------------------------
   The SQLite WAL write lock stayed held by the old gateway connection
   until Python actually exited — causing 'database is locked' errors
   when --replace launched a new gateway against the same file.  We
   now explicitly close both self._db and self.session_store._db
   inside _stop_impl, with try/except so a flaky close on one doesn't
   block the other.

Tests
-----
tests/gateway/test_session_state_cleanup.py — 10 cases covering:
  * helper pops all three dicts atomically
  * idempotent on missing/empty keys
  * preserves other sessions
  * tolerates older runners without _busy_ack_ts attribute
  * thread-safe under concurrent release
  * regression guard: scans gateway/run.py and fails if a future
    contributor reintroduces 'del self._running_agents[...]'
    outside docstrings
  * SessionDB close called on both holders during shutdown
  * shutdown tolerates missing session_store
  * shutdown tolerates close() raising on one db (other still closes)

Broader gateway suite: 3108 passed (vs 3100 on baseline) — failure
delta is +8 net passes; the 10 remaining failures are pre-existing
cross-test pollution / missing optional deps (matrix needs olm,
signal/telegram approval flake, dingtalk Mock wiring), all reproduce
on stashed baseline.
2026-04-17 15:18:23 -07:00
Teknium 036dacf659 feat(telegram): auto-wrap markdown tables in code blocks (#11794)
Telegram's MarkdownV2 has no table syntax — pipes get backslash-escaped
and tables render as noisy unaligned text.  format_message now detects
GFM-style pipe tables (header row + delimiter row + optional body) and
wraps them in ``` fences before the existing MarkdownV2 conversion runs.
Telegram renders fenced code blocks as monospace preformatted text with
columns intact.

Tables already inside an existing code block are left alone.  Plain
prose with pipes, lone '---' horizontal rules, and non-table content
are unaffected.

Closes the recurring community request to stop having to ask the agent
to re-render tables as code blocks manually.
2026-04-17 14:27:26 -07:00
Teknium 3207b9bda0 test: speed up slow tests (backoff + subprocess + IMDS network) (#11797)
Cuts shard-3 local runtime in half by neutralizing real wall-clock
waits across three classes of slow test:

## 1. Retry backoff mocks

- tests/run_agent/conftest.py (NEW): autouse fixture mocks
  jittered_backoff to 0.0 so the `while time.time() < sleep_end`
  busy-loop exits immediately. No global time.sleep mock (would
  break threading tests).
- test_anthropic_error_handling, test_413_compression,
  test_run_agent_codex_responses, test_fallback_model: per-file
  fixtures mock time.sleep / asyncio.sleep for retry / compression
  paths.
- test_retaindb_plugin: cap the retaindb module's bound time.sleep
  to 0.05s via a per-test shim (background writer-thread retries
  sleep 2s after errors; tests don't care about exact duration).
  Plus replace arbitrary time.sleep(N) waits with short polling
  loops bounded by deadline.

## 2. Subprocess sleeps in production code

- test_update_gateway_restart: mock time.sleep. Production code
  does time.sleep(3) after `systemctl restart` to verify the
  service survived. Tests mock subprocess.run \u2014 nothing actually
  restarts \u2014 so the wait is dead time.

## 3. Network / IMDS timeouts (biggest single win)

- tests/conftest.py: add AWS_EC2_METADATA_DISABLED=true plus
  AWS_METADATA_SERVICE_TIMEOUT=1 and ATTEMPTS=1. boto3 falls back
  to IMDS (169.254.169.254) when no AWS creds are set. Any test
  hitting has_aws_credentials() / resolve_aws_auth_env_var() (e.g.
  test_status, test_setup_copilot_acp, anything that touches
  provider auto-detect) burned ~2-4s waiting for that to time out.
- test_exit_cleanup_interrupt: explicitly mock
  resolve_runtime_provider which was doing real network auto-detect
  (~4s). Tests don't care about provider resolution \u2014 the agent
  is already mocked.
- test_timezone: collapse the 3-test "TZ env in subprocess" suite
  into 2 tests by checking both injection AND no-leak in the same
  subprocess spawn (was 3 \u00d7 3.2s, now 2 \u00d7 4s).

## Validation

| Test | Before | After |
|---|---|---|
| test_anthropic_error_handling (8 tests) | ~80s | ~15s |
| test_413_compression (14 tests) | ~18s | 2.3s |
| test_retaindb_plugin (67 tests) | ~13s | 1.3s |
| test_status_includes_tavily_key | 4.0s | 0.05s |
| test_setup_copilot_acp_skips_same_provider_pool_step | 8.0s | 0.26s |
| test_update_gateway_restart (5 tests) | ~18s total | ~0.35s total |
| test_exit_cleanup_interrupt (2 tests) | 8s | 1.5s |
| **Matrix shard 3 local** | **108s** | **50s** |

No behavioral contract changed \u2014 tests still verify retry happens,
service restart logic runs, etc.; they just don't burn real seconds
waiting for it.

Supersedes PR #11779 (those changes are included here).
2026-04-17 14:21:22 -07:00
Teknium eb07c05646 fix(gateway): prune stale SessionStore entries to bound memory + disk (#11789)
SessionStore._entries grew unbounded.  Every unique
(platform, chat_id, thread_id, user_id) tuple ever seen was kept in
RAM and rewritten to sessions.json on every message.  A Discord bot
in 100 servers x 100 channels x ~100 rotating users accumulates on
the order of 10^5 entries after a few months; each sessions.json
write becomes an O(n) fsync.  Nothing trimmed this — there was no
TTL, no cap, no eviction path.

Changes
-------
* SessionStore.prune_old_entries(max_age_days) — drops entries whose
  updated_at is older than the cutoff.  Preserves:
    - suspended entries (user paused them via /stop for later resume)
    - entries with an active background process attached
  Pruning is functionally identical to a natural reset-policy expiry:
  SQLite transcript stays, session_key -> session_id mapping dropped,
  returning user gets a fresh session.

* GatewayConfig.session_store_max_age_days (default 90; 0 disables).
  Serialized in to_dict/from_dict, coerced from bad types / negatives
  to safe defaults.  No migration needed — missing field -> 90 days.

* _session_expiry_watcher calls prune_old_entries once per hour
  (first tick is immediate).  Uses the existing watcher loop so no
  new background task is created.

Why not more aggressive
-----------------------
90 days is long enough that legitimate long-idle users (seasonal,
vacation, etc.) aren't surprised — pruning just means they get a
fresh session on return, same outcome they'd get from any other
reset-policy trigger.  Admins can lower it via config; 0 disables.

Tests
-----
tests/gateway/test_session_store_prune.py — 17 cases covering:
  * entry age based on updated_at, not created_at
  * max_age_days=0 disables; negative coerces to 0
  * suspended + active-process entries are skipped
  * _save fires iff something was removed
  * disk JSON reflects post-prune state
  * thread safety against concurrent readers
  * config field roundtrips + graceful fallback on bad values
  * watcher gate logic (first tick prunes, subsequent within 1h don't)

119 broader session/gateway tests remain green.
2026-04-17 13:48:49 -07:00
Teknium f362083c64 fix(providers): complete NVIDIA NIM parity with other providers
Follow-up on the native NVIDIA NIM provider salvage. The original PR wired
PROVIDER_REGISTRY + HERMES_OVERLAYS correctly but missed several touchpoints
required for full parity with other OpenAI-compatible providers (xai,
huggingface, deepseek, zai).

Gaps closed:

- hermes_cli/main.py:
  - Add 'nvidia' to the _model_flow_api_key_provider dispatch tuple so
    selecting 'NVIDIA NIM' in `hermes model` actually runs the api-key
    provider flow (previously fell through silently).
  - Add 'nvidia' to `hermes chat --provider` argparse choices so the
    documented test command (`hermes chat --provider nvidia --model ...`)
    parses successfully.

- hermes_cli/config.py: Register NVIDIA_API_KEY and NVIDIA_BASE_URL in
  OPTIONAL_ENV_VARS so setup wizard can prompt for them and they're
  auto-added to the subprocess env blocklist.

- hermes_cli/doctor.py: Add NVIDIA NIM row to `_apikey_providers` so
  `hermes doctor` probes https://integrate.api.nvidia.com/v1/models.

- hermes_cli/dump.py: Add NVIDIA_API_KEY → 'nvidia' mapping for
  `hermes dump` credential masking.

- tests/tools/test_local_env_blocklist.py: Extend registry_vars fixture
  with NVIDIA_API_KEY to verify it's blocked from leaking into subprocesses.

- agent/model_metadata.py: Add 'nemotron' → 131072 context-length entry
  so all Nemotron variants get 128K context via substring match (rather
  than falling back to MINIMUM_CONTEXT_LENGTH).

- hermes_cli/models.py: Fix hallucinated model ID
  'nvidia/nemotron-3-nano-8b-a4b' → 'nvidia/nemotron-3-nano-30b-a3b'
  (verified against live integrate.api.nvidia.com/v1/models catalog).
  Expand curated list from 5 to 9 agentic models mapping to OpenRouter
  defaults per provider-guide convention: add qwen3.5-397b-a17b,
  deepseek-v3.2, llama-3.3-nemotron-super-49b-v1.5, gpt-oss-120b.

- cli-config.yaml.example: Document 'nvidia' provider option.

- scripts/release.py: Map asurla@nvidia.com → anniesurla in AUTHOR_MAP
  for CI attribution.

E2E verified: `hermes chat --provider nvidia ...` now reaches NVIDIA's
endpoint (returns 401 with bogus key instead of argparse error);
`hermes doctor` detects NVIDIA NIM when NVIDIA_API_KEY is set.
2026-04-17 13:47:46 -07:00
asurla 3b569ff576 feat(providers): add native NVIDIA NIM provider
Adds NVIDIA NIM as a first-class provider: ProviderConfig in
auth.py, HermesOverlay in providers.py, curated models
(Nemotron plus other open source models hosted on
build.nvidia.com), URL mapping in model_metadata.py, aliases
(nim, nvidia-nim, build-nvidia, nemotron), and env var tests.

Docs updated: providers page, quickstart table, fallback
providers table, and README provider list.
2026-04-17 13:47:46 -07:00
Brooklyn Nicholson bd09e42eac Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 15:44:57 -05:00
Teknium cc3aa76675 build(deps): add qrcode to dingtalk + feishu extras (parity with messaging) (#11627)
#4b1567f4 (anthhub) added qrcode to the messaging extra for Weixin's
QR login. The same package is needed by:

  * hermes_cli/dingtalk_auth.py — QR device-flow auth shipped in #11574
  * gateway/platforms/feishu.py:3962 — Feishu QR login

These extras are independent of [messaging] (users can install
hermes-agent[dingtalk] or hermes-agent[feishu] without [messaging]),
so the dep needs to be declared on each.

Pin matches anthhub's choice (>=7.0,<8) for consistency. The all
extra inherits from all three, so it picks up qrcode transitively.

Adds parallel tests to tests/test_project_metadata.py — same shape
as test_messaging_extra_includes_qrcode_for_weixin_setup.

Refs #9431.
2026-04-17 13:31:53 -07:00
Teknium 2ff1ef6ae6 fix(surrogates): sanitize reasoning/reasoning_content/reasoning_details fields (#11628)
Byte-level reasoning models (xiaomi/mimo-v2-pro, kimi, glm) can emit lone
surrogates in reasoning output. The proactive sanitizer walked content/
name/tool_calls but not extra fields like reasoning or the nested
reasoning_details array. Surrogates in those fields survived the
proactive pass, crashed json.dumps() in the OpenAI SDK, and the recovery
block's _sanitize_messages_surrogates(messages) call also didn't check
those fields — so 'found' was False, no retry happened, and after 3
attempts the user saw:

  API call failed after 3 retries. 'utf-8' codec can't encode characters
  in position N-M: surrogates not allowed

Changes:
- _sanitize_messages_surrogates: walk any extra string fields (reasoning,
  reasoning_content, etc.) and recurse into nested dict/list values
  (reasoning_details). Mirrors _sanitize_messages_non_ascii coverage
  added in PR #10537.
- _sanitize_structure_surrogates: new recursive walker, mirror of
  _sanitize_structure_non_ascii but for surrogate recovery.
- UnicodeEncodeError recovery block: also sanitize api_messages,
  api_kwargs, and prefill_messages (not just the canonical messages
  list — the API-copy carries reasoning_content transformed from
  reasoning and that's what the SDK actually serializes). Always
  retry on detected surrogate errors, not only when we found
  something to strip — gate on error type per PR #10537's pattern.

Tests: extended tests/cli/test_surrogate_sanitization.py with
coverage for reasoning, reasoning_content, reasoning_details (flat
and deeply nested), structure walker, and an integration case that
reproduces the exact api_messages shape that was crashing.
2026-04-17 13:30:47 -07:00
Teknium 1229d8855c fix: remove misleading model.max_tokens suggestion from thinking-exhausted error (#11626)
The 'Thinking Budget Exhausted' user-facing error message advised users to
'set model.max_tokens in config.yaml'. That config key is documented but
intentionally not wired through to the API call in CLI/gateway paths — we
omit max_tokens by default so the inference server uses its full output
budget (llama-server -1=infinity, vLLM max_model_len-prompt_len, etc.).

Users followed the suggestion, saw no change, and kept filing bugs (see
closed #4404, #10917, #6955 and PRs #5001/#6080/#6446/#6707/#7075/#8804/
#10924/#11173/#11268 — all reporting the same misdirection).

Replace the misleading suggestion with an actionable one: switch models
via /model. Lowering reasoning effort remains the primary remediation.
2026-04-17 13:29:54 -07:00
Henkey d49126b987 fix(release): map HenkDz contributor email 2026-04-17 13:29:26 -07:00
Henkey cb883f9e97 fix(acp): improve zed integration 2026-04-17 13:29:26 -07:00
Brooklyn Nicholson d5b9db8b4a Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 15:13:36 -05:00
Brooklyn Nicholson 6a37802476 chore: uptick 2026-04-17 15:13:33 -05:00
Teknium d0e1388ca9 fix(tests): make AIAgent constructor calls self-contained (#11755)
* fix(tests): make AIAgent constructor calls self-contained (no env leakage)

Tests in tests/run_agent/ were constructing AIAgent() without passing
both api_key and base_url, then relying on leaked state from other
tests in the same xdist worker (or process-level env vars) to keep
provider resolution happy. Under hermetic conftest + pytest-split,
that state is gone and the tests fail with 'No LLM provider configured'.

Fix: pass both api_key and base_url explicitly on 47 AIAgent()
construction sites across 13 files. AIAgent.__init__ with both set
takes the direct-construction path (line 960 in run_agent.py) and
skips the resolver entirely.

One call site (test_none_base_url_passed_as_none) left alone — that
test asserts behavior for base_url=None specifically.

This is a prerequisite for any future matrix-split or stricter
isolation work, and lands cleanly on its own.

Validation:
- tests/run_agent/ full: 760 passed, 0 failed (local)
- Previously relied on cross-test pollution; now self-contained

* fix(tests): update opencode-go model order assertion to match kimi-k2.5-first

commit 78a74bb promoted kimi-k2.5 to first position in model suggestion
lists but didn't update this test, which has been failing on main since.
Reorder expected list to match the new canonical order.
2026-04-17 12:32:03 -07:00
kshitij 78a74bb097 feat: promote kimi-k2.5 to first position in all model suggestion lists (#11745)
Move moonshotai/kimi-k2.5 to position #1 in every model picker list:
- OPENROUTER_MODELS (with 'recommended' tag)
- _PROVIDER_MODELS: nous, kimi-coding, opencode-zen, opencode-go, alibaba, huggingface
- _model_flow_kimi() Coding Plan model list in main.py

kimi-coding-cn and moonshot lists already had kimi-k2.5 first.
2026-04-17 12:05:22 -07:00
Brooklyn Nicholson bedbeebbc8 feat(tui): interleave tool rows into live assistant turns
Live turn rendering used to show the streaming assistant text as one
blob with tool calls pooled in a separate section below, so the live
view drifted from the reload view (which threads tool rows inline via
toTranscriptMessages). Model now mirrors reload:

- turnStore gains streamSegments (completed assistant chunks, each
  with any tool rows that landed between its predecessor and itself)
  and streamPendingTools (tool rows waiting for the next chunk)
- turnController.flushStreamingSegment() seals the current bufRef into
  a segment when a new tool.start fires; pending tools get attached to
  that next chunk so order matches reload hydration
- recordMessageComplete returns finalMessages instead of one payload,
  so appendMessage gets the same shape for live-ending turns as for
  reloaded ones
- appLayout renders segments before the progress/streaming area, and
  the streaming message + pending-tools fallback carry whatever tools
  arrived after the last assistant chunk
2026-04-17 11:33:29 -05:00
Brooklyn Nicholson f53250b5e1 fix(tui): tighten /resume render, follow-up to 42721dbe
- useVirtualHistory: track last-seen ScrollBox metrics in a ref inside
  the post-layout effect and bump ver when sticky/top/vp change — the
  subscribe-based rearm was sufficient for fresh clicks but not for the
  "hydrated mid-commit, measured empty, then metrics settle" path where
  nothing re-triggered the hook until the next unrelated keystroke
- useSessionLifecycle: resume scrollToBottom from queueMicrotask to
  setTimeout(..., 0) so the fresh transcript has a full task turn to
  commit + measure before we try to land at the newest content
2026-04-17 11:33:14 -05:00
Brooklyn Nicholson 00591e3801 chore: fmt 2026-04-17 11:06:25 -05:00
Brooklyn Nicholson be768db627 fix: long history session thingy 2026-04-17 11:05:23 -05:00
Brooklyn Nicholson 42721dbe1c fix(tui): big-session /resume now renders without first keystroke
useVirtualHistory set up its useSyncExternalStore subscription during
the first render, when scrollRef.current was still null (the ScrollBox
ref attaches during commit, after render). Its useCallback for
subscribe had a stable scrollRef identity as its only dep, so it never
re-subscribed once the ref actually attached — the hook stayed stuck
with vp=0, top=0, no scroll subscription. Small sessions fit entirely
in cold-start so you didn't notice; big /resume sessions got sliced to
the last 40 items with a huge topSpacer and the viewport sat on empty
space until some unrelated state change (e.g. a keystroke) re-rendered
and finally read a real vp.

- flip a hasScrollRef flag in useLayoutEffect once the ref attaches and
  add it to the subscribe useCallback deps so useSyncExternalStore
  rearms with a real subscription
- on resume, scrollToBottom() after history hydrates so the ScrollBox
  lands at the newest messages instead of scrollTop=0 (stickyScroll
  doesn't auto-engage on the initial empty→full dump)
2026-04-17 11:04:29 -05:00
Brooklyn Nicholson 8f553a55b2 chore(tui): fix eslint/prettier nits from npm run fix
- drop inline `import()` type annotation in useSessionLifecycle (import
  `PanelSection` at the top like everything else)
- include `panel` and `session.resumeById` in the useMainApp useMemo
  deps now that the event handler depends on them
- wrap the derived `selected` range in a useMemo so it has stable
  identity and stops invalidating the TextInput `rendered` memo every
  render
- prettier re-sorting of a couple of export/import lines
2026-04-17 11:00:15 -05:00
Brooklyn Nicholson a82097e7a2 feat(tui): /model and /setup slash commands with in-place CLI handoff
- hermes-ink: export `withInkSuspended()` + `useExternalProcess()` that
  pause/resume Ink around an arbitrary external process (built on the
  existing enterAlternateScreen/exitAlternateScreen plumbing)
- tui: `launchHermesCommand(args)` spawns the `hermes` binary with
  inherited stdio, with `HERMES_BIN` override for non-standard launches
- tui: `/model` and `/setup` slash commands invoke the CLI wizards
  in-place, then re-preflight `setup.status` and auto-start a session on
  success — no more exit-and-relaunch to finish first-run setup
- setup panel now advertises those slashes instead of only pointing
  users back at the shell
2026-04-17 10:58:18 -05:00
Brooklyn Nicholson 0dd5055d59 fix(tui): first-run setup preflight + actionable no-provider panel
- tui_gateway: new `setup.status` RPC that reuses CLI's
  `_has_any_provider_configured()`, so the TUI can ask the same question
  the CLI bootstrap asks before launching a session
- useSessionLifecycle: preflight `setup.status` before both `newSession`
  and `resumeById`, and render a clear "Setup Required" panel when no
  provider is configured instead of booting a session that immediately
  fails with `agent init failed`
- createGatewayEventHandler: drop duplicate startup resume logic in
  favor of the preflighted `resumeById`, and special-case the
  no-provider agent-init error as a last-mile fallback to the same
  setup panel
- add regression tests for both paths
2026-04-17 10:58:01 -05:00
Brooklyn Nicholson 5b386ced71 fix(tui): approval flow + input ergonomics + selection perf
- tui_gateway: route approvals through gateway callback (HERMES_GATEWAY_SESSION/
  HERMES_EXEC_ASK) so dangerous commands emit approval.request instead of
  silently falling through the CLI input() path and auto-denying
- approval UX: dedicated PromptZone between transcript and composer, safer
  defaults (sel=0, numeric quick-picks, no Esc=deny), activity trail line,
  outcome footer under the cost row
- text input: Ctrl+A select-all, real forward Delete, Ctrl+W always consumed
  (fixes Ctrl+Backspace at cursor 0 inserting literal w)
- hermes-ink selection: swap synchronous onRender() for throttled
  scheduleRender() on drag, and only notify React subscribers on presence
  change — no more per-cell paint/subscribe spam
- useConfigSync: silence config.get polling failures instead of surfacing
  'error: timeout: config.get' in the transcript
2026-04-17 10:37:48 -05:00
Brooklyn Nicholson 0219da9626 chore: uptick 2026-04-17 09:47:19 -05:00
Brooklyn Nicholson 1f37ef2fd1 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 08:59:33 -05:00
Teknium 6ea7386a6f chore: map memosr, anthhub, shenuu, xiayh0107 emails to AUTHOR_MAP 2026-04-17 06:50:36 -07:00
Young Sherlock 8dcd08d8bb Fix Weixin media uploads and refresh lockfile 2026-04-17 06:50:36 -07:00
shenuu 3a0ec1d935 fix(weixin): macOS SSL cert, QR data, and refresh rendering
- Use certifi CA bundle for aiohttp SSL in qr_login(), start(), and
  send_weixin_direct() to fix SSL verification failures against
  Tencent's iLink server on macOS (Homebrew OpenSSL lacks system certs)
- Fix QR code data: encode qrcode_img_content (full liteapp URL) instead
  of raw hex token — WeChat needs the full URL to resolve the scan
- Render ASCII QR on refresh so the user can re-scan without restarting
- Improve error message on QR render failure to show the actual exception

Tested on macOS (Apple Silicon, Homebrew Python 3.13)
2026-04-17 06:50:36 -07:00
jinzheng8115 e105b7ac93 fix(weixin): retry send without context_token on iLink session expiry
iLink context_token has a limited TTL. When no user message has arrived
for an extended period (e.g. overnight), cron-initiated pushes fail with
errcode -14 (session timeout).

Tested that iLink accepts sends without context_token as a degraded
fallback, so we now automatically strip the expired token and retry
once. This keeps scheduled push messages (weather, digests, etc.)
working reliably without requiring a user message to refresh the
session first.

Changes:
- _send_text_chunk() catches iLinkDeliveryError with session-expired
  errcode (-14) and retries without context_token
- Stale tokens are cleared from ContextTokenStore on session expiry
- All 34 existing weixin tests pass
2026-04-17 06:50:36 -07:00
anthhub 4b1567f425 fix(packaging): include qrcode in messaging extra 2026-04-17 06:50:36 -07:00
memosr cedc95c100 fix(security): validate WeChat media URLs against CDN allowlist to prevent SSRF 2026-04-17 06:50:36 -07:00
Teknium c7334b4a50 chore(release): map @Hypn0sis and @OwenYWT to AUTHOR_MAP 2026-04-17 06:46:52 -07:00
Teknium 3f3d8a7b24 fix(discord): strip mention syntax from auto-thread names
Previously a message like `<@&1490963422786093149> help` would spawn a
thread literally named `<@&1490963422786093149> help`, exposing raw
Discord mention markers in the thread list. Only user mentions
(`<@id>`) were being stripped upstream — role mentions (`<@&id>`) and
channel mentions (`<#id>`) leaked through.

Fix: strip all three mention patterns in `_auto_create_thread` before
building the thread name. Collapse runs of whitespace left by the
removal. If the entire content was mention-only, fall back to 'Hermes'
instead of an empty title.

Fixes #6336.

Tests: two new regression guards in test_discord_slash_commands.py
covering mixed-mention content and mention-only content.
2026-04-17 06:46:52 -07:00
sgaofen 32a694ad5f fix(discord): fall back when auto-thread creation fails 2026-04-17 06:46:52 -07:00
OwenYWT f5dc4e905d fix(discord): skip auto-threading reply messages 2026-04-17 06:46:52 -07:00
Matteo De Agazio 93fe4b357d fix(discord): free-response channels skip auto-threading
Free-response channels already bypassed the @mention gate so users could
chat inline with the bot, but auto-threading still fired on every
message — spinning off a thread per message and defeating the
lightweight-chat purpose.

Fix: fold `is_free_channel` into `skip_thread` so threading is skipped
whenever the channel is in DISCORD_FREE_RESPONSE_CHANNELS (via env or
discord.free_response_channels in config.yaml).

Net change: one line in _handle_message + one regression test.

Partially addresses #9399. Authored by @Hypn0sis (salvaged from PR #9650;
the bundled 'smart' auto-thread mode from that PR was dropped in favor
of deterministic true/false semantics).
2026-04-17 06:46:52 -07:00
Teknium 8d7b7feb0d fix(gateway): bound _agent_cache with LRU cap + idle TTL eviction (#11565)
* fix(gateway): bound _agent_cache with LRU cap + idle TTL eviction

The per-session AIAgent cache was unbounded. Each cached AIAgent holds
LLM clients, tool schemas, memory providers, and a conversation buffer.
In a long-lived gateway serving many chats/threads, cached agents
accumulated indefinitely — entries were only evicted on /new, /model,
or session reset.

Changes:
- Cache is now an OrderedDict so we can pop least-recently-used entries.
- _enforce_agent_cache_cap() pops entries beyond _AGENT_CACHE_MAX_SIZE=64
  when a new agent is inserted. LRU order is refreshed via move_to_end()
  on cache hits.
- _sweep_idle_cached_agents() evicts entries whose AIAgent has been idle
  longer than _AGENT_CACHE_IDLE_TTL_SECS=3600s. Runs from the existing
  _session_expiry_watcher so no new background task is created.
- The expiry watcher now also pops the cache entry after calling
  _cleanup_agent_resources on a flushed session — previously the agent
  was shut down but its reference stayed in the cache dict.
- Evicted agents have _cleanup_agent_resources() called on a daemon
  thread so the cache lock isn't held during slow teardown.

Both tuning constants live at module scope so tests can monkeypatch
them without touching class state.

Tests: 7 new cases in test_agent_cache.py covering LRU eviction,
move_to_end refresh, cleanup thread dispatch, idle TTL sweep,
defensive handling of agents without _last_activity_ts, and plain-dict
test fixture tolerance.

* tweak: bump _AGENT_CACHE_MAX_SIZE 64 -> 128

* fix(gateway): never evict mid-turn agents; live spillover tests

The prior commit could tear down an active agent if its session_key
happened to be LRU when the cap was exceeded.  AIAgent.close() kills
process_registry entries for the task, tears down the terminal
sandbox, closes the OpenAI client (sets self.client = None), and
cascades .close() into any active child subagents — all fatal if
the agent is still processing a turn.

Changes:
- _enforce_agent_cache_cap and _sweep_idle_cached_agents now look at
  GatewayRunner._running_agents and skip any entry whose AIAgent
  instance is present (identity via id(), so MagicMock doesn't
  confuse lookup in tests).  _AGENT_PENDING_SENTINEL is treated
  as 'not active' since no real agent exists yet.
- Eviction only considers the LRU-excess window (first size-cap
  entries).  If an excess slot is held by a mid-turn agent, we skip
  it WITHOUT compensating by evicting a newer entry.  A freshly
  inserted session (zero cache history) shouldn't be punished to
  protect a long-lived one that happens to be busy.
- Cache may therefore stay transiently over cap when load spikes;
  a WARNING is logged so operators can see it, and the next insert
  re-runs the check after some turns have finished.

New tests (TestAgentCacheActiveSafety + TestAgentCacheSpilloverLive):
- Active LRU entry is skipped; no newer entry compensated
- Mixed active/idle excess window: only idle slots go
- All-active cache: no eviction, WARNING logged, all clients intact
- _AGENT_PENDING_SENTINEL doesn't block other evictions
- Idle-TTL sweep skips active agents
- End-to-end: active agent's .client survives eviction attempt
- Live fill-to-cap with real AIAgents, then spillover
- Live: CAP=4 all active + 1 newcomer — cache grows to 5, no teardown
- Live: 8 threads racing 160 inserts into CAP=16 — settles at 16
- Live: evicted session's next turn gets a fresh agent that works

30 tests pass (13 pre-existing + 17 new).  Related gateway suites
(model switch, session reset, proxy, etc.) all green.

* fix(gateway): cache eviction preserves per-task state for session resume

The prior commits called AIAgent.close() on cache-evicted agents, which
tears down process_registry entries, terminal sandbox, and browser
daemon for that task_id — permanently. Fine for session-expiry (session
ended), wrong for cache eviction (session may resume).

Real-world scenario: a user leaves a Telegram session open for 2+ hours,
idle TTL evicts the cached AIAgent, user returns and sends a message.
Conversation history is preserved via SessionStore, but their terminal
sandbox (cwd, env vars, bg shells) and browser state were destroyed.

Fix: split the two cleanup modes.

  close()               Full teardown — session ended. Kills bg procs,
                        tears down terminal sandbox + browser daemon,
                        closes LLM client. Used by session-expiry,
                        /new, /reset (unchanged).

  release_clients()     Soft cleanup — session may resume. Closes
                        LLM client only. Leaves process_registry,
                        terminal sandbox, browser daemon intact
                        for the resuming agent to inherit via
                        shared task_id.

Gateway cache eviction (_enforce_agent_cache_cap, _sweep_idle_cached_agents)
now dispatches _release_evicted_agent_soft on the daemon thread instead
of _cleanup_agent_resources. All session-expiry call sites of
_cleanup_agent_resources are unchanged.

Tests (TestAgentCacheIdleResume, 5 new cases):
- release_clients does NOT call process_registry.kill_all
- release_clients does NOT call cleanup_vm / cleanup_browser
- release_clients DOES close the LLM client (agent.client is None after)
- close() vs release_clients() — semantic contract pinned
- Idle-evicted session's rebuild with same session_id gets same task_id

Updated test_cap_triggers_cleanup_thread to assert the soft path fires
and the hard path does NOT.

35 tests pass in test_agent_cache.py; 67 related tests green.
2026-04-17 06:36:34 -07:00
Teknium fc04f83062 chore(release): map jvcl author email for release notes 2026-04-17 06:33:21 -07:00
Jorge fe0e7edd27 fix(cli): clear input buffer after /model picker selection
The Enter handler that confirms a selection in the /model picker closed
the picker but never reset event.app.current_buffer, leaving the user's
original "/model" command lingering in the prompt. Match the ESC and
Ctrl+C handlers (which already reset the buffer) so the prompt is empty
after a successful switch.
2026-04-17 06:33:21 -07:00
Jorge 86f02d8d71 refactor(cli): align model picker viewport with PR #11260 vocabulary
Match the row-budget naming introduced in PR #11260 for the approval and
clarify panels: rename chrome_reserve=14 into reserved_below=6 (input
chrome below the panel) + panel_chrome=6 (this panel's borders, blanks,
and hint row) + min_visible=3 (floor on visible items). Same arithmetic
as before, but a reviewer reading both files now sees the same handle.

Compact-chrome mode is intentionally not adopted — that pattern fits the
"fixed mandatory content might overflow" shape of approval/clarify
(solved by truncating with a marker), whereas the picker's overflow is
already handled by the scrolling viewport.
2026-04-17 06:33:21 -07:00
Jorge 5fbe16635b fix(cli): scroll the /model picker viewport so long catalogs aren't clipped
The /model picker rendered every choice into a prompt_toolkit Window
with no max height. Providers with many models (e.g. Ollama Cloud's 36+)
overflowed the terminal, clipping the bottom border and the last items.

- Add HermesCLI._compute_model_picker_viewport() to slide a scroll
  offset that keeps the cursor on screen, sized from the live terminal
  rows minus chrome reserved for input/status/border.
- Render only the visible slice in _get_model_picker_display() and
  persist the offset on _model_picker_state across redraws.
- Bind ESC (eager) to close the picker, matching the Cancel button.
- Cover the viewport math with 8 unit tests in
  tests/hermes_cli/test_model_picker_viewport.py.
2026-04-17 06:33:21 -07:00
Teknium fdf42d62a0 chore: map briandevans and LLQWQ emails to AUTHOR_MAP 2026-04-17 06:26:43 -07:00
Teknium f64241ed90 feat(cron+tests): extend origin fallback to email/dingtalk/qqbot + fix Weixin test mocks
Cron origin fallback extension (builds on #9193's _HOME_TARGET_ENV_VARS):
adds the three remaining origin-fallback-eligible platforms that have
home channel env vars configured in gateway/config.py but use non-generic
env var names:

- email    → EMAIL_HOME_ADDRESS   (non-standard suffix)
- dingtalk → DINGTALK_HOME_CHANNEL
- qqbot    → QQ_HOME_CHANNEL      (non-standard prefix: QQ_ not QQBOT_)

Picks up the completeness intent of @Xowiek's PR #11317 using the
architecturally-correct dict-based lookup from #9193, so platforms with
non-standard env var names actually resolve instead of silently missing.
Extended the parametrized regression test to cover the new three.

Weixin test mock alignment (builds on #10091's _send_session split):
Three test sites added in Batch 1 (TestWeixinSendImageFileParameterName)
and Batch 3 (TestWeixinVoiceSending) mocked only adapter._session, but
#10091 switched the send paths to check self._send_session. Added the
companion setter so the tests stay green with the session split in place.
2026-04-17 06:26:43 -07:00
bde3249023 b46db048c3 fix(cron): align home target env lookup 2026-04-17 06:26:43 -07:00
bde3249023 f696b4745a fix(cron): restore origin fallback for feishu home channels 2026-04-17 06:26:43 -07:00
Ubuntu 5ca52bae5b fix(gateway/weixin): split poll/send sessions, reuse live adapter for cron & send_message
- gateway/platforms/weixin.py:
  - Split aiohttp.ClientSession into _poll_session and _send_session
  - Add _LIVE_ADAPTERS registry so send_weixin_direct() reuses the connected gateway adapter instead of creating a competing session
  - Fixes silent message loss when gateway is running (iLink token contention)

- cron/scheduler.py:
  - Support comma-separated deliver values (e.g. 'feishu,weixin') for multi-target delivery
  - Delay pconfig/enabled check until standalone fallback so live adapters work even when platform is not in gateway config

- tools/send_message_tool.py:
  - Synthesize PlatformConfig from WEIXIN_* env vars when gateway config lacks a weixin entry
  - Fall back to WEIXIN_HOME_CHANNEL env var for home channel resolution

- tests/gateway/test_weixin.py:
  - Update mocks to include _send_session
2026-04-17 06:26:43 -07:00
Teknium c60b6dc317 test(dingtalk): cover get_connected_platforms + null platform_toolsets
Follow-ups to the salvaged commits in this PR:

* gateway/config.py — strip trailing whitespace from youngDoo's diff
  (line 315 had ~140 trailing spaces).

* hermes_cli/tools_config.py — replace `config.get("platform_toolsets", {})`
  with `config.get("platform_toolsets") or {}`. Handles the case where the
  YAML key is present but explicitly null (parses as None, previously
  crashed with AttributeError on the next line's .get(platform)).
  Cherry-picked from yyq4193's #9003 with attribution.

* tests/gateway/test_config.py — 4 new tests for TestGetConnectedPlatforms
  covering DingTalk via extras, via env vars, disabled, and missing creds.

* tests/hermes_cli/test_tools_config.py — regression test for the null
  platform_toolsets edge case.

* scripts/release.py — add kagura-agent, youngDoo, yyq4193 to AUTHOR_MAP.

Co-authored-by: yyq4193 <39405770+yyq4193@users.noreply.github.com>
2026-04-17 06:26:18 -07:00
kagura-agent 47a0dd1024 fix(dingtalk): fire-and-forget message processing & session_webhook fallback
Fixes #11463: DingTalk channel receives messages but fails to reply
with 'No session_webhook available'.

Two changes:

1. **Fire-and-forget message processing**: process() now dispatches
   _on_message as a background task via asyncio.create_task instead of
   awaiting it. This ensures the SDK ACK is returned immediately,
   preventing heartbeat timeouts and disconnections when message
   processing takes longer than the SDK's ACK deadline.

2. **session_webhook extraction fallback**: If ChatbotMessage.from_dict()
   fails to map the sessionWebhook field (possible across SDK versions),
   the handler now falls back to extracting it directly from the raw
   callback data dict using both 'sessionWebhook' and 'session_webhook'
   key variants.

Added 3 tests covering webhook extraction, fallback behavior, and
fire-and-forget ACK timing.
2026-04-17 06:26:18 -07:00
youngDoo 91e7aff219 gateway cant add DingTalk platform
gateway cant add DingTalk platform without key and secret
2026-04-17 06:26:18 -07:00
Teknium d404849351 test: make test env hermetic; enforce CI parity via scripts/run_tests.sh (#11577)
* test: make test env hermetic; enforce CI parity via scripts/run_tests.sh

Fixes the recurring 'works locally, fails in CI' (and vice versa) class
of flakes by making tests hermetic and providing a canonical local runner
that matches CI's environment.

## Layer 1 — hermetic conftest.py (tests/conftest.py)

Autouse fixture now unsets every credential-shaped env var before every
test, so developer-local API keys can't leak into tests that assert
'auto-detect provider when key present'.

Pattern: unset any var ending in _API_KEY, _TOKEN, _SECRET, _PASSWORD,
_CREDENTIALS, _ACCESS_KEY, _PRIVATE_KEY, etc. Plus an explicit list of
credential names that don't fit the suffix pattern (AWS_ACCESS_KEY_ID,
FAL_KEY, GH_TOKEN, etc.) and all the provider BASE_URL overrides that
change auto-detect behavior.

Also unsets HERMES_* behavioral vars (HERMES_YOLO_MODE, HERMES_QUIET,
HERMES_SESSION_*, etc.) that mutate agent behavior.

Also:
  - Redirects HOME to a per-test tempdir (not just HERMES_HOME), so
    code reading ~/.hermes/* directly can't touch the real dir.
  - Pins TZ=UTC, LANG=C.UTF-8, LC_ALL=C.UTF-8, PYTHONHASHSEED=0 to
    match CI's deterministic runtime.

The old _isolate_hermes_home fixture name is preserved as an alias so
any test that yields it explicitly still works.

## Layer 2 — scripts/run_tests.sh canonical runner

'Always use scripts/run_tests.sh, never call pytest directly' is the
new rule (documented in AGENTS.md). The script:
  - Unsets all credential env vars (belt-and-suspenders for callers
    who bypass conftest — e.g. IDE integrations)
  - Pins TZ/LANG/PYTHONHASHSEED
  - Uses -n 4 xdist workers (matches GHA ubuntu-latest; -n auto on
    a 20-core workstation surfaces test-ordering flakes CI will never
    see, causing the infamous 'passes in CI, fails locally' drift)
  - Finds the venv in .venv, venv, or main checkout's venv
  - Passes through arbitrary pytest args

Installs pytest-split on demand so the script can also be used to run
matrix-split subsets locally for debugging.

## Remove 3 module-level dotenv stubs that broke test isolation

tests/hermes_cli/test_{arcee,xiaomi,api_key}_provider.py each had a
module-level:

    if 'dotenv' not in sys.modules:
        fake_dotenv = types.ModuleType('dotenv')
        fake_dotenv.load_dotenv = lambda *a, **kw: None
        sys.modules['dotenv'] = fake_dotenv

This patches sys.modules['dotenv'] to a fake at import time with no
teardown. Under pytest-xdist LoadScheduling, whichever worker collected
one of these files first poisoned its sys.modules; subsequent tests in
the same worker that imported load_dotenv transitively (e.g.
test_env_loader.py via hermes_cli.env_loader) got the no-op lambda and
saw their assertions fail.

dotenv is a required dependency (python-dotenv>=1.2.1 in pyproject.toml),
so the defensive stub was never needed. Removed.

## Validation

- tests/hermes_cli/ alone: 2178 passed, 1 skipped, 0 failed (was 4
  failures in test_env_loader.py before this fix)
- tests/test_plugin_skills.py, tests/hermes_cli/test_plugins.py,
  tests/test_hermes_logging.py combined: 123 passed (the caplog
  regression tests from PR #11453 still pass)
- Local full run shows no F/E clusters in the 0-55% range that were
  previously present before the conftest hardening

## Background

See AGENTS.md 'Testing' section for the full list of drift sources
this closes. Matrix split (closed as #11566) will be re-attempted
once this foundation lands — cross-test pollution was the root cause
of the shard-3 hang in that PR.

* fix(conftest): don't redirect HOME — it broke CI subprocesses

PR #11577's autouse fixture was setting HOME to a per-test tempdir.
CI started timing out at 97% complete with dozens of E/F markers and
orphan python processes at cleanup — tests (or transitive deps)
spawn subprocesses that expect a stable HOME, and the redirect broke
them in non-obvious ways.

Env-var unsetting and TZ/LANG/hashseed pinning (the actual CI-drift
fixes) are unchanged and still in place. HERMES_HOME redirection is
also unchanged — that's the canonical way to isolate tests from
~/.hermes/, not HOME.

Any code in the codebase reading ~/.hermes/* via `Path.home() / ".hermes"`
instead of `get_hermes_home()` is a bug to fix at the callsite, not
something to paper over in conftest.
2026-04-17 06:09:09 -07:00
Teknium ee95822e07 chore(release): map jz.pentest@gmail.com to @0xyg3n 2026-04-17 05:48:26 -07:00
Teknium e5b880264b fix(discord): harden DISCORD_ALLOWED_ROLES and cover gateway layer
Two follow-ups to the cherry-picked PR #9873 (`e3bcc819`):

1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())`
   so test fixtures that build the adapter via `object.__new__`
   (skipping __init__) don't crash with AttributeError.
   See AGENTS.md pitfall #17 — same pattern as gateway.run.

2. New 3-case regression coverage in test_discord_bot_auth_bypass.py:
   - role-only config bypasses the gateway 'no allowlists' branch
   - roles + users combined still authorizes user-allowlist matches
   - the role bypass does NOT leak to other platforms (Telegram, etc.)

3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord
   auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from
   a previous test in the session can't flip later 'should-reject' tests
   into false-pass.

Required because the bare cherry-pick of #9873 only added the adapter-
level role check — it didn't cover the gateway-level _is_user_authorized,
which still rejected role-only setups via the 'no allowlists configured'
branch.
2026-04-17 05:48:26 -07:00
0xyg3n 541a3e27d7 feat(discord): add DISCORD_ALLOWED_ROLES env var for role-based access control
Adds a new DISCORD_ALLOWED_ROLES environment variable that allows filtering
bot interactions by Discord role ID. Uses OR semantics with the existing
DISCORD_ALLOWED_USERS - if a user matches either allowlist, they're permitted.

Changes:
- Parse DISCORD_ALLOWED_ROLES comma-separated role IDs on connect
- Enable members intent when roles are configured (needed for role lookup)
- Update _is_allowed_user() to accept optional author param for direct role check
- Fallback to scanning mutual guilds when author object lacks roles (DMs, voice)
- Fully backwards compatible: no behavior change when env var is unset
2026-04-17 05:48:26 -07:00
Teknium 0741f22463 chore(release): map gnanasekaran.sekareee@gmail.com to @gnanam1990 2026-04-17 05:42:04 -07:00
Teknium 7d888ab49c test(discord): regression guard for DISCORD_ALLOW_BOTS auth bypass
Six test cases covering:
- DISCORD_ALLOW_BOTS=mentions + bot not in DISCORD_ALLOWED_USERS → authorized
- DISCORD_ALLOW_BOTS=all + bot not in DISCORD_ALLOWED_USERS → authorized
- DISCORD_ALLOW_BOTS=none → bots still rejected (preserves security)
- DISCORD_ALLOW_BOTS unset → same as 'none'
- Humans still checked against allowlist even with allow_bots=all
- Bot bypass is Discord-specific — doesn't leak to other platforms

Guards against a regression where the is_bot bypass in _is_user_authorized
gets moved, removed, or accidentally extended to other platforms.
2026-04-17 05:42:04 -07:00
gnanam1990 0f4403346d fix(discord): DISCORD_ALLOW_BOTS=mentions/all now works without DISCORD_ALLOWED_USERS
Fixes #4466.

Root cause: two sequential authorization gates both independently rejected
bot messages, making DISCORD_ALLOW_BOTS completely ineffective.

Gate 1 — `discord.py` `on_message`:
    _is_allowed_user ran BEFORE the bot filter, so bot senders were dropped
    before the DISCORD_ALLOW_BOTS policy was ever evaluated.

Gate 2 — `gateway/run.py` _is_user_authorized:
    The gateway-level allowlist check rejected bot IDs with 'Unauthorized
    user: <bot_id>' even if they passed Gate 1.

Fix:

  gateway/platforms/discord.py — reorder on_message so DISCORD_ALLOW_BOTS
  runs BEFORE _is_allowed_user. Bots permitted by the filter skip the
  user allowlist; non-bots are still checked.

  gateway/session.py — add is_bot: bool = False to SessionSource so the
  gateway layer can distinguish bot senders.

  gateway/platforms/base.py — expose is_bot parameter in build_source.

  gateway/platforms/discord.py _handle_message — set is_bot=True when
  building the SessionSource for bot authors.

  gateway/run.py _is_user_authorized — when source.is_bot is True AND
  DISCORD_ALLOW_BOTS is 'mentions' or 'all', return True early. Platform
  filter already validated the message at on_message; don't re-reject.

Behavior matrix:

  | Config                                     | Before  | After   |
  | DISCORD_ALLOW_BOTS=none (default)          | Blocked | Blocked |
  | DISCORD_ALLOW_BOTS=all                     | Blocked | Allowed |
  | DISCORD_ALLOW_BOTS=mentions + @mention     | Blocked | Allowed |
  | DISCORD_ALLOW_BOTS=mentions, no mention    | Blocked | Blocked |
  | Human in DISCORD_ALLOWED_USERS             | Allowed | Allowed |
  | Human NOT in DISCORD_ALLOWED_USERS         | Blocked | Blocked |

Co-authored-by: Hermes Maintainer <hermes@nousresearch.com>
2026-04-17 05:42:04 -07:00
Teknium d7fb435e0e fix(discord): flat /skill command with autocomplete — fits 8KB limit trivially (#11580)
Closes #11321, closes #10259.

## Problem

The nested /skill command group (category subcommand groups + skill
subcommands) serialized to ~14KB with the default 75-skill catalog,
exceeding Discord's ~8000-byte per-command registration payload. The
entire tree.sync() rejected with error 50035 — ALL slash commands
including the 27 base commands failed to register.

## Fix

Replace the nested Group layout with a single flat Command:

    /skill name:<autocomplete> args:<optional string>

Autocomplete options are fetched dynamically by Discord when the user
types — they do NOT count against the per-command registration budget.
So this single command registers at ~200 bytes regardless of how many
skills exist. Scales to thousands of skills with no size calculations,
no splitting, no hidden skills.

UX improvements:
- Discord live-filters by user's typed prefix against BOTH name and
  description, so '/skill pdf' finds 'ocr-and-documents' via its
  description. More discoverable than clicking through category menus.
- Unknown skill name → ephemeral error pointing user at autocomplete.
- Stable alphabetical ordering across restarts.

## Why not the other proposed approaches

Three prior PRs tried to fit within the 8KB limit by modifying the
nested layout:

- #10214 (njiangk): truncated all descriptions to 'Run <name>' and
  category descriptions to 'Skills'. Works but destroys slash picker UX.
- #11385 (LeonSGP43): 40-char description clamp + iterative
  trim-largest-category fallback. Works but HIDES skills the user can
  no longer invoke via slash — functional regression.
- #10261 (zeapsu): adaptive split into /skill-<cat> top-level groups.
  Preserves all skills but pollutes the slash namespace with 20
  top-level commands.

All three work around the symptom. The flat autocomplete design
dissolves the problem — there is no payload-size pressure to manage.

## Tests

tests/gateway/test_discord_slash_commands.py — 5 new test cases replace
the 3 old nested-structure tests:

- flat-not-nested structure assertion
- empty skills → no command registered
- callback dispatches the right cmd_key by name
- unknown name → ephemeral error, no dispatch
- large-catalog regression guard (500 skills) — command payload stays
  under 500 bytes regardless

E2E validated against real discord.py 2.7.1:
- Command registers as discord.app_commands.Command (not Group).
- Autocomplete filters by name AND description (verified across several
  queries including description-only matches like 'pdf' → OCR skill).
- 500-skill catalog returns max 25 results per autocomplete query
  (Discord's hard cap), filtered correctly.
- Choice labels formatted as 'name — description' clamped to 100 chars.
2026-04-17 05:19:14 -07:00
Teknium 13f2d997b0 test(dingtalk): cover QR device-flow auth + OpenClaw branding disclosure
Adds 15 regression tests for hermes_cli/dingtalk_auth.py covering:
  * _api_post — network error mapping, errcode-nonzero mapping, success path
  * begin_registration — 2-step chain, missing-nonce/device_code/uri
    error cases
  * wait_for_registration_success — success path, missing-creds guard,
    on_waiting callback invocation
  * render_qr_to_terminal — returns False when qrcode missing, prints
    when available
  * Configuration — BASE_URL default + override, SOURCE default

Also adds a one-line disclosure in dingtalk_qr_auth() telling users
the scan page will be OpenClaw-branded. Interim measure: DingTalk's
registration portal is hardcoded to route all sources to /openapp/
registration/openClaw, so users see OpenClaw branding regardless of
what 'source' value we send. We keep 'openClaw' as the source token
until DingTalk-Real-AI registers a Hermes-specific template.

Also adds meng93 to scripts/release.py AUTHOR_MAP.
2026-04-17 05:08:07 -07:00
meng93 9deeee7bb7 feat(dingtalk): add QR code auth support and fix 3 critical bugs
- feat: support one-click QR scan to create DingTalk bot and establish connection
- fix(gateway): wrap blocking DingTalkStreamClient.start() with asyncio.to_thread()
- fix(gateway): extract message fields from CallbackMessage payload instead of ChatbotMessage
- fix(gateway): add oapi.dingtalk.com to allowed webhook URL domains
2026-04-17 05:08:07 -07:00
Teknium 08930a65ea chore: map Patrick Wang, Hedgeho9, Berny Linville emails to AUTHOR_MAP 2026-04-17 05:01:29 -07:00
Berny Linville 6ee65b4d61 fix(weixin): preserve native markdown rendering
- stop rewriting markdown tables, headings, and links before delivery
- keep markdown table blocks and headings together during chunking
- update Weixin tests and docs for native markdown rendering

Closes #10308
2026-04-17 05:01:29 -07:00
Hedgeho9 498fc6780e fix(weixin): extract and deliver MEDIA: attachments in normal send() path
The Weixin adapter's send() method previously split and delivered the
raw response text without first extracting MEDIA: tags or bare local
file paths. This meant images, documents, and voice files referenced
by the agent were silently dropped in normal (non-streaming,
non-background) conversations.

Changes:
- In WeixinAdapter.send(), call extract_media() and
  extract_local_files() before formatting/splitting text.
- Deliver extracted files via send_image_file(), send_document(),
  send_voice(), or send_video() prior to sending text chunks.
- Also fix two minor typing issues in gateway/run.py where
  extract_media() tuples were not unpacked correctly in background
  and /btw task handlers.

Fixes missing media delivery on Weixin personal accounts.
2026-04-17 05:01:29 -07:00
Patrick Wang 4ed6e4c1a5 refactor(weixin): drop pilk dependency from voice fallback 2026-04-17 05:01:29 -07:00
Patrick Wang 649f38390c fix: force Weixin voice fallback to file attachments 2026-04-17 05:01:29 -07:00
Patrick Wang 678b69ec1b fix(weixin): use Tencent SILK encoding for voice replies 2026-04-17 05:01:29 -07:00
Teknium 53da34a4fc fix(discord): route attachment downloads through authenticated bot session (#11568)
Three open issues — #8242, #6587, #11345 — all trace to the same root
cause: the image / audio / document download paths in
`DiscordAdapter._handle_message` used plain, unauthenticated HTTP to
fetch `att.url`. That broke in three independent ways:

  #8242  cdn.discordapp.com attachment URLs increasingly require the
         bot session to download; unauthenticated httpx sees 403
         Forbidden, image/voice analysis fail silently.

  #6587  Some user environments (VPNs, corporate DNS, tunnels) resolve
         cdn.discordapp.com to private-looking IPs. Our is_safe_url()
         guard correctly blocks them as SSRF risks, but the user
         environment is legitimate — image analysis and voice STT die.

  #11345 The document download path skipped is_safe_url() entirely —
         raw aiohttp.ClientSession.get(att.url) with no SSRF check,
         inconsistent with the image/audio branches.

Unified fix: use `discord.Attachment.read()` as the primary download
path on all three branches. `att.read()` routes through discord.py's
own authenticated HTTPClient, so:

  - Discord CDN auth is handled (#8242 resolved).
  - Our is_safe_url() gate isn't consulted for the attachment path at
    all — the bot session handles networking internally (#6587 resolved).
  - All three branches now share the same code path, eliminating the
    document-path SSRF gap (#11345 resolved).

Falls back to the existing cache_*_from_url helpers (image/audio) or an
SSRF-gated aiohttp fetch (documents) when `att.read()` is unavailable
or fails — preserves defense-in-depth for any future payload-schema
drift that could slip a non-CDN URL into att.url.

New helpers on DiscordAdapter:
  - _read_attachment_bytes(att)  — safe att.read() wrapper
  - _cache_discord_image(att, ext)     — primary + URL fallback
  - _cache_discord_audio(att, ext)     — primary + URL fallback
  - _cache_discord_document(att, ext)  — primary + SSRF-gated aiohttp fallback

Tests:
  - tests/gateway/test_discord_attachment_download.py — 12 new cases
    covering all three helpers: primary path, fallback on missing
    .read(), fallback on validator rejection, SSRF guard on document
    fallback, aiohttp fallback happy-path, and an E2E case via
    _handle_message confirming cache_image_from_url is never invoked
    when att.read() succeeds.
  - All 11 existing document-handling tests continue to pass via the
    aiohttp fallback path (their SimpleNamespace attachments have no
    .read(), which triggers the fallback — now SSRF-gated).

Closes #8242, closes #6587, closes #11345.
2026-04-17 04:59:03 -07:00
Teknium 24342813fe fix(qqbot): correct Authorization header format in send_message REST path (#11569)
The send_message tool's direct-REST QQBot path used "QQBotAccessToken {token}"
which QQ's API rejects with 401. The correct format is "QQBot {token}" — the
gateway adapter at gateway/platforms/qqbot.py uses this format in all 5 header
sites (lines 341, 551, 579, 1068, 1467); this was the one outlier.

Credit to @Quon for surfacing this in #10257 (that PR had unrelated issues in
its media-upload logic and was closed; this salvages the genuine 1-line fix).
2026-04-17 04:25:47 -07:00
Teknium ca03e80348 chore: map LehaoLin email to AUTHOR_MAP for release script 2026-04-17 04:22:40 -07:00
LehaoLin 504e7eb9e5 fix(gateway): wait for reconnection before dropping WebSocket sends
When a WebSocket-based platform adapter (e.g. QQ Bot) temporarily
loses its connection, send() now polls is_connected for up to 15s
instead of immediately returning a non-retryable failure. If the
auto-reconnect completes within the window, the message is delivered
normally. On timeout, the SendResult is marked retryable=True so the
base class retry mechanism can attempt re-delivery.

Same treatment applied to _send_media().

Adds 4 async tests covering:
- Successful send after simulated reconnection
- Retryable failure on timeout
- Immediate success when already connected
- _send_media reconnection wait

Fixes #11163
2026-04-17 04:22:40 -07:00
dieutx b594b30de4 fix(release): map dieutx email in author map 2026-04-17 04:22:40 -07:00
dieutx 995177d542 fix(gateway): honor QQ_GROUP_ALLOWED_USERS in runner auth 2026-04-17 04:22:40 -07:00
Pedro Gonzalez 590c9964e1 Fix QQ voice attachment SSRF validation 2026-04-17 04:22:40 -07:00
yeyitech a97b08e30c fix: allow trusted QQ CDN benchmark IP resolution 2026-04-17 04:22:40 -07:00
Teknium aca81ac7bb test(dingtalk): cover require_mention + allowed_users gating
Adds 16 regression tests for the gating logic introduced in the
salvaged commit:

  * TestAllowedUsersGate — empty/wildcard/case-insensitive matching,
    staff_id vs sender_id, env var CSV population
  * TestMentionPatterns — compilation, case-insensitivity, invalid
    regex is skipped-not-raised, JSON env var, newline fallback
  * TestShouldProcessMessage — DM always accepted, group gating via
    require_mention / is_in_at_list / wake-word pattern / free_response_chats

Also adds yule975 to scripts/release.py AUTHOR_MAP (release CI blocks
unmapped emails).
2026-04-17 04:21:49 -07:00
yule975 9039273ff0 feat(platforms): add require_mention + allowed_users gating to DingTalk
DingTalk was the only messaging platform without group-mention gating or a
per-user allowlist. Slack, Telegram, Discord, WhatsApp, Matrix, and Mattermost
all support these via config.yaml + matching env vars; this change closes the
gap for DingTalk using the same surface:

Config:
  platforms.dingtalk.require_mention: bool   (env: DINGTALK_REQUIRE_MENTION)
  platforms.dingtalk.mention_patterns: list  (env: DINGTALK_MENTION_PATTERNS)
  platforms.dingtalk.free_response_chats: list  (env: DINGTALK_FREE_RESPONSE_CHATS)
  platforms.dingtalk.allowed_users: list     (env: DINGTALK_ALLOWED_USERS)

Semantics mirror Telegram's implementation:
- DMs are always accepted (subject to allowed_users).
- Group messages are accepted only when the chat is allowlisted, mention is
  not required, the bot was @mentioned (dingtalk_stream sets is_in_at_list),
  or the text matches a configured regex wake-word.
- allowed_users matches sender_id / sender_staff_id case-insensitively;
  a single "*" disables the check.

Rationale: without this, any DingTalk user in a group chat can trigger the
bot, which makes DingTalk less safe to deploy than the other platforms. A
user's config.yaml already accepts require_mention for dingtalk but the value
was silently ignored.
2026-04-17 04:21:49 -07:00
Teknium 29d5d36b14 fix(copilot): normalize vendor-prefixed and dash-notation model IDs (#6879) (#11561)
The Copilot API returns HTTP 400 "model_not_supported" when it receives a
model ID it doesn't recognize (vendor-prefixed like
`anthropic/claude-sonnet-4.6` or dash-notation like `claude-sonnet-4-6`).
Two bugs combined to leave both formats unhandled:

1. `_COPILOT_MODEL_ALIASES` in hermes_cli/models.py only covered bare
   dot-notation and vendor-prefixed dot-notation.  Hermes' default Claude
   IDs elsewhere use hyphens (anthropic native format), and users with an
   aggregator-style config who switch `model.provider` to `copilot`
   inherit `anthropic/claude-X-4.6` — neither case was in the table.

2. The Copilot branch of `normalize_model_for_provider()` only stripped
   the vendor prefix when it matched the target provider (`copilot/`) or
   was the special-cased `openai/` for openai-codex.  Every other vendor
   prefix survived to the Copilot request unchanged.

Fix:

- Add dash-notation aliases (`claude-{opus,sonnet,haiku}-4-{5,6}` and the
  `anthropic/`-prefixed variants) to the alias table.
- Rewire the Copilot / Copilot-ACP branch of
  `normalize_model_for_provider()` to delegate to the existing
  `normalize_copilot_model_id()`.  That function already does alias
  lookups, catalog-aware resolution, and vendor-prefix fallback — it was
  being bypassed for the generic normalisation entry point.

Because `switch_model()` already calls `normalize_model_for_provider()`
for every `/model` switch (line 685 in model_switch.py), this single fix
covers the CLI startup path (cli.py), the `/model` slash command path,
and the gateway load-from-config path.

Closes #6879

Credits dsr-restyn (#6743) who independently diagnosed the dash-notation
case; their aliases are folded into this consolidated fix alongside the
vendor-prefix stripping repair.
2026-04-17 04:19:36 -07:00
Teknium eabe14af1c test(discord): update reply_mode fixture for new to_reference() wrapping
Follow-up to the reply-reference fix: `_make_discord_adapter` used to return
the raw fetched `Message` as the expected reference, but the adapter now
wraps it via `ref_msg.to_reference(fail_if_not_exists=False)` so Discord
treats a deleted target as 'send without reply chip'. Update the fixture
to return the MessageReference sentinel so the 4 chunk-reference-identity
tests assert against the right object.

No production behavior change; only aligns the stale test fixture.
2026-04-17 04:17:56 -07:00
Teknium ef37aa7cce test(discord): add regression guard for non-reference send errors
Follow-up to the reply-reference fix: ensure errors unrelated to the reply
reference (e.g. 50013 Missing Permissions) do NOT trigger the no-reference
retry path and still surface as a failed SendResult. Keeps the wider retry
condition from silently swallowing unrelated API errors.

Proposed in the original issue writeup (#11342) as test case
`test_non_reference_errors_still_propagate`.
2026-04-17 04:17:56 -07:00
LeonSGP43 a448e7a04d fix(discord): drop invalid reply references 2026-04-17 04:17:56 -07:00
Teknium 0231f8882b chore(release): add Asunfly to AUTHOR_MAP for #10070 salvage 2026-04-17 04:11:30 -07:00
Asunfly 7c932c5aa4 fix(dingtalk): close websocket on disconnect 2026-04-17 04:11:30 -07:00
Teknium f268215019 fix(auth): codex auth remove no longer silently undone by auto-import (#11485)
* feat(skills): add 'hermes skills reset' to un-stick bundled skills

When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.

Adds an escape hatch for this case.

  hermes skills reset <name>
      Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
      re-baselines against the user's current copy. Future 'hermes update'
      runs accept upstream changes again. Non-destructive.

  hermes skills reset <name> --restore
      Also deletes the user's copy and re-copies the bundled version.
      Use when you want the pristine upstream skill back.

Also available as /skills reset in chat.

- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
  handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
  repro, --restore, unknown-skill error, upstream-removed-skill, and
  no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
  section explaining the origin-hash mechanic + reset usage

* fix(auth): codex auth remove no longer silently undone by auto-import

'hermes auth remove openai-codex' appeared to succeed but the credential
reappeared on the next command.  Two compounding bugs:

1. _seed_from_singletons() for openai-codex unconditionally re-imports
   tokens from ~/.codex/auth.json whenever the Hermes auth store is
   empty (by design — the Codex CLI and Hermes share that file).  There
   was no suppression check, unlike the claude_code seed path.

2. auth_remove_command's cleanup branch only matched
   removed.source == 'device_code' exactly.  Entries added via
   'hermes auth add openai-codex' have source 'manual:device_code', so
   for those the Hermes auth store's providers['openai-codex'] state was
   never cleared on remove — the next load_pool() re-seeded straight
   from there.

Net effect: there was no way to make a codex removal stick short of
manually editing both ~/.hermes/auth.json and ~/.codex/auth.json before
opening Hermes again.

Fix:

- Add unsuppress_credential_source() helper (mirrors
  suppress_credential_source()).
- Gate the openai-codex branch in _seed_from_singletons() with
  is_source_suppressed(), matching the claude_code pattern.
- Broaden auth_remove_command's codex match to handle both
  'device_code' and 'manual:device_code' (via endswith check), always
  call suppress_credential_source(), and print guidance about the
  unchanged ~/.codex/auth.json file.
- Clear the suppression marker in auth_add_command's openai-codex
  branch so re-linking via 'hermes auth add openai-codex' works.

~/.codex/auth.json is left untouched — that's the Codex CLI's own
credential store, not ours to delete.

Tests cover: unsuppress helper behavior, remove of both source
variants, add clears suppression, seed respects suppression.  E2E
verified: remove → load → add → load flow now behaves correctly.
2026-04-17 04:10:17 -07:00
Teknium 8b312248dc chore: map RucchiZ email to AUTHOR_MAP for release script 2026-04-17 04:09:21 -07:00
赵晨飞 82969615bb test(weixin): add regression test for send_image_file parameter name
Add TestWeixinSendImageFileParameterName test class with two tests:
- test_send_image_file_uses_image_path_parameter: verifies the correct
  parameter name (image_path) is used when gateway calls send_image_file
- test_send_image_file_works_without_optional_params: ensures minimal
  params work correctly

This prevents the interface from drifting again as noted by Copilot.
2026-04-17 04:09:21 -07:00
赵晨飞 902d6b97d6 fix(weixin): correct send_image_file parameter name to match base class
The send_image_file method in WeixinAdapter used 'path' as parameter
name, but BasePlatformAdapter and gateway callers use 'image_path'.
This mismatch caused image sending to fail when called through the
gateway's extract_media path.

Changed parameter name from 'path' to 'image_path' to match the
interface defined in base.py and the calls in gateway/run.py.
2026-04-17 04:09:21 -07:00
Teknium 5d929caa59 chore(release): map michel.belleau@malaiwah.com to @malaiwah 2026-04-17 04:08:42 -07:00
Michel Belleau efa6c9f715 fix(discord): default allowed_mentions to block @everyone and role pings
discord.py does not apply a default AllowedMentions to the client, so any
reply whose content contains @everyone/@here or a role mention would ping
the whole server — including verbatim echoes of user input or LLM output
that happens to contain those tokens.

Set a safe default on commands.Bot: everyone=False, roles=False,
users=True, replied_user=True. Operators can opt back in via four
DISCORD_ALLOW_MENTION_* env vars or discord.allow_mentions.* in
config.yaml. No behavior change for normal user/reply pings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 04:08:42 -07:00
Teknium 2367c6ffd5 test: remove 169 change-detector tests across 21 files (#11472)
First pass of test-suite reduction to address flaky CI and bloat.

Removed tests that fall into these change-detector patterns:

1. Source-grep tests (tests/gateway/test_feishu.py, test_email.py): tests
   that call inspect.getsource() on production modules and grep for string
   literals. Break on any refactor/rename even when behavior is correct.

2. Platform enum tautologies (every gateway/test_X.py): assertions like
   `Platform.X.value == 'x'` duplicated across ~9 adapter test files.

3. Toolset/PLATFORM_HINTS/setup-wizard registry-presence checks: tests that
   only verify a key exists in a dict. Data-layout tests, not behavior.

4. Argparse wiring tests (test_argparse_flag_propagation, test_subparser_routing
   _fallback): tests that do parser.parse_args([...]) then assert args.field.
   Tests Python's argparse, not our code.

5. Pure dispatch tests (test_plugins_cmd.TestPluginsCommandDispatch): patch
   cmd_X, call plugins_command with matching action, assert mock called.
   Tests the if/elif chain, not behavior.

6. Kwarg-to-mock verification (test_auxiliary_client ~45 tests,
   test_web_tools_config, test_gemini_cloudcode, test_retaindb_plugin): tests
   that mock the external API client, call our function, and assert exact
   kwargs. Break on refactor even when behavior is preserved.

7. Schedule-internal "function-was-called" tests (acp/test_server scheduling
   tests): tests that patch own helper method, then assert it was called.

Kept behavioral tests throughout: error paths (pytest.raises), security
tests (path traversal, SSRF, redaction), message alternation invariants,
provider API format conversion, streaming logic, memory contract, real
config load/merge tests.

Net reduction: 169 tests removed. 38 empty classes cleaned up.

Collected before: 12,522 tests
Collected after:  12,353 tests
2026-04-17 01:05:09 -07:00
Teknium e33cb65a98 fix(insights): hide cache read/write and cost metrics from display (#11477)
The cache-read, cache-write, and total estimated-cost values shown in
/insights (and the per-model Cost column) were unreliable. Hide them from
both terminal and gateway renderings.

The underlying data pipeline is untouched — sessions still store
cache_read_tokens, cache_write_tokens, and estimated_cost_usd; the web
server, /usage command, and status bar are unaffected. Only the
InsightsEngine display layer is trimmed.

Changes:
- format_terminal: drop 'Cache read / Cache write' line, drop 'Est. cost'
  from the Total tokens row, drop per-model 'Cost' column, drop the
  '* Cost N/A for custom/self-hosted' footnote.
- format_gateway: drop cache breakdown from Tokens line, drop 'Est. cost'
  line, drop per-model cost suffix.
- Tests updated to assert these strings are now absent.
2026-04-17 01:02:06 -07:00
Teknium 3f74dafaee fix(nous): respect 'Skip (keep current)' after OAuth login (#11476)
* feat(skills): add 'hermes skills reset' to un-stick bundled skills

When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.

Adds an escape hatch for this case.

  hermes skills reset <name>
      Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
      re-baselines against the user's current copy. Future 'hermes update'
      runs accept upstream changes again. Non-destructive.

  hermes skills reset <name> --restore
      Also deletes the user's copy and re-copies the bundled version.
      Use when you want the pristine upstream skill back.

Also available as /skills reset in chat.

- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
  handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
  repro, --restore, unknown-skill error, upstream-removed-skill, and
  no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
  section explaining the origin-hash mechanic + reset usage

* fix(nous): respect 'Skip (keep current)' after OAuth login

When a user already set up on another provider (e.g. OpenRouter) runs
`hermes model` and picks Nous Portal, OAuth succeeds and then a model
picker is shown.  If the user picks 'Skip (keep current)', the previous
provider + model should be preserved.

Previously, \_update_config_for_provider was called unconditionally after
login, which flipped config.yaml model.provider to 'nous' while keeping
the old model.default (e.g. anthropic/claude-opus-4.6 from OpenRouter),
leaving the user with a mismatched provider/model pair on the next
request.

Fix: snapshot the prior active_provider before login, and if no model is
selected (Skip, or no models available, or fetch failure), restore the
prior active_provider and leave config.yaml untouched.  The Nous OAuth
tokens stay saved so future `hermes model` -> Nous works without
re-authenticating.

Test plan:
- New tests cover Skip path (preserves provider+model, saves creds),
  pick-a-model path (switches to nous), and fresh-install Skip path
  (active_provider cleared, not stuck as 'nous').
2026-04-17 00:52:42 -07:00
Teknium 3438d274f6 fix(dingtalk): repair _extract_text for dingtalk-stream >= 0.20 SDK shape
The cherry-picked SDK compat fix (previous commit) wired process() to
parse CallbackMessage.data into a ChatbotMessage, but _extract_text()
was still written against the pre-0.20 payload shape:

  * message.text changed from dict {content: ...} → TextContent object.
    The old code's str(text) fallback produced 'TextContent(content=...)'
    as the agent's input, so every received message came in mangled.
  * rich_text moved from message.rich_text (list) to
    message.rich_text_content.rich_text_list.

This preserves legacy fallbacks (dict-shaped text, bare rich_text list)
while handling the current SDK layout via hasattr(text, 'content').

Adds regression tests covering:
  * webhook domain allowlist (api.*, oapi.*, and hostile lookalikes)
  * _IncomingHandler.process is a coroutine function
  * _extract_text against TextContent object, dict, rich_text_content,
    legacy rich_text, and empty-message cases

Also adds kevinskysunny to scripts/release.py AUTHOR_MAP (release CI
blocks unmapped emails).
2026-04-17 00:52:35 -07:00
Kevin S. Sunny c3d2895b18 fix(dingtalk): support dingtalk-stream 0.24+ and oapi webhooks 2026-04-17 00:52:35 -07:00
Teknium e5cde568b7 feat(skills): add 'hermes skills reset' to un-stick bundled skills (#11468)
When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.

Adds an escape hatch for this case.

  hermes skills reset <name>
      Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
      re-baselines against the user's current copy. Future 'hermes update'
      runs accept upstream changes again. Non-destructive.

  hermes skills reset <name> --restore
      Also deletes the user's copy and re-copies the bundled version.
      Use when you want the pristine upstream skill back.

Also available as /skills reset in chat.

- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
  handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
  repro, --restore, unknown-skill error, upstream-removed-skill, and
  no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
  section explaining the origin-hash mechanic + reset usage
2026-04-17 00:41:31 -07:00
Teknium a55a133387 fix(tests): attach caplog to specific logger in 3 order-dependent tests (#11453)
Three tests in tests/test_plugin_skills.py and tests/hermes_cli/test_plugins.py
used caplog.at_level(logging.WARNING) without specifying a logger. When another
test earlier in the same xdist worker touched propagation on tools.skills_tool
or hermes_cli.plugins, caplog would miss the warning and the assertion would
fail intermittently in CI.

These three tests accounted for 15 of the last ~30 Tests workflow failures
(5 each), including the recent main failure on commit 436a7359 (PR #11398).

Fix: pass logger="tools.skills_tool" / logger="hermes_cli.plugins" to
caplog.at_level() so the handler attaches directly to the logger under test
and capture is independent of global propagation state.

Affected tests:
- tests/test_plugin_skills.py::TestSkillViewPluginGuards::test_injection_logged_but_served
- tests/hermes_cli/test_plugins.py::TestPluginCommands::test_register_command_empty_name_rejected
- tests/hermes_cli/test_plugins.py::TestPluginCommands::test_register_command_builtin_conflict_rejected

No production code change. Verified passing under xdist (-n 4) alongside
test_hermes_logging.py (the test most likely to poison the logger state).
2026-04-17 00:20:40 -07:00
Teknium 816e3e3774 test(feishu): cover new SDK event handler registrations
Extends test_build_event_handler_registers_reaction_and_card_processors
to assert that register_p2_im_chat_access_event_bot_p2p_chat_entered_v1
and register_p2_im_message_recalled_v1 are called when building the
event handler, matching the production registrations.

Also adds Fatty911 to scripts/release.py AUTHOR_MAP for credit on the
salvaged event-handler fix.
2026-04-16 22:08:11 -07:00
Fatty911 94168b7f60 fix: register missing Feishu event handlers for P2P chat entered and message recalled 2026-04-16 22:08:11 -07:00
Teknium 220fa7db90 feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406)
* feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro

Upstream asked for these two upgrades ASAP — the old entries show
stale models when newer, higher-quality versions are available on FAL.

Recraft V3 → Recraft V4 Pro
  ID:    fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image
  Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier)
  Schema: V4 dropped the required `style` enum entirely; defaults
          handle taste now. Added `colors` and `background_color`
          to supports for brand-palette control. `seed` is not
          supported by V4 per the API docs.

Nano Banana → Nano Banana Pro
  ID:    fal-ai/nano-banana → fal-ai/nano-banana-pro
  Price: $0.08/image → $0.15/image (1K); $0.30 at 4K
  Schema: Aspect ratio family unchanged. Added `resolution`
          (1K/2K/4K, default 1K for billing predictability),
          `enable_web_search` (real-time info grounding, +$0.015),
          and `limit_generations` (force exactly 1 image).
  Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality
                and reasoning depth improved; slower (~6s → ~8s).

Migration: users who had the old IDs in `image_gen.model` will
fall through the existing 'unknown model → default' warning path
in `_resolve_fal_model()` and get the Klein 9B default on the next
run. Re-run `hermes tools` → Image Generation to pick the new
version. No silent cost-upgrade aliasing — the 2-6x price jump
on these tiers warrants explicit user re-selection.

Portal note: both new model IDs need to be allowlisted on the
Nous fal-queue-gateway alongside the previous 7 additions, or
users on Nous Subscription will see the 'managed gateway rejected
model' error we added previously (which is clear and
self-remediating, just noisy).

* docs: wrap '<1s' in backticks to unblock MDX compilation

Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and
'<1s' fails because '1' isn't a valid tag-name start character. This
was broken on main since PR #11265 (never noticed because
docs-site-checks was failing on OTHER issues at the time and we
admin-merged through it).

Wrapping in backticks also gives the cell monospace styling which
reads more cleanly alongside the inline-code model ID in the same row.

The other '<1s' occurrence (line 52) is inside a fenced code block
and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
Teknium 70768665a4 fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383)
* feat(mcp-oauth): scaffold MCPOAuthManager

Central manager for per-server MCP OAuth state. Provides
get_or_build_provider (cached), remove (evicts cache + deletes
disk), invalidate_if_disk_changed (mtime watch, core fix for
external-refresh workflow), and handle_401 (dedup'd recovery).

No behavior change yet — existing call sites still use
build_oauth_auth directly. Task 1 of 8 in the MCP OAuth
consolidation (fixes Cthulhu's BetterStack reliability issues).

* feat(mcp-oauth): add HermesMCPOAuthProvider with pre-flow disk watch

Subclasses the MCP SDK's OAuthClientProvider to inject a disk
mtime check before every async_auth_flow, via the central
manager. When a subclass instance is used, external token
refreshes (cron, another CLI instance) are picked up before
the next API call.

Still dead code: the manager's _build_provider still delegates
to build_oauth_auth and returns the plain OAuthClientProvider.
Task 4 wires this subclass in. Task 2 of 8.

* refactor(mcp-oauth): extract build_oauth_auth helpers

Decomposes build_oauth_auth into _configure_callback_port,
_build_client_metadata, _maybe_preregister_client, and
_parse_base_url. Public API preserved. These helpers let
MCPOAuthManager._build_provider reuse the same logic in Task 4
instead of duplicating the construction dance.

Also updates the SDK version hint in the warning from 1.10.0 to
1.26.0 (which is what we actually require for the OAuth types
used here). Task 3 of 8.

* feat(mcp-oauth): manager now builds HermesMCPOAuthProvider directly

_build_provider constructs the disk-watching subclass using the
helpers from Task 3, instead of delegating to the plain
build_oauth_auth factory. Any consumer using the manager now gets
pre-flow disk-freshness checks automatically.

build_oauth_auth is preserved as the public API for backwards
compatibility. The code path is now:

    MCPOAuthManager.get_or_build_provider  ->
      _build_provider  ->
        _configure_callback_port
        _build_client_metadata
        _maybe_preregister_client
        _parse_base_url
        HermesMCPOAuthProvider(...)

Task 4 of 8.

* feat(mcp): wire OAuth manager + add _reconnect_event

MCPServerTask gains _reconnect_event alongside _shutdown_event.
When set, _run_http / _run_stdio exit their async-with blocks
cleanly (no exception), and the outer run() loop re-enters the
transport to rebuild the MCP session with fresh credentials.
This is the recovery path for OAuth failures that the SDK's
in-place httpx.Auth cannot handle (e.g. cron externally consumed
the refresh_token, or server-side session invalidation).

_run_http now asks MCPOAuthManager for the OAuth provider
instead of calling build_oauth_auth directly. Config-time,
runtime, and reconnect paths all share one provider instance
with pre-flow disk-watch active.

shutdown() defensively sets both events so there is no race
between reconnect and shutdown signalling.

Task 5 of 8.

* feat(mcp): detect auth failures in tool handlers, trigger reconnect

All 5 MCP tool handlers (tool call, list_resources, read_resource,
list_prompts, get_prompt) now detect auth failures and route
through MCPOAuthManager.handle_401:

  1. If the manager says recovery is viable (disk has fresh tokens,
     or SDK can refresh in-place), signal MCPServerTask._reconnect_event
     to tear down and rebuild the MCP session with fresh credentials,
     then retry the tool call once.

  2. If no recovery path exists, return a structured needs_reauth
     JSON error so the model stops hallucinating manual refresh
     attempts (the 'let me curl the token endpoint' loop Cthulhu
     pasted from Discord).

_is_auth_error catches OAuthFlowError, OAuthTokenError,
OAuthNonInteractiveError, and httpx.HTTPStatusError(401). Non-auth
exceptions still surface via the generic error path unchanged.

Task 6 of 8.

* feat(mcp-cli): route add/remove through manager, add 'hermes mcp login'

cmd_mcp_add and cmd_mcp_remove now go through MCPOAuthManager
instead of calling build_oauth_auth / remove_oauth_tokens
directly. This means CLI config-time state and runtime MCP
session state are backed by the same provider cache — removing
a server evicts the live provider, adding a server populates
the same cache the MCP session will read from.

New 'hermes mcp login <name>' command:
  - Wipes both the on-disk tokens file and the in-memory
    MCPOAuthManager cache
  - Triggers a fresh OAuth browser flow via the existing probe
    path
  - Intended target for the needs_reauth error Task 6 returns
    to the model

Task 7 of 8.

* test(mcp-oauth): end-to-end integration tests

Five new tests exercising the full consolidation with real file
I/O and real imports (no transport mocks):

  1. external_refresh_picked_up_without_restart — Cthulhu's cron
     workflow. External process writes fresh tokens to disk;
     on the next auth flow the manager's mtime-watch flips
     _initialized and the SDK re-reads from storage.

  2. handle_401_deduplicates_concurrent_callers — 10 concurrent
     handlers for the same failed token fire exactly ONE recovery
     attempt (thundering-herd protection).

  3. handle_401_returns_false_when_no_provider — defensive path
     for unknown servers.

  4. invalidate_if_disk_changed_handles_missing_file — pre-auth
     state returns False cleanly.

  5. provider_is_reused_across_reconnects — cache stickiness so
     reconnects preserve the disk-watch baseline mtime.

Task 8 of 8 — consolidation complete.
2026-04-16 21:57:10 -07:00
Teknium 436a7359cd feat: add claude-opus-4.7 to Nous Portal curated model list (#11398)
Mirrors OpenRouter which already lists anthropic/claude-opus-4.7 as
recommended. Surfaces the model in the `hermes model` picker and the
gateway /model flow for Nous Portal users.

Context length (1M) is already covered by the existing claude-opus-4.7
entry in agent/model_metadata.py DEFAULT_CONTEXT_LENGTHS.
2026-04-16 21:37:06 -07:00
Teknium 24fa055763 fix(ci): resolve 4 pre-existing main failures (docs lint + 3 stale tests) (#11373)
* docs: fix ascii-guard border alignment errors

Three docs pages had ASCII diagram boxes with off-by-one column
alignment issues that failed docs-site-checks CI:

- architecture.md: outer box is 71 cols but inner-box content lines
  and border corners were offset by 1 col, making content-line right
  border at col 70/72 while top/bottom border was at col 71. Inner
  boxes also had border corners at cols 19/36/53 but content pipes
  at cols 20/37/54. Rewrote the diagram with consistent 71-col width
  throughout, aligned inner boxes at cols 4-19, 22-37, 40-55 with
  2-space gaps and 15-space trailing padding.

- gateway-internals.md: same class of issue — outer box at 51 cols,
  inner content lines varied 52-54 cols. Rewrote with consistent
  51-col width, inner boxes at cols 4-15, 18-29, 32-43. Also
  restructured the bottom-half message flow so it's bare text
  (not half-open box cells) matching the intent of the original.

- agent-loop.md line 112-114: box 2 (API thread) content lines had
  one extra space pushing the right border to col 46 while the top
  and bottom borders of that box sat at col 45. Trimmed one trailing
  space from each of the three content lines.

All 123 docs files now pass `npm run lint:diagrams`:
  ✓ Errors: 0  (warnings: 6, non-fatal)

Pre-existing failures on main — unrelated to any open PR.

* test(setup): accept description kwarg in prompt_choice mock lambdas

setup.py's `_curses_prompt_choice` gained an optional `description`
parameter (used for rendering context hints alongside the prompt).
`prompt_choice` forwards it via keyword arg. The two existing tests
mocked `_curses_prompt_choice` with lambdas that didn't accept the
new kwarg, so the forwarded call raised TypeError.

Fix: add `description=None` to both mock lambda signatures so they
absorb the new kwarg without changing behavior.

* test(matrix): update stale audio-caching assertion

test_regular_audio_has_http_url asserted that non-voice audio
messages keep their HTTP URL and are NOT downloaded/cached. That
was true when the caching code only triggered on
`is_voice_message`. Since bec02f37 (encrypted-media caching
refactor), matrix.py caches all media locally — photos, audio,
video, documents — so downstream tools can read them as real
files via media_urls. This applies to regular audio too.

Renamed the test to `test_regular_audio_is_cached_locally`,
flipped the assertions accordingly, and documented the
intentional behavior change in the docstring. Other tests in
the file (voice-specific caching, message-type detection,
reply-to threading) continue to pass.

* test(413): allow multi-pass preflight compression

run_agent.py's preflight compression runs up to 3 passes in a loop
for very large sessions (each pass summarizes the middle N turns,
then re-checks tokens). The loop breaks when a pass returns a
message list no shorter than its input (can't compress further).

test_preflight_compresses_oversized_history used a static mock
return value that returned the same 2 messages regardless of input,
so the loop ran pass 1 (41 -> 2) and pass 2 (2 -> 2 -> break),
making call_count == 2. The assert_called_once() assertion was
strictly wrong under the multi-pass design.

The invariant the test actually cares about is: preflight ran, and
its first invocation received the full oversized history. Replaced
the count assertion with those two invariants.

* docs: drop '...' from gateway diagram, merge side-by-side boxes

ascii-guard 2.3.0 flagged two remaining issues after the initial fix
pass:

1. gateway-internals.md L33: the '...' suffix after inner box 3's
   right border got parsed as 'extra characters after inner-box right
   border'. Dropped the '...' — the surrounding prose already conveys
   'and more platforms' without needing the visual hint.

2. agent-loop.md: ascii-guard can't cleanly parse two side-by-side
   boxes of different heights (main thread 7 rows, API thread 5 rows).
   Even equalizing heights didn't help — the linter treats the left
   box's right border as the end of the diagram. Merged into a single
   54-char-wide outer box with both threads labeled as regions inside,
   keeping the ▶ arrow to preserve the main→API flow direction.
2026-04-16 20:43:41 -07:00
Teknium fdefd98aa3 docs(skills): make descriptions self-contained, not cross-dependent
Previous pass assumed both skills would always be loaded together, so
each description pointed at the other ('use concept-diagrams instead').
That breaks when only one skill is active — the agent reads 'use the
other skill' and there is no other skill.

Now each skill's description and scope section is fully self-contained:

- States what it's best suited for
- Lists subjects where a more specialized skill (if available) would be
  a better fit, naming them only as 'consider X if available'
- Explicitly offers itself as a general SVG diagram fallback when no
  more specialized skill exists

An agent loading either skill alone gets unambiguous guidance; an
agent with both loaded still gets useful routing via the 'consider X
if available' hints and the related_skills metadata.
2026-04-16 20:39:55 -07:00
Teknium 7d535969ff docs(skills): make architecture-diagram vs concept-diagrams routing explicit
Both skills generate SVG system diagrams, but for very different subjects
and aesthetics. The old descriptions didn't make the split clear, so an
agent loading either one couldn't confidently pick.

Changes:

- Rewrote both frontmatter descriptions to state the scope up front plus
  an explicit 'for X, use the other skill instead' pointer.
- Added a symmetric 'When to use this skill vs <other>' decision table
  to the top of each SKILL.md body, so the guidance is visible whether
  the agent is reading frontmatter or full content.
- Added architecture-diagram <-> concept-diagrams to each other's
  related_skills metadata.

Rule of thumb baked into both skills:
  software/cloud infra -> architecture-diagram
  physical / scientific / educational -> concept-diagrams
2026-04-16 20:39:55 -07:00
Teknium 19c589a20b refactor(concept-diagrams): rename + tighten v1k22's skill for merge
Salvage of PR #11045 (original by v1k22). Changes on top of the
original commit:

- Rename 'architecture-visualization-svg-diagrams' -> 'concept-diagrams'
  to differentiate from the existing architecture-diagram skill.
  architecture-diagram stays as the dark-themed Cocoon-style option for
  software/infra; concept-diagrams covers physics, chemistry, math,
  engineering, physical objects, and educational visuals.
- Trigger description scoped to actual use cases; removed the 'always
  use this skill' language and long phrase-capture list to stop
  colliding with architecture-diagram, excalidraw, generative-widgets,
  manim-video.
- Default output is now a standalone self-contained HTML file (works
  offline, no server). The preview server is opt-in and no longer part
  of the default workflow.
- When the server IS used: bind to 127.0.0.1 instead of 0.0.0.0 (was a
  LAN exposure hazard on shared networks) and let the OS pick a free
  ephemeral port instead of hard-coding 22223 (collision prone).
- Shrink SKILL.md from 1540 to 353 lines by extracting reusable
  material into linked files:
    - templates/template.html (host page with full CSS design system)
    - references/physical-shape-cookbook.md
    - references/infrastructure-patterns.md
    - references/dashboard-patterns.md
  All 15 examples kept intact.
- Add dhandhalyabhavik@gmail.com -> v1k22 to AUTHOR_MAP.

Preserves v1k22's authorship on the underlying commit.
2026-04-16 20:39:55 -07:00
v1k22 9a4766fc18 feat: add architecture-visualization-svg-diagrams skill to creative category
- SKILL.md with full SVG design system (color palette, typography, spacing, dark mode)
- 15 example diagrams covering flowcharts, physical structures, chemistry, charts, floor plans, and more
- Supports 8 diagram types: flowchart, structural, API map, microservice, data flow, physical, infrastructure, UI mockups
- Auto-hosts diagrams on 0.0.0.0:22223 as interactive web pages
2026-04-16 20:39:55 -07:00
Teknium 7af9bf3a54 fix(feishu): queue inbound events when adapter loop not ready (#5499) (#11372)
Inbound Feishu messages arriving during brief windows when the adapter
loop is unavailable (startup/restart transitions, network-flap reconnect)
were silently dropped with a WARNING log. This matches the symptom in
issue #5499 — and users have reported seeing only a subset of their
messages reach the agent.

Fix: queue pending events in a thread-safe list and spawn a single
drainer thread that replays them once the loop becomes ready. Covers
these scenarios:

  * Queue events instead of dropping when loop is None/closed
  * Single drainer handles the full queue (not thread-per-event)
  * Thread-safe with threading.Lock on the queue and schedule flag
  * Handles mid-drain bursts (new events arrive while drainer is working)
  * Handles RuntimeError if loop closes between check and submit
  * Depth cap (1000) prevents unbounded growth during extended outages
  * Drops queue cleanly on disconnect rather than holding forever
  * Safety timeout (120s) prevents infinite retention on broken adapters

Based on the approach proposed in #4789 by milkoor, rewritten for
thread-safety and correctness.

Test plan:
  * 5 new unit tests (TestPendingInboundQueue) — all passing
  * E2E test with real asyncio loop + fake WS thread: 10-event burst
    before loop ready → all 10 delivered in order
  * E2E concurrent burst test: 20 events queued, 20 more arrive during
    drainer dispatch → all 40 delivered, no loss, no duplicates
  * All 111 existing feishu tests pass

Related: #5499, #4789

Co-authored-by: milkoor <milkoor@users.noreply.github.com>
2026-04-16 20:36:59 -07:00
Brooklyn Nicholson 5435287dec chore: uptick 2026-04-16 22:35:45 -05:00
Brooklyn Nicholson 41d3d7afb7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 22:35:27 -05:00
Brooklyn Nicholson 39231f29c6 refactor(tui): /clean pass across ui-tui — 49 files, −217 LOC
Full codebase pass using the /clean doctrine (KISS/DRY, no one-off
helpers, no variables-used-once, pure functional where natural,
inlined obvious one-liners, killed dead exports, narrowed types,
spaced JSX). All contracts preserved — no RPC method, event name,
or exported type shape changed.

app/ — 15 files, -134 LOC
- inlined 4 one-off helpers (titleCase, isLong, statusToneFrom,
  focusOutside predicate)
- stores to arrow-const style (buildUiState, buildTurnState,
  buildOverlayState plus get/patch/reset triplets)
- functional slash/registry byName map (flatMap over for-loops)
- dropped dead param `live` in cancelOverlayFromCtrlC
- DRY'd duplicate shift() call in scrollWithSelection
- consolidated sections.push calls in /help

components/ — 12 files, -40 LOC
- extracted inline prop types to interfaces at file bottom (13×)
- inlined 6 one-off vars (pctLabel, logoW, heroW, cwd, title, hint)
- promoted HEART_COLORS + OPTS/LABELS to module scope
- JSX sibling spacing across 9 files
- un-shadowed `raw` in textInput
- components/thinking.tsx + components/markdown.tsx untouched
  (structurally load-bearing / edge-case-heavy)

config content domain protocol/ — 8 files, -77 LOC
- tightened 3 regexes (MOUSE_TRACKING, looksLikeSlashCommand,
  hasInterpolation — dropped stateful lastIndex dance)
- dead export ParsedSlashCommand removed
- MODES narrowed to `as const`, `.find(m => m === s)` replaces
  `.includes() ? (as cast) : null`
- fortunes.ts hash via reduce
- fmtDuration ternary chain
- inlined aboveViewport predicate in viewport.ts

hooks/ + lib/ — 9 files, -38 LOC
- ANSI_RE via String.fromCharCode(27) + WS_RE lifted to module
  scope (no more eslint-disable no-control-regex)
- compactPreview/edgePreview/thinkingPreview → ternary arrows
- useCompletion: hoisted pathReplace, moved stale-ref guard earlier
- useInputHistory: dropped useCallback wrapper (append is stable)
- useVirtualHistory: replaced 4× any with unknown + narrow
  MeasuredNode interface + one cast site

root TS — 3 files, -63 LOC
- banner.ts: parseRichMarkup via matchAll instead of exec/lastIndex,
  artWidth via reduce
- gatewayClient.ts: resolvePython candidate list collapse, inlined
  one-branch guards in dispatch/pushLog/drain/request
- types.ts: alpha-sorted ActiveTool / Msg / SudoReq / SecretReq
  members

eslint config
- disabled react-hooks/exhaustive-deps on packages/hermes-ink/**
  (compiled by react/compiler, deps live in $[N] memo arrays that
  eslint can't introspect) and removed the now-orphan in-file
  disable directive in ScrollBox.tsx

fixes (not from the cleaner pass)
- useComposerState: unlinkSync(file) + try/catch → rmSync(file,
  { force: true }) — kills the no-empty lint error and is more
  idiomatic
- useConfigSync: added setBellOnComplete + setVoiceEnabled to the
  two useEffect dep arrays (they're stable React setState setters;
  adding is safe and silences exhaustive-deps)

verification
- npx eslint src/ packages/ → 0 errors, 0 warnings
- npm run type-check → clean
- npm test → 50/50
- npm run build → 394.8kb ink-bundle.js, 11ms esbuild
- pytest tests/tui_gateway/ tests/test_tui_gateway_server.py
  tests/hermes_cli/test_tui_resume_flow.py
  tests/hermes_cli/test_tui_npm_install.py → 57/57
2026-04-16 22:32:53 -05:00
Teknium 01906e99dd feat(image_gen): multi-model FAL support with picker in hermes tools (#11265)
* feat(image_gen): multi-model FAL support with picker in hermes tools

Adds 8 FAL text-to-image models selectable via `hermes tools` →
Image Generation → (FAL.ai | Nous Subscription) → model picker.

Models supported:
- fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP)
- fal-ai/flux-2-pro (previous default, kept backward-compat upscaling)
- fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN)
- fal-ai/nano-banana (Gemini 2.5 Flash Image)
- fal-ai/gpt-image-1.5 (with quality tier: low/medium/high)
- fal-ai/ideogram/v3 (best typography)
- fal-ai/recraft-v3 (vector, brand styles)
- fal-ai/qwen-image (LLM-based)

Architecture:
- FAL_MODELS catalog declares per-model size family, defaults, supports
  whitelist, and upscale flag. Three size families handled uniformly:
  image_size_preset (flux family), aspect_ratio (nano-banana), and
  gpt_literal (gpt-image-1.5).
- _build_fal_payload() translates unified inputs (prompt + aspect_ratio)
  into model-specific payloads, merges defaults, applies caller overrides,
  wires GPT quality_setting, then filters to the supports whitelist — so
  models never receive rejected keys.
- IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen
  providers (Replicate, Stability, etc.); each provider entry tags itself
  with imagegen_backend: 'fal' to select the right catalog.
- Upscaler (Clarity) defaults off for new models (preserves <1s value
  prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS.

Config:
  image_gen.model           = fal-ai/flux-2/klein/9b  (new)
  image_gen.quality_setting = medium                  (new, GPT only)
  image_gen.use_gateway     = bool                    (existing)

Agent-facing schema unchanged (prompt + aspect_ratio only) — model
choice is a user-level config decision, not an agent-level arg.

Picker uses curses_radiolist (arrow keys, auto numbered-fallback on
non-TTY). Column-aligned: Model / Speed / Strengths / Price.

Docs: image-generation.md rewritten with the model table and picker
walkthrough. tools-reference, tool-gateway, overview updated to drop
the stale "FLUX 2 Pro" wording.

Tests: 42 new in tests/tools/test_image_generation.py covering catalog
integrity, all 3 size families, supports filter, default merging, GPT
quality wiring, model resolution fallback. 8 new in
tests/hermes_cli/test_tools_config.py for picker wiring (registry,
config writes, GPT quality follow-up prompt, corrupt-config repair).

* feat(image_gen): translate managed-gateway 4xx to actionable error

When the Nous Subscription managed FAL proxy rejects a model with 4xx
(likely portal-side allowlist miss or billing gate), surface a clear
message explaining:
  1. The rejected model ID + HTTP status
  2. Two remediation paths: set FAL_KEY for direct access, or
     pick a different model via `hermes tools`

5xx, connection errors, and direct-FAL errors pass through unchanged
(those have different root causes and reasonable native messages).

Motivation: new FAL models added to this release (flux-2-klein-9b,
z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3,
qwen-image) are untested against the Nous Portal proxy. If the portal
allowlists model IDs, users on Nous Subscription will hit cryptic
4xx errors without guidance on how to work around it.

Tests: 8 new cases covering status extraction across httpx/fal error
shapes and 4xx-vs-5xx-vs-ConnectionError translation policy.

Docs: brief note in image-generation.md for Nous subscribers.

Operator action (Nous Portal side): verify that fal-queue-gateway
passes through these 7 new FAL model IDs. If the proxy has an
allowlist, add them; otherwise Nous Subscription users will see the
new translated error and fall back to direct FAL.

* feat(image_gen): pin GPT-Image quality to medium (no user choice)

Previously the tools picker asked a follow-up question for GPT-Image
quality tier (low / medium / high) and persisted the answer to
`image_gen.quality_setting`. This created two problems:

1. Nous Portal billing complexity — the 22x cost spread between tiers
   ($0.009 low / $0.20 high) forces the gateway to meter per-tier per
   user, which the portal team can't easily support at launch.
2. User footgun — anyone picking `high` by mistake burns through
   credit ~6x faster than `medium`.

This commit pins quality at medium by baking it into FAL_MODELS
defaults for gpt-image-1.5 and removes all user-facing override paths:

- Removed `_resolve_gpt_quality()` runtime lookup
- Removed `honors_quality_setting` flag on the model entry
- Removed `_configure_gpt_quality_setting()` picker helper
- Removed `_GPT_QUALITY_CHOICES` constant
- Removed the follow-up prompt call in `_configure_imagegen_model()`
- Even if a user manually edits `image_gen.quality_setting` in
  config.yaml, no code path reads it — always sends medium.

Tests:
- Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium
  (5 tests) — proves medium is baked in, config is ignored, flag is
  removed, helper is removed, non-gpt models never get quality.
- Replaced test_picker_with_gpt_image_also_prompts_quality with
  test_picker_with_gpt_image_does_not_prompt_quality — proves only 1
  picker call fires when gpt-image is selected (no quality follow-up).

Docs updated: image-generation.md replaces the quality-tier table
with a short note explaining the pinning decision.

* docs(image_gen): drop stale 'wires GPT quality tier' line from internals section

Caught in a cleanup sweep after pinning quality to medium. The
"How It Works Internally" walkthrough still described the removed
quality-wiring step.
2026-04-16 20:19:53 -07:00
Teknium 0061dca950 fix(installer): make prompt_yes_no bash 3.2 compatible
The helper used ${var,,} (bash 4+ lowercase parameter expansion) and
[[ =~ ]], which fail on macOS default /bin/bash (3.2.57) with:

    bash: ${default,,}: bad substitution

With 'set -e' at the top of the script, that aborts the whole
installer for macOS users who don't have a newer bash on PATH.

Replace the lowercase expansions with POSIX-style case patterns
(`[yY]|[yY][eE][sS]|...`) that behave identically and parse cleanly
on bash 3.2. Verified with a 15-case behavior test on both bash 3.2
and bash 5.2 — all pass.
2026-04-16 20:14:02 -07:00
helix4u 5be8e95604 fix(installer): use line-based tty confirmation prompts 2026-04-16 20:14:02 -07:00
Teknium 8c478983ed fix: enable TCP keepalives to detect dead provider connections (#10324) (#11277)
Re-land of #10933, now guarded by the tests in #11266.

When a provider drops a TCP connection mid-stream, the socket can enter
CLOSE-WAIT and ''epoll_wait'' may never fire — no data or error signal
arrives, so the httpx read timeout never triggers and the agent hangs
indefinitely. The other defenses (''_force_close_tcp_sockets'', stale
stream detector) all ride on the socket layer reporting the dead
connection, which it never does without probes.

Inject ''SO_KEEPALIVE'' + ''TCP_KEEPIDLE''/''KEEPINTVL''/''KEEPCNT''
into the httpx transport. Kernel probes after 30s idle, retries every
10s, gives up after 3 → dead peer detected within ~60s instead of
hanging forever. Platform-aware: ''TCP_KEEPIDLE'' on Linux,
''TCP_KEEPALIVE'' on macOS. Silent no-op on Windows or anywhere
the socket options aren't available.

The original land (#10933) mutated ''client_kwargs'' in place when it
injected the ''httpx.Client''. Since callers pass ''self._client_kwargs''
by reference, the injected client leaked into the instance state. After
the first request, the OpenAI SDK closed its ''http_client'' — including
the injected one. The next ''_create_openai_client'' call re-read the
now-closed ''httpx.Client'' from ''self._client_kwargs'' and every
subsequent chat raised ''APIConnectionError'' with cause ''RuntimeError:
Cannot send a request, as the client has been closed'' (AlexKucera's
Discord report, 2026-04-16).

The defensive ''client_kwargs = dict(client_kwargs)'' copy already on
main (taeuk178's #10978) means this injection only lands in the
per-call local copy. Each ''_create_openai_client'' invocation gets
its OWN fresh ''httpx.Client'' whose lifetime is tied to the paired
''OpenAI'' client. When that ''OpenAI'' client is closed (rebuild,
teardown, credential rotation), its ''httpx.Client'' closes with it
and the next call constructs a fresh one — no stale closed transport
can be reused.

Full 4-test matrix all green (unit + live with real OpenRouter round
trips, HERMES_LIVE_TESTS=1):

    tests/run_agent/test_create_openai_client_kwargs_isolation.py      PASS
    tests/run_agent/test_create_openai_client_reuse.py                 PASS (2)
    tests/run_agent/test_sequential_chats_live.py                      PASS

Socket options verified on the live httpx transport:

    _socket_options: [(1, 9, 1), (6, 4, 30), (6, 5, 10), (6, 6, 3)]
    = (SO_KEEPALIVE=1, TCP_KEEPIDLE=30s, TCP_KEEPINTVL=10s, TCP_KEEPCNT=3)

Sequential-chat reproduction of the #10933 failure was explicitly
run against this patch — the defensive copy on main prevents the
closed transport from leaking back into ''self._client_kwargs'', so
every rebuild constructs a fresh transport.

Closes #10324
2026-04-16 20:04:54 -07:00
Teknium ab33ce1c86 fix(opencode): strip /v1 from base_url on mid-session /model switch to Anthropic-routed models (#11286)
PR #4918 fixed the double-/v1 bug at fresh agent init by stripping the
trailing /v1 from OpenCode base URLs when api_mode is anthropic_messages
(so the Anthropic SDK's own /v1/messages doesn't land on /v1/v1/messages).
The same logic was missing from the /model mid-session switch path.

Repro: start a session on opencode-go with GLM-5 (or any chat_completions
model), then `/model minimax-m2.7`. switch_model() correctly sets
api_mode=anthropic_messages via opencode_model_api_mode(), but base_url
passes through as https://opencode.ai/zen/go/v1. The Anthropic SDK then
POSTs to https://opencode.ai/zen/go/v1/v1/messages, which returns the
OpenCode website 404 HTML page (title 'Not Found | opencode').

Same bug affects `/model claude-sonnet-4-6` on opencode-zen.

Verified upstream: POST /v1/messages returns clean JSON 401 with x-api-key
auth (route works), while POST /v1/v1/messages returns the exact HTML 404
users reported.

Fix mirrors runtime_provider.resolve_runtime_provider:
- hermes_cli/model_switch.py::switch_model() strips /v1 after the OpenCode
  api_mode override when the resolved mode is anthropic_messages.
- run_agent.py::AIAgent.switch_model() applies the same strip as
  defense-in-depth, so any direct caller can't reintroduce the double-/v1.

Tests: 9 new regression tests in tests/hermes_cli/test_model_switch_opencode_anthropic.py
covering minimax on opencode-go, claude on opencode-zen, chat_completions
(GLM/Kimi/Gemini) keeping /v1 intact, codex_responses (GPT) keeping /v1
intact, trailing-slash handling, and the agent-level defense-in-depth.
2026-04-16 19:41:41 -07:00
Teknium 7fd508979e fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards
Follow-ups on top of kshitijk4poor's cherry-picked salvage of PR #8018:

tools/environments/daytona.py
  - PID-suffix /tmp/.hermes_sync.<pid>.tar so concurrent sync_back calls
    against the same sandbox don't collide on the remote temp path
  - Move sync_back() inside the cleanup lock and after the _sandbox-None
    guard, with its own try/except. Previously a no-op cleanup (sandbox
    already cleared) still fired sync_back → 3-attempt retry storm against
    a nil sandbox (~6s of sleep). Now short-circuits cleanly.

tools/environments/file_sync.py
  - Add _SYNC_BACK_MAX_BYTES (2 GiB) defensive cap: refuse to extract a
    tar larger than the limit. Protects against runaway sandboxes
    producing arbitrary-size archives.
  - Add 'nothing previously pushed' guard at the top of sync_back(). If
    _pushed_hashes and _synced_files are both empty, the FileSyncManager
    was never initialized from the host side — there is nothing coherent
    to sync back. Skips the retry/backoff machinery on uninitialized
    managers and eliminates test-suite slowdown from pre-existing cleanup
    tests that don't mock the sync layer.

tests/tools/test_file_sync_back.py
  - Update _make_manager helper to seed a _pushed_hashes entry by default
    so sync_back() exercises its real path. A seed_pushed_state=False
    opt-out is available for noop-path tests.
  - Add TestSyncBackSizeCap with positive and negative coverage of the
    new cap.

tests/tools/test_sync_back_backends.py
  - Update Daytona bulk download test to assert the PID-suffixed path
    pattern instead of the fixed /tmp/.hermes_sync.tar.
2026-04-16 19:39:21 -07:00
kshitijk4poor d64446e315 feat(file-sync): sync remote changes back to host on teardown
Salvage of PR #8018 by @alt-glitch onto current main.

On sandbox teardown, FileSyncManager now downloads the remote .hermes/
directory, diffs against SHA-256 hashes of what was originally pushed,
and applies only changed files back to the host.

Core (tools/environments/file_sync.py):
- sync_back(): orchestrates download -> unpack -> diff -> apply with:
  - Retry with exponential backoff (3 attempts, 2s/4s/8s)
  - SIGINT trap + defer (prevents partial writes on Ctrl-C)
  - fcntl.flock serialization (concurrent gateway sandboxes)
  - Last-write-wins conflict resolution with warning
  - New remote files pulled back via _infer_host_path prefix matching

Backends:
- SSH: _ssh_bulk_download — tar cf - piped over SSH
- Modal: _modal_bulk_download — exec tar cf - -> proc.stdout.read
- Daytona: _daytona_bulk_download — exec tar cf -> SDK download_file
- All three call sync_back() at the top of cleanup()

Fixes applied during salvage (vs original PR #8018):

| # | Issue | Fix |
|---|-------|-----|
| C1 | import fcntl unconditional — crashes Windows | try/except with fallback; _sync_back_locked skips locking when fcntl=None |
| W1 | assert for runtime guard (stripped by -O) | Replaced with proper if/raise RuntimeError |
| W2 | O(n*m) from _get_files_fn() called per file | Cache mapping once at start of _sync_back_impl, pass to resolve/infer |
| W3 | Dead BulkDownloadFn imports in 3 backends | Removed unused imports |
| W4 | Modal hardcodes root/.hermes, no explanation | Added docstring comment explaining Modal always runs as root |
| S1 | SHA-256 computed for new files where pushed_hash=None | Skip hashing when pushed_hash is None (comparison always False) |
| S2 | Daytona /tmp/.hermes_sync.tar never cleaned up | Added rm -f after download (best-effort) |

Tests: 49 passing (17 new: _infer_host_path edge cases, SIGINT
main/worker thread, Windows fcntl=None fallback, Daytona tar cleanup).

Based on #8018 by @alt-glitch.
2026-04-16 19:39:21 -07:00
Brooklyn Nicholson c730ab8ad7 chore: fmt 2026-04-16 21:09:50 -05:00
Brooklyn Nicholson c74017f405 fix(tui): sticky prompt correctness + scrollbar re-render thrash
Sticky prompt:
The loop was skipping `first` (the first row in the viewport) when
looking for a user message scrolled above the top edge. If `first`
itself was a user row that had just ticked above the viewport, we'd
fall through the early-return guard (`role === 'user' && !above`),
then walk from `first - 1` backward — never rechecking `first`, never
finding anything, returning '' and leaving the sticky empty. This is
why it felt "stuck" at the start: one-turn sessions with the user row
exactly at/near the top never surfaced the breadcrumb.

Collapsed the two branches into one loop starting at `first`: nearest
user wins — still-on-screen → empty (redundant to echo), already
above → text. Same semantics, covers the gap.

Scrollbar:
`useSyncExternalStore` snapshot was `scrollTop:vp:scrollHeight` —
scrollHeight ticks up by ~1 row on every streamed chunk, forcing a
re-render per chunk. Quantized snapshot to the displayed values
(`thumbTop:thumbSize:vp`) so we only re-render when the bar actually
changes. Drops render count per turn by ~100x during streaming and
stops the "constantly resizes" flicker.
2026-04-16 21:07:19 -05:00
Brooklyn Nicholson 40f2368875 fix(tui): ungate reasoning events so the Thinking panel shows live tokens
The gateway was gating `reasoning.delta` and `reasoning.available`
behind `_reasoning_visible(sid)` (true iff `display.show_reasoning:
true` or `tool_progress_mode: verbose`). With the default config,
neither was true — so reasoning events never reached the TUI,
`turn.reasoning` stayed empty, `reasoningTokens` stayed 0, and the
Thinking expander showed no token label for the whole turn. Tools
still reported tokens because `tool.start` had no such gate.

Then `message.complete` fired with `payload.reasoning` populated, the
TUI saved it into `msg.thinking`, and the finalized row's expander
sprouted "~36 tokens" post-hoc. That's the "tokens appear after the
turn" jank.

Remove the gate on emission. The TUI is responsible for whether to
display reasoning content (detailsMode + collapsed expander already
handle that). Token counting becomes continuous throughout the turn,
matching how tools work.

Also dropped the now-unused `_reasoning_visible` and
`_session_show_reasoning` helpers. `show_reasoning` config key stays
in place — it's still toggled via `/reasoning show|hide` and read
elsewhere for potential future TUI-side gating.
2026-04-16 20:56:47 -05:00
Brooklyn Nicholson 319aabbb80 refactor(tui): wrap progress panel + streaming body in StreamingAssistant
Two improvements:

1. The progress ToolTrail and the streaming MessageLine were two
   sibling JSX blocks in appLayout with hand-rolled margin glue
   between them. Extracted into `<StreamingAssistant>`, a single
   component that owns both the trail and the streaming body plus
   the 1-row gap between them. appLayout just hands it `progress`
   and theme; the layout logic lives in one place, matching the
   mental model that these two pieces are one live assistant turn.

2. Thinking token label was hidden when `reasoningTokens === 0` even
   if the live reasoning text was already populated (the
   scheduleReasoning timer hadn't ticked, or the model sent no
   reasoning but the text was coming in via reasoning.delta).
   Changed the tokenCount fallback from `reasoningTokens !==
   undefined ? reasoningTokens : estimate` to `reasoningTokens > 0 ?
   ... : estimate` so the label appears the moment text exists.
2026-04-16 20:49:41 -05:00
Brooklyn Nicholson 26f3a05c9c fix(tui): don't clobber busy on the progress panel during streaming
`appLayout` was passing `busy={ui.busy && !progress.streaming}` into
ToolTrail, so the moment `message.delta` fired and streaming began,
the panel internally saw `busy=false`. With the prior fix in place
(hasThinking = !!cot || reasoningActive || busy), that flipped
hasThinking to false and the Thinking expander vanished mid-turn —
reappearing only after message.complete when the finalized row
rendered with its own internal expander.

The `!progress.streaming` override was a defensive guard against the
panel implying "still thinking" once the response text was streaming.
But that's already handled inside ToolTrail — `streaming` prop on the
Thinking component uses `busy && reasoningStreaming`, and
reasoningStreaming is already falsey once recordMessageDelta calls
endReasoningPhase.

Pass plain `busy={ui.busy}`. Panel stays up start-to-finish; handoff
to the finalized-message row is continuous.
2026-04-16 20:39:02 -05:00
Brooklyn Nicholson 15096903c7 fix(tui): keep the newline above the streaming assistant text
Finalized assistant messages rendered the thinking/tools trail inside
MessageLine with marginBottom=1 before the response body — giving a
clean blank line above the text. The streaming path rendered the
progress ToolTrail and the streaming MessageLine as two separate
siblings with no margin between, so the in-progress response butted
right up against the thinking panel. That's the "newline appears
after it's done" jank.

Wrap the streaming MessageLine in a Box with marginTop=1 whenever the
progress area is visible above it. Same spacing as the finalized
version, continuous through the handoff.
2026-04-16 20:35:46 -05:00
Brooklyn Nicholson 26859e3fcb fix(tui): keep the Thinking expander visible for the whole turn
Previously `hasThinking = !!cot || reasoningActive || (busy && !hasTools)`
so the moment a tool started streaming (`hasTools` → true) the expander
vanished mid-turn. If the model also produced no `reasoning.delta`
events (reasoning-less models, or reasoning arriving after tools), the
whole turn ran with no Thinking row — then `message.complete`
populated `msg.thinking` from the payload's post-hoc reasoning trace
and the expander suddenly appeared in the transcript AFTER the turn.

Drop the `!hasTools` restriction. The Thinking row now anchors for the
entire `busy` window; tools and thinking coexist as sibling sections
(they already did — the exclusion was a UX mistake). Reasoning-less
models show a dim empty header; streaming models show live content;
tool-interleaved turns keep the anchor visible throughout.
2026-04-16 20:27:06 -05:00
Brooklyn Nicholson aedc767c66 feat(tui): put the kawaii face+verb ticker in the status bar, not the thinking panel
The status bar was showing stale lifecycle text ("running…") while the
face+verb stream flickered through the thinking panel as Python pushed
thinking.delta events. That's backwards — the face ticker is the
primary "I'm alive" signal, it belongs in the status bar; the thinking
panel is for substantive reasoning and tool activity.

Status bar now reads `ui.busy`: when true, renders a local `<FaceTicker>`
cycling FACES × VERBS on a 2.5s interval, unaffected by server events.
When false, the bar shows the actual status string (ready, starting
agent…, interrupted, etc.).

Side effect: `scheduleThinkingStatus` still patches `ui.status` with
Python's face text, but while busy the bar ignores that string and uses
the ticker instead. No server-side changes needed — Python keeps
emitting thinking.delta as a liveness heartbeat, the TUI just doesn't
let it fight the status bar.
2026-04-16 20:14:25 -05:00
Brooklyn Nicholson 23212d6b40 docs: kill "PT" shorthand — say "classic (prompt_toolkit) CLI"
"PT" was internal shorthand for prompt_toolkit that leaked into
AGENTS.md and the TUI post-mortem. Spell it out.

- AGENTS.md: "PT CLI" → "classic (prompt_toolkit) CLI"
- docs/plans/2026-04-01-ink-gateway-tui-migration-plan.md: both hits
2026-04-16 19:39:09 -05:00
Brooklyn Nicholson 7ffefc2d6c docs(tui): rename "Ink TUI" to just "TUI" throughout user-facing surfaces
"Ink" is the React reconciler — implementation detail, not branding.
Consistent naming: the classic CLI is the CLI, the new one is the TUI.

Updated docs: user-guide/tui.md, user-guide/cli.md cross-link, quickstart,
cli-commands reference, environment-variables reference.

Updated code: main.py --tui help text, server.py user-visible setup
error, AGENTS.md "TUI Architecture" section.

Kept "Ink" only where it is literally the library (hermes-ink internal
source comments, AGENTS.md tree note flagging ui-tui/ as a React/Ink dir).
2026-04-16 19:38:21 -05:00
Brooklyn Nicholson 2812bfe5b9 docs(tui): add Ink TUI user guide + cross-link from CLI docs
New primary guide at `user-guide/tui.md` covering launch, requirements,
keybindings, slash commands, status line, configuration, sessions, and
the revert path. Matches the voice of `user-guide/cli.md`.

Cross-links:
- `user-guide/cli.md`: tip callout pointing readers at the Ink TUI
- `getting-started/quickstart.md`: shows both `hermes` and `hermes --tui`
  under "Start Chatting" so first-run users know they have the choice
- `reference/environment-variables.md`: new "Interface" section with
  `HERMES_TUI` and `HERMES_TUI_DIR`
- `reference/cli-commands.md`: `--tui` and `--dev` added to global options

Sidebar: `user-guide/tui` slotted right after `user-guide/cli`.
2026-04-16 19:29:18 -05:00
Brooklyn Nicholson ca30803d89 chore(tui): strip noise comments 2026-04-16 19:14:05 -05:00
Brooklyn Nicholson 7f1204840d test(tui): fix stale mocks + xdist flakes in TUI test suite
All 61 TUI-related tests green across 3 consecutive xdist runs.

tests/tui_gateway/test_protocol.py:
- rename `get_messages` → `get_messages_as_conversation` on mock DB (method
  was renamed in the real backend, test was still stubbing the old name)
- update tool-message shape expectation: `{role, name, context}` matches
  current `_history_to_messages` output, not the legacy `{role, text}`

tests/hermes_cli/test_tui_resume_flow.py:
- `cmd_chat` grew a first-run provider-gate that bailed to "Run: hermes
  setup" before `_launch_tui` was ever reached; 3 tests stubbed
  `_resolve_last_session` + `_launch_tui` but not the gate
- factored a `main_mod` fixture that stubs `_has_any_provider_configured`,
  reused by all three tests

tests/test_tui_gateway_server.py:
- `test_config_set_personality_resets_history_and_returns_info` was flaky
  under xdist because the real `_write_config_key` touches
  `~/.hermes/config.yaml`, racing with any other worker that writes
  config. Stub it in the test.
2026-04-16 19:07:49 -05:00
Brooklyn Nicholson dd2ec6bfa0 chore: uptick 2026-04-16 18:57:56 -05:00
Brooklyn Nicholson 3746c60439 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 18:25:49 -05:00
Brooklyn Nicholson 727f0eaf74 refactor(tui): clean up touched files — DRY, KISS, functional
Python (tui_gateway/server.py):
- hoist `_wait_agent` next to `_sess` so `_sess` no longer forward-refs
- simplify `_wait_agent`: `ready.wait()` already returns True when set,
  no separate `.is_set()` check, collapse two returns into one expr
- factor `_sess_nowait` for handlers that don't need the agent (currently
  `terminal.resize` + `input.detect_drop`) — DRY up the duplicated
  `_sessions.get` + "session not found" dance
- inline `session = _sessions[sid]` in the session.create build thread so
  agent/worker writes don't re-look-up the dict each time
- rename inline `ready_event` → `ready` (it's never ambiguous)

TS:
- `useSessionLifecycle.newSession`: hoist `r.info ?? null` into `info`
  so it's one lookup, drop ceremonial `{ … }` blocks around single-line
  bodies
- `createGatewayEventHandler.session.info`: wrap the case in a block,
  hoist `ev.payload` into `info`, tighten comments
- `useMainApp` flush effect: collapse two guard returns into one
- `bootBanner.ts`: lift `TAGLINE` + `FALLBACK` to module constants, make
  `GRADIENT` readonly, one-liner return via template literal
- `theme.ts`: group `selectionBg` inside the status* block (it's a UI
  surface bg, same family), trim the comment
2026-04-16 18:07:23 -05:00
Brooklyn Nicholson 275256cdb4 feat(tui): uniform selection background instead of SGR inverse
Selection was falling back to SGR-7 inverse (fg ↔ bg per cell), which
fragments over syntax-highlighted content — each amber/gold/dim/cornsilk
fg turned into a different bg stripe, producing the staircase look.

Now `useMainApp` calls `selection.setSelectionBgColor()` with a muted
navy (`#3a3a55`) on theme change. `setSelectionBg` in screen.ts replaces
just the bg cell-by-cell while preserving fg/bold/dim/italic, so the
highlight is one solid color across the whole drag range and the text
stays readable in its original color.

Skins can override via `selection_bg` in their color map.
2026-04-16 15:50:28 -05:00
Brooklyn Nicholson 9503896aa2 perf(tui): paint banner to stdout in ~2ms, before Ink loads
Dynamic-importing @hermes/ink + App costs ~170ms on cold start — during
that window the terminal was blank. Now `entry.tsx` writes a raw-ANSI
banner to stdout immediately after the TTY check, using hardcoded
DEFAULT_THEME colors. Ink's `<AlternateScreen>` wipes the normal-screen
buffer when it mounts, so the boot banner is replaced seamlessly by the
real React render a moment later — no double-banner, no flash.

  T=2ms    banner visible (vs. ~170ms before)
  T=~170ms React + Ink mounts
  T=~200ms alt screen takes over, Banner component repaints

Palette drift between `bootBanner.ts` and the live theme is harmless —
the live render overrides after ~200ms. Narrow terminals (cols < 98)
fall back to the one-line "⚕ NOUS HERMES" marker.
2026-04-16 15:48:41 -05:00
Brooklyn Nicholson 04e36851b7 feat(tui): honest status 'starting agent…' until session.info arrives
Post-async-session.create, `session.create` returns in ~1ms with partial
info and the real agent fires `session.info` ~1s later. Previously the
status bar went straight to 'ready' right after the instant RPC return,
which was misleading — `prompt.submit` would block server-side waiting
for the agent to finish building.

Now:
- `newSession`: status = 'starting agent…' when info has no `version`,
  else 'ready' (covers the fast resume path too)
- `session.info` event: flips status to 'ready' only if it was
  'starting agent…', preserving running/interrupted/error states
2026-04-16 15:41:44 -05:00
Brooklyn Nicholson a8e0a1148f perf(tui): async session.create — sid live in ~250ms instead of ~1350ms
Previously `session.create` blocked for ~1.2s on `_make_agent` (mostly
`run_agent` transitive imports + AIAgent constructor). The UI waited
through that whole window before sid became known and the banner/panel
could render.

Now `session.create` returns immediately with `{session_id, info:
{model, cwd, tools:{}, skills:{}}}` and spawns a background thread that
does the real `_make_agent` + `_init_session`. When the agent is live,
the thread emits `session.info` with the full payload.

Python side:
- `_sessions[sid]` gets a placeholder dict with `agent=None` and a
  `threading.Event()` named `agent_ready`
- `_wait_agent(session, rid, timeout=30)` blocks until the event is set
  (no-op when already set or absent, e.g. for `session.resume`)
- `_sess()` now calls `_wait_agent` — so every handler routed through it
  (prompt.submit, session.usage, session.compress, session.branch,
  rollback.*, tools.configure, etc.) automatically holds until the agent
  is live, but only during the ~1s startup window
- `terminal.resize` and `input.detect_drop` bypass the wait via direct
  dict lookup — they don't touch the agent and would otherwise block
  the first post-startup RPCs unnecessarily

TS side:
- `session.info` event handler now patches the intro message's `info`
  in-place so the seeded banner upgrades to the full session panel when
  the agent finishes initializing
- `appLayout` gates `SessionPanel` on `info.version` being present
  (only set by `_session_info(agent)`, not by the partial payload from
  `session.create`) — so the panel only appears when real data arrives

Net effect on cold start:
  T=~400ms  banner paints (seeded intro)
  T=~245ms  ui.sid set (session.create responds in ~1ms after ready)
  T=~1400ms session panel fills in (real session.info event)

Pre-session keystrokes queue as before (already handled by the flush
effect); `prompt.submit` will wait on `agent_ready` on the Python side
when the flush tries to send before the agent is live.
2026-04-16 15:39:19 -05:00
Brooklyn Nicholson 842a122964 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 15:37:28 -05:00
Brooklyn Nicholson 2d693c865c perf(tui): spawn python gateway before loading @hermes/ink
Before: entry.tsx imports @hermes/ink (394KB bundle) + App + GatewayClient
in declaration order, then calls `gw.start()` at ~T=220ms. Python fork +
server.py import starts then.

After: only `GatewayClient` is statically imported (5ms, node builtins
only). `gw.start()` fires at ~T=5ms. @hermes/ink + App load in parallel
via `Promise.all(import(...))`. Python gets ~215ms of free runway to do
its own module import before node even finishes loading.

Net: session.info arrives ~150ms earlier in cold start. First React frame
timing is unchanged (still ~240ms — still gated by ink+app imports).

Removed a previously-tried warm-thread in server.py that pre-imported
`run_agent` in the background. Measured variance showed occasional
5-10s outliers (GIL thrashing); median gain was <100ms. Not worth the
non-determinism.
2026-04-16 15:21:49 -05:00
Brooklyn Nicholson f3920fec0b feat(tui): queue pre-session input, auto-flush when session lands
The TUI is fully interactive from the first frame but `session.create`
(agent + tools + MCP) takes ~2s. Plain-text messages typed before the
session is live used to fail with "session not ready yet"; slash and
shell commands worked but agent prompts were dropped.

Now:
- `dispatchSubmission` enqueues plain text when `sid` is null (slash/shell
  still short-circuit first)
- `useMainApp` tracks sid transitions and kicks off one `sendQueued()`
  when the session first becomes ready; subsequent queued messages drain
  on `message.complete` as before
- Fixed pre-existing double-Enter bug that dequeued without sid check

User flow: type `hello` → shows in `queuedDisplay` preview → 2s later
agent wakes → message auto-sends → reply streams. Zero wasted input.
2026-04-16 15:04:18 -05:00
Brooklyn Nicholson c6ed61430a perf(tui): paint banner on first frame, don't wait on session.create
Previously `historyItems` was seeded empty and the intro (with Banner +
SessionPanel) was only pushed after Python's `session.create` returned —
~1.8s of agent + tools + MCP init with nothing on screen. Base CLI feels
instant because it prints the banner as its first action.

Seed `historyItems` with an info-less intro on mount. `appLayout` now
renders the Banner unconditionally for `kind === 'intro'` and gates only
the SessionPanel on `info` being present. Gateway.ready swaps the skin
(~200ms) and session.info fills in the panel when the agent is ready.

Net: first usable frame drops from ~2s to ~300ms (node + module graph +
React mount). No behavior change — intro message is replaced in place
by `introMsg(info)` when `newSession()` / `resumeById()` resolve.
2026-04-16 14:58:12 -05:00
Brooklyn Nicholson cb2a737bc8 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 14:48:33 -05:00
Brooklyn Nicholson 18840bcff8 chore: uptick 2026-04-16 14:48:29 -05:00
Brooklyn Nicholson 0478266831 refactor(tui): stop shadowing python — slash fallback inherits worker output
Python's slash worker already prints every echo/panel command through Rich.
TS was reformatting the same data client-side for 23 commands. Delete those
shadows; let the `slash.exec` fallback in `createSlashHandler` route the
worker's text (via `<Ansi>`) and page-wrap long output.

TS registry now contains 23 commands (down from 45) — only those that:
  - mutate React-local state (composer, transcript, overlays, uiStore)
  - touch the terminal (OSC52 copy, `$EDITOR`, clipboard)
  - open pickers (`/model`, `/resume`)
  - trigger history surgery (`/undo`, `/retry`, `/compress`, `/personality`)
  - need TS-only composition (`/help` merges HOTKEYS + catalog)

Deleted shadows:
  session: yolo, skin, verbose, reasoning, provider, stop, reload-mcp,
           save, title, insights, debug, fast, platforms, snapshot,
           usage, history, profile
  ops:     plugins, rollback, agents, tasks, cron, config, toolsets,
           browser, skills (list/browse only; `/tools configure` kept
           for its history-reset side effect)

Side effects:
- Drops `slash/shared.ts` + `SlashShared` + `shared`/`SLASH_OUTPUT_PAGE` —
  generic slash.exec fallback handles titled paging via `createSlashHandler`.
- Prunes 17 now-unreferenced `*Response` interfaces from gatewayTypes.ts.
- `createSlashHandler` fallback now pages long output (len>180 || lines>2)
  and uses the command name as title.

session.ts: 670 -> 199  (-70%)
ops.ts:     460 ->  52  (-88%)
gatewayTypes.ts: 450 -> 302  (-33%)
2026-04-16 14:26:15 -05:00
Brooklyn Nicholson beccd1bc04 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 12:42:44 -05:00
Brooklyn Nicholson 68ecdb6e26 refactor(tui): store-driven turn state + slash registry + module split
Hoist turn state from a 286-line hook into $turnState atom + turnController
singleton. createGatewayEventHandler becomes a typed dispatch over the
controller; its ctx shrinks from 30 fields to 5. Event-handler refs and 16
threaded actions are gone.

Fold three createSlash*Handler factories into a data-driven SlashCommand[]
registry under slash/commands/{core,session,ops}.ts. Aliases are data;
findSlashCommand does name+alias lookup. Shared guarded/guardedErr combinator
in slash/guarded.ts.

Split constants.ts + app/helpers.ts into config/ (timing/limits/env),
content/ (faces/placeholders/hotkeys/verbs/charms/fortunes), domain/ (roles/
details/messages/paths/slash/viewport/usage), protocol/ (interpolation/paste).

Type every RPC response in gatewayTypes.ts (26 new interfaces); drop all
`(r: any)` across slash + main app.

Shrink useMainApp from 1216 -> 646 lines by extracting useSessionLifecycle,
useSubmission, useConfigSync. Add <Fg> themed primitive and strip ~50
`as any` color casts.

Tests: 50 passing. Build + type-check clean.
2026-04-16 12:34:45 -05:00
Ari Lotter fc0623f0af update nix 2026-04-16 11:50:35 -04:00
Brooklyn Nicholson 9c71f3a6ea Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 10:47:41 -05:00
Brooklyn Nicholson c4b9750bc1 feat: lazy bootstrap node 2026-04-16 10:47:37 -05:00
Brooklyn Nicholson 39b1336d1f fix: ctx usage display 2026-04-16 08:27:41 -05:00
Brooklyn Nicholson f81dba0da2 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 08:23:20 -05:00
Brooklyn Nicholson 8e06db56fd chore: uptick 2026-04-16 01:04:35 -05:00
Brooklyn Nicholson cb31732c4f chore: uptick 2026-04-15 23:29:00 -05:00
Brooklyn Nicholson 097702c8a7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 19:11:07 -05:00
Brooklyn Nicholson 72aebfbb24 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 17:43:41 -05:00
Brooklyn Nicholson c9f78d110a feat: good vibes indi 2026-04-15 17:43:38 -05:00
Brooklyn Nicholson baa0de7649 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 16:35:01 -05:00
Brooklyn Nicholson 57e4b61155 feat: change to $ when in ! mode 2026-04-15 16:34:58 -05:00
Brooklyn Nicholson 53a024a941 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 14:37:54 -05:00
Brooklyn Nicholson cb7b740e32 feat: add subagent details 2026-04-15 14:35:42 -05:00
Brooklyn Nicholson 4b4b4d47bc feat: just more cleaning 2026-04-15 14:14:01 -05:00
Brooklyn Nicholson 46cef4b7fa Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 12:48:17 -05:00
Brooklyn Nicholson 9931d1d814 chore: cleanup 2026-04-15 10:35:08 -05:00
Brooklyn Nicholson cc15b55bb9 chore: uptick 2026-04-15 10:23:15 -05:00
Brooklyn Nicholson 371166fe26 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 10:21:00 -05:00
Brooklyn Nicholson 33c615504d feat: add inline token count etc and fix venv 2026-04-15 10:20:56 -05:00
Brooklyn Nicholson 561cea0d4a Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 00:02:31 -05:00
Brooklyn Nicholson 496bfb3c59 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 22:30:22 -05:00
Brooklyn Nicholson 99d859ce4a feat: refactor by splitting up app and doing proper state 2026-04-14 22:30:18 -05:00
Brooklyn Nicholson 4cbf54fb33 chore: uptick 2026-04-14 19:38:04 -05:00
Brooklyn Nicholson 77cd5bf565 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 19:33:03 -05:00
Brooklyn Nicholson bf54f1fb2f Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 18:26:05 -05:00
Brooklyn Nicholson 3bc661ea29 fix: model et al selection on enter 2026-04-14 18:26:00 -05:00
Brooklyn Nicholson 52c11d172a feat: add scrollbar and fix selection on scroll 2026-04-14 14:34:33 -05:00
Brooklyn Nicholson 9804aa7443 fix: scrolling while selecting 2026-04-14 12:50:22 -05:00
Brooklyn Nicholson 7aed09e1ba fix: ctrlc 2026-04-14 12:07:29 -05:00
Brooklyn Nicholson dd2b0b4775 chore: uptick 2026-04-14 11:53:55 -05:00
Brooklyn Nicholson ea2d5754ab Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 11:49:40 -05:00
Brooklyn Nicholson 9a3a2925ed feat: scroll aware sticky prompt 2026-04-14 11:49:32 -05:00
Brooklyn Nicholson c189d5e98b fix: pasting 2026-04-13 22:39:03 -05:00
Brooklyn Nicholson 6bbac046a7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 21:46:11 -05:00
Brooklyn Nicholson bbc7316007 feat: add cur cwd 2026-04-13 21:46:08 -05:00
Brooklyn Nicholson 35dbb1da3f chore: uptick 2026-04-13 21:22:44 -05:00
Brooklyn Nicholson 6d6b3b03ac feat: add clicky handles 2026-04-13 21:20:55 -05:00
Brooklyn Nicholson 1b573b7b21 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 21:17:41 -05:00
Brooklyn Nicholson 7e4dd6ea02 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 18:32:13 -05:00
Brooklyn Nicholson aeb53131f3 fix(ui-tui): harden TUI error handling, model validation, command UX parity, and gateway lifecycle 2026-04-13 18:29:24 -05:00
Brooklyn Nicholson 783c6b6ed6 chore: uptick 2026-04-13 15:08:06 -05:00
Brooklyn Nicholson 4a260b51fe fix: deep markdown parsing 2026-04-13 15:01:15 -05:00
Brooklyn Nicholson ebe3270430 fix: fake models 2026-04-13 14:57:42 -05:00
Brooklyn Nicholson 77b97b810a chore: update how txt pasting ux feels 2026-04-13 14:49:10 -05:00
Brooklyn Nicholson 9db94e8521 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 14:17:55 -05:00
Brooklyn Nicholson cac1b1b724 fix(ui-tui): surface RPC errors and guard invalid gateway responses 2026-04-13 14:17:52 -05:00
Ari Lotter 56524bb1d9 fix: nix local dev with tui 2026-04-13 15:09:31 -04:00
Brooklyn Nicholson 0642b6cc53 fix: clean newline paste thingy 2026-04-13 12:54:48 -05:00
Brooklyn Nicholson eec1db36f7 chore: preserve commands 2026-04-13 10:43:42 -05:00
Brooklyn Nicholson 713a614ea8 chore: uptick 2026-04-13 10:22:44 -05:00
Brooklyn Nicholson a27167fb30 chore: fmt 2026-04-13 10:14:05 -05:00
Brooklyn Nicholson a2c0597ae4 feat: show thinking indicator while inferencing 2026-04-13 10:11:18 -05:00
Brooklyn Nicholson 0fd33a98cd feat: ctrl t for diff thinking rendering types 2026-04-12 20:08:12 -05:00
Brooklyn Nicholson ddb0871769 feat(tui): hierarchical tool progress with grouped parent/child rows and transient line pruning 2026-04-12 17:39:17 -05:00
Brooklyn Nicholson e03bef684e chore: fmt 2026-04-12 16:33:25 -05:00
Brooklyn Nicholson 4b026d6761 fix: little box typey thing 2026-04-12 16:31:30 -05:00
Brooklyn Nicholson 8efd3db1b4 fix: force builds 2026-04-12 16:08:03 -05:00
Brooklyn Nicholson ef51bb0091 fix: tool drafting stuff 2026-04-12 16:06:39 -05:00
Ari Lotter 3bf0f39337 wrap preformatted ansi in <Ansi> component 2026-04-12 16:53:53 -04:00
Brooklyn Nicholson 690d62a6d1 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-12 13:19:07 -05:00
Brooklyn Nicholson 2aea75e91e Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-12 13:18:55 -05:00
Austin Pickett 5552e1ffe1 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 22:10:11 -04:00
Austin Pickett 90890f8f04 feat: personality selector 2026-04-11 22:10:02 -04:00
Ari Lotter 8e0df1d532 launch tui later to allow setup et al 2026-04-11 20:23:30 -04:00
Ari Lotter 29721fcc58 nix fixes 2026-04-11 19:35:00 -04:00
Brooklyn Nicholson a1d2a0c0fd feat: self update npm deps on hermes update 2026-04-11 18:29:18 -05:00
Brooklyn Nicholson ec553fdb49 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 17:15:41 -05:00
Brooklyn Nicholson 24a498eb90 feat: better markdown 2026-04-11 17:15:36 -05:00
Brooklyn Nicholson 9ccb490cf3 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 15:30:23 -05:00
Brooklyn Nicholson 32302c37dd feat: fix types and add type checking plus lazybundle on launch andddd dev flag 2026-04-11 14:42:28 -05:00
Ari Lotter 5e5e65f6d5 fix nix build 2026-04-11 15:30:37 -04:00
Brooklyn Nicholson acbf1794f2 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 14:05:17 -05:00
Brooklyn Nicholson e2ea8934d4 feat: ensure feature parity once again 2026-04-11 14:02:36 -05:00
Austin Pickett 7e7f78f86c Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 15:00:28 -04:00
Austin Pickett 5fb6a4418b feat: panels 2026-04-11 14:29:24 -04:00
Brooklyn Nicholson bf6af95ff5 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 13:14:36 -05:00
Brooklyn Nicholson 3fd5cf6e3c feat: fix img pasting in new ink plus newline after tools 2026-04-11 13:14:32 -05:00
Brooklyn Nicholson b04248f4d5 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor
# Conflicts:
#	gateway/platforms/base.py
#	gateway/run.py
#	tests/gateway/test_command_bypass_active_session.py
2026-04-11 11:39:47 -05:00
Brooklyn Nicholson 7803d21bcc Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 11:39:19 -05:00
Brooklyn Nicholson 8760faf991 feat: fork ink and make it work nicely 2026-04-11 11:29:08 -05:00
jonny cab6447d58 fix(tui): render tool trail consistently between live and resume
Resumed sessions showed raw JSON tool output in content boxes instead
of the compact trail lines seen during live use. The root cause was
two separate rendering paths with no shared code.

Extract buildToolTrailLine() into lib/text.ts as the single source
of truth for formatting tool trail lines. Both the live tool.complete
handler and toTranscriptMessages now call it.

Server-side, reconstruct tool name and args from the assistant
message's tool_calls field (tool_name column is unpopulated) and
pass them through _tool_ctx/build_tool_preview — the same path
the live tool.start callback uses.
2026-04-11 06:35:00 +00:00
jonny 57e8d44af8 fix(tui): preserve tool metadata in resumed session history
session.resume was building conversation history with only role and
content, stripping tool_call_id, tool_calls, and tool_name. The API
requires tool messages to reference their parent tool_call, so resumed
sessions with tool history would fail with HTTP 500.

Use get_messages_as_conversation() which already preserves the full
message structure including tool metadata and reasoning fields.
2026-04-11 05:23:44 +00:00
jonny cb79018977 fix(tui): improve session picker readability
- Show full session ID in a fixed-width column for easy scanning
- Pad row numbers to 2 digits to keep alignment past 9 entries
- Always show session source (tui/cli) instead of conditionally hiding it
- Use Box-based column layout so ID, metadata, and title don't run together
2026-04-10 11:16:41 +00:00
jonny 90f0aa174d fix(tui): support /resume <id> to bypass session picker
- Extract resumeById callback from inline onSelect handler
- /resume with no arg opens picker (unchanged behavior)
- /resume <id> resumes directly, skipping the picker
2026-04-10 11:00:08 +00:00
jonny 304f1463a9 fix(tui): show CLI sessions in resume picker
- session.list RPC now queries both tui and cli sources, merged by recency
- Session picker shows source label for non-tui sessions (e.g. ", cli")
- Added source field to SessionItem interface
2026-04-10 09:34:01 +00:00
jonny 294c377c0c fix(tui): use PROJECT_ROOT instead of cwd for HERMES_ROOT fallback
When HERMES_ROOT was added for Nix-bundled TUI support, the fallback
was set to os.getcwd(). This overrode the TUI's own import.meta.dirname
resolution, so launching `hermes --tui` from outside the repo caused
the gateway client to look for venv/bin/python relative to the user's
working directory instead of the repo root.

Use PROJECT_ROOT (resolved from the source file location) as the
fallback, which is stable regardless of where the command is invoked.
2026-04-10 09:18:06 +00:00
Ari Lotter 660379637a one more nix fix 2026-04-10 01:41:29 -04:00
Ari Lotter bc80848e49 update lockfile 2026-04-10 00:50:39 -04:00
Ari Lotter 658cd2dd4c nix: add tui lockfile update script 2026-04-10 00:46:37 -04:00
Brooklyn Nicholson 8c1ba639c6 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-09 23:35:29 -05:00
Brooklyn Nicholson 17a9c47178 feat: support shift enter for ghostty etc 2026-04-09 23:35:25 -05:00
Austin Pickett e1df13cf20 fix: menus 2026-04-10 00:01:37 -04:00
Brooklyn Nicholson 4fe78d5b88 chore: fix bad merge apparently? 2026-04-09 19:17:06 -05:00
Brooklyn Nicholson aa5b697a9d Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-09 19:12:31 -05:00
Brooklyn Nicholson aca479c1ae Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-09 19:08:52 -05:00
Brooklyn Nicholson b85ff282bc feat(ui-tui): slash command history/display, CoT fade, live skin switch, fix double reasoning 2026-04-09 19:08:47 -05:00
Austin Pickett f805323517 chore: merge main 2026-04-09 20:00:34 -04:00
Austin Pickett 4406b4b100 fix: add delete support 2026-04-09 19:53:55 -04:00
Brooklyn Nicholson 17ecdce936 feat: add slash commands to the history so it doesnt get lost 2026-04-09 18:51:17 -05:00
Brooklyn Nicholson 7e813a30e0 fix: sexier cots 2026-04-09 18:33:25 -05:00
Brooklyn Nicholson 6e24b9947e feat(ui-tui): render tool calls inline in message flow instead of activity lane 2026-04-09 17:40:30 -05:00
Brooklyn Nicholson 99fd3b518d feat: add /copy and /agents 2026-04-09 17:19:36 -05:00
Brooklyn Nicholson c5511bbc5a fix: leading ./ thingy 2026-04-09 16:27:06 -05:00
Brooklyn Nicholson b7d4ea1550 feat: better hyperlink formatting 2026-04-09 15:13:43 -05:00
Ari Lotter 74241328f0 direnv: watch lockfiles/nix files; gitignore .nix-stamps 2026-04-09 15:50:24 -04:00
Ari Lotter df5874c119 nix: add bundled TUI build-time verification check 2026-04-09 15:50:24 -04:00
Ari Lotter 21afb3fa3c nix: delegate devShell setup to package passthru hooks
- use inputsFrom to inherit build inputs from packages
- concat passthru.devShellHook from each package
2026-04-09 15:50:24 -04:00
Ari Lotter 31b2c12f0f nix: bundle TUI in main package with passthru hooks
- build tui.nix, copy to $out/ui-tui/ (same layout as dev)
- set HERMES_TUI_DIR, HERMES_PYTHON in wrapper
- add passthru.devShellHook with stamp-checked venv setup
- expose tui as separate package output
2026-04-09 15:50:24 -04:00
Ari Lotter 405c1b4e84 nix: add TUI derivation with buildNpmPackage
- fetchNpmDeps for reproducibilty
- compile ts to js
- passthru.devShellHook for dev shell stamp-checked auto dep install
2026-04-09 15:50:24 -04:00
Ari Lotter 5ff96551d5 cli: support bundled TUI at HERMES_TUI_DIR (for nix)
- Fix cwd to use bundled TUI dir, not PROJECT_ROOT
- Set HERMES_ROOT from env with cwd fallback
2026-04-09 15:50:24 -04:00
Ari Lotter 2b4272ef5b ui-tui: update package-lock.json 2026-04-09 15:35:54 -04:00
Ari Lotter 670dcea8f4 ui-tui: add tsc build pipeline
- Switch tsconfig to nodenext module resolution for Node 22 (used by
installer script)
- Add shebang to entry.tsx, preserved into index.js
- Add HERMES_ROOT env var fallback for repo root resolution
2026-04-09 15:35:29 -04:00
Brooklyn Nicholson 17f13013eb chore: fmt 2026-04-09 14:17:45 -05:00
Brooklyn Nicholson 00e1d42b9e feat: image pasting 2026-04-09 13:45:23 -05:00
Brooklyn Nicholson b2ea9b4176 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-09 12:31:20 -05:00
Brooklyn Nicholson 0d7c19a42f fix(ui-tui): ref-based input buffer, gateway listener stability, usage display, and 6 correctness bugs 2026-04-09 12:21:24 -05:00
Brooklyn Nicholson 8755b9dfc0 fix: resizing etc 2026-04-09 00:46:35 -05:00
Brooklyn Nicholson 54bd25ff4a fix(tui): -c resume, ctrl z, pasting updates, exit summary, session fix 2026-04-09 00:36:53 -05:00
Brooklyn Nicholson b66550ed08 fix(tui): stabilize multiline input, persist tool traces, and port CLI-style context status bar 2026-04-08 23:59:56 -05:00
Brooklyn Nicholson c49bbbe8c2 chore: fmt 2026-04-08 22:02:38 -05:00
Brooklyn Nicholson 9d8f9765c1 feat: add tests and update mds 2026-04-08 19:31:25 -05:00
Brooklyn Nicholson f226e6be10 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-08 19:11:44 -05:00
Brooklyn Nicholson a435c7274a chore: uptick 2026-04-08 14:22:36 -05:00
Brooklyn Nicholson b597123489 feat: better bg tasks 2026-04-08 14:18:37 -05:00
Brooklyn Nicholson af0f4a52fe feat: cute spinners 2026-04-08 13:45:34 -05:00
Brooklyn Nicholson b50d81f212 fix: diff colours 2026-04-08 12:11:55 -05:00
Brooklyn Nicholson a9fa054df9 chore: uptick 2026-04-08 10:35:07 -05:00
Brooklyn Nicholson 31cb23890a Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-08 09:46:46 -05:00
Brooklyn Nicholson a3cfb1de86 feat: auto install tui deps 2026-04-08 09:46:40 -05:00
Austin Pickett 371efafc46 feat: personality 2026-04-08 00:15:15 -04:00
Austin Pickett ebd2d83ef2 feat: add skin logo support 2026-04-07 23:59:11 -04:00
Brooklyn Nicholson af077b2c0d fix: history up arrow 2026-04-07 20:47:59 -05:00
Brooklyn Nicholson 2d884ff12d chore: uptick 2026-04-07 20:46:59 -05:00
Brooklyn Nicholson b397c91d4a chore: uptick 2026-04-07 20:44:18 -05:00
Brooklyn Nicholson 9c2c9e3a3e chore: fmt 2026-04-07 20:30:22 -05:00
Brooklyn Nicholson c3eeb03e26 chore: clean exit 2026-04-07 20:29:31 -05:00
Brooklyn Nicholson d9d0ac06b9 chore: readme update 2026-04-07 20:24:46 -05:00
Brooklyn Nicholson 29f2610e4b tui updates for rendering pipeline 2026-04-07 20:11:05 -05:00
Brooklyn Nicholson dcb97f7465 chore: readme 2026-04-06 18:52:45 -05:00
Brooklyn Nicholson 86308b6de4 chore: better command support 2026-04-06 18:49:40 -05:00
Brooklyn Nicholson 2d349bbf7a chore: fmt 2026-04-06 18:43:00 -05:00
Brooklyn Nicholson 39878aff00 chore: uptick 2026-04-06 18:40:21 -05:00
Brooklyn Nicholson afd670a36f feat: small refactors 2026-04-06 18:38:13 -05:00
Brooklyn Nicholson e2b3b1c5e4 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-06 17:56:45 -05:00
Brooklyn Nicholson 4c7d5ec778 tui: add tui arg 2026-04-05 18:55:59 -05:00
Brooklyn Nicholson f116c59071 tui: inherit Python-side rendering via gateway bridge 2026-04-05 18:50:41 -05:00
Brooklyn Nicholson 0f556a17f5 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-05 18:24:10 -05:00
Brooklyn Nicholson ee92460763 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-04 16:35:13 -05:00
Brooklyn Nicholson 2893e9df71 feat: add image pasting capability 2026-04-04 13:00:55 -05:00
Brooklyn Nicholson 5a5d90c85a chore: formatting etc 2026-04-03 20:14:57 -05:00
Brooklyn Nicholson 56a69e519b chore: uptick 2026-04-03 19:55:15 -05:00
Brooklyn Nicholson fab4d8d470 chore: uptick 2026-04-03 19:52:50 -05:00
Brooklyn Nicholson 1218994992 chore: uptick 2026-04-03 14:44:50 -05:00
Brooklyn Nicholson f4bf57ff7a chore: uptick 2026-04-02 23:00:38 -05:00
Brooklyn Nicholson bbba9ed4f2 feat: split apart main.tsx 2026-04-02 20:39:52 -05:00
Brooklyn Nicholson 2818dd8611 feat: add prettier etc for ui-tui 2026-04-02 19:34:30 -05:00
Brooklyn Nicholson 2ea5345a7b feat: new tui based on ink 2026-04-02 19:07:53 -05:00
553 changed files with 87355 additions and 6550 deletions
+4
View File
@@ -1 +1,5 @@
watch_file pyproject.toml uv.lock
watch_file ui-tui/package-lock.json ui-tui/package.json
watch_file flake.nix flake.lock nix/devShell.nix nix/tui.nix nix/package.nix nix/python.nix
use flake
+1
View File
@@ -60,5 +60,6 @@ mini-swe-agent/
# Nix
.direnv/
.nix-stamps/
result
website/static/api/skills-index.json
+104 -6
View File
@@ -56,6 +56,19 @@ hermes-agent/
│ ├── run.py # Main loop, slash commands, message dispatch
│ ├── session.py # SessionStore — conversation persistence
│ └── platforms/ # Adapters: telegram, discord, slack, whatsapp, homeassistant, signal, qqbot
├── ui-tui/ # Ink (React) terminal UI — `hermes --tui`
│ ├── src/entry.tsx # TTY gate + render()
│ ├── src/app.tsx # Main state machine and UI
│ ├── src/gatewayClient.ts # Child process + JSON-RPC bridge
│ ├── src/app/ # Decomposed app logic (event handler, slash handler, stores, hooks)
│ ├── src/components/ # Ink components (branding, markdown, prompts, pickers, etc.)
│ ├── src/hooks/ # useCompletion, useInputHistory, useQueue, useVirtualHistory
│ └── src/lib/ # Pure helpers (history, osc52, text, rpc, messages)
├── tui_gateway/ # Python JSON-RPC backend for the TUI
│ ├── entry.py # stdio entrypoint
│ ├── server.py # RPC handlers and session logic
│ ├── render.py # Optional rich/ANSI bridge
│ └── slash_worker.py # Persistent HermesCLI subprocess for slash commands
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains integration)
├── cron/ # Scheduler (jobs.py, scheduler.py)
├── environments/ # RL training environments (Atropos)
@@ -179,6 +192,59 @@ if canonical == "mycommand":
---
## TUI Architecture (ui-tui + tui_gateway)
The TUI is a full replacement for the classic (prompt_toolkit) CLI, activated via `hermes --tui` or `HERMES_TUI=1`.
### Process Model
```
hermes --tui
└─ Node (Ink) ──stdio JSON-RPC── Python (tui_gateway)
│ └─ AIAgent + tools + sessions
└─ renders transcript, composer, prompts, activity
```
TypeScript owns the screen. Python owns sessions, tools, model calls, and slash command logic.
### Transport
Newline-delimited JSON-RPC over stdio. Requests from Ink, events from Python. See `tui_gateway/server.py` for the full method/event catalog.
### Key Surfaces
| Surface | Ink component | Gateway method |
|---------|---------------|----------------|
| Chat streaming | `app.tsx` + `messageLine.tsx` | `prompt.submit``message.delta/complete` |
| Tool activity | `thinking.tsx` | `tool.start/progress/complete` |
| Approvals | `prompts.tsx` | `approval.respond``approval.request` |
| Clarify/sudo/secret | `prompts.tsx`, `maskedPrompt.tsx` | `clarify/sudo/secret.respond` |
| Session picker | `sessionPicker.tsx` | `session.list/resume` |
| Slash commands | Local handler + fallthrough | `slash.exec``_SlashWorker`, `command.dispatch` |
| Completions | `useCompletion` hook | `complete.slash`, `complete.path` |
| Theming | `theme.ts` + `branding.tsx` | `gateway.ready` with skin data |
### Slash Command Flow
1. Built-in client commands (`/help`, `/quit`, `/clear`, `/resume`, `/copy`, `/paste`, etc.) handled locally in `app.tsx`
2. Everything else → `slash.exec` (runs in persistent `_SlashWorker` subprocess) → `command.dispatch` fallback
### Dev Commands
```bash
cd ui-tui
npm install # first time
npm run dev # watch mode (rebuilds hermes-ink + tsx --watch)
npm start # production
npm run build # full build (hermes-ink + tsc)
npm run type-check # typecheck only (tsc --noEmit)
npm run lint # eslint
npm run fmt # prettier
npm test # vitest
```
---
## Adding New Tools
Requires changes in **2 files**:
@@ -458,13 +524,45 @@ def profile_env(tmp_path, monkeypatch):
## Testing
**ALWAYS use `scripts/run_tests.sh`** — do not call `pytest` directly. The script enforces
hermetic environment parity with CI (unset credential vars, TZ=UTC, LANG=C.UTF-8,
4 xdist workers matching GHA ubuntu-latest). Direct `pytest` on a 16+ core
developer machine with API keys set diverges from CI in ways that have caused
multiple "works locally, fails in CI" incidents (and the reverse).
```bash
source venv/bin/activate
python -m pytest tests/ -q # Full suite (~3000 tests, ~3 min)
python -m pytest tests/test_model_tools.py -q # Toolset resolution
python -m pytest tests/test_cli_init.py -q # CLI config loading
python -m pytest tests/gateway/ -q # Gateway tests
python -m pytest tests/tools/ -q # Tool-level tests
scripts/run_tests.sh # full suite, CI-parity
scripts/run_tests.sh tests/gateway/ # one directory
scripts/run_tests.sh tests/agent/test_foo.py::test_x # one test
scripts/run_tests.sh -v --tb=long # pass-through pytest flags
```
### Why the wrapper (and why the old "just call pytest" doesn't work)
Five real sources of local-vs-CI drift the script closes:
| | Without wrapper | With wrapper |
|---|---|---|
| Provider API keys | Whatever is in your env (auto-detects pool) | All `*_API_KEY`/`*_TOKEN`/etc. unset |
| HOME / `~/.hermes/` | Your real config+auth.json | Temp dir per test |
| Timezone | Local TZ (PDT etc.) | UTC |
| Locale | Whatever is set | C.UTF-8 |
| xdist workers | `-n auto` = all cores (20+ on a workstation) | `-n 4` matching CI |
`tests/conftest.py` also enforces points 1-4 as an autouse fixture so ANY pytest
invocation (including IDE integrations) gets hermetic behavior — but the wrapper
is belt-and-suspenders.
### Running without the wrapper (only if you must)
If you can't use the wrapper (e.g. on Windows or inside an IDE that shells
pytest directly), at minimum activate the venv and pass `-n 4`:
```bash
source venv/bin/activate
python -m pytest tests/ -q -n 4
```
Worker count above 4 will surface test-ordering flakes that CI never sees.
Always run the full suite before pushing changes.
+9 -2
View File
@@ -13,7 +13,7 @@
**The self-improving AI agent built by [Nous Research](https://nousresearch.com).** It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [NVIDIA NIM](https://build.nvidia.com) (Nemotron), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
<table>
<tr><td><b>A real terminal interface</b></td><td>Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output.</td></tr>
@@ -141,11 +141,18 @@ See `hermes claw migrate --help` for all options, or use the `openclaw-migration
We welcome contributions! See the [Contributing Guide](https://hermes-agent.nousresearch.com/docs/developer-guide/contributing) for development setup, code style, and PR process.
Quick start for contributors:
Quick start for contributors — clone and go with `setup-hermes.sh`:
```bash
git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
./setup-hermes.sh # installs uv, creates venv, installs .[all], symlinks ~/.local/bin/hermes
./hermes # auto-detects the venv, no need to `source` first
```
Manual path (equivalent to the above):
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv venv --python 3.11
source venv/bin/activate
+20 -1
View File
@@ -49,6 +49,7 @@ def make_tool_progress_cb(
session_id: str,
loop: asyncio.AbstractEventLoop,
tool_call_ids: Dict[str, Deque[str]],
tool_call_meta: Dict[str, Dict[str, Any]],
) -> Callable:
"""Create a ``tool_progress_callback`` for AIAgent.
@@ -84,6 +85,16 @@ def make_tool_progress_cb(
tool_call_ids[name] = queue
queue.append(tc_id)
snapshot = None
if name in {"write_file", "patch", "skill_manage"}:
try:
from agent.display import capture_local_edit_snapshot
snapshot = capture_local_edit_snapshot(name, args)
except Exception:
logger.debug("Failed to capture ACP edit snapshot for %s", name, exc_info=True)
tool_call_meta[tc_id] = {"args": args, "snapshot": snapshot}
update = build_tool_start(tc_id, name, args)
_send_update(conn, session_id, loop, update)
@@ -119,6 +130,7 @@ def make_step_cb(
session_id: str,
loop: asyncio.AbstractEventLoop,
tool_call_ids: Dict[str, Deque[str]],
tool_call_meta: Dict[str, Dict[str, Any]],
) -> Callable:
"""Create a ``step_callback`` for AIAgent.
@@ -132,10 +144,12 @@ def make_step_cb(
for tool_info in prev_tools:
tool_name = None
result = None
function_args = None
if isinstance(tool_info, dict):
tool_name = tool_info.get("name") or tool_info.get("function_name")
result = tool_info.get("result") or tool_info.get("output")
function_args = tool_info.get("arguments") or tool_info.get("args")
elif isinstance(tool_info, str):
tool_name = tool_info
@@ -145,8 +159,13 @@ def make_step_cb(
tool_call_ids[tool_name] = queue
if tool_name and queue:
tc_id = queue.popleft()
meta = tool_call_meta.pop(tc_id, {})
update = build_tool_complete(
tc_id, tool_name, result=str(result) if result is not None else None
tc_id,
tool_name,
result=str(result) if result is not None else None,
function_args=function_args or meta.get("args"),
snapshot=meta.get("snapshot"),
)
_send_update(conn, session_id, loop, update)
if not queue:
+148 -30
View File
@@ -26,6 +26,7 @@ from acp.schema import (
McpServerHttp,
McpServerSse,
McpServerStdio,
ModelInfo,
NewSessionResponse,
PromptResponse,
ResumeSessionResponse,
@@ -36,6 +37,7 @@ from acp.schema import (
SessionCapabilities,
SessionForkCapabilities,
SessionListCapabilities,
SessionModelState,
SessionResumeCapabilities,
SessionInfo,
TextContentBlock,
@@ -147,6 +149,98 @@ class HermesACPAgent(acp.Agent):
self._conn = conn
logger.info("ACP client connected")
@staticmethod
def _encode_model_choice(provider: str | None, model: str | None) -> str:
"""Encode a model selection so ACP clients can keep provider context."""
raw_model = str(model or "").strip()
if not raw_model:
return ""
raw_provider = str(provider or "").strip().lower()
if not raw_provider:
return raw_model
return f"{raw_provider}:{raw_model}"
def _build_model_state(self, state: SessionState) -> SessionModelState | None:
"""Return the ACP model selector payload for editors like Zed."""
model = str(state.model or getattr(state.agent, "model", "") or "").strip()
provider = getattr(state.agent, "provider", None) or detect_provider() or "openrouter"
try:
from hermes_cli.models import curated_models_for_provider, normalize_provider, provider_label
normalized_provider = normalize_provider(provider)
provider_name = provider_label(normalized_provider)
available_models: list[ModelInfo] = []
seen_ids: set[str] = set()
for model_id, description in curated_models_for_provider(normalized_provider):
rendered_model = str(model_id or "").strip()
if not rendered_model:
continue
choice_id = self._encode_model_choice(normalized_provider, rendered_model)
if choice_id in seen_ids:
continue
desc_parts = [f"Provider: {provider_name}"]
if description:
desc_parts.append(str(description).strip())
if rendered_model == model:
desc_parts.append("current")
available_models.append(
ModelInfo(
model_id=choice_id,
name=rendered_model,
description="".join(part for part in desc_parts if part),
)
)
seen_ids.add(choice_id)
current_model_id = self._encode_model_choice(normalized_provider, model)
if current_model_id and current_model_id not in seen_ids:
available_models.insert(
0,
ModelInfo(
model_id=current_model_id,
name=model,
description=f"Provider: {provider_name} • current",
),
)
if available_models:
return SessionModelState(
available_models=available_models,
current_model_id=current_model_id or available_models[0].model_id,
)
except Exception:
logger.debug("Could not build ACP model state", exc_info=True)
if not model:
return None
fallback_choice = self._encode_model_choice(provider, model)
return SessionModelState(
available_models=[ModelInfo(model_id=fallback_choice, name=model)],
current_model_id=fallback_choice,
)
@staticmethod
def _resolve_model_selection(raw_model: str, current_provider: str) -> tuple[str, str]:
"""Resolve ``provider:model`` input into the provider and normalized model id."""
target_provider = current_provider
new_model = raw_model.strip()
try:
from hermes_cli.models import detect_provider_for_model, parse_model_input
target_provider, new_model = parse_model_input(new_model, current_provider)
if target_provider == current_provider:
detected = detect_provider_for_model(new_model, current_provider)
if detected:
target_provider, new_model = detected
except Exception:
logger.debug("Provider detection failed, using model as-is", exc_info=True)
return target_provider, new_model
async def _register_session_mcp_servers(
self,
state: SessionState,
@@ -273,7 +367,10 @@ class HermesACPAgent(acp.Agent):
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("New session %s (cwd=%s)", state.session_id, cwd)
self._schedule_available_commands_update(state.session_id)
return NewSessionResponse(session_id=state.session_id)
return NewSessionResponse(
session_id=state.session_id,
models=self._build_model_state(state),
)
async def load_session(
self,
@@ -289,7 +386,7 @@ class HermesACPAgent(acp.Agent):
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Loaded session %s", session_id)
self._schedule_available_commands_update(session_id)
return LoadSessionResponse()
return LoadSessionResponse(models=self._build_model_state(state))
async def resume_session(
self,
@@ -305,7 +402,7 @@ class HermesACPAgent(acp.Agent):
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Resumed session %s", state.session_id)
self._schedule_available_commands_update(state.session_id)
return ResumeSessionResponse()
return ResumeSessionResponse(models=self._build_model_state(state))
async def cancel(self, session_id: str, **kwargs: Any) -> None:
state = self.session_manager.get_session(session_id)
@@ -340,11 +437,20 @@ class HermesACPAgent(acp.Agent):
cwd: str | None = None,
**kwargs: Any,
) -> ListSessionsResponse:
infos = self.session_manager.list_sessions()
sessions = [
SessionInfo(session_id=s["session_id"], cwd=s["cwd"])
for s in infos
]
infos = self.session_manager.list_sessions(cwd=cwd)
sessions = []
for s in infos:
updated_at = s.get("updated_at")
if updated_at is not None and not isinstance(updated_at, str):
updated_at = str(updated_at)
sessions.append(
SessionInfo(
session_id=s["session_id"],
cwd=s["cwd"],
title=s.get("title"),
updated_at=updated_at,
)
)
return ListSessionsResponse(sessions=sessions)
# ---- Prompt (core) ------------------------------------------------------
@@ -389,12 +495,13 @@ class HermesACPAgent(acp.Agent):
state.cancel_event.clear()
tool_call_ids: dict[str, Deque[str]] = defaultdict(deque)
tool_call_meta: dict[str, dict[str, Any]] = {}
previous_approval_cb = None
if conn:
tool_progress_cb = make_tool_progress_cb(conn, session_id, loop, tool_call_ids)
tool_progress_cb = make_tool_progress_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
thinking_cb = make_thinking_cb(conn, session_id, loop)
step_cb = make_step_cb(conn, session_id, loop, tool_call_ids)
step_cb = make_step_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
message_cb = make_message_cb(conn, session_id, loop)
approval_cb = make_approval_callback(conn.request_permission, loop, session_id)
else:
@@ -449,6 +556,19 @@ class HermesACPAgent(acp.Agent):
self.session_manager.save_session(session_id)
final_response = result.get("final_response", "")
if final_response:
try:
from agent.title_generator import maybe_auto_title
maybe_auto_title(
self.session_manager._get_db(),
session_id,
user_text,
final_response,
state.history,
)
except Exception:
logger.debug("Failed to auto-title ACP session %s", session_id, exc_info=True)
if final_response and conn:
update = acp.update_agent_message_text(final_response)
await conn.session_update(session_id, update)
@@ -556,27 +676,15 @@ class HermesACPAgent(acp.Agent):
provider = getattr(state.agent, "provider", None) or "auto"
return f"Current model: {model}\nProvider: {provider}"
new_model = args.strip()
target_provider = None
current_provider = getattr(state.agent, "provider", None) or "openrouter"
# Auto-detect provider for the requested model
try:
from hermes_cli.models import parse_model_input, detect_provider_for_model
target_provider, new_model = parse_model_input(new_model, current_provider)
if target_provider == current_provider:
detected = detect_provider_for_model(new_model, current_provider)
if detected:
target_provider, new_model = detected
except Exception:
logger.debug("Provider detection failed, using model as-is", exc_info=True)
target_provider, new_model = self._resolve_model_selection(args, current_provider)
state.model = new_model
state.agent = self.session_manager._make_agent(
session_id=state.session_id,
cwd=state.cwd,
model=new_model,
requested_provider=target_provider or current_provider,
requested_provider=target_provider,
)
self.session_manager.save_session(state.session_id)
provider_label = getattr(state.agent, "provider", None) or target_provider or current_provider
@@ -678,20 +786,30 @@ class HermesACPAgent(acp.Agent):
"""Switch the model for a session (called by ACP protocol)."""
state = self.session_manager.get_session(session_id)
if state:
state.model = model_id
current_provider = getattr(state.agent, "provider", None)
current_base_url = getattr(state.agent, "base_url", None)
current_api_mode = getattr(state.agent, "api_mode", None)
requested_provider, resolved_model = self._resolve_model_selection(
model_id,
current_provider or "openrouter",
)
state.model = resolved_model
provider_changed = bool(current_provider and requested_provider != current_provider)
current_base_url = None if provider_changed else getattr(state.agent, "base_url", None)
current_api_mode = None if provider_changed else getattr(state.agent, "api_mode", None)
state.agent = self.session_manager._make_agent(
session_id=session_id,
cwd=state.cwd,
model=model_id,
requested_provider=current_provider,
model=resolved_model,
requested_provider=requested_provider,
base_url=current_base_url,
api_mode=current_api_mode,
)
self.session_manager.save_session(session_id)
logger.info("Session %s: model switched to %s", session_id, model_id)
logger.info(
"Session %s: model switched to %s via provider %s",
session_id,
resolved_model,
requested_provider,
)
return SetSessionModelResponse()
logger.warning("Session %s: model switch requested for missing session", session_id)
return None
+127 -34
View File
@@ -13,8 +13,12 @@ from hermes_constants import get_hermes_home
import copy
import json
import logging
import os
import re
import sys
import time
import uuid
from datetime import datetime, timezone
from dataclasses import dataclass, field
from threading import Lock
from typing import Any, Dict, List, Optional
@@ -22,6 +26,64 @@ from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
def _normalize_cwd_for_compare(cwd: str | None) -> str:
raw = str(cwd or ".").strip()
if not raw:
raw = "."
expanded = os.path.expanduser(raw)
# Normalize Windows drive paths into the equivalent WSL mount form so
# ACP history filters match the same workspace across Windows and WSL.
match = re.match(r"^([A-Za-z]):[\\/](.*)$", expanded)
if match:
drive = match.group(1).lower()
tail = match.group(2).replace("\\", "/")
expanded = f"/mnt/{drive}/{tail}"
elif re.match(r"^/mnt/[A-Za-z]/", expanded):
expanded = f"/mnt/{expanded[5].lower()}/{expanded[7:]}"
return os.path.normpath(expanded)
def _build_session_title(title: Any, preview: Any, cwd: str | None) -> str:
explicit = str(title or "").strip()
if explicit:
return explicit
preview_text = str(preview or "").strip()
if preview_text:
return preview_text
leaf = os.path.basename(str(cwd or "").rstrip("/\\"))
return leaf or "New thread"
def _format_updated_at(value: Any) -> str | None:
if value is None:
return None
if isinstance(value, str) and value.strip():
return value
try:
return datetime.fromtimestamp(float(value), tz=timezone.utc).isoformat()
except Exception:
return None
def _updated_at_sort_key(value: Any) -> float:
if value is None:
return float("-inf")
if isinstance(value, (int, float)):
return float(value)
raw = str(value).strip()
if not raw:
return float("-inf")
try:
return datetime.fromisoformat(raw.replace("Z", "+00:00")).timestamp()
except Exception:
try:
return float(raw)
except Exception:
return float("-inf")
def _acp_stderr_print(*args, **kwargs) -> None:
"""Best-effort human-readable output sink for ACP stdio sessions.
@@ -162,47 +224,78 @@ class SessionManager:
logger.info("Forked ACP session %s -> %s", session_id, new_id)
return state
def list_sessions(self) -> List[Dict[str, Any]]:
def list_sessions(self, cwd: str | None = None) -> List[Dict[str, Any]]:
"""Return lightweight info dicts for all sessions (memory + database)."""
normalized_cwd = _normalize_cwd_for_compare(cwd) if cwd else None
db = self._get_db()
persisted_rows: dict[str, dict[str, Any]] = {}
if db is not None:
try:
for row in db.list_sessions_rich(source="acp", limit=1000):
persisted_rows[str(row["id"])] = dict(row)
except Exception:
logger.debug("Failed to load ACP sessions from DB", exc_info=True)
# Collect in-memory sessions first.
with self._lock:
seen_ids = set(self._sessions.keys())
results = [
{
"session_id": s.session_id,
"cwd": s.cwd,
"model": s.model,
"history_len": len(s.history),
}
for s in self._sessions.values()
]
results = []
for s in self._sessions.values():
history_len = len(s.history)
if history_len <= 0:
continue
if normalized_cwd and _normalize_cwd_for_compare(s.cwd) != normalized_cwd:
continue
persisted = persisted_rows.get(s.session_id, {})
preview = next(
(
str(msg.get("content") or "").strip()
for msg in s.history
if msg.get("role") == "user" and str(msg.get("content") or "").strip()
),
persisted.get("preview") or "",
)
results.append(
{
"session_id": s.session_id,
"cwd": s.cwd,
"model": s.model,
"history_len": history_len,
"title": _build_session_title(persisted.get("title"), preview, s.cwd),
"updated_at": _format_updated_at(
persisted.get("last_active") or persisted.get("started_at") or time.time()
),
}
)
# Merge any persisted sessions not currently in memory.
db = self._get_db()
if db is not None:
try:
rows = db.search_sessions(source="acp", limit=1000)
for row in rows:
sid = row["id"]
if sid in seen_ids:
continue
# Extract cwd from model_config JSON.
cwd = "."
mc = row.get("model_config")
if mc:
try:
cwd = json.loads(mc).get("cwd", ".")
except (json.JSONDecodeError, TypeError):
pass
results.append({
"session_id": sid,
"cwd": cwd,
"model": row.get("model") or "",
"history_len": row.get("message_count") or 0,
})
except Exception:
logger.debug("Failed to list ACP sessions from DB", exc_info=True)
for sid, row in persisted_rows.items():
if sid in seen_ids:
continue
message_count = int(row.get("message_count") or 0)
if message_count <= 0:
continue
# Extract cwd from model_config JSON.
session_cwd = "."
mc = row.get("model_config")
if mc:
try:
session_cwd = json.loads(mc).get("cwd", ".")
except (json.JSONDecodeError, TypeError):
pass
if normalized_cwd and _normalize_cwd_for_compare(session_cwd) != normalized_cwd:
continue
results.append({
"session_id": sid,
"cwd": session_cwd,
"model": row.get("model") or "",
"history_len": message_count,
"title": _build_session_title(row.get("title"), row.get("preview"), session_cwd),
"updated_at": _format_updated_at(row.get("last_active") or row.get("started_at")),
})
results.sort(key=lambda item: _updated_at_sort_key(item.get("updated_at")), reverse=True)
return results
def update_cwd(self, session_id: str, cwd: str) -> Optional[SessionState]:
+174 -9
View File
@@ -2,6 +2,7 @@
from __future__ import annotations
import json
import uuid
from typing import Any, Dict, List, Optional
@@ -96,6 +97,170 @@ def build_tool_title(tool_name: str, args: Dict[str, Any]) -> str:
return tool_name
def _build_patch_mode_content(patch_text: str) -> List[Any]:
"""Parse V4A patch mode input into ACP diff blocks when possible."""
if not patch_text:
return [acp.tool_content(acp.text_block(""))]
try:
from tools.patch_parser import OperationType, parse_v4a_patch
operations, error = parse_v4a_patch(patch_text)
if error or not operations:
return [acp.tool_content(acp.text_block(patch_text))]
content: List[Any] = []
for op in operations:
if op.operation == OperationType.UPDATE:
old_chunks: list[str] = []
new_chunks: list[str] = []
for hunk in op.hunks:
old_lines = [line.content for line in hunk.lines if line.prefix in (" ", "-")]
new_lines = [line.content for line in hunk.lines if line.prefix in (" ", "+")]
if old_lines or new_lines:
old_chunks.append("\n".join(old_lines))
new_chunks.append("\n".join(new_lines))
old_text = "\n...\n".join(chunk for chunk in old_chunks if chunk)
new_text = "\n...\n".join(chunk for chunk in new_chunks if chunk)
if old_text or new_text:
content.append(
acp.tool_diff_content(
path=op.file_path,
old_text=old_text or None,
new_text=new_text or "",
)
)
continue
if op.operation == OperationType.ADD:
added_lines = [line.content for hunk in op.hunks for line in hunk.lines if line.prefix == "+"]
content.append(
acp.tool_diff_content(
path=op.file_path,
new_text="\n".join(added_lines),
)
)
continue
if op.operation == OperationType.DELETE:
content.append(
acp.tool_diff_content(
path=op.file_path,
old_text=f"Delete file: {op.file_path}",
new_text="",
)
)
continue
if op.operation == OperationType.MOVE:
content.append(
acp.tool_content(acp.text_block(f"Move file: {op.file_path} -> {op.new_path}"))
)
return content or [acp.tool_content(acp.text_block(patch_text))]
except Exception:
return [acp.tool_content(acp.text_block(patch_text))]
def _strip_diff_prefix(path: str) -> str:
raw = str(path or "").strip()
if raw.startswith(("a/", "b/")):
return raw[2:]
return raw
def _parse_unified_diff_content(diff_text: str) -> List[Any]:
"""Convert unified diff text into ACP diff content blocks."""
if not diff_text:
return []
content: List[Any] = []
current_old_path: Optional[str] = None
current_new_path: Optional[str] = None
old_lines: list[str] = []
new_lines: list[str] = []
def _flush() -> None:
nonlocal current_old_path, current_new_path, old_lines, new_lines
if current_old_path is None and current_new_path is None:
return
path = current_new_path if current_new_path and current_new_path != "/dev/null" else current_old_path
if not path or path == "/dev/null":
current_old_path = None
current_new_path = None
old_lines = []
new_lines = []
return
content.append(
acp.tool_diff_content(
path=_strip_diff_prefix(path),
old_text="\n".join(old_lines) if old_lines else None,
new_text="\n".join(new_lines),
)
)
current_old_path = None
current_new_path = None
old_lines = []
new_lines = []
for line in diff_text.splitlines():
if line.startswith("--- "):
_flush()
current_old_path = line[4:].strip()
continue
if line.startswith("+++ "):
current_new_path = line[4:].strip()
continue
if line.startswith("@@"):
continue
if current_old_path is None and current_new_path is None:
continue
if line.startswith("+"):
new_lines.append(line[1:])
elif line.startswith("-"):
old_lines.append(line[1:])
elif line.startswith(" "):
shared = line[1:]
old_lines.append(shared)
new_lines.append(shared)
_flush()
return content
def _build_tool_complete_content(
tool_name: str,
result: Optional[str],
*,
function_args: Optional[Dict[str, Any]] = None,
snapshot: Any = None,
) -> List[Any]:
"""Build structured ACP completion content, falling back to plain text."""
display_result = result or ""
if len(display_result) > 5000:
display_result = display_result[:4900] + f"\n... ({len(result)} chars total, truncated)"
if tool_name in {"write_file", "patch", "skill_manage"}:
try:
from agent.display import extract_edit_diff
diff_text = extract_edit_diff(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
if isinstance(diff_text, str) and diff_text.strip():
diff_content = _parse_unified_diff_content(diff_text)
if diff_content:
return diff_content
except Exception:
pass
return [acp.tool_content(acp.text_block(display_result))]
# ---------------------------------------------------------------------------
# Build ACP content objects for tool-call events
# ---------------------------------------------------------------------------
@@ -119,9 +284,8 @@ def build_tool_start(
new = arguments.get("new_string", "")
content = [acp.tool_diff_content(path=path, new_text=new, old_text=old)]
else:
# Patch mode — show the patch content as text
patch_text = arguments.get("patch", "")
content = [acp.tool_content(acp.text_block(patch_text))]
content = _build_patch_mode_content(patch_text)
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
@@ -178,16 +342,17 @@ def build_tool_complete(
tool_call_id: str,
tool_name: str,
result: Optional[str] = None,
function_args: Optional[Dict[str, Any]] = None,
snapshot: Any = None,
) -> ToolCallProgress:
"""Create a ToolCallUpdate (progress) event for a completed tool call."""
kind = get_tool_kind(tool_name)
# Truncate very large results for the UI
display_result = result or ""
if len(display_result) > 5000:
display_result = display_result[:4900] + f"\n... ({len(result)} chars total, truncated)"
content = [acp.tool_content(acp.text_block(display_result))]
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
return acp.update_tool_call(
tool_call_id,
kind=kind,
+78 -32
View File
@@ -94,6 +94,17 @@ def _normalize_aux_provider(provider: Optional[str]) -> str:
return "custom"
return _PROVIDER_ALIASES.get(normalized, normalized)
_FIXED_TEMPERATURE_MODELS: Dict[str, float] = {
"kimi-for-coding": 0.6,
}
def _fixed_temperature_for_model(model: Optional[str]) -> Optional[float]:
"""Return a required temperature override for models with strict contracts."""
normalized = (model or "").strip().lower()
return _FIXED_TEMPERATURE_MODELS.get(normalized)
# Default auxiliary models for direct API-key providers (cheap/fast for side tasks)
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
"gemini": "gemini-3-flash-preview",
@@ -734,6 +745,15 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
elif "generativelanguage.googleapis.com" in base_url.lower():
# Google's OpenAI-compatible endpoint only accepts x-goog-api-key.
# Passing api_key= causes the SDK to inject Authorization: Bearer,
# which Google rejects with HTTP 400 "Multiple authentication
# credentials received". Use a placeholder for api_key and pass
# the real key via x-goog-api-key header instead.
# Fixes: https://github.com/NousResearch/hermes-agent/issues/7893
extra["default_headers"] = {"x-goog-api-key": api_key}
api_key = "not-used"
return OpenAI(api_key=api_key, base_url=base_url, **extra), model
creds = resolve_api_key_provider_credentials(provider_id)
@@ -755,6 +775,15 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
elif "generativelanguage.googleapis.com" in base_url.lower():
# Google's OpenAI-compatible endpoint only accepts x-goog-api-key.
# Passing api_key= causes the SDK to inject Authorization: Bearer,
# which Google rejects with HTTP 400 "Multiple authentication
# credentials received". Use a placeholder for api_key and pass
# the real key via x-goog-api-key header instead.
# Fixes: https://github.com/NousResearch/hermes-agent/issues/7893
extra["default_headers"] = {"x-goog-api-key": api_key}
api_key = "not-used"
return OpenAI(api_key=api_key, base_url=base_url, **extra), model
return None, None
@@ -1064,8 +1093,6 @@ _AUTO_PROVIDER_LABELS = {
"_resolve_api_key_provider": "api-key",
}
_AGGREGATOR_PROVIDERS = frozenset({"openrouter", "nous"})
_MAIN_RUNTIME_FIELDS = ("provider", "model", "base_url", "api_key", "api_mode")
@@ -1196,11 +1223,15 @@ def _resolve_auto(main_runtime: Optional[Dict[str, Any]] = None) -> Tuple[Option
"""Full auto-detection chain.
Priority:
1. If the user's main provider is NOT an aggregator (OpenRouter / Nous),
use their main provider + main model directly. This ensures users on
Alibaba, DeepSeek, ZAI, etc. get auxiliary tasks handled by the same
provider they already have credentials for — no OpenRouter key needed.
2. OpenRouter → Nous → custom → Codex → API-key providers (original chain).
1. User's main provider + main model, regardless of provider type.
This means auxiliary tasks (compression, vision, web extraction,
session search, etc.) use the same model the user configured for
chat. Users on OpenRouter/Nous get their chosen chat model; users
on DeepSeek/ZAI/Alibaba get theirs; etc. Running aux tasks on the
user's picked model keeps behavior predictable — no surprise
switches to a cheap fallback model for side tasks.
2. OpenRouter → Nous → custom → Codex → API-key providers (fallback
chain, only used when the main provider has no working client).
"""
global auxiliary_is_nous, _stale_base_url_warned
auxiliary_is_nous = False # Reset — _try_nous() will set True if it wins
@@ -1230,11 +1261,16 @@ def _resolve_auto(main_runtime: Optional[Dict[str, Any]] = None) -> Tuple[Option
)
_stale_base_url_warned = True
# ── Step 1: non-aggregator main provider → use main model directly ──
# ── Step 1: main provider + main model → use them directly ──
#
# This is the primary aux backend for every user. "auto" means
# "use my main chat model for side tasks as well" — including users
# on aggregators (OpenRouter, Nous) who previously got routed to a
# cheap provider-side default. Explicit per-task overrides set via
# config.yaml (auxiliary.<task>.provider) still win over this.
main_provider = runtime_provider or _read_main_provider()
main_model = runtime_model or _read_main_model()
if (main_provider and main_model
and main_provider not in _AGGREGATOR_PROVIDERS
and main_provider not in ("auto", "")):
resolved_provider = main_provider
explicit_base_url = None
@@ -1593,6 +1629,15 @@ def resolve_provider_client(
from hermes_cli.models import copilot_default_headers
headers.update(copilot_default_headers())
elif "generativelanguage.googleapis.com" in base_url.lower():
# Google's OpenAI-compatible endpoint only accepts x-goog-api-key.
# Passing api_key= causes the OpenAI SDK to inject Authorization: Bearer,
# which Google rejects with HTTP 400 "Multiple authentication credentials
# received". Use a placeholder for api_key and pass the real key via
# x-goog-api-key header instead.
# Fixes: https://github.com/NousResearch/hermes-agent/issues/7893
headers["x-goog-api-key"] = api_key
api_key = "not-used"
client = OpenAI(api_key=api_key, base_url=base_url,
**({"default_headers": headers} if headers else {}))
@@ -1817,34 +1862,31 @@ def resolve_vision_provider_client(
if requested == "auto":
# Vision auto-detection order:
# 1. Active provider + model (user's main chat config)
# 2. OpenRouter (known vision-capable default model)
# 3. Nous Portal (known vision-capable default model)
# 1. User's main provider + main model (including aggregators).
# _PROVIDER_VISION_MODELS provides per-provider vision model
# overrides when the provider has a dedicated multimodal model
# that differs from the chat model (e.g. xiaomi → mimo-v2-omni,
# zai → glm-5v-turbo).
# 2. OpenRouter (vision-capable aggregator fallback)
# 3. Nous Portal (vision-capable aggregator fallback)
# 4. Stop
main_provider = _read_main_provider()
main_model = _read_main_model()
if main_provider and main_provider not in ("auto", ""):
if main_provider in _VISION_AUTO_PROVIDER_ORDER:
# Known strict backend — use its defaults.
sync_client, default_model = _resolve_strict_vision_backend(main_provider)
if sync_client is not None:
return _finalize(main_provider, sync_client, default_model)
else:
# Exotic provider (DeepSeek, Alibaba, Xiaomi, named custom, etc.)
# Use provider-specific vision model if available, otherwise main model.
vision_model = _PROVIDER_VISION_MODELS.get(main_provider, main_model)
rpc_client, rpc_model = resolve_provider_client(
main_provider, vision_model,
api_mode=resolved_api_mode)
if rpc_client is not None:
logger.info(
"Vision auto-detect: using active provider %s (%s)",
main_provider, rpc_model or vision_model,
)
return _finalize(
main_provider, rpc_client, rpc_model or vision_model)
vision_model = _PROVIDER_VISION_MODELS.get(main_provider, main_model)
rpc_client, rpc_model = resolve_provider_client(
main_provider, vision_model,
api_mode=resolved_api_mode)
if rpc_client is not None:
logger.info(
"Vision auto-detect: using main provider %s (%s)",
main_provider, rpc_model or vision_model,
)
return _finalize(
main_provider, rpc_client, rpc_model or vision_model)
# Fall back through aggregators.
# Fall back through aggregators (uses their dedicated vision model,
# not the user's main model) when main provider has no client.
for candidate in _VISION_AUTO_PROVIDER_ORDER:
if candidate == main_provider:
continue # already tried above
@@ -2293,6 +2335,10 @@ def _build_call_kwargs(
"timeout": timeout,
}
fixed_temperature = _fixed_temperature_for_model(model)
if fixed_temperature is not None:
temperature = fixed_temperature
# Opus 4.7+ rejects any non-default temperature/top_p/top_k — silently
# drop here so auxiliary callers that hardcode temperature (e.g. 0.3 on
# flush_memories, 0 on structured-JSON extraction) don't 400 the moment
+22 -1
View File
@@ -1130,6 +1130,14 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
state = _load_provider_state(auth_store, "nous")
if state:
active_sources.add("device_code")
# Prefer a user-supplied label embedded in the singleton state
# (set by persist_nous_credentials(label=...) when the user ran
# `hermes auth add nous --label <name>`). Fall back to the
# auto-derived token fingerprint for logins that didn't supply one.
custom_label = str(state.get("label") or "").strip()
seeded_label = custom_label or label_from_token(
state.get("access_token", ""), "device_code"
)
changed |= _upsert_entry(
entries,
provider,
@@ -1148,7 +1156,7 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
"agent_key": state.get("agent_key"),
"agent_key_expires_at": state.get("agent_key_expires_at"),
"tls": state.get("tls") if isinstance(state.get("tls"), dict) else None,
"label": label_from_token(state.get("access_token", ""), "device_code"),
"label": seeded_label,
},
)
@@ -1208,6 +1216,19 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
logger.debug("Qwen OAuth token seed failed: %s", exc)
elif provider == "openai-codex":
# Respect user suppression — `hermes auth remove openai-codex` marks
# the device_code source as suppressed so it won't be re-seeded from
# either the Hermes auth store or ~/.codex/auth.json. Without this
# gate the removal is instantly undone on the next load_pool() call.
codex_suppressed = False
try:
from hermes_cli.auth import is_source_suppressed
codex_suppressed = is_source_suppressed(provider, "device_code")
except ImportError:
pass
if codex_suppressed:
return changed, active_sources
state = _load_provider_state(auth_store, "openai-codex")
tokens = state.get("tokens") if isinstance(state, dict) else None
# Fallback: import from Codex CLI (~/.codex/auth.json) if Hermes auth
+135 -4
View File
@@ -747,18 +747,149 @@ class GeminiCloudCodeClient:
def _gemini_http_error(response: httpx.Response) -> CodeAssistError:
"""Translate an httpx response into a CodeAssistError with rich metadata.
Parses Google's error envelope (``{"error": {"code", "message", "status",
"details": [...]}}``) so the agent's error classifier can reason about
the failure — ``status_code`` enables the rate_limit / auth classification
paths, and ``response`` lets the main loop honor ``Retry-After`` just
like it does for OpenAI SDK exceptions.
Also lifts a few recognizable Google conditions into human-readable
messages so the user sees something better than a 500-char JSON dump:
MODEL_CAPACITY_EXHAUSTED → "Gemini model capacity exhausted for
<model>. This is a Google-side throttle..."
RESOURCE_EXHAUSTED w/o reason → quota-style message
404 → "Model <name> not found at cloudcode-pa..."
"""
status = response.status_code
# Parse the body once, surviving any weird encodings.
body_text = ""
body_json: Dict[str, Any] = {}
try:
body = response.text[:500]
body_text = response.text
except Exception:
body = ""
# Let run_agent's retry logic see auth errors as rotatable via `api_key`
body_text = ""
if body_text:
try:
parsed = json.loads(body_text)
if isinstance(parsed, dict):
body_json = parsed
except (ValueError, TypeError):
body_json = {}
# Dig into Google's error envelope. Shape is:
# {"error": {"code": 429, "message": "...", "status": "RESOURCE_EXHAUSTED",
# "details": [{"@type": ".../ErrorInfo", "reason": "MODEL_CAPACITY_EXHAUSTED",
# "metadata": {...}},
# {"@type": ".../RetryInfo", "retryDelay": "30s"}]}}
err_obj = body_json.get("error") if isinstance(body_json, dict) else None
if not isinstance(err_obj, dict):
err_obj = {}
err_status = str(err_obj.get("status") or "").strip()
err_message = str(err_obj.get("message") or "").strip()
err_details_list = err_obj.get("details") if isinstance(err_obj.get("details"), list) else []
# Extract google.rpc.ErrorInfo reason + metadata. There may be more
# than one ErrorInfo (rare), so we pick the first one with a reason.
error_reason = ""
error_metadata: Dict[str, Any] = {}
retry_delay_seconds: Optional[float] = None
for detail in err_details_list:
if not isinstance(detail, dict):
continue
type_url = str(detail.get("@type") or "")
if not error_reason and type_url.endswith("/google.rpc.ErrorInfo"):
reason = detail.get("reason")
if isinstance(reason, str) and reason:
error_reason = reason
md = detail.get("metadata")
if isinstance(md, dict):
error_metadata = md
elif retry_delay_seconds is None and type_url.endswith("/google.rpc.RetryInfo"):
# retryDelay is a google.protobuf.Duration string like "30s" or "1.5s".
delay_raw = detail.get("retryDelay")
if isinstance(delay_raw, str) and delay_raw.endswith("s"):
try:
retry_delay_seconds = float(delay_raw[:-1])
except ValueError:
pass
elif isinstance(delay_raw, (int, float)):
retry_delay_seconds = float(delay_raw)
# Fall back to the Retry-After header if the body didn't include RetryInfo.
if retry_delay_seconds is None:
try:
header_val = response.headers.get("Retry-After") or response.headers.get("retry-after")
except Exception:
header_val = None
if header_val:
try:
retry_delay_seconds = float(header_val)
except (TypeError, ValueError):
retry_delay_seconds = None
# Classify the error code. ``code_assist_rate_limited`` stays the default
# for 429s; a more specific reason tag helps downstream callers (e.g. tests,
# logs) without changing the rate_limit classification path.
code = f"code_assist_http_{status}"
if status == 401:
code = "code_assist_unauthorized"
elif status == 429:
code = "code_assist_rate_limited"
if error_reason == "MODEL_CAPACITY_EXHAUSTED":
code = "code_assist_capacity_exhausted"
# Build a human-readable message. Keep the status + a raw-body tail for
# debugging, but lead with a friendlier summary when we recognize the
# Google signal.
model_hint = ""
if isinstance(error_metadata, dict):
model_hint = str(error_metadata.get("model") or error_metadata.get("modelId") or "").strip()
if status == 429 and error_reason == "MODEL_CAPACITY_EXHAUSTED":
target = model_hint or "this Gemini model"
message = (
f"Gemini capacity exhausted for {target} (Google-side throttle, "
f"not a Hermes issue). Try a different Gemini model or set a "
f"fallback_providers entry to a non-Gemini provider."
)
if retry_delay_seconds is not None:
message += f" Google suggests retrying in {retry_delay_seconds:g}s."
elif status == 429 and err_status == "RESOURCE_EXHAUSTED":
message = (
f"Gemini quota exhausted ({err_message or 'RESOURCE_EXHAUSTED'}). "
f"Check /gquota for remaining daily requests."
)
if retry_delay_seconds is not None:
message += f" Retry suggested in {retry_delay_seconds:g}s."
elif status == 404:
# Google returns 404 when a model has been retired or renamed.
target = model_hint or (err_message or "model")
message = (
f"Code Assist 404: {target} is not available at "
f"cloudcode-pa.googleapis.com. It may have been renamed or "
f"retired. Check hermes_cli/models.py for the current list."
)
elif err_message:
# Generic fallback with the parsed message.
message = f"Code Assist HTTP {status} ({err_status or 'error'}): {err_message}"
else:
# Last-ditch fallback — raw body snippet.
message = f"Code Assist returned HTTP {status}: {body_text[:500]}"
return CodeAssistError(
f"Code Assist returned HTTP {status}: {body}",
message,
code=code,
status_code=status,
response=response,
retry_after=retry_delay_seconds,
details={
"status": err_status,
"reason": error_reason,
"metadata": error_metadata,
"message": err_message,
},
)
+37 -1
View File
@@ -68,9 +68,45 @@ _ONBOARDING_POLL_INTERVAL_SECONDS = 5.0
class CodeAssistError(RuntimeError):
def __init__(self, message: str, *, code: str = "code_assist_error") -> None:
"""Exception raised by the Code Assist (``cloudcode-pa``) integration.
Carries HTTP status / response / retry-after metadata so the agent's
``error_classifier._extract_status_code`` and the main loop's Retry-After
handling (which walks ``error.response.headers``) pick up the right
signals. Without these, 429s from the OAuth path look like opaque
``RuntimeError`` and skip the rate-limit path.
"""
def __init__(
self,
message: str,
*,
code: str = "code_assist_error",
status_code: Optional[int] = None,
response: Any = None,
retry_after: Optional[float] = None,
details: Optional[Dict[str, Any]] = None,
) -> None:
super().__init__(message)
self.code = code
# ``status_code`` is picked up by ``agent.error_classifier._extract_status_code``
# so a 429 from Code Assist classifies as FailoverReason.rate_limit and
# triggers the main loop's fallback_providers chain the same way SDK
# errors do.
self.status_code = status_code
# ``response`` is the underlying ``httpx.Response`` (or a shim with a
# ``.headers`` mapping and ``.json()`` method). The main loop reads
# ``error.response.headers["Retry-After"]`` to honor Google's retry
# hints when the backend throttles us.
self.response = response
# Parsed ``Retry-After`` seconds (kept separately for convenience —
# Google returns retry hints in both the header and the error body's
# ``google.rpc.RetryInfo`` details, and we pick whichever we found).
self.retry_after = retry_after
# Parsed structured error details from the Google error envelope
# (e.g. ``{"reason": "MODEL_CAPACITY_EXHAUSTED", "status": "RESOURCE_EXHAUSTED"}``).
# Useful for logging and for tests that want to assert on specifics.
self.details = details or {}
class ProjectIdRequiredError(CodeAssistError):
+5 -26
View File
@@ -634,13 +634,7 @@ class InsightsEngine:
lines.append(f" Sessions: {o['total_sessions']:<12} Messages: {o['total_messages']:,}")
lines.append(f" Tool calls: {o['total_tool_calls']:<12,} User messages: {o['user_messages']:,}")
lines.append(f" Input tokens: {o['total_input_tokens']:<12,} Output tokens: {o['total_output_tokens']:,}")
cache_total = o.get("total_cache_read_tokens", 0) + o.get("total_cache_write_tokens", 0)
if cache_total > 0:
lines.append(f" Cache read: {o['total_cache_read_tokens']:<12,} Cache write: {o['total_cache_write_tokens']:,}")
cost_str = f"${o['estimated_cost']:.2f}"
if o.get("models_without_pricing"):
cost_str += " *"
lines.append(f" Total tokens: {o['total_tokens']:<12,} Est. cost: {cost_str}")
lines.append(f" Total tokens: {o['total_tokens']:,}")
if o["total_hours"] > 0:
lines.append(f" Active time: ~{_format_duration(o['total_hours'] * 3600):<11} Avg session: ~{_format_duration(o['avg_session_duration'])}")
lines.append(f" Avg msgs/session: {o['avg_messages_per_session']:.1f}")
@@ -650,16 +644,10 @@ class InsightsEngine:
if report["models"]:
lines.append(" 🤖 Models Used")
lines.append(" " + "" * 56)
lines.append(f" {'Model':<30} {'Sessions':>8} {'Tokens':>12} {'Cost':>8}")
lines.append(f" {'Model':<30} {'Sessions':>8} {'Tokens':>12}")
for m in report["models"]:
model_name = m["model"][:28]
if m.get("has_pricing"):
cost_cell = f"${m['cost']:>6.2f}"
else:
cost_cell = " N/A"
lines.append(f" {model_name:<30} {m['sessions']:>8} {m['total_tokens']:>12,} {cost_cell}")
if o.get("models_without_pricing"):
lines.append(" * Cost N/A for custom/self-hosted models")
lines.append(f" {model_name:<30} {m['sessions']:>8} {m['total_tokens']:>12,}")
lines.append("")
# Platform breakdown
@@ -739,15 +727,7 @@ class InsightsEngine:
# Overview
lines.append(f"**Sessions:** {o['total_sessions']} | **Messages:** {o['total_messages']:,} | **Tool calls:** {o['total_tool_calls']:,}")
cache_total = o.get("total_cache_read_tokens", 0) + o.get("total_cache_write_tokens", 0)
if cache_total > 0:
lines.append(f"**Tokens:** {o['total_tokens']:,} (in: {o['total_input_tokens']:,} / out: {o['total_output_tokens']:,} / cache: {cache_total:,})")
else:
lines.append(f"**Tokens:** {o['total_tokens']:,} (in: {o['total_input_tokens']:,} / out: {o['total_output_tokens']:,})")
cost_note = ""
if o.get("models_without_pricing"):
cost_note = " _(excludes custom/self-hosted models)_"
lines.append(f"**Est. cost:** ${o['estimated_cost']:.2f}{cost_note}")
lines.append(f"**Tokens:** {o['total_tokens']:,} (in: {o['total_input_tokens']:,} / out: {o['total_output_tokens']:,})")
if o["total_hours"] > 0:
lines.append(f"**Active time:** ~{_format_duration(o['total_hours'] * 3600)} | **Avg session:** ~{_format_duration(o['avg_session_duration'])}")
lines.append("")
@@ -756,8 +736,7 @@ class InsightsEngine:
if report["models"]:
lines.append("**🤖 Models:**")
for m in report["models"][:5]:
cost_str = f"${m['cost']:.2f}" if m.get("has_pricing") else "N/A"
lines.append(f" {m['model'][:25]}{m['sessions']} sessions, {m['total_tokens']:,} tokens, {cost_str}")
lines.append(f" {m['model'][:25]}{m['sessions']} sessions, {m['total_tokens']:,} tokens")
lines.append("")
# Platforms (if multi-platform)
+4 -1
View File
@@ -38,6 +38,7 @@ _PROVIDER_PREFIXES: frozenset[str] = frozenset({
"mimo", "xiaomi-mimo",
"arcee-ai", "arceeai",
"xai", "x-ai", "x.ai", "grok",
"nvidia", "nim", "nvidia-nim", "nemotron",
"qwen-portal",
})
@@ -124,7 +125,6 @@ DEFAULT_CONTEXT_LENGTHS = {
"gemini": 1048576,
# Gemma (open models served via AI Studio)
"gemma-4-31b": 256000,
"gemma-4-26b": 256000,
"gemma-3": 131072,
"gemma": 8192, # fallback for older gemma models
# DeepSeek
@@ -158,6 +158,8 @@ DEFAULT_CONTEXT_LENGTHS = {
"grok": 131072, # catch-all (grok-beta, unknown grok-*)
# Kimi
"kimi": 262144,
# Nemotron — NVIDIA's open-weights series (128K context across all sizes)
"nemotron": 131072,
# Arcee
"trinity": 262144,
# OpenRouter
@@ -240,6 +242,7 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"api.fireworks.ai": "fireworks",
"opencode.ai": "opencode-go",
"api.x.ai": "xai",
"integrate.api.nvidia.com": "nvidia",
"api.xiaomimimo.com": "xiaomi",
"xiaomimimo.com": "xiaomi",
"ollama.com": "ollama-cloud",
+7 -6
View File
@@ -654,7 +654,7 @@ def build_skills_system_prompt(
):
continue
skills_by_category.setdefault(category, []).append(
(skill_name, entry.get("description", ""))
(frontmatter_name, entry.get("description", ""))
)
category_descriptions = {
str(k): str(v)
@@ -679,7 +679,7 @@ def build_skills_system_prompt(
):
continue
skills_by_category.setdefault(entry["category"], []).append(
(skill_name, entry["description"])
(entry["frontmatter_name"], entry["description"])
)
# Read category-level DESCRIPTION.md files
@@ -722,9 +722,10 @@ def build_skills_system_prompt(
continue
entry = _build_snapshot_entry(skill_file, ext_dir, frontmatter, desc)
skill_name = entry["skill_name"]
if skill_name in seen_skill_names:
frontmatter_name = entry["frontmatter_name"]
if frontmatter_name in seen_skill_names:
continue
if entry["frontmatter_name"] in disabled or skill_name in disabled:
if frontmatter_name in disabled or skill_name in disabled:
continue
if not _skill_should_show(
extract_skill_conditions(frontmatter),
@@ -732,9 +733,9 @@ def build_skills_system_prompt(
available_toolsets,
):
continue
seen_skill_names.add(skill_name)
seen_skill_names.add(frontmatter_name)
skills_by_category.setdefault(entry["category"], []).append(
(skill_name, entry["description"])
(frontmatter_name, entry["description"])
)
except Exception as e:
logger.debug("Error reading external skill %s: %s", skill_file, e)
+1
View File
@@ -24,6 +24,7 @@ model:
# "minimax" - MiniMax global (requires: MINIMAX_API_KEY)
# "minimax-cn" - MiniMax China (requires: MINIMAX_CN_API_KEY)
# "huggingface" - Hugging Face Inference (requires: HF_TOKEN)
# "nvidia" - NVIDIA NIM / build.nvidia.com (requires: NVIDIA_API_KEY)
# "xiaomi" - Xiaomi MiMo (requires: XIAOMI_API_KEY)
# "arcee" - Arcee AI Trinity models (requires: ARCEEAI_API_KEY)
# "ollama-cloud" - Ollama Cloud (requires: OLLAMA_API_KEY — https://ollama.com/settings)
+282 -25
View File
@@ -18,6 +18,8 @@ import os
import shutil
import sys
import json
import re
import base64
import atexit
import tempfile
import time
@@ -78,6 +80,42 @@ _project_env = Path(__file__).parent / '.env'
load_hermes_dotenv(hermes_home=_hermes_home, project_env=_project_env)
_REASONING_TAGS = (
"REASONING_SCRATCHPAD",
"think",
"reasoning",
"THINKING",
"thinking",
)
def _strip_reasoning_tags(text: str) -> str:
cleaned = text
for tag in _REASONING_TAGS:
cleaned = re.sub(rf"<{tag}>.*?</{tag}>\s*", "", cleaned, flags=re.DOTALL)
cleaned = re.sub(rf"<{tag}>.*$", "", cleaned, flags=re.DOTALL)
return cleaned.strip()
def _assistant_content_as_text(content: Any) -> str:
if content is None:
return ""
if isinstance(content, str):
return content
if isinstance(content, list):
parts = [
str(part.get("text", ""))
for part in content
if isinstance(part, dict) and part.get("type") == "text"
]
return "\n".join(p for p in parts if p)
return str(content)
def _assistant_copy_text(content: Any) -> str:
return _strip_reasoning_tags(_assistant_content_as_text(content))
# =============================================================================
# Configuration Loading
# =============================================================================
@@ -1172,6 +1210,10 @@ def _resolve_attachment_path(raw_path: str) -> Path | None:
return None
expanded = os.path.expandvars(os.path.expanduser(token))
if os.name != "nt":
normalized = expanded.replace("\\", "/")
if len(normalized) >= 3 and normalized[1] == ":" and normalized[2] == "/" and normalized[0].isalpha():
expanded = f"/mnt/{normalized[0].lower()}/{normalized[3:]}"
path = Path(expanded)
if not path.is_absolute():
base_dir = Path(os.getenv("TERMINAL_CWD", os.getcwd()))
@@ -1254,10 +1296,12 @@ def _detect_file_drop(user_input: str) -> "dict | None":
or stripped.startswith("~")
or stripped.startswith("./")
or stripped.startswith("../")
or (len(stripped) >= 3 and stripped[1] == ":" and stripped[2] in ("\\", "/") and stripped[0].isalpha())
or stripped.startswith('"/')
or stripped.startswith('"~')
or stripped.startswith("'/")
or stripped.startswith("'~")
or (len(stripped) >= 4 and stripped[0] in ("'", '"') and stripped[2] == ":" and stripped[3] in ("\\", "/") and stripped[1].isalpha())
)
if not starts_like_path:
return None
@@ -3125,21 +3169,6 @@ class HermesCLI:
MAX_ASST_LEN = 200 # truncate assistant text
MAX_ASST_LINES = 3 # max lines of assistant text
def _strip_reasoning(text: str) -> str:
"""Remove <REASONING_SCRATCHPAD>...</REASONING_SCRATCHPAD> blocks
from displayed text (reasoning model internal thoughts)."""
import re
cleaned = re.sub(
r"<REASONING_SCRATCHPAD>.*?</REASONING_SCRATCHPAD>\s*",
"", text, flags=re.DOTALL,
)
# Also strip unclosed reasoning tags at the end
cleaned = re.sub(
r"<REASONING_SCRATCHPAD>.*$",
"", cleaned, flags=re.DOTALL,
)
return cleaned.strip()
# Collect displayable entries (skip system, tool-result messages)
entries = [] # list of (role, display_text)
_last_asst_idx = None # index of last assistant entry
@@ -3171,7 +3200,7 @@ class HermesCLI:
elif role == "assistant":
text = "" if content is None else str(content)
text = _strip_reasoning(text)
text = _strip_reasoning_tags(text)
parts = []
full_parts = [] # un-truncated version
if text:
@@ -3510,6 +3539,26 @@ class HermesCLI:
killed = process_registry.kill_all()
print(f" ✅ Stopped {killed} process(es).")
def _handle_agents_command(self):
"""Handle /agents — show background processes and agent status."""
from tools.process_registry import format_uptime_short, process_registry
processes = process_registry.list_sessions()
running = [p for p in processes if p.get("status") == "running"]
finished = [p for p in processes if p.get("status") != "running"]
_cprint(f" Running processes: {len(running)}")
for p in running:
cmd = p.get("command", "")[:80]
up = format_uptime_short(p.get("uptime_seconds", 0))
_cprint(f" {p.get('session_id', '?')} · {up} · {cmd}")
if finished:
_cprint(f" Recently finished: {len(finished)}")
agent_running = getattr(self, "_agent_running", False)
_cprint(f" Agent: {'running' if agent_running else 'idle'}")
def _handle_paste_command(self):
"""Handle /paste — explicitly check clipboard for an image.
@@ -3535,6 +3584,61 @@ class HermesCLI:
else:
_cprint(f" {_DIM}(._.) No image found in clipboard{_RST}")
def _write_osc52_clipboard(self, text: str) -> None:
"""Copy *text* to terminal clipboard via OSC 52."""
payload = base64.b64encode(text.encode("utf-8")).decode("ascii")
seq = f"\x1b]52;c;{payload}\x07"
out = getattr(self, "_app", None)
output = getattr(out, "output", None) if out else None
if output and hasattr(output, "write_raw"):
output.write_raw(seq)
output.flush()
return
if output and hasattr(output, "write"):
output.write(seq)
output.flush()
return
sys.stdout.write(seq)
sys.stdout.flush()
def _handle_copy_command(self, cmd_original: str) -> None:
"""Handle /copy [number] — copy assistant output to clipboard."""
parts = cmd_original.split(maxsplit=1)
arg = parts[1].strip() if len(parts) > 1 else ""
assistant = [m for m in self.conversation_history if m.get("role") == "assistant"]
if not assistant:
_cprint(" Nothing to copy yet.")
return
if arg:
try:
idx = int(arg) - 1
except ValueError:
_cprint(" Usage: /copy [number]")
return
if idx < 0 or idx >= len(assistant):
_cprint(f" Invalid response number. Use 1-{len(assistant)}.")
return
else:
idx = len(assistant) - 1
while idx >= 0 and not _assistant_copy_text(assistant[idx].get("content")):
idx -= 1
if idx < 0:
_cprint(" Nothing to copy in assistant responses yet.")
return
text = _assistant_copy_text(assistant[idx].get("content"))
if not text:
_cprint(" Nothing to copy in that assistant response.")
return
try:
self._write_osc52_clipboard(text)
_cprint(f" Copied assistant response #{idx + 1} to clipboard")
except Exception as e:
_cprint(f" Clipboard copy failed: {e}")
def _handle_image_command(self, cmd_original: str):
"""Handle /image <path> — attach a local image file for the next prompt."""
raw_args = (cmd_original.split(None, 1)[1].strip() if " " in cmd_original else "")
@@ -3671,7 +3775,7 @@ class HermesCLI:
skin = get_active_skin()
separator_color = skin.get_color("banner_dim", "#B8860B")
accent_color = skin.get_color("ui_accent", "#FFBF00")
label_color = skin.get_color("ui_label", "#4dd0e1")
label_color = skin.get_color("ui_label", "#DAA520")
except Exception:
separator_color, accent_color, label_color = "#B8860B", "#FFBF00", "cyan"
toolsets_info = ""
@@ -4514,6 +4618,34 @@ class HermesCLI:
self._restore_modal_input_snapshot()
self._invalidate(min_interval=0.0)
@staticmethod
def _compute_model_picker_viewport(
selected: int,
scroll_offset: int,
n: int,
term_rows: int,
reserved_below: int = 6,
panel_chrome: int = 6,
min_visible: int = 3,
) -> tuple[int, int]:
"""Resolve (scroll_offset, visible) for the /model picker viewport.
``reserved_below`` matches the approval / clarify panels input area,
status bar, and separators below the panel. ``panel_chrome`` covers
this panel's own borders + blanks + hint row. The remaining rows hold
the scrollable list, with the offset slid to keep ``selected`` on screen.
"""
max_visible = max(min_visible, term_rows - reserved_below - panel_chrome)
if n <= max_visible:
return 0, n
visible = max_visible
if selected < scroll_offset:
scroll_offset = selected
elif selected >= scroll_offset + visible:
scroll_offset = selected - visible + 1
scroll_offset = max(0, min(scroll_offset, n - visible))
return scroll_offset, visible
def _apply_model_switch_result(self, result, persist_global: bool) -> None:
if not result.success:
_cprint(f"{result.error_message}")
@@ -5525,6 +5657,8 @@ class HermesCLI:
self._show_usage()
elif canonical == "insights":
self._show_insights(cmd_original)
elif canonical == "copy":
self._handle_copy_command(cmd_original)
elif canonical == "debug":
self._handle_debug_command()
elif canonical == "paste":
@@ -5568,6 +5702,8 @@ class HermesCLI:
self._handle_snapshot_command(cmd_original)
elif canonical == "stop":
self._handle_stop_command()
elif canonical == "agents":
self._handle_agents_command()
elif canonical == "background":
self._handle_background_command(cmd_original)
elif canonical == "btw":
@@ -5584,6 +5720,30 @@ class HermesCLI:
_cprint(f" Queued for the next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
else:
_cprint(f" Queued: {payload[:80]}{'...' if len(payload) > 80 else ''}")
elif canonical == "steer":
# Inject a message after the next tool call without interrupting.
# If the agent is actively running, push the text into the agent's
# pending_steer slot — the drain hook in _execute_tool_calls_*
# will append it to the next tool result's content. If no agent
# is running, fall back to queue semantics (same as /queue).
parts = cmd_original.split(None, 1)
payload = parts[1].strip() if len(parts) > 1 else ""
if not payload:
_cprint(" Usage: /steer <prompt>")
elif self._agent_running and self.agent is not None and hasattr(self.agent, "steer"):
try:
accepted = self.agent.steer(payload)
except Exception as exc:
_cprint(f" Steer failed: {exc}")
else:
if accepted:
_cprint(f" ⏩ Steer queued — arrives after the next tool call: {payload[:80]}{'...' if len(payload) > 80 else ''}")
else:
_cprint(" Steer rejected (empty payload).")
else:
# No active run — treat as a normal next-turn message.
self._pending_input.put(payload)
_cprint(f" No agent running; queued as next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
elif canonical == "skin":
self._handle_skin_command(cmd_original)
elif canonical == "voice":
@@ -6881,8 +7041,7 @@ class HermesCLI:
)
raise RuntimeError(
"Voice mode requires sounddevice and numpy.\n"
"Install with: pip install sounddevice numpy\n"
"Or: pip install hermes-agent[voice]"
f"Install with: {sys.executable} -m pip install sounddevice numpy"
)
if not reqs.get("stt_available", reqs.get("stt_key_set")):
raise RuntimeError(
@@ -7158,8 +7317,7 @@ class HermesCLI:
_cprint(f" {_DIM}Then install/update the Termux:API Android app for microphone capture{_RST}")
_cprint(f" {_BOLD}Option 2: pkg install python-numpy portaudio && python -m pip install sounddevice{_RST}")
else:
_cprint(f"\n {_BOLD}Install: pip install {' '.join(reqs['missing_packages'])}{_RST}")
_cprint(f" {_DIM}Or: pip install hermes-agent[voice]{_RST}")
_cprint(f"\n {_BOLD}Install: {sys.executable} -m pip install {' '.join(reqs['missing_packages'])}{_RST}")
return
with self._voice_lock:
@@ -8110,7 +8268,15 @@ class HermesCLI:
else:
print(f"\n⚡ Sending after interrupt: '{preview}'")
self._pending_input.put(combined)
# If a /steer was left over (agent finished before another tool
# batch could absorb it), deliver it as the next user turn.
_leftover_steer = result.get("pending_steer") if result else None
if _leftover_steer and hasattr(self, '_pending_input'):
preview = _leftover_steer[:60] + ("..." if len(_leftover_steer) > 60 else "")
print(f"\n⏩ Delivering leftover /steer as next turn: '{preview}'")
self._pending_input.put(_leftover_steer)
return response
except Exception as e:
@@ -8528,6 +8694,7 @@ class HermesCLI:
# --- /model picker modal ---
if self._model_picker_state:
self._handle_model_picker_selection()
event.app.current_buffer.reset()
event.app.invalidate()
return
@@ -8693,6 +8860,13 @@ class HermesCLI:
state["selected"] = min(max_idx, state.get("selected", 0) + 1)
event.app.invalidate()
@kb.add('escape', filter=Condition(lambda: bool(self._model_picker_state)), eager=True)
def model_picker_escape(event):
"""ESC closes the /model picker."""
self._close_model_picker()
event.app.current_buffer.reset()
event.app.invalidate()
# --- History navigation: up/down browse history in normal input mode ---
# The TextArea is multiline, so by default up/down only move the cursor.
# Buffer.auto_up/auto_down handle both: cursor movement when multi-line,
@@ -9494,6 +9668,22 @@ class HermesCLI:
box_width = _panel_box_width(title, [hint] + choices, min_width=46, max_width=84)
inner_text_width = max(8, box_width - 6)
selected = state.get("selected", 0)
# Scrolling viewport: the panel renders into a Window with no max
# height, so without limiting visible items the bottom border and
# any items past the available terminal rows get clipped on long
# provider catalogs (e.g. Ollama Cloud's 36+ models).
try:
from prompt_toolkit.application import get_app
term_rows = get_app().output.get_size().rows
except Exception:
term_rows = shutil.get_terminal_size((100, 24)).lines
scroll_offset, visible = HermesCLI._compute_model_picker_viewport(
selected, state.get("_scroll_offset", 0), len(choices), term_rows,
)
state["_scroll_offset"] = scroll_offset
lines = []
lines.append(('class:clarify-border', '╭─ '))
lines.append(('class:clarify-title', title))
@@ -9501,8 +9691,8 @@ class HermesCLI:
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
_append_panel_line(lines, 'class:clarify-border', 'class:clarify-hint', hint, box_width)
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
selected = state.get("selected", 0)
for idx, choice in enumerate(choices):
for idx in range(scroll_offset, scroll_offset + visible):
choice = choices[idx]
style = 'class:clarify-selected' if idx == selected else 'class:clarify-choice'
prefix = ' ' if idx == selected else ' '
for wrapped in _wrap_panel_text(prefix + choice, inner_text_width, subsequent_indent=' '):
@@ -9907,8 +10097,36 @@ class HermesCLI:
# Register signal handlers for graceful shutdown on SSH disconnect / SIGTERM
def _signal_handler(signum, frame):
"""Handle SIGHUP/SIGTERM by triggering graceful cleanup."""
"""Handle SIGHUP/SIGTERM by triggering graceful cleanup.
Calls ``self.agent.interrupt()`` first so the agent daemon
thread's poll loop sees the per-thread interrupt and kills the
tool's subprocess group via ``_kill_process`` (os.killpg).
Without this, the main thread dies from KeyboardInterrupt and
the daemon thread is killed with it before it can run one
more poll iteration to clean up the subprocess, which was
spawned with ``os.setsid`` and therefore survives as an orphan
with PPID=1.
Grace window (``HERMES_SIGTERM_GRACE``, default 1.5 s) gives
the daemon time to: detect the interrupt (next 200 ms poll)
call _kill_process (SIGTERM + 1 s wait + SIGKILL if needed)
return from _wait_for_process. ``time.sleep`` releases the
GIL so the daemon actually runs during the window.
"""
logger.debug("Received signal %s, triggering graceful shutdown", signum)
try:
if getattr(self, "agent", None) and getattr(self, "_agent_running", False):
self.agent.interrupt(f"received signal {signum}")
import time as _t
try:
_grace = float(os.getenv("HERMES_SIGTERM_GRACE", "1.5"))
except (TypeError, ValueError):
_grace = 1.5
if _grace > 0:
_t.sleep(_grace)
except Exception:
pass # never block signal handling
raise KeyboardInterrupt()
try:
@@ -10211,6 +10429,45 @@ def main(
# Register cleanup for single-query mode (interactive mode registers in run())
atexit.register(_run_cleanup)
# Also install signal handlers in single-query / `-q` mode. Interactive
# mode registers its own inside HermesCLI.run(), but `-q` runs
# cli.agent.run_conversation() below and AIAgent spawns worker threads
# for tools — so when SIGTERM arrives on the main thread, raising
# KeyboardInterrupt only unwinds the main thread, not the worker
# running _wait_for_process. Python then exits, the child subprocess
# (spawned with os.setsid, its own process group) is reparented to
# init and keeps running as an orphan.
#
# Fix: route SIGTERM/SIGHUP through agent.interrupt() which sets the
# per-thread interrupt flag the worker's poll loop checks every 200 ms.
# Give the worker a grace window to call _kill_process (SIGTERM to the
# process group, then SIGKILL after 1 s), then raise KeyboardInterrupt
# so main unwinds normally. HERMES_SIGTERM_GRACE overrides the 1.5 s
# default for debugging.
def _signal_handler_q(signum, frame):
logger.debug("Received signal %s in single-query mode", signum)
try:
_agent = getattr(cli, "agent", None)
if _agent is not None:
_agent.interrupt(f"received signal {signum}")
import time as _t
try:
_grace = float(os.getenv("HERMES_SIGTERM_GRACE", "1.5"))
except (TypeError, ValueError):
_grace = 1.5
if _grace > 0:
_t.sleep(_grace)
except Exception:
pass # never block signal handling
raise KeyboardInterrupt()
try:
import signal as _signal
_signal.signal(_signal.SIGTERM, _signal_handler_q)
if hasattr(_signal, "SIGHUP"):
_signal.signal(_signal.SIGHUP, _signal_handler_q)
except Exception:
pass # signal handler may fail in restricted environments
# Handle single query mode
if query or image:
+183 -106
View File
@@ -27,7 +27,7 @@ except ImportError:
except ImportError:
msvcrt = None
from pathlib import Path
from typing import Optional
from typing import List, Optional
# Add parent directory to path for imports BEFORE repo-level imports.
# Without this, standalone invocations (e.g. after `hermes update` reloads
@@ -49,6 +49,33 @@ _KNOWN_DELIVERY_PLATFORMS = frozenset({
"qqbot",
})
# Platforms that support a configured cron/notification home target, mapped to
# the environment variable used by gateway setup/runtime config.
_HOME_TARGET_ENV_VARS = {
"matrix": "MATRIX_HOME_ROOM",
"telegram": "TELEGRAM_HOME_CHANNEL",
"discord": "DISCORD_HOME_CHANNEL",
"slack": "SLACK_HOME_CHANNEL",
"signal": "SIGNAL_HOME_CHANNEL",
"mattermost": "MATTERMOST_HOME_CHANNEL",
"sms": "SMS_HOME_CHANNEL",
"email": "EMAIL_HOME_ADDRESS",
"dingtalk": "DINGTALK_HOME_CHANNEL",
"feishu": "FEISHU_HOME_CHANNEL",
"wecom": "WECOM_HOME_CHANNEL",
"weixin": "WEIXIN_HOME_CHANNEL",
"bluebubbles": "BLUEBUBBLES_HOME_CHANNEL",
"qqbot": "QQBOT_HOME_CHANNEL",
}
# Legacy env var names kept for back-compat. Each entry is the current
# primary env var → the previous name. _get_home_target_chat_id falls
# back to the legacy name if the primary is unset, so users who set the
# old name before the rename keep working until they migrate.
_LEGACY_HOME_TARGET_ENV_VARS = {
"QQBOT_HOME_CHANNEL": "QQ_HOME_CHANNEL",
}
from cron.jobs import get_due_jobs, mark_job_run, save_job_output, advance_next_run
# Sentinel: when a cron agent has nothing new to report, it can start its
@@ -76,15 +103,28 @@ def _resolve_origin(job: dict) -> Optional[dict]:
return None
def _resolve_delivery_target(job: dict) -> Optional[dict]:
"""Resolve the concrete auto-delivery target for a cron job, if any."""
deliver = job.get("deliver", "local")
def _get_home_target_chat_id(platform_name: str) -> str:
"""Return the configured home target chat/room ID for a delivery platform."""
env_var = _HOME_TARGET_ENV_VARS.get(platform_name.lower())
if not env_var:
return ""
value = os.getenv(env_var, "")
if not value:
legacy = _LEGACY_HOME_TARGET_ENV_VARS.get(env_var)
if legacy:
value = os.getenv(legacy, "")
return value
def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[dict]:
"""Resolve one concrete auto-delivery target for a cron job."""
origin = _resolve_origin(job)
if deliver == "local":
if deliver_value == "local":
return None
if deliver == "origin":
if deliver_value == "origin":
if origin:
return {
"platform": origin["platform"],
@@ -93,8 +133,8 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
}
# Origin missing (e.g. job created via API/script) — try each
# platform's home channel as a fallback instead of silently dropping.
for platform_name in ("matrix", "telegram", "discord", "slack", "bluebubbles"):
chat_id = os.getenv(f"{platform_name.upper()}_HOME_CHANNEL", "")
for platform_name in _HOME_TARGET_ENV_VARS:
chat_id = _get_home_target_chat_id(platform_name)
if chat_id:
logger.info(
"Job '%s' has deliver=origin but no origin; falling back to %s home channel",
@@ -108,8 +148,8 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
}
return None
if ":" in deliver:
platform_name, rest = deliver.split(":", 1)
if ":" in deliver_value:
platform_name, rest = deliver_value.split(":", 1)
platform_key = platform_name.lower()
from tools.send_message_tool import _parse_target_ref
@@ -139,7 +179,7 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
"thread_id": thread_id,
}
platform_name = deliver
platform_name = deliver_value
if origin and origin.get("platform") == platform_name:
return {
"platform": platform_name,
@@ -149,7 +189,7 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
if platform_name.lower() not in _KNOWN_DELIVERY_PLATFORMS:
return None
chat_id = os.getenv(f"{platform_name.upper()}_HOME_CHANNEL", "")
chat_id = _get_home_target_chat_id(platform_name)
if not chat_id:
return None
@@ -160,6 +200,30 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
}
def _resolve_delivery_targets(job: dict) -> List[dict]:
"""Resolve all concrete auto-delivery targets for a cron job (supports comma-separated deliver)."""
deliver = job.get("deliver", "local")
if deliver == "local":
return []
parts = [p.strip() for p in str(deliver).split(",") if p.strip()]
seen = set()
targets = []
for part in parts:
target = _resolve_single_delivery_target(job, part)
if target:
key = (target["platform"].lower(), str(target["chat_id"]), target.get("thread_id"))
if key not in seen:
seen.add(key)
targets.append(target)
return targets
def _resolve_delivery_target(job: dict) -> Optional[dict]:
"""Resolve the concrete auto-delivery target for a cron job, if any."""
targets = _resolve_delivery_targets(job)
return targets[0] if targets else None
# Media extension sets — keep in sync with gateway/platforms/base.py:_process_message_background
_AUDIO_EXTS = frozenset({'.ogg', '.opus', '.mp3', '.wav', '.m4a'})
_VIDEO_EXTS = frozenset({'.mp4', '.mov', '.avi', '.mkv', '.webm', '.3gp'})
@@ -200,7 +264,7 @@ def _send_media_via_adapter(adapter, chat_id: str, media_files: list, metadata:
def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Optional[str]:
"""
Deliver job output to the configured target (origin chat, specific platform, etc.).
Deliver job output to the configured target(s) (origin chat, specific platform, etc.).
When ``adapters`` and ``loop`` are provided (gateway is running), tries to
use the live adapter first this supports E2EE rooms (e.g. Matrix) where
@@ -209,33 +273,14 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
Returns None on success, or an error string on failure.
"""
target = _resolve_delivery_target(job)
if not target:
targets = _resolve_delivery_targets(job)
if not targets:
if job.get("deliver", "local") != "local":
msg = f"no delivery target resolved for deliver={job.get('deliver', 'local')}"
logger.warning("Job '%s': %s", job["id"], msg)
return msg
return None # local-only jobs don't deliver — not a failure
platform_name = target["platform"]
chat_id = target["chat_id"]
thread_id = target.get("thread_id")
# Diagnostic: log thread_id for topic-aware delivery debugging
origin = job.get("origin") or {}
origin_thread = origin.get("thread_id")
if origin_thread and not thread_id:
logger.warning(
"Job '%s': origin has thread_id=%s but delivery target lost it "
"(deliver=%s, target=%s)",
job["id"], origin_thread, job.get("deliver", "local"), target,
)
elif thread_id:
logger.debug(
"Job '%s': delivering to %s:%s thread_id=%s",
job["id"], platform_name, chat_id, thread_id,
)
from tools.send_message_tool import _send_to_platform
from gateway.config import load_gateway_config, Platform
@@ -258,24 +303,6 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
"bluebubbles": Platform.BLUEBUBBLES,
"qqbot": Platform.QQBOT,
}
platform = platform_map.get(platform_name.lower())
if not platform:
msg = f"unknown platform '{platform_name}'"
logger.warning("Job '%s': %s", job["id"], msg)
return msg
try:
config = load_gateway_config()
except Exception as e:
msg = f"failed to load gateway config: {e}"
logger.error("Job '%s': %s", job["id"], msg)
return msg
pconfig = config.platforms.get(platform)
if not pconfig or not pconfig.enabled:
msg = f"platform '{platform_name}' not configured/enabled"
logger.warning("Job '%s': %s", job["id"], msg)
return msg
# Optionally wrap the content with a header/footer so the user knows this
# is a cron delivery. Wrapping is on by default; set cron.wrap_response: false
@@ -304,67 +331,117 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
from gateway.platforms.base import BasePlatformAdapter
media_files, cleaned_delivery_content = BasePlatformAdapter.extract_media(delivery_content)
# Prefer the live adapter when the gateway is running — this supports E2EE
# rooms (e.g. Matrix) where the standalone HTTP path cannot encrypt.
runtime_adapter = (adapters or {}).get(platform)
if runtime_adapter is not None and loop is not None and getattr(loop, "is_running", lambda: False)():
send_metadata = {"thread_id": thread_id} if thread_id else None
try:
# Send cleaned text (MEDIA tags stripped) — not the raw content
text_to_send = cleaned_delivery_content.strip()
adapter_ok = True
if text_to_send:
future = asyncio.run_coroutine_threadsafe(
runtime_adapter.send(chat_id, text_to_send, metadata=send_metadata),
loop,
)
send_result = future.result(timeout=60)
if send_result and not getattr(send_result, "success", True):
err = getattr(send_result, "error", "unknown")
logger.warning(
"Job '%s': live adapter send to %s:%s failed (%s), falling back to standalone",
job["id"], platform_name, chat_id, err,
)
adapter_ok = False # fall through to standalone path
try:
config = load_gateway_config()
except Exception as e:
msg = f"failed to load gateway config: {e}"
logger.error("Job '%s': %s", job["id"], msg)
return msg
# Send extracted media files as native attachments via the live adapter
if adapter_ok and media_files:
_send_media_via_adapter(runtime_adapter, chat_id, media_files, send_metadata, loop, job)
delivery_errors = []
if adapter_ok:
logger.info("Job '%s': delivered to %s:%s via live adapter", job["id"], platform_name, chat_id)
return None
except Exception as e:
for target in targets:
platform_name = target["platform"]
chat_id = target["chat_id"]
thread_id = target.get("thread_id")
# Diagnostic: log thread_id for topic-aware delivery debugging
origin = job.get("origin") or {}
origin_thread = origin.get("thread_id")
if origin_thread and not thread_id:
logger.warning(
"Job '%s': live adapter delivery to %s:%s failed (%s), falling back to standalone",
job["id"], platform_name, chat_id, e,
"Job '%s': origin has thread_id=%s but delivery target lost it "
"(deliver=%s, target=%s)",
job["id"], origin_thread, job.get("deliver", "local"), target,
)
elif thread_id:
logger.debug(
"Job '%s': delivering to %s:%s thread_id=%s",
job["id"], platform_name, chat_id, thread_id,
)
# Standalone path: run the async send in a fresh event loop (safe from any thread)
coro = _send_to_platform(platform, pconfig, chat_id, cleaned_delivery_content, thread_id=thread_id, media_files=media_files)
try:
result = asyncio.run(coro)
except RuntimeError:
# asyncio.run() checks for a running loop before awaiting the coroutine;
# when it raises, the original coro was never started — close it to
# prevent "coroutine was never awaited" RuntimeWarning, then retry in a
# fresh thread that has no running loop.
coro.close()
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, cleaned_delivery_content, thread_id=thread_id, media_files=media_files))
result = future.result(timeout=30)
except Exception as e:
msg = f"delivery to {platform_name}:{chat_id} failed: {e}"
logger.error("Job '%s': %s", job["id"], msg)
return msg
platform = platform_map.get(platform_name.lower())
if not platform:
msg = f"unknown platform '{platform_name}'"
logger.warning("Job '%s': %s", job["id"], msg)
delivery_errors.append(msg)
continue
if result and result.get("error"):
msg = f"delivery error: {result['error']}"
logger.error("Job '%s': %s", job["id"], msg)
return msg
# Prefer the live adapter when the gateway is running — this supports E2EE
# rooms (e.g. Matrix) where the standalone HTTP path cannot encrypt.
runtime_adapter = (adapters or {}).get(platform)
delivered = False
if runtime_adapter is not None and loop is not None and getattr(loop, "is_running", lambda: False)():
send_metadata = {"thread_id": thread_id} if thread_id else None
try:
# Send cleaned text (MEDIA tags stripped) — not the raw content
text_to_send = cleaned_delivery_content.strip()
adapter_ok = True
if text_to_send:
future = asyncio.run_coroutine_threadsafe(
runtime_adapter.send(chat_id, text_to_send, metadata=send_metadata),
loop,
)
send_result = future.result(timeout=60)
if send_result and not getattr(send_result, "success", True):
err = getattr(send_result, "error", "unknown")
logger.warning(
"Job '%s': live adapter send to %s:%s failed (%s), falling back to standalone",
job["id"], platform_name, chat_id, err,
)
adapter_ok = False # fall through to standalone path
logger.info("Job '%s': delivered to %s:%s", job["id"], platform_name, chat_id)
# Send extracted media files as native attachments via the live adapter
if adapter_ok and media_files:
_send_media_via_adapter(runtime_adapter, chat_id, media_files, send_metadata, loop, job)
if adapter_ok:
logger.info("Job '%s': delivered to %s:%s via live adapter", job["id"], platform_name, chat_id)
delivered = True
except Exception as e:
logger.warning(
"Job '%s': live adapter delivery to %s:%s failed (%s), falling back to standalone",
job["id"], platform_name, chat_id, e,
)
if not delivered:
pconfig = config.platforms.get(platform)
if not pconfig or not pconfig.enabled:
msg = f"platform '{platform_name}' not configured/enabled"
logger.warning("Job '%s': %s", job["id"], msg)
delivery_errors.append(msg)
continue
# Standalone path: run the async send in a fresh event loop (safe from any thread)
coro = _send_to_platform(platform, pconfig, chat_id, cleaned_delivery_content, thread_id=thread_id, media_files=media_files)
try:
result = asyncio.run(coro)
except RuntimeError:
# asyncio.run() checks for a running loop before awaiting the coroutine;
# when it raises, the original coro was never started — close it to
# prevent "coroutine was never awaited" RuntimeWarning, then retry in a
# fresh thread that has no running loop.
coro.close()
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, cleaned_delivery_content, thread_id=thread_id, media_files=media_files))
result = future.result(timeout=30)
except Exception as e:
msg = f"delivery to {platform_name}:{chat_id} failed: {e}"
logger.error("Job '%s': %s", job["id"], msg)
delivery_errors.append(msg)
continue
if result and result.get("error"):
msg = f"delivery error: {result['error']}"
logger.error("Job '%s': %s", job["id"], msg)
delivery_errors.append(msg)
continue
logger.info("Job '%s': delivered to %s:%s", job["id"], platform_name, chat_id)
if delivery_errors:
return "; ".join(delivery_errors)
return None
@@ -0,0 +1,108 @@
# Ink Gateway TUI Migration — Post-mortem
Planned: 2026-04-01 · Delivered: 2026-04 · Status: shipped, classic (prompt_toolkit) CLI still present
## What Shipped
Three layers, same repo, Python runtime unchanged.
```
ui-tui (Node/TS) ──stdio JSON-RPC──▶ tui_gateway (Py) ──▶ AIAgent (run_agent.py)
```
### Backend — `tui_gateway/`
```
tui_gateway/
├── entry.py # subprocess entrypoint, stdio read/write loop
├── server.py # everything: sessions dict, @method handlers, _emit
├── render.py # stream renderer, diff rendering, message rendering
├── slash_worker.py # subprocess that runs hermes_cli slash commands
└── __init__.py
```
`server.py` owns the full runtime-control surface: session store (`_sessions: dict[str, dict]`), method registry (`@method("…")` decorator), event emitter (`_emit`), agent lifecycle (`_make_agent`, `_init_session`, `_wire_callbacks`), approval/sudo/clarify round-trips, and JSON-RPC dispatch.
Protocol methods (`@method(...)` in `server.py`):
- session: `session.{create, resume, list, close, interrupt, usage, history, compress, branch, title, save, undo}`
- prompt: `prompt.{submit, background, btw}`
- tools: `tools.{list, show, configure}`
- slash: `slash.exec`, `command.{dispatch, resolve}`, `commands.catalog`, `complete.{path, slash}`
- approvals: `approval.respond`, `sudo.respond`, `clarify.respond`, `secret.respond`
- config/state: `config.{get, set, show}`, `model.options`, `reload.mcp`
- ops: `shell.exec`, `cli.exec`, `terminal.resize`, `input.detect_drop`, `clipboard.paste`, `paste.collapse`, `image.attach`, `process.stop`
- misc: `agents.list`, `skills.manage`, `plugins.list`, `cron.manage`, `insights.get`, `rollback.{list, diff, restore}`, `browser.manage`
Protocol events (`_emit(…)` → handled in `ui-tui/src/app/createGatewayEventHandler.ts`):
- lifecycle: `gateway.{ready, stderr}`, `session.info`, `skin.changed`
- stream: `message.{start, delta, complete}`, `thinking.delta`, `reasoning.{delta, available}`, `status.update`
- tools: `tool.{start, progress, complete, generating}`, `subagent.{start, thinking, tool, progress, complete}`
- interactive: `approval.request`, `sudo.request`, `clarify.request`, `secret.request`
- async: `background.complete`, `btw.complete`, `error`
### Frontend — `ui-tui/src/`
```
src/
├── entry.tsx # node bootstrap: bootBanner → spawn python → dynamic-import Ink → render(<App/>)
├── app.tsx # <GatewayProvider> wraps <AppLayout>
├── bootBanner.ts # raw-ANSI banner to stdout in ~2ms, pre-React
├── gatewayClient.ts # JSON-RPC client over child_process stdio
├── gatewayTypes.ts # typed RPC responses + GatewayEvent union
├── theme.ts # DEFAULT_THEME + fromSkin
├── app/ # hooks + stores — the orchestration layer
│ ├── uiStore.ts # nanostore: sid, info, busy, usage, theme, status…
│ ├── turnStore.ts # nanostore: per-turn activity / reasoning / tools
│ ├── turnController.ts # imperative singleton for stream-time operations
│ ├── overlayStore.ts # nanostore: modal/overlay state
│ ├── useMainApp.ts # top-level composition hook
│ ├── useSessionLifecycle.ts # session.create/resume/close/reset
│ ├── useSubmission.ts # shell/slash/prompt dispatch + interpolation
│ ├── useConfigSync.ts # config.get + mtime poll
│ ├── useComposerState.ts # input buffer, paste snippets, editor mode
│ ├── useInputHandlers.ts # key bindings
│ ├── createGatewayEventHandler.ts # event-stream dispatcher
│ ├── createSlashHandler.ts # slash command router (registry + python fallback)
│ └── slash/commands/ # core.ts, ops.ts, session.ts — TS-owned slash commands
├── components/ # AppLayout, AppChrome, AppOverlays, MessageLine, Thinking, Markdown, pickers, prompts, Banner, SessionPanel
├── config/ # env, limits, timing constants
├── content/ # charms, faces, fortunes, hotkeys, placeholders, verbs
├── domain/ # details, messages, paths, roles, slash, usage, viewport
├── protocol/ # interpolation, paste regex
├── hooks/ # useCompletion, useInputHistory, useQueue, useVirtualHistory
└── lib/ # history, messages, osc52, rpc, text
```
### CLI entry points — `hermes_cli/main.py`
- `hermes --tui``node dist/entry.js` (auto-builds when `.ts`/`.tsx` newer than `dist/entry.js`)
- `hermes --tui --dev``tsx src/entry.tsx` (skip build)
- `HERMES_TUI_DIR=…` → external prebuilt dist (nix, distro packaging)
## Diverged From Original Plan
| Plan | Reality | Why |
|---|---|---|
| `tui_gateway/{controller,session_state,events,protocol}.py` | all collapsed into `server.py` | no second consumer ever emerged, keeping one file cheaper than four |
| `ui-tui/src/main.tsx` | split into `entry.tsx` (bootstrap) + `app.tsx` (shell) | boot banner + early python spawn wanted a pre-React moment |
| `ui-tui/src/state/store.ts` | three nanostores (`uiStore`, `turnStore`, `overlayStore`) | separate lifetimes: ui persists, turn resets per reply, overlay is modal |
| `approval.requested` / `sudo.requested` / `clarify.requested` | `*.request` (no `-ed`) | cosmetic |
| `session.cancel` | dropped | `session.interrupt` covers it |
| `HERMES_EXPERIMENTAL_TUI=1`, `display.experimental_tui: true`, `/tui on/off/status` | none shipped | `--tui` went from opt-in to first-class without an experimental phase |
## Post-migration Additions (not in original plan)
- **Async `session.create`** — returns sid in ~1ms, agent builds on a background thread, `session.info` broadcasts when ready; `_wait_agent()` gates every agent-touching handler via `_sess`
- **`bootBanner`** — raw-ANSI logo painted to stdout at T≈2ms, before Ink loads; `<AlternateScreen>` wipes it seamlessly when React mounts
- **Selection uniform bg**`theme.color.selectionBg` wired via `useSelection().setSelectionBgColor`; replaces SGR-inverse per-cell swap that fragmented over amber/gold fg
- **Slash command registry** — TS-owned commands in `app/slash/commands/{core,ops,session}.ts`, everything else falls through to `slash.exec` (python worker)
- **Turn store + controller split** — imperative singleton (`turnController`) holds refs/timers, nanostore (`turnStore`) holds render-visible state
## What's Still Open
- **Classic CLI not deleted.** `cli.py` still has ~80 `prompt_toolkit` references; classic REPL is still the default when `--tui` is absent. The original plan's "Cut 4 · prompt_toolkit removal later" hasn't happened.
- **No config-file opt-in.** `HERMES_EXPERIMENTAL_TUI` and `display.experimental_tui` were never built; only the CLI flag exists. Fine for now — if we want "default to TUI", a single line in `main.py` flips it.
+36 -27
View File
@@ -6,6 +6,11 @@
# All fields are optional — missing values inherit from the default skin.
# Activate with: /skin <name> or display.skin: <name> in config.yaml
#
# Keys are marked:
# (both) — applies to both the classic CLI and the TUI
# (classic) — classic CLI only (see hermes --tui in user-guide/tui.md)
# (tui) — TUI only
#
# See hermes_cli/skin_engine.py for the full schema reference.
# ============================================================================
@@ -14,43 +19,47 @@ name: example
description: An example custom skin — copy and modify this template
# ── Colors ──────────────────────────────────────────────────────────────────
# Hex color values for Rich markup. These control the CLI's visual palette.
# Hex color values. These control the visual palette.
colors:
# Banner panel (the startup welcome box)
# Banner panel (the startup welcome box) — (both)
banner_border: "#CD7F32" # Panel border
banner_title: "#FFD700" # Panel title text
banner_accent: "#FFBF00" # Section headers (Available Tools, Skills, etc.)
banner_dim: "#B8860B" # Dim/muted text (separators, model info)
banner_text: "#FFF8DC" # Body text (tool names, skill names)
# UI elements
ui_accent: "#FFBF00" # General accent color
# UI elements — (both)
ui_accent: "#FFBF00" # General accent (falls back to banner_accent)
ui_label: "#4dd0e1" # Labels
ui_ok: "#4caf50" # Success indicators
ui_error: "#ef5350" # Error indicators
ui_warn: "#ffa726" # Warning indicators
# Input area
prompt: "#FFF8DC" # Prompt text color
input_rule: "#CD7F32" # Horizontal rule around input
prompt: "#FFF8DC" # Prompt text / `` glyph color (both)
input_rule: "#CD7F32" # Horizontal rule above input (classic)
# Response box
response_border: "#FFD700" # Response box border (ANSI color)
# Response box — (classic)
response_border: "#FFD700" # Response box border
# Session display
session_label: "#DAA520" # Session label
session_border: "#8B8682" # Session ID dim color
# Session display — (both)
session_label: "#DAA520" # "Session: " label
session_border: "#8B8682" # Session ID text
# TUI surfaces
status_bar_bg: "#1a1a2e" # Status / usage bar background
voice_status_bg: "#1a1a2e" # Voice-mode badge background
completion_menu_bg: "#1a1a2e" # Completion list background
completion_menu_current_bg: "#333355" # Active completion row background
completion_menu_meta_bg: "#1a1a2e" # Completion meta column background
completion_menu_meta_current_bg: "#333355" # Active completion meta background
# TUI / CLI surfaces — (classic: status bar, voice badge, completion meta)
status_bar_bg: "#1a1a2e" # Status / usage bar background (classic)
voice_status_bg: "#1a1a2e" # Voice-mode badge background (classic)
completion_menu_bg: "#1a1a2e" # Completion list background (both)
completion_menu_current_bg: "#333355" # Active completion row background (both)
completion_menu_meta_bg: "#1a1a2e" # Completion meta column bg (classic)
completion_menu_meta_current_bg: "#333355" # Active meta bg (classic)
# Drag-to-select background — (tui)
selection_bg: "#3a3a55" # Uniform selection highlight in the TUI
# ── Spinner ─────────────────────────────────────────────────────────────────
# Customize the animated spinner shown during API calls and tool execution.
# (classic) — the TUI uses its own animated indicators; spinner config here
# is only read by the classic prompt_toolkit CLI.
spinner:
# Faces shown while waiting for the API response
waiting_faces:
@@ -78,17 +87,17 @@ spinner:
# - ["⟪▲", "▲⟫"]
# ── Branding ────────────────────────────────────────────────────────────────
# Text strings used throughout the CLI interface.
# Text strings used throughout the interface.
branding:
agent_name: "Hermes Agent" # Banner title, about display
welcome: "Welcome! Type your message or /help for commands."
goodbye: "Goodbye! ⚕" # Exit message
response_label: " ⚕ Hermes " # Response box header label
prompt_symbol: " " # Input prompt symbol
help_header: "(^_^)? Available Commands" # /help header text
agent_name: "Hermes Agent" # (both) Banner title, about display
welcome: "Welcome! Type your message or /help for commands." # (both)
goodbye: "Goodbye! ⚕" # (both) Exit message
response_label: " ⚕ Hermes " # (classic) Response box header label
prompt_symbol: " " # (both) Input prompt glyph
help_header: "(^_^)? Available Commands" # (both) /help overlay title
# ── Tool Output ─────────────────────────────────────────────────────────────
# Character used as the prefix for tool output lines.
# Character used as the prefix for tool output lines. (both)
# Default is "┊" (thin dotted vertical line). Some alternatives:
# "╎" (light triple dash vertical)
# "▏" (left one-eighth block)
Generated
+21
View File
@@ -36,6 +36,26 @@
"type": "github"
}
},
"npm-lockfile-fix": {
"inputs": {
"nixpkgs": [
"nixpkgs"
]
},
"locked": {
"lastModified": 1775903712,
"narHash": "sha256-2GV79U6iVH4gKAPWYrxUReB0S41ty/Y3dBLquU8AlaA=",
"owner": "jeslie0",
"repo": "npm-lockfile-fix",
"rev": "c6093acb0c0548e0f9b8b3d82918823721930fe8",
"type": "github"
},
"original": {
"owner": "jeslie0",
"repo": "npm-lockfile-fix",
"type": "github"
}
},
"pyproject-build-systems": {
"inputs": {
"nixpkgs": [
@@ -124,6 +144,7 @@
"inputs": {
"flake-parts": "flake-parts",
"nixpkgs": "nixpkgs",
"npm-lockfile-fix": "npm-lockfile-fix",
"pyproject-build-systems": "pyproject-build-systems",
"pyproject-nix": "pyproject-nix_2",
"uv2nix": "uv2nix_2"
+11 -2
View File
@@ -19,11 +19,20 @@
url = "github:pyproject-nix/build-system-pkgs";
inputs.nixpkgs.follows = "nixpkgs";
};
npm-lockfile-fix = {
url = "github:jeslie0/npm-lockfile-fix";
inputs.nixpkgs.follows = "nixpkgs";
};
};
outputs = inputs:
outputs =
inputs:
inputs.flake-parts.lib.mkFlake { inherit inputs; } {
systems = [ "x86_64-linux" "aarch64-linux" "aarch64-darwin" ];
systems = [
"x86_64-linux"
"aarch64-linux"
"aarch64-darwin"
];
imports = [
./nix/packages.nix
+19 -1
View File
@@ -100,7 +100,7 @@ def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
def _build_discord(adapter) -> List[Dict[str, str]]:
"""Enumerate all text channels the Discord bot can see."""
"""Enumerate all text channels and forum channels the Discord bot can see."""
channels = []
client = getattr(adapter, "_client", None)
if not client:
@@ -119,6 +119,15 @@ def _build_discord(adapter) -> List[Dict[str, str]]:
"guild": guild.name,
"type": "channel",
})
# Forum channels (type 15) — creating a message auto-spawns a thread post.
forums = getattr(guild, "forum_channels", None) or []
for ch in forums:
channels.append({
"id": str(ch.id),
"name": ch.name,
"guild": guild.name,
"type": "forum",
})
# Also include DM-capable users we've interacted with is not
# feasible via guild enumeration; those come from sessions.
@@ -191,6 +200,15 @@ def load_directory() -> Dict[str, Any]:
return {"updated_at": None, "platforms": {}}
def lookup_channel_type(platform_name: str, chat_id: str) -> Optional[str]:
"""Return the channel ``type`` string (e.g. ``"channel"``, ``"forum"``) for *chat_id*, or *None* if unknown."""
directory = load_directory()
for ch in directory.get("platforms", {}).get(platform_name, []):
if ch.get("id") == chat_id:
return ch.get("type")
return None
def resolve_channel_name(platform_name: str, name: str) -> Optional[str]:
"""
Resolve a human-friendly channel name to a numeric ID.
+89 -2
View File
@@ -258,6 +258,13 @@ class GatewayConfig:
# Streaming configuration
streaming: StreamingConfig = field(default_factory=StreamingConfig)
# Session store pruning: drop SessionEntry records older than this many
# days from the in-memory dict and sessions.json. Keeps the store from
# growing unbounded in gateways serving many chats/threads/users over
# months. Pruning is invisible to users — if they resume, they get a
# fresh session exactly as if the reset policy had fired. 0 = disabled.
session_store_max_age_days: int = 90
def get_connected_platforms(self) -> List[Platform]:
"""Return list of platforms that are enabled and configured."""
connected = []
@@ -307,6 +314,14 @@ class GatewayConfig:
# QQBot uses extra dict for app credentials
elif platform == Platform.QQBOT and config.extra.get("app_id") and config.extra.get("client_secret"):
connected.append(platform)
# DingTalk uses client_id/client_secret from config.extra or env vars
elif platform == Platform.DINGTALK and (
config.extra.get("client_id") or os.getenv("DINGTALK_CLIENT_ID")
) and (
config.extra.get("client_secret") or os.getenv("DINGTALK_CLIENT_SECRET")
):
connected.append(platform)
return connected
def get_home_channel(self, platform: Platform) -> Optional[HomeChannel]:
@@ -357,6 +372,7 @@ class GatewayConfig:
"thread_sessions_per_user": self.thread_sessions_per_user,
"unauthorized_dm_behavior": self.unauthorized_dm_behavior,
"streaming": self.streaming.to_dict(),
"session_store_max_age_days": self.session_store_max_age_days,
}
@classmethod
@@ -404,6 +420,13 @@ class GatewayConfig:
"pair",
)
try:
session_store_max_age_days = int(data.get("session_store_max_age_days", 90))
if session_store_max_age_days < 0:
session_store_max_age_days = 0
except (TypeError, ValueError):
session_store_max_age_days = 90
return cls(
platforms=platforms,
default_reset_policy=default_policy,
@@ -418,6 +441,7 @@ class GatewayConfig:
thread_sessions_per_user=_coerce_bool(thread_sessions_per_user, False),
unauthorized_dm_behavior=unauthorized_dm_behavior,
streaming=StreamingConfig.from_dict(data.get("streaming", {})),
session_store_max_age_days=session_store_max_age_days,
)
def get_unauthorized_dm_behavior(self, platform: Optional[Platform] = None) -> str:
@@ -617,6 +641,20 @@ def load_gateway_config() -> GatewayConfig:
if isinstance(ntc, list):
ntc = ",".join(str(v) for v in ntc)
os.environ["DISCORD_NO_THREAD_CHANNELS"] = str(ntc)
# allow_mentions: granular control over what the bot can ping.
# Safe defaults (no @everyone/roles) are applied in the adapter;
# these YAML keys only override when set and let users opt back
# into unsafe modes (e.g. roles=true) if they actually want it.
allow_mentions_cfg = discord_cfg.get("allow_mentions")
if isinstance(allow_mentions_cfg, dict):
for yaml_key, env_key in (
("everyone", "DISCORD_ALLOW_MENTION_EVERYONE"),
("roles", "DISCORD_ALLOW_MENTION_ROLES"),
("users", "DISCORD_ALLOW_MENTION_USERS"),
("replied_user", "DISCORD_ALLOW_MENTION_REPLIED_USER"),
):
if yaml_key in allow_mentions_cfg and not os.getenv(env_key):
os.environ[env_key] = str(allow_mentions_cfg[yaml_key]).lower()
# Telegram settings → env vars (env vars take precedence)
telegram_cfg = yaml_cfg.get("telegram", {})
@@ -663,6 +701,24 @@ def load_gateway_config() -> GatewayConfig:
frc = ",".join(str(v) for v in frc)
os.environ["WHATSAPP_FREE_RESPONSE_CHATS"] = str(frc)
# DingTalk settings → env vars (env vars take precedence)
dingtalk_cfg = yaml_cfg.get("dingtalk", {})
if isinstance(dingtalk_cfg, dict):
if "require_mention" in dingtalk_cfg and not os.getenv("DINGTALK_REQUIRE_MENTION"):
os.environ["DINGTALK_REQUIRE_MENTION"] = str(dingtalk_cfg["require_mention"]).lower()
if "mention_patterns" in dingtalk_cfg and not os.getenv("DINGTALK_MENTION_PATTERNS"):
os.environ["DINGTALK_MENTION_PATTERNS"] = json.dumps(dingtalk_cfg["mention_patterns"])
frc = dingtalk_cfg.get("free_response_chats")
if frc is not None and not os.getenv("DINGTALK_FREE_RESPONSE_CHATS"):
if isinstance(frc, list):
frc = ",".join(str(v) for v in frc)
os.environ["DINGTALK_FREE_RESPONSE_CHATS"] = str(frc)
allowed = dingtalk_cfg.get("allowed_users")
if allowed is not None and not os.getenv("DINGTALK_ALLOWED_USERS"):
if isinstance(allowed, list):
allowed = ",".join(str(v) for v in allowed)
os.environ["DINGTALK_ALLOWED_USERS"] = str(allowed)
# Matrix settings → env vars (env vars take precedence)
matrix_cfg = yaml_cfg.get("matrix", {})
if isinstance(matrix_cfg, dict):
@@ -1006,6 +1062,25 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
if webhook_secret:
config.platforms[Platform.WEBHOOK].extra["secret"] = webhook_secret
# DingTalk
dingtalk_client_id = os.getenv("DINGTALK_CLIENT_ID")
dingtalk_client_secret = os.getenv("DINGTALK_CLIENT_SECRET")
if dingtalk_client_id and dingtalk_client_secret:
if Platform.DINGTALK not in config.platforms:
config.platforms[Platform.DINGTALK] = PlatformConfig()
config.platforms[Platform.DINGTALK].enabled = True
config.platforms[Platform.DINGTALK].extra.update({
"client_id": dingtalk_client_id,
"client_secret": dingtalk_client_secret,
})
dingtalk_home = os.getenv("DINGTALK_HOME_CHANNEL")
if dingtalk_home:
config.platforms[Platform.DINGTALK].home_channel = HomeChannel(
platform=Platform.DINGTALK,
chat_id=dingtalk_home,
name=os.getenv("DINGTALK_HOME_CHANNEL_NAME", "Home"),
)
# Feishu / Lark
feishu_app_id = os.getenv("FEISHU_APP_ID")
feishu_app_secret = os.getenv("FEISHU_APP_SECRET")
@@ -1154,12 +1229,24 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
qq_group_allowed = os.getenv("QQ_GROUP_ALLOWED_USERS", "").strip()
if qq_group_allowed:
extra["group_allow_from"] = qq_group_allowed
qq_home = os.getenv("QQ_HOME_CHANNEL", "").strip()
qq_home = os.getenv("QQBOT_HOME_CHANNEL", "").strip()
qq_home_name_env = "QQBOT_HOME_CHANNEL_NAME"
if not qq_home:
# Back-compat: accept the pre-rename name and log a one-time warning.
legacy_home = os.getenv("QQ_HOME_CHANNEL", "").strip()
if legacy_home:
qq_home = legacy_home
qq_home_name_env = "QQ_HOME_CHANNEL_NAME"
import logging
logging.getLogger(__name__).warning(
"QQ_HOME_CHANNEL is deprecated; rename to QQBOT_HOME_CHANNEL "
"in your .env for consistency with the platform key."
)
if qq_home:
config.platforms[Platform.QQBOT].home_channel = HomeChannel(
platform=Platform.QQBOT,
chat_id=qq_home,
name=os.getenv("QQ_HOME_CHANNEL_NAME", "Home"),
name=os.getenv("QQBOT_HOME_CHANNEL_NAME") or os.getenv(qq_home_name_env, "Home"),
)
# Session settings
+38 -1
View File
@@ -669,6 +669,15 @@ class MessageEvent:
# Original platform data
raw_message: Any = None
message_id: Optional[str] = None
# Platform-specific update identifier. For Telegram this is the
# ``update_id`` from the PTB Update wrapper; other platforms currently
# ignore it. Used by ``/restart`` to record the triggering update so the
# new gateway can advance the Telegram offset past it and avoid processing
# the same ``/restart`` twice if PTB's graceful-shutdown ACK times out
# ("Error while calling `get_updates` one more time to mark all fetched
# updates" in gateway.log).
platform_update_id: Optional[int] = None
# Media attachments
# media_urls: local file paths (for vision tool access)
@@ -1045,16 +1054,40 @@ class BasePlatformAdapter(ABC):
"""
pass
# Default: the adapter treats ``finalize=True`` on edit_message as a
# no-op and is happy to have the stream consumer skip redundant final
# edits. Subclasses that *require* an explicit finalize call to close
# out the message lifecycle (e.g. rich card / AI assistant surfaces
# such as DingTalk AI Cards) override this to True (class attribute or
# property) so the stream consumer knows not to short-circuit.
REQUIRES_EDIT_FINALIZE: bool = False
async def edit_message(
self,
chat_id: str,
message_id: str,
content: str,
*,
finalize: bool = False,
) -> SendResult:
"""
Edit a previously sent message. Optional platforms that don't
support editing return success=False and callers fall back to
sending a new message.
``finalize`` signals that this is the last edit in a streaming
sequence. Most platforms (Telegram, Slack, Discord, Matrix,
etc.) treat it as a no-op because their edit APIs have no notion
of message lifecycle state an edit is an edit. Platforms that
render streaming updates with a distinct "in progress" state and
require explicit closure (e.g. rich card / AI assistant surfaces
such as DingTalk AI Cards) use it to finalize the message and
transition the UI out of the streaming indicator those should
also set ``REQUIRES_EDIT_FINALIZE = True`` so callers route a
final edit through even when content is unchanged. Callers
should set ``finalize=True`` on the final edit of a streamed
response (typically when ``got_done`` fires in the stream
consumer) and leave it ``False`` on intermediate edits.
"""
return SendResult(success=False, error="Not supported")
@@ -1579,7 +1612,9 @@ class BasePlatformAdapter(ABC):
# session lifecycle and its cleanup races with the running task
# (see PR #4926).
cmd = event.get_command()
if cmd in ("approve", "deny", "status", "stop", "new", "reset", "background", "restart", "queue", "q"):
from hermes_cli.commands import should_bypass_active_session
if should_bypass_active_session(cmd):
logger.debug(
"[%s] Command '/%s' bypassing active-session guard for %s",
self.name, cmd, session_key,
@@ -1991,6 +2026,7 @@ class BasePlatformAdapter(ABC):
chat_topic: Optional[str] = None,
user_id_alt: Optional[str] = None,
chat_id_alt: Optional[str] = None,
is_bot: bool = False,
) -> SessionSource:
"""Helper to build a SessionSource for this platform."""
# Normalize empty topic to None
@@ -2007,6 +2043,7 @@ class BasePlatformAdapter(ABC):
chat_topic=chat_topic.strip() if chat_topic else None,
user_id_alt=user_id_alt,
chat_id_alt=chat_id_alt,
is_bot=is_bot,
)
@abstractmethod
File diff suppressed because it is too large Load Diff
+548 -91
View File
@@ -51,7 +51,9 @@ from gateway.platforms.base import (
ProcessingOutcome,
SendResult,
cache_image_from_url,
cache_image_from_bytes,
cache_audio_from_url,
cache_audio_from_bytes,
cache_document_from_bytes,
SUPPORTED_DOCUMENT_TYPES,
)
@@ -80,6 +82,41 @@ def check_discord_requirements() -> bool:
return DISCORD_AVAILABLE
def _build_allowed_mentions():
"""Build Discord ``AllowedMentions`` with safe defaults, overridable via env.
Discord bots default to parsing ``@everyone``, ``@here``, role pings, and
user pings when ``allowed_mentions`` is unset on the client any LLM
output or echoed user content that contains ``@everyone`` would therefore
ping the whole server. We explicitly deny ``@everyone`` and role pings
by default and keep user / replied-user pings enabled so normal
conversation still works.
Override via environment variables (or ``discord.allow_mentions.*`` in
config.yaml):
DISCORD_ALLOW_MENTION_EVERYONE default false @everyone + @here
DISCORD_ALLOW_MENTION_ROLES default false @role pings
DISCORD_ALLOW_MENTION_USERS default true @user pings
DISCORD_ALLOW_MENTION_REPLIED_USER default true reply-ping author
"""
if not DISCORD_AVAILABLE:
return None
def _b(name: str, default: bool) -> bool:
raw = os.getenv(name, "").strip().lower()
if not raw:
return default
return raw in ("true", "1", "yes", "on")
return discord.AllowedMentions(
everyone=_b("DISCORD_ALLOW_MENTION_EVERYONE", False),
roles=_b("DISCORD_ALLOW_MENTION_ROLES", False),
users=_b("DISCORD_ALLOW_MENTION_USERS", True),
replied_user=_b("DISCORD_ALLOW_MENTION_REPLIED_USER", True),
)
class VoiceReceiver:
"""Captures and decodes voice audio from a Discord voice channel.
@@ -458,6 +495,7 @@ class DiscordAdapter(BasePlatformAdapter):
self._client: Optional[commands.Bot] = None
self._ready_event = asyncio.Event()
self._allowed_user_ids: set = set() # For button approval authorization
self._allowed_role_ids: set = set() # For DISCORD_ALLOWED_ROLES filtering
# Voice channel state (per-guild)
self._voice_clients: Dict[int, Any] = {} # guild_id -> VoiceClient
# Text batching: merge rapid successive messages (Telegram-style)
@@ -536,6 +574,15 @@ class DiscordAdapter(BasePlatformAdapter):
if uid.strip()
}
# Parse DISCORD_ALLOWED_ROLES — comma-separated role IDs.
# Users with ANY of these roles can interact with the bot.
roles_env = os.getenv("DISCORD_ALLOWED_ROLES", "")
if roles_env:
self._allowed_role_ids = {
int(rid.strip()) for rid in roles_env.split(",")
if rid.strip().isdigit()
}
# Set up intents.
# Message Content is required for normal text replies.
# Server Members is only needed when the allowlist contains usernames
@@ -547,7 +594,10 @@ class DiscordAdapter(BasePlatformAdapter):
intents.message_content = True
intents.dm_messages = True
intents.guild_messages = True
intents.members = any(not entry.isdigit() for entry in self._allowed_user_ids)
intents.members = (
any(not entry.isdigit() for entry in self._allowed_user_ids)
or bool(self._allowed_role_ids) # Need members intent for role lookup
)
intents.voice_states = True
# Resolve proxy (DISCORD_PROXY > generic env vars > macOS system proxy)
@@ -556,10 +606,15 @@ class DiscordAdapter(BasePlatformAdapter):
if proxy_url:
logger.info("[%s] Using proxy for Discord: %s", self.name, proxy_url)
# Create bot — proxy= for HTTP, connector= for SOCKS
# Create bot — proxy= for HTTP, connector= for SOCKS.
# allowed_mentions is set with safe defaults (no @everyone/roles)
# so LLM output or echoed user content can't ping the whole
# server; override per DISCORD_ALLOW_MENTION_* env vars or the
# discord.allow_mentions.* block in config.yaml.
self._client = commands.Bot(
command_prefix="!", # Not really used, we handle raw messages
intents=intents,
allowed_mentions=_build_allowed_mentions(),
**proxy_kwargs_for_bot(proxy_url),
)
adapter_self = self # capture for closure
@@ -594,14 +649,13 @@ class DiscordAdapter(BasePlatformAdapter):
if message.type not in (discord.MessageType.default, discord.MessageType.reply):
return
# Check if the message author is in the allowed user list
if not self._is_allowed_user(str(message.author.id)):
return
# Bot message filtering (DISCORD_ALLOW_BOTS):
# "none" — ignore all other bots (default)
# "mentions" — accept bot messages only when they @mention us
# "all" — accept all bot messages
# Must run BEFORE the user allowlist check so that bots
# permitted by DISCORD_ALLOW_BOTS are not rejected for
# not being in DISCORD_ALLOWED_USERS (fixes #4466).
if getattr(message.author, "bot", False):
allow_bots = os.getenv("DISCORD_ALLOW_BOTS", "none").lower().strip()
if allow_bots == "none":
@@ -609,7 +663,12 @@ class DiscordAdapter(BasePlatformAdapter):
elif allow_bots == "mentions":
if not self._client.user or self._client.user not in message.mentions:
return
# "all" falls through to handle_message
# "all" falls through; bot is permitted — skip the
# human-user allowlist below (bots aren't in it).
else:
# Non-bot: enforce the configured user/role allowlists.
if not self._is_allowed_user(str(message.author.id), message.author):
return
# Multi-agent filtering: if the message mentions specific bots
# but NOT this bot, the sender is talking to another agent —
@@ -798,6 +857,9 @@ class DiscordAdapter(BasePlatformAdapter):
When metadata contains a thread_id, the message is sent to that
thread instead of the parent channel identified by chat_id.
Forum channels (type 15) reject direct messages a thread post is
created automatically.
"""
if not self._client:
return SendResult(success=False, error="Not connected")
@@ -823,6 +885,10 @@ class DiscordAdapter(BasePlatformAdapter):
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
# Forum channels reject channel.send() — create a thread post instead.
if self._is_forum_parent(channel):
return await self._send_to_forum(channel, content)
# Format and split message if needed
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
@@ -833,7 +899,10 @@ class DiscordAdapter(BasePlatformAdapter):
if reply_to and self._reply_to_mode != "off":
try:
ref_msg = await channel.fetch_message(int(reply_to))
reference = ref_msg
if hasattr(ref_msg, "to_reference"):
reference = ref_msg.to_reference(fail_if_not_exists=False)
else:
reference = ref_msg
except Exception as e:
logger.debug("Could not fetch reply-to message: %s", e)
@@ -851,14 +920,20 @@ class DiscordAdapter(BasePlatformAdapter):
err_text = str(e)
if (
chunk_reference is not None
and "error code: 50035" in err_text
and "Cannot reply to a system message" in err_text
and (
(
"error code: 50035" in err_text
and "Cannot reply to a system message" in err_text
)
or "error code: 10008" in err_text
)
):
logger.warning(
"[%s] Reply target %s is a Discord system message; retrying send without reply reference",
"[%s] Reply target %s rejected the reply reference; retrying send without reply reference",
self.name,
reply_to,
)
reference = None
msg = await channel.send(
content=chunk,
reference=None,
@@ -877,6 +952,120 @@ class DiscordAdapter(BasePlatformAdapter):
logger.error("[%s] Failed to send Discord message: %s", self.name, e, exc_info=True)
return SendResult(success=False, error=str(e))
async def _send_to_forum(self, forum_channel: Any, content: str) -> SendResult:
"""Create a thread post in a forum channel with the message as starter content.
Forum channels (type 15) don't support direct messages. Instead we
POST to /channels/{forum_id}/threads with a thread name derived from
the first line of the message. Any follow-up chunk failures are
reported in ``raw_response['warnings']`` so the caller can surface
partial-send issues.
"""
from tools.send_message_tool import _derive_forum_thread_name
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
thread_name = _derive_forum_thread_name(content)
starter_content = chunks[0] if chunks else thread_name
try:
thread = await forum_channel.create_thread(
name=thread_name,
content=starter_content,
)
except Exception as e:
logger.error("[%s] Failed to create forum thread in %s: %s", self.name, forum_channel.id, e)
return SendResult(success=False, error=f"Forum thread creation failed: {e}")
thread_channel = thread if hasattr(thread, "send") else getattr(thread, "thread", None)
thread_id = str(getattr(thread_channel, "id", getattr(thread, "id", "")))
starter_msg = getattr(thread, "message", None)
message_id = str(getattr(starter_msg, "id", thread_id)) if starter_msg else thread_id
# Send remaining chunks into the newly created thread. Track any
# per-chunk failures so the caller sees partial-send outcomes.
message_ids = [message_id]
warnings: list[str] = []
for chunk in chunks[1:]:
try:
msg = await thread_channel.send(content=chunk)
message_ids.append(str(msg.id))
except Exception as e:
warning = f"Failed to send follow-up chunk to forum thread {thread_id}: {e}"
logger.warning("[%s] %s", self.name, warning)
warnings.append(warning)
raw_response: Dict[str, Any] = {"message_ids": message_ids, "thread_id": thread_id}
if warnings:
raw_response["warnings"] = warnings
return SendResult(
success=True,
message_id=message_ids[0],
raw_response=raw_response,
)
async def _forum_post_file(
self,
forum_channel: Any,
*,
thread_name: Optional[str] = None,
content: str = "",
file: Any = None,
files: Optional[list] = None,
) -> SendResult:
"""Create a forum thread whose starter message carries file attachments.
Used by the send_voice / send_image_file / send_document paths when
the target channel is a forum (type 15). ``create_thread`` on a
ForumChannel accepts the same file/files/content kwargs as
``channel.send``, creating the thread and starter message atomically.
"""
from tools.send_message_tool import _derive_forum_thread_name
if not thread_name:
# Prefer the text content, fall back to the first attached
# filename, fall back to the generic default.
hint = content or ""
if not hint.strip():
if file is not None:
hint = getattr(file, "filename", "") or ""
elif files:
hint = getattr(files[0], "filename", "") or ""
thread_name = _derive_forum_thread_name(hint) if hint.strip() else "New Post"
kwargs: Dict[str, Any] = {"name": thread_name}
if content:
kwargs["content"] = content
if file is not None:
kwargs["file"] = file
if files:
kwargs["files"] = files
try:
thread = await forum_channel.create_thread(**kwargs)
except Exception as e:
logger.error(
"[%s] Failed to create forum thread with file in %s: %s",
self.name,
getattr(forum_channel, "id", "?"),
e,
)
return SendResult(success=False, error=f"Forum thread creation failed: {e}")
thread_channel = thread if hasattr(thread, "send") else getattr(thread, "thread", None)
thread_id = str(getattr(thread_channel, "id", getattr(thread, "id", "")))
starter_msg = getattr(thread, "message", None)
message_id = str(getattr(starter_msg, "id", thread_id)) if starter_msg else thread_id
return SendResult(
success=True,
message_id=message_id,
raw_response={"thread_id": thread_id},
)
async def edit_message(
self,
chat_id: str,
@@ -907,7 +1096,11 @@ class DiscordAdapter(BasePlatformAdapter):
caption: Optional[str] = None,
file_name: Optional[str] = None,
) -> SendResult:
"""Send a local file as a Discord attachment."""
"""Send a local file as a Discord attachment.
Forum channels (type 15) get a new thread whose starter message
carries the file they reject direct POST /messages.
"""
if not self._client:
return SendResult(success=False, error="Not connected")
@@ -920,6 +1113,12 @@ class DiscordAdapter(BasePlatformAdapter):
filename = file_name or os.path.basename(file_path)
with open(file_path, "rb") as fh:
file = discord.File(fh, filename=filename)
if self._is_forum_parent(channel):
return await self._forum_post_file(
channel,
content=(caption or "").strip(),
file=file,
)
msg = await channel.send(content=caption if caption else None, file=file)
return SendResult(success=True, message_id=str(msg.id))
@@ -968,6 +1167,18 @@ class DiscordAdapter(BasePlatformAdapter):
with open(audio_path, "rb") as f:
file_data = f.read()
# Forum channels (type 15) reject direct POST /messages — the
# native voice flag path also targets /messages so it would fail
# too. Create a thread post with the audio as the starter
# attachment instead.
if self._is_forum_parent(channel):
forum_file = discord.File(io.BytesIO(file_data), filename=filename)
return await self._forum_post_file(
channel,
content=(caption or "").strip(),
file=forum_file,
)
# Try sending as a native voice message via raw API (flags=8192).
try:
import base64
@@ -1310,11 +1521,48 @@ class DiscordAdapter(BasePlatformAdapter):
except OSError:
pass
def _is_allowed_user(self, user_id: str) -> bool:
"""Check if user is in DISCORD_ALLOWED_USERS."""
if not self._allowed_user_ids:
def _is_allowed_user(self, user_id: str, author=None) -> bool:
"""Check if user is allowed via DISCORD_ALLOWED_USERS or DISCORD_ALLOWED_ROLES.
Uses OR semantics: if the user matches EITHER allowlist, they're allowed.
If both allowlists are empty, everyone is allowed (backwards compatible).
When author is a Member, checks .roles directly; otherwise falls back
to scanning the bot's mutual guilds for a Member record.
"""
# ``getattr`` fallbacks here guard against test fixtures that build
# an adapter via ``object.__new__(DiscordAdapter)`` and skip __init__
# (see AGENTS.md pitfall #17 — same pattern as gateway.run).
allowed_users = getattr(self, "_allowed_user_ids", set())
allowed_roles = getattr(self, "_allowed_role_ids", set())
has_users = bool(allowed_users)
has_roles = bool(allowed_roles)
if not has_users and not has_roles:
return True
return user_id in self._allowed_user_ids
# Check user ID allowlist
if has_users and user_id in allowed_users:
return True
# Check role allowlist
if has_roles:
# Try direct role check from Member object
direct_roles = getattr(author, "roles", None) if author is not None else None
if direct_roles:
if any(getattr(r, "id", None) in allowed_roles for r in direct_roles):
return True
# Fallback: scan mutual guilds for member's roles
if self._client is not None:
try:
uid_int = int(user_id)
except (TypeError, ValueError):
uid_int = None
if uid_int is not None:
for guild in self._client.guilds:
m = guild.get_member(uid_int)
if m is None:
continue
m_roles = getattr(m, "roles", None) or []
if any(getattr(r, "id", None) in allowed_roles for r in m_roles):
return True
return False
async def send_image_file(
self,
@@ -1383,6 +1631,13 @@ class DiscordAdapter(BasePlatformAdapter):
import io
file = discord.File(io.BytesIO(image_data), filename=f"image.{ext}")
if self._is_forum_parent(channel):
return await self._forum_post_file(
channel,
content=(caption or "").strip(),
file=file,
)
msg = await channel.send(
content=caption if caption else None,
file=file,
@@ -1445,6 +1700,13 @@ class DiscordAdapter(BasePlatformAdapter):
import io
file = discord.File(io.BytesIO(animation_data), filename="animation.gif")
if self._is_forum_parent(channel):
return await self._forum_post_file(
channel,
content=(caption or "").strip(),
file=file,
)
msg = await channel.send(
content=caption if caption else None,
file=file,
@@ -1732,6 +1994,11 @@ class DiscordAdapter(BasePlatformAdapter):
async def slash_stop(interaction: discord.Interaction):
await self._run_simple_slash(interaction, "/stop", "Stop requested~")
@tree.command(name="steer", description="Inject a message after the next tool call (no interrupt)")
@discord.app_commands.describe(prompt="Text to inject into the agent's next tool result")
async def slash_steer(interaction: discord.Interaction, prompt: str):
await self._run_simple_slash(interaction, f"/steer {prompt}".strip())
@tree.command(name="compress", description="Compress conversation context")
async def slash_compress(interaction: discord.Interaction):
await self._run_simple_slash(interaction, "/compress")
@@ -1904,12 +2171,23 @@ class DiscordAdapter(BasePlatformAdapter):
self._register_skill_group(tree)
def _register_skill_group(self, tree) -> None:
"""Register a ``/skill`` command group with category subcommand groups.
"""Register a single ``/skill`` command with autocomplete on the name.
Skills are organized by their directory category under ``SKILLS_DIR``.
Each category becomes a subcommand group; root-level skills become
direct subcommands. Discord supports 25 subcommand groups × 25
subcommands each = 625 skills well beyond the old 100-command cap.
Discord enforces an ~8000-byte per-command payload limit. The older
nested layout (``/skill <category> <name>``) registered one giant
command whose serialized payload grew linearly with the skill
catalog with the default ~75 skills the payload was ~14 KB and
``tree.sync()`` rejected the entire slash-command batch (issues
#11321, #10259, #11385, #10261, #10214).
Autocomplete options are fetched dynamically by Discord when the
user types they do NOT count against the per-command registration
budget. So we register ONE flat ``/skill`` command with
``name: str`` (autocompleted) and ``args: str = ""``. This scales
to thousands of skills with no size math, no splitting, and no
hidden skills. The slash picker also becomes more discoverable
Discord live-filters by the user's typed prefix against both the
skill name and its description.
"""
try:
from hermes_cli.commands import discord_skill_commands_by_category
@@ -1920,68 +2198,97 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception:
pass
# Reuse the existing collector for consistent filtering
# (per-platform disabled, hub-excluded, name clamping), then
# flatten — the category grouping was only useful for the
# nested layout.
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=existing_names,
)
entries: list[tuple[str, str, str]] = list(uncategorized)
for cat_skills in categories.values():
entries.extend(cat_skills)
if not categories and not uncategorized:
if not entries:
return
skill_group = discord.app_commands.Group(
# Stable alphabetical order so the autocomplete suggestion
# list is predictable across restarts.
entries.sort(key=lambda t: t[0])
# name -> (description, cmd_key) — used by both the autocomplete
# callback and the handler for O(1) dispatch.
skill_lookup: dict[str, tuple[str, str]] = {
n: (d, k) for n, d, k in entries
}
async def _autocomplete_name(
interaction: "discord.Interaction", current: str,
) -> list:
"""Filter skills by the user's typed prefix.
Matches both the skill name and its description so
"/skill pdf" surfaces skills whose description mentions
PDFs even if the name doesn't. Discord caps this list at
25 entries per query.
"""
q = (current or "").strip().lower()
choices: list = []
for name, desc, _key in entries:
if not q or q in name.lower() or (desc and q in desc.lower()):
if desc:
label = f"{name}{desc}"
else:
label = name
# Discord's Choice.name is capped at 100 chars.
if len(label) > 100:
label = label[:97] + "..."
choices.append(
discord.app_commands.Choice(name=label, value=name)
)
if len(choices) >= 25:
break
return choices
@discord.app_commands.describe(
name="Which skill to run",
args="Optional arguments for the skill",
)
@discord.app_commands.autocomplete(name=_autocomplete_name)
async def _skill_handler(
interaction: "discord.Interaction", name: str, args: str = "",
):
entry = skill_lookup.get(name)
if not entry:
await interaction.response.send_message(
f"Unknown skill: `{name}`. Start typing for "
f"autocomplete suggestions.",
ephemeral=True,
)
return
_desc, cmd_key = entry
await self._run_simple_slash(
interaction, f"{cmd_key} {args}".strip()
)
cmd = discord.app_commands.Command(
name="skill",
description="Run a Hermes skill",
callback=_skill_handler,
)
tree.add_command(cmd)
# ── Helper: build a callback for a skill command key ──
def _make_handler(_key: str):
@discord.app_commands.describe(args="Optional arguments for the skill")
async def _handler(interaction: discord.Interaction, args: str = ""):
await self._run_simple_slash(interaction, f"{_key} {args}".strip())
_handler.__name__ = f"skill_{_key.lstrip('/').replace('-', '_')}"
return _handler
# ── Uncategorized (root-level) skills → direct subcommands ──
for discord_name, description, cmd_key in uncategorized:
cmd = discord.app_commands.Command(
name=discord_name,
description=description or f"Run the {discord_name} skill",
callback=_make_handler(cmd_key),
)
skill_group.add_command(cmd)
# ── Category subcommand groups ──
for cat_name in sorted(categories):
cat_desc = f"{cat_name.replace('-', ' ').title()} skills"
if len(cat_desc) > 100:
cat_desc = cat_desc[:97] + "..."
cat_group = discord.app_commands.Group(
name=cat_name,
description=cat_desc,
parent=skill_group,
)
for discord_name, description, cmd_key in categories[cat_name]:
cmd = discord.app_commands.Command(
name=discord_name,
description=description or f"Run the {discord_name} skill",
callback=_make_handler(cmd_key),
)
cat_group.add_command(cmd)
tree.add_command(skill_group)
total = sum(len(v) for v in categories.values()) + len(uncategorized)
logger.info(
"[%s] Registered /skill group: %d skill(s) across %d categories"
" + %d uncategorized",
self.name, total, len(categories), len(uncategorized),
"[%s] Registered /skill command with %d skill(s) via autocomplete",
self.name, len(entries),
)
if hidden:
logger.warning(
"[%s] %d skill(s) not registered (Discord subcommand limits)",
logger.info(
"[%s] %d skill(s) filtered out of /skill (name clamp / reserved)",
self.name, hidden,
)
except Exception as exc:
logger.warning("[%s] Failed to register /skill group: %s", self.name, exc)
logger.warning("[%s] Failed to register /skill command: %s", self.name, exc)
def _build_slash_event(self, interaction: discord.Interaction, text: str) -> MessageEvent:
"""Build a MessageEvent from a Discord slash command interaction."""
@@ -2140,6 +2447,26 @@ class DiscordAdapter(BasePlatformAdapter):
from gateway.platforms.base import resolve_channel_prompt
return resolve_channel_prompt(self.config.extra, channel_id, parent_id)
def _discord_require_mention(self) -> bool:
"""Return whether Discord channel messages require a bot mention."""
configured = self.config.extra.get("require_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() not in ("false", "0", "no", "off")
return bool(configured)
return os.getenv("DISCORD_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no", "off")
def _discord_free_response_channels(self) -> set:
"""Return Discord channel IDs where no bot mention is required."""
raw = self.config.extra.get("free_response_channels")
if raw is None:
raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
if isinstance(raw, str) and raw.strip():
return {part.strip() for part in raw.split(",") if part.strip()}
return set()
def _thread_parent_channel(self, channel: Any) -> Any:
"""Return the parent text channel when invoked from a thread."""
return getattr(channel, "parent", None) or channel
@@ -2242,8 +2569,15 @@ class DiscordAdapter(BasePlatformAdapter):
Returns the created thread object, or ``None`` on failure.
"""
# Build a short thread name from the message
# Build a short thread name from the message. Strip Discord mention
# syntax (users / roles / channels) so thread titles don't end up
# showing raw <@id>, <@&id>, or <#id> markers — the ID isn't
# meaningful to humans glancing at the thread list (#6336).
content = (message.content or "").strip()
# <@123>, <@!123>, <@&123>, <#123> — collapse to empty; normalize spaces.
content = re.sub(r"<@[!&]?\d+>", "", content)
content = re.sub(r"<#\d+>", "", content)
content = re.sub(r"\s+", " ", content).strip()
thread_name = content[:80] if content else "Hermes"
if len(content) > 80:
thread_name = thread_name[:77] + "..."
@@ -2251,9 +2585,25 @@ class DiscordAdapter(BasePlatformAdapter):
try:
thread = await message.create_thread(name=thread_name, auto_archive_duration=1440)
return thread
except Exception as e:
logger.warning("[%s] Auto-thread creation failed: %s", self.name, e)
return None
except Exception as direct_error:
display_name = getattr(getattr(message, "author", None), "display_name", None) or "unknown user"
reason = f"Auto-threaded from mention by {display_name}"
try:
seed_msg = await message.channel.send(f"\U0001f9f5 Thread created by Hermes: **{thread_name}**")
thread = await seed_msg.create_thread(
name=thread_name,
auto_archive_duration=1440,
reason=reason,
)
return thread
except Exception as fallback_error:
logger.warning(
"[%s] Auto-thread creation failed. Direct error: %s. Fallback error: %s",
self.name,
direct_error,
fallback_error,
)
return None
async def send_exec_approval(
self, chat_id: str, command: str, session_key: str,
@@ -2440,6 +2790,124 @@ class DiscordAdapter(BasePlatformAdapter):
return f"{parent_name} / {thread_name}"
return thread_name
# ------------------------------------------------------------------
# Attachment download helpers
#
# Discord attachments (images / audio / documents) are fetched via the
# authenticated bot session whenever the Attachment object exposes
# ``read()``. That sidesteps two classes of bug that hit the older
# plain-HTTP path:
#
# 1. ``cdn.discordapp.com`` URLs increasingly require bot auth on
# download — unauthenticated httpx sees 403 Forbidden.
# (issue #8242)
# 2. Some user environments (VPNs, corporate DNS, tunnels) resolve
# ``cdn.discordapp.com`` to private-looking IPs that our
# ``is_safe_url`` guard classifies as SSRF risks. Routing the
# fetch through discord.py's own HTTP client handles DNS
# internally so our guard isn't consulted for the attachment
# path. (issue #6587)
#
# If ``att.read()`` is unavailable (unexpected object shape / test
# stub) or the bot session fetch fails, we fall back to the existing
# SSRF-gated URL downloaders. The fallback keeps defense-in-depth
# against any future Discord payload-schema drift that could slip a
# non-CDN URL into the ``att.url`` field. (issue #11345)
# ------------------------------------------------------------------
async def _read_attachment_bytes(self, att) -> Optional[bytes]:
"""Read an attachment via discord.py's authenticated bot session.
Returns the raw bytes on success, or ``None`` if ``att`` doesn't
expose a callable ``read()`` or the read itself fails. Callers
should treat ``None`` as a signal to fall back to the URL-based
downloaders.
"""
reader = getattr(att, "read", None)
if reader is None or not callable(reader):
return None
try:
return await reader()
except Exception as e:
logger.warning(
"[Discord] Authenticated attachment read failed for %s: %s",
getattr(att, "filename", None) or getattr(att, "url", "<unknown>"),
e,
)
return None
async def _cache_discord_image(self, att, ext: str) -> str:
"""Cache a Discord image attachment to local disk.
Primary path: ``att.read()`` + ``cache_image_from_bytes``
(authenticated, no SSRF gate).
Fallback: ``cache_image_from_url`` (plain httpx, SSRF-gated).
"""
raw_bytes = await self._read_attachment_bytes(att)
if raw_bytes is not None:
try:
return cache_image_from_bytes(raw_bytes, ext=ext)
except Exception as e:
logger.debug(
"[Discord] cache_image_from_bytes rejected att.read() data; falling back to URL: %s",
e,
)
return await cache_image_from_url(att.url, ext=ext)
async def _cache_discord_audio(self, att, ext: str) -> str:
"""Cache a Discord audio attachment to local disk.
Primary path: ``att.read()`` + ``cache_audio_from_bytes``
(authenticated, no SSRF gate).
Fallback: ``cache_audio_from_url`` (plain httpx, SSRF-gated).
"""
raw_bytes = await self._read_attachment_bytes(att)
if raw_bytes is not None:
try:
return cache_audio_from_bytes(raw_bytes, ext=ext)
except Exception as e:
logger.debug(
"[Discord] cache_audio_from_bytes failed; falling back to URL: %s",
e,
)
return await cache_audio_from_url(att.url, ext=ext)
async def _cache_discord_document(self, att, ext: str) -> bytes:
"""Download a Discord document attachment and return the raw bytes.
Primary path: ``att.read()`` (authenticated, no SSRF gate).
Fallback: SSRF-gated ``aiohttp`` download. This closes the gap
where the old document path made raw ``aiohttp.ClientSession``
requests with no safety check (#11345). The caller is responsible
for passing the returned bytes to ``cache_document_from_bytes``
(and, where applicable, for injecting text content).
"""
raw_bytes = await self._read_attachment_bytes(att)
if raw_bytes is not None:
return raw_bytes
# Fallback: SSRF-gated URL download.
if not is_safe_url(att.url):
raise ValueError(
f"Blocked unsafe attachment URL (SSRF protection): {att.url}"
)
import aiohttp
from gateway.platforms.base import resolve_proxy_url, proxy_kwargs_for_aiohttp
_proxy = resolve_proxy_url(platform_env_var="DISCORD_PROXY")
_sess_kw, _req_kw = proxy_kwargs_for_aiohttp(_proxy)
async with aiohttp.ClientSession(**_sess_kw) as session:
async with session.get(
att.url,
timeout=aiohttp.ClientTimeout(total=30),
**_req_kw,
) as resp:
if resp.status != 200:
raise Exception(f"HTTP {resp.status}")
return await resp.read()
async def _handle_message(self, message: DiscordMessage) -> None:
"""Handle incoming Discord messages."""
# In server channels (not DMs), require the bot to be @mentioned
@@ -2482,12 +2950,11 @@ class DiscordAdapter(BasePlatformAdapter):
logger.debug("[%s] Ignoring message in ignored channel: %s", self.name, channel_ids)
return
free_channels_raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
free_channels = {ch.strip() for ch in free_channels_raw.split(",") if ch.strip()}
free_channels = self._discord_free_response_channels()
if parent_channel_id:
channel_ids.add(parent_channel_id)
require_mention = os.getenv("DISCORD_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no")
require_mention = self._discord_require_mention()
# Voice-linked text channels act as free-response while voice is active.
# Only the exact bound channel gets the exemption, not sibling threads.
voice_linked_ids = {str(ch_id) for ch_id in self._voice_text_channels.values()}
@@ -2515,9 +2982,10 @@ class DiscordAdapter(BasePlatformAdapter):
if not is_thread and not isinstance(message.channel, discord.DMChannel):
no_thread_channels_raw = os.getenv("DISCORD_NO_THREAD_CHANNELS", "")
no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
skip_thread = bool(channel_ids & no_thread_channels)
skip_thread = bool(channel_ids & no_thread_channels) or is_free_channel
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
if auto_thread and not skip_thread and not is_voice_linked_channel:
is_reply_message = getattr(message, "type", None) == discord.MessageType.reply
if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
thread = await self._auto_create_thread(message)
if thread:
is_thread = True
@@ -2578,6 +3046,7 @@ class DiscordAdapter(BasePlatformAdapter):
user_name=message.author.display_name,
thread_id=thread_id,
chat_topic=chat_topic,
is_bot=getattr(message.author, "bot", False),
)
# Build media URLs -- download image attachments to local cache so the
@@ -2593,7 +3062,7 @@ class DiscordAdapter(BasePlatformAdapter):
ext = "." + content_type.split("/")[-1].split(";")[0]
if ext not in (".jpg", ".jpeg", ".png", ".gif", ".webp"):
ext = ".jpg"
cached_path = await cache_image_from_url(att.url, ext=ext)
cached_path = await self._cache_discord_image(att, ext)
media_urls.append(cached_path)
media_types.append(content_type)
print(f"[Discord] Cached user image: {cached_path}", flush=True)
@@ -2607,7 +3076,7 @@ class DiscordAdapter(BasePlatformAdapter):
ext = "." + content_type.split("/")[-1].split(";")[0]
if ext not in (".ogg", ".mp3", ".wav", ".webm", ".m4a"):
ext = ".ogg"
cached_path = await cache_audio_from_url(att.url, ext=ext)
cached_path = await self._cache_discord_audio(att, ext)
media_urls.append(cached_path)
media_types.append(content_type)
print(f"[Discord] Cached user audio: {cached_path}", flush=True)
@@ -2638,19 +3107,7 @@ class DiscordAdapter(BasePlatformAdapter):
)
else:
try:
import aiohttp
from gateway.platforms.base import resolve_proxy_url, proxy_kwargs_for_aiohttp
_proxy = resolve_proxy_url(platform_env_var="DISCORD_PROXY")
_sess_kw, _req_kw = proxy_kwargs_for_aiohttp(_proxy)
async with aiohttp.ClientSession(**_sess_kw) as session:
async with session.get(
att.url,
timeout=aiohttp.ClientTimeout(total=30),
**_req_kw,
) as resp:
if resp.status != 200:
raise Exception(f"HTTP {resp.status}")
raw_bytes = await resp.read()
raw_bytes = await self._cache_discord_document(att, ext)
cached_path = cache_document_from_bytes(
raw_bytes, att.filename or f"document{ext}"
)
+173 -3
View File
@@ -1073,6 +1073,13 @@ class FeishuAdapter(BasePlatformAdapter):
self._webhook_rate_counts: Dict[str, tuple[int, float]] = {} # rate_key → (count, window_start)
self._webhook_anomaly_counts: Dict[str, tuple[int, str, float]] = {} # ip → (count, last_status, first_seen)
self._card_action_tokens: Dict[str, float] = {} # token → first_seen_time
# Inbound events that arrived before the adapter loop was ready
# (e.g. during startup/restart or network-flap reconnect). A single
# drainer thread replays them as soon as the loop becomes available.
self._pending_inbound_events: List[Any] = []
self._pending_inbound_lock = threading.Lock()
self._pending_drain_scheduled = False
self._pending_inbound_max_depth = 1000 # cap queue; drop oldest beyond
self._chat_locks: Dict[str, asyncio.Lock] = {} # chat_id → lock (per-chat serial processing)
self._sent_message_ids_to_chat: Dict[str, str] = {} # message_id → chat_id (for reaction routing)
self._sent_message_id_order: List[str] = [] # LRU order for _sent_message_ids_to_chat
@@ -1219,6 +1226,12 @@ class FeishuAdapter(BasePlatformAdapter):
.register_p2_card_action_trigger(self._on_card_action_trigger)
.register_p2_im_chat_member_bot_added_v1(self._on_bot_added_to_chat)
.register_p2_im_chat_member_bot_deleted_v1(self._on_bot_removed_from_chat)
.register_p2_im_chat_access_event_bot_p2p_chat_entered_v1(self._on_p2p_chat_entered)
.register_p2_im_message_recalled_v1(self._on_message_recalled)
.register_p2_customized_event(
"drive.notice.comment_add_v1",
self._on_drive_comment_event,
)
.build()
)
@@ -1757,10 +1770,22 @@ class FeishuAdapter(BasePlatformAdapter):
# =========================================================================
def _on_message_event(self, data: Any) -> None:
"""Normalize Feishu inbound events into MessageEvent."""
"""Normalize Feishu inbound events into MessageEvent.
Called by the lark_oapi SDK's event dispatcher on a background thread.
If the adapter loop is not currently accepting callbacks (brief window
during startup/restart or network-flap reconnect), the event is queued
for replay instead of dropped.
"""
loop = self._loop
if loop is None or bool(getattr(loop, "is_closed", lambda: False)()):
logger.warning("[Feishu] Dropping inbound message before adapter loop is ready")
if not self._loop_accepts_callbacks(loop):
start_drainer = self._enqueue_pending_inbound_event(data)
if start_drainer:
threading.Thread(
target=self._drain_pending_inbound_events,
name="feishu-pending-inbound-drainer",
daemon=True,
).start()
return
future = asyncio.run_coroutine_threadsafe(
self._handle_message_event_data(data),
@@ -1768,6 +1793,124 @@ class FeishuAdapter(BasePlatformAdapter):
)
future.add_done_callback(self._log_background_failure)
def _enqueue_pending_inbound_event(self, data: Any) -> bool:
"""Append an event to the pending-inbound queue.
Returns True if the caller should spawn a drainer thread (no drainer
currently scheduled), False if a drainer is already running and will
pick up the new event on its next pass.
"""
with self._pending_inbound_lock:
if len(self._pending_inbound_events) >= self._pending_inbound_max_depth:
# Queue full — drop the oldest to make room. This happens only
# if the loop stays unavailable for an extended period AND the
# WS keeps firing callbacks. Still better than silent drops.
dropped = self._pending_inbound_events.pop(0)
try:
event = getattr(dropped, "event", None)
message = getattr(event, "message", None)
message_id = str(getattr(message, "message_id", "") or "unknown")
except Exception:
message_id = "unknown"
logger.error(
"[Feishu] Pending-inbound queue full (%d); dropped oldest event %s",
self._pending_inbound_max_depth,
message_id,
)
self._pending_inbound_events.append(data)
depth = len(self._pending_inbound_events)
should_start = not self._pending_drain_scheduled
if should_start:
self._pending_drain_scheduled = True
logger.warning(
"[Feishu] Queued inbound event for replay (loop not ready, queue depth=%d)",
depth,
)
return should_start
def _drain_pending_inbound_events(self) -> None:
"""Replay queued inbound events once the adapter loop is ready.
Runs in a dedicated daemon thread. Polls ``_running`` and
``_loop_accepts_callbacks`` until events can be dispatched or the
adapter shuts down. A single drainer handles the entire queue;
concurrent ``_on_message_event`` calls just append.
"""
poll_interval = 0.25
max_wait_seconds = 120.0 # safety cap: drop queue after 2 minutes
waited = 0.0
try:
while True:
if not getattr(self, "_running", True):
# Adapter shutting down — drop queued events rather than
# holding them against a closed loop.
with self._pending_inbound_lock:
dropped = len(self._pending_inbound_events)
self._pending_inbound_events.clear()
if dropped:
logger.warning(
"[Feishu] Dropped %d queued inbound event(s) during shutdown",
dropped,
)
return
loop = self._loop
if self._loop_accepts_callbacks(loop):
with self._pending_inbound_lock:
batch = self._pending_inbound_events[:]
self._pending_inbound_events.clear()
if not batch:
# Queue emptied between check and grab; done.
with self._pending_inbound_lock:
if not self._pending_inbound_events:
return
continue
dispatched = 0
requeue: List[Any] = []
for event in batch:
try:
fut = asyncio.run_coroutine_threadsafe(
self._handle_message_event_data(event),
loop,
)
fut.add_done_callback(self._log_background_failure)
dispatched += 1
except RuntimeError:
# Loop closed between check and submit — requeue
# and poll again.
requeue.append(event)
if requeue:
with self._pending_inbound_lock:
self._pending_inbound_events[:0] = requeue
if dispatched:
logger.info(
"[Feishu] Replayed %d queued inbound event(s)",
dispatched,
)
if not requeue:
# Successfully drained; check if more arrived while
# we were dispatching and exit if not.
with self._pending_inbound_lock:
if not self._pending_inbound_events:
return
# More events queued or requeue pending — loop again.
continue
if waited >= max_wait_seconds:
with self._pending_inbound_lock:
dropped = len(self._pending_inbound_events)
self._pending_inbound_events.clear()
logger.error(
"[Feishu] Adapter loop unavailable for %.0fs; "
"dropped %d queued inbound event(s)",
max_wait_seconds,
dropped,
)
return
time.sleep(poll_interval)
waited += poll_interval
finally:
with self._pending_inbound_lock:
self._pending_drain_scheduled = False
async def _handle_message_event_data(self, data: Any) -> None:
"""Shared inbound message handling for websocket and webhook transports."""
event = getattr(data, "event", None)
@@ -1820,6 +1963,31 @@ class FeishuAdapter(BasePlatformAdapter):
logger.info("[Feishu] Bot removed from chat: %s", chat_id)
self._chat_info_cache.pop(chat_id, None)
def _on_p2p_chat_entered(self, data: Any) -> None:
logger.debug("[Feishu] User entered P2P chat with bot")
def _on_message_recalled(self, data: Any) -> None:
logger.debug("[Feishu] Message recalled by user")
def _on_drive_comment_event(self, data: Any) -> None:
"""Handle drive document comment notification (drive.notice.comment_add_v1).
Delegates to :mod:`gateway.platforms.feishu_comment` for parsing,
logging, and reaction. Scheduling follows the same
``run_coroutine_threadsafe`` pattern used by ``_on_message_event``.
"""
from gateway.platforms.feishu_comment import handle_drive_comment_event
loop = self._loop
if not self._loop_accepts_callbacks(loop):
logger.warning("[Feishu] Dropping drive comment event before adapter loop is ready")
return
future = asyncio.run_coroutine_threadsafe(
handle_drive_comment_event(self._client, data, self_open_id=self._bot_open_id),
loop,
)
future.add_done_callback(self._log_background_failure)
def _on_reaction_event(self, event_type: str, data: Any) -> None:
"""Route user reactions on bot messages as synthetic text events."""
event = getattr(data, "event", None)
@@ -2445,6 +2613,8 @@ class FeishuAdapter(BasePlatformAdapter):
self._on_reaction_event(event_type, data)
elif event_type == "card.action.trigger":
self._on_card_action_trigger(data)
elif event_type == "drive.notice.comment_add_v1":
self._on_drive_comment_event(data)
else:
logger.debug("[Feishu] Ignoring webhook event type: %s", event_type or "unknown")
return web.json_response({"code": 0, "msg": "ok"})
File diff suppressed because it is too large Load Diff
+429
View File
@@ -0,0 +1,429 @@
"""
Feishu document comment access-control rules.
3-tier rule resolution: exact doc > wildcard "*" > top-level > code defaults.
Each field (enabled/policy/allow_from) falls back independently.
Config: ~/.hermes/feishu_comment_rules.json (mtime-cached, hot-reload).
Pairing store: ~/.hermes/feishu_comment_pairing.json.
"""
from __future__ import annotations
import json
import logging
import time
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Dict, Optional
from hermes_constants import get_hermes_home
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Paths
# ---------------------------------------------------------------------------
#
# Uses the canonical ``get_hermes_home()`` helper (HERMES_HOME-aware and
# profile-safe). Resolved at import time; this module is lazy-imported by
# the Feishu comment event handler, which runs long after profile overrides
# have been applied, so freezing paths here is safe.
RULES_FILE = get_hermes_home() / "feishu_comment_rules.json"
PAIRING_FILE = get_hermes_home() / "feishu_comment_pairing.json"
# ---------------------------------------------------------------------------
# Data models
# ---------------------------------------------------------------------------
_VALID_POLICIES = ("allowlist", "pairing")
@dataclass(frozen=True)
class CommentDocumentRule:
"""Per-document rule. ``None`` means 'inherit from lower tier'."""
enabled: Optional[bool] = None
policy: Optional[str] = None
allow_from: Optional[frozenset] = None
@dataclass(frozen=True)
class CommentsConfig:
"""Top-level comment access config."""
enabled: bool = True
policy: str = "pairing"
allow_from: frozenset = field(default_factory=frozenset)
documents: Dict[str, CommentDocumentRule] = field(default_factory=dict)
@dataclass(frozen=True)
class ResolvedCommentRule:
"""Fully resolved rule after field-by-field fallback."""
enabled: bool
policy: str
allow_from: frozenset
match_source: str # e.g. "exact:docx:xxx" | "wildcard" | "top" | "default"
# ---------------------------------------------------------------------------
# Mtime-cached file loading
# ---------------------------------------------------------------------------
class _MtimeCache:
"""Generic mtime-based file cache. ``stat()`` per access, re-read only on change."""
def __init__(self, path: Path):
self._path = path
self._mtime: float = 0.0
self._data: Optional[dict] = None
def load(self) -> dict:
try:
st = self._path.stat()
mtime = st.st_mtime
except FileNotFoundError:
self._mtime = 0.0
self._data = {}
return {}
if mtime == self._mtime and self._data is not None:
return self._data
try:
with open(self._path, "r", encoding="utf-8") as f:
data = json.load(f)
if not isinstance(data, dict):
data = {}
except (json.JSONDecodeError, OSError):
logger.warning("[Feishu-Rules] Failed to read %s, using empty config", self._path)
data = {}
self._mtime = mtime
self._data = data
return data
_rules_cache = _MtimeCache(RULES_FILE)
_pairing_cache = _MtimeCache(PAIRING_FILE)
# ---------------------------------------------------------------------------
# Config parsing
# ---------------------------------------------------------------------------
def _parse_frozenset(raw: Any) -> Optional[frozenset]:
"""Parse a list of strings into a frozenset; return None if key absent."""
if raw is None:
return None
if isinstance(raw, (list, tuple)):
return frozenset(str(u).strip() for u in raw if str(u).strip())
return None
def _parse_document_rule(raw: dict) -> CommentDocumentRule:
enabled = raw.get("enabled")
if enabled is not None:
enabled = bool(enabled)
policy = raw.get("policy")
if policy is not None:
policy = str(policy).strip().lower()
if policy not in _VALID_POLICIES:
policy = None
allow_from = _parse_frozenset(raw.get("allow_from"))
return CommentDocumentRule(enabled=enabled, policy=policy, allow_from=allow_from)
def load_config() -> CommentsConfig:
"""Load comment rules from disk (mtime-cached)."""
raw = _rules_cache.load()
if not raw:
return CommentsConfig()
documents: Dict[str, CommentDocumentRule] = {}
raw_docs = raw.get("documents", {})
if isinstance(raw_docs, dict):
for key, rule_raw in raw_docs.items():
if isinstance(rule_raw, dict):
documents[str(key)] = _parse_document_rule(rule_raw)
policy = str(raw.get("policy", "pairing")).strip().lower()
if policy not in _VALID_POLICIES:
policy = "pairing"
return CommentsConfig(
enabled=raw.get("enabled", True),
policy=policy,
allow_from=_parse_frozenset(raw.get("allow_from")) or frozenset(),
documents=documents,
)
# ---------------------------------------------------------------------------
# Rule resolution (§8.4 field-by-field fallback)
# ---------------------------------------------------------------------------
def has_wiki_keys(cfg: CommentsConfig) -> bool:
"""Check if any document rule key starts with 'wiki:'."""
return any(k.startswith("wiki:") for k in cfg.documents)
def resolve_rule(
cfg: CommentsConfig,
file_type: str,
file_token: str,
wiki_token: str = "",
) -> ResolvedCommentRule:
"""Resolve effective rule: exact doc → wiki key → wildcard → top-level → defaults."""
exact_key = f"{file_type}:{file_token}"
exact = cfg.documents.get(exact_key)
exact_src = f"exact:{exact_key}"
if exact is None and wiki_token:
wiki_key = f"wiki:{wiki_token}"
exact = cfg.documents.get(wiki_key)
exact_src = f"exact:{wiki_key}"
wildcard = cfg.documents.get("*")
layers = []
if exact is not None:
layers.append((exact, exact_src))
if wildcard is not None:
layers.append((wildcard, "wildcard"))
def _pick(field_name: str):
for layer, source in layers:
val = getattr(layer, field_name)
if val is not None:
return val, source
return getattr(cfg, field_name), "top"
enabled, en_src = _pick("enabled")
policy, pol_src = _pick("policy")
allow_from, _ = _pick("allow_from")
# match_source = highest-priority tier that contributed any field
priority_order = {"exact": 0, "wildcard": 1, "top": 2}
best_src = min(
[en_src, pol_src],
key=lambda s: priority_order.get(s.split(":")[0], 3),
)
return ResolvedCommentRule(
enabled=enabled,
policy=policy,
allow_from=allow_from,
match_source=best_src,
)
# ---------------------------------------------------------------------------
# Pairing store
# ---------------------------------------------------------------------------
def _load_pairing_approved() -> set:
"""Return set of approved user open_ids (mtime-cached)."""
data = _pairing_cache.load()
approved = data.get("approved", {})
if isinstance(approved, dict):
return set(approved.keys())
if isinstance(approved, list):
return set(str(u) for u in approved if u)
return set()
def _save_pairing(data: dict) -> None:
PAIRING_FILE.parent.mkdir(parents=True, exist_ok=True)
tmp = PAIRING_FILE.with_suffix(".tmp")
with open(tmp, "w", encoding="utf-8") as f:
json.dump(data, f, indent=2, ensure_ascii=False)
tmp.replace(PAIRING_FILE)
# Invalidate cache so next load picks up change
_pairing_cache._mtime = 0.0
_pairing_cache._data = None
def pairing_add(user_open_id: str) -> bool:
"""Add a user to the pairing-approved list. Returns True if newly added."""
data = _pairing_cache.load()
approved = data.get("approved", {})
if not isinstance(approved, dict):
approved = {}
if user_open_id in approved:
return False
approved[user_open_id] = {"approved_at": time.time()}
data["approved"] = approved
_save_pairing(data)
return True
def pairing_remove(user_open_id: str) -> bool:
"""Remove a user from the pairing-approved list. Returns True if removed."""
data = _pairing_cache.load()
approved = data.get("approved", {})
if not isinstance(approved, dict):
return False
if user_open_id not in approved:
return False
del approved[user_open_id]
data["approved"] = approved
_save_pairing(data)
return True
def pairing_list() -> Dict[str, Any]:
"""Return the approved dict {user_open_id: {approved_at: ...}}."""
data = _pairing_cache.load()
approved = data.get("approved", {})
return dict(approved) if isinstance(approved, dict) else {}
# ---------------------------------------------------------------------------
# Access check (public API for feishu_comment.py)
# ---------------------------------------------------------------------------
def is_user_allowed(rule: ResolvedCommentRule, user_open_id: str) -> bool:
"""Check if user passes the resolved rule's policy gate."""
if user_open_id in rule.allow_from:
return True
if rule.policy == "pairing":
return user_open_id in _load_pairing_approved()
return False
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def _print_status() -> None:
cfg = load_config()
print(f"Rules file: {RULES_FILE}")
print(f" exists: {RULES_FILE.exists()}")
print(f"Pairing file: {PAIRING_FILE}")
print(f" exists: {PAIRING_FILE.exists()}")
print()
print(f"Top-level:")
print(f" enabled: {cfg.enabled}")
print(f" policy: {cfg.policy}")
print(f" allow_from: {sorted(cfg.allow_from) if cfg.allow_from else '[]'}")
print()
if cfg.documents:
print(f"Document rules ({len(cfg.documents)}):")
for key, rule in sorted(cfg.documents.items()):
parts = []
if rule.enabled is not None:
parts.append(f"enabled={rule.enabled}")
if rule.policy is not None:
parts.append(f"policy={rule.policy}")
if rule.allow_from is not None:
parts.append(f"allow_from={sorted(rule.allow_from)}")
print(f" [{key}] {', '.join(parts) if parts else '(empty — inherits all)'}")
else:
print("Document rules: (none)")
print()
approved = pairing_list()
print(f"Pairing approved ({len(approved)}):")
for uid, meta in sorted(approved.items()):
ts = meta.get("approved_at", 0)
print(f" {uid} (approved_at={ts})")
def _do_check(doc_key: str, user_open_id: str) -> None:
cfg = load_config()
parts = doc_key.split(":", 1)
if len(parts) != 2:
print(f"Error: doc_key must be 'fileType:fileToken', got '{doc_key}'")
return
file_type, file_token = parts
rule = resolve_rule(cfg, file_type, file_token)
allowed = is_user_allowed(rule, user_open_id)
print(f"Document: {doc_key}")
print(f"User: {user_open_id}")
print(f"Resolved rule:")
print(f" enabled: {rule.enabled}")
print(f" policy: {rule.policy}")
print(f" allow_from: {sorted(rule.allow_from) if rule.allow_from else '[]'}")
print(f" match_source: {rule.match_source}")
print(f"Result: {'ALLOWED' if allowed else 'DENIED'}")
def _main() -> int:
import sys
try:
from hermes_cli.env_loader import load_hermes_dotenv
load_hermes_dotenv()
except Exception:
pass
usage = (
"Usage: python -m gateway.platforms.feishu_comment_rules <command> [args]\n"
"\n"
"Commands:\n"
" status Show rules config and pairing state\n"
" check <fileType:token> <user> Simulate access check\n"
" pairing add <user_open_id> Add user to pairing-approved list\n"
" pairing remove <user_open_id> Remove user from pairing-approved list\n"
" pairing list List pairing-approved users\n"
"\n"
f"Rules config file: {RULES_FILE}\n"
" Edit this JSON file directly to configure policies and document rules.\n"
" Changes take effect on the next comment event (no restart needed).\n"
)
args = sys.argv[1:]
if not args:
print(usage)
return 1
cmd = args[0]
if cmd == "status":
_print_status()
elif cmd == "check":
if len(args) < 3:
print("Usage: check <fileType:fileToken> <user_open_id>")
return 1
_do_check(args[1], args[2])
elif cmd == "pairing":
if len(args) < 2:
print("Usage: pairing <add|remove|list> [args]")
return 1
sub = args[1]
if sub == "add":
if len(args) < 3:
print("Usage: pairing add <user_open_id>")
return 1
if pairing_add(args[2]):
print(f"Added: {args[2]}")
else:
print(f"Already approved: {args[2]}")
elif sub == "remove":
if len(args) < 3:
print("Usage: pairing remove <user_open_id>")
return 1
if pairing_remove(args[2]):
print(f"Removed: {args[2]}")
else:
print(f"Not in approved list: {args[2]}")
elif sub == "list":
approved = pairing_list()
if not approved:
print("(no approved users)")
for uid, meta in sorted(approved.items()):
print(f" {uid} approved_at={meta.get('approved_at', '?')}")
else:
print(f"Unknown pairing subcommand: {sub}")
return 1
else:
print(f"Unknown command: {cmd}\n")
print(usage)
return 1
return 0
if __name__ == "__main__":
import sys
sys.exit(_main())
+57
View File
@@ -0,0 +1,57 @@
"""
QQBot platform package.
Re-exports the main adapter symbols from ``adapter.py`` (the original
``qqbot.py``) so that **all existing import paths remain unchanged**::
from gateway.platforms.qqbot import QQAdapter # works
from gateway.platforms.qqbot import check_qq_requirements # works
New modules:
- ``constants`` shared constants (API URLs, timeouts, message types)
- ``utils`` User-Agent builder, config helpers
- ``crypto`` AES-256-GCM key generation and decryption
- ``onboard`` QR-code scan-to-configure flow
"""
# -- Adapter (original qqbot.py) ------------------------------------------
from .adapter import ( # noqa: F401
QQAdapter,
QQCloseError,
check_qq_requirements,
_coerce_list,
_ssrf_redirect_guard,
)
# -- Onboard (QR-code scan-to-configure) -----------------------------------
from .onboard import ( # noqa: F401
BindStatus,
create_bind_task,
poll_bind_result,
build_connect_url,
)
from .crypto import decrypt_secret, generate_bind_key # noqa: F401
# -- Utils -----------------------------------------------------------------
from .utils import build_user_agent, get_api_headers, coerce_list # noqa: F401
__all__ = [
# adapter
"QQAdapter",
"QQCloseError",
"check_qq_requirements",
"_coerce_list",
"_ssrf_redirect_guard",
# onboard
"BindStatus",
"create_bind_task",
"poll_bind_result",
"build_connect_url",
# crypto
"decrypt_secret",
"generate_bind_key",
# utils
"build_user_agent",
"get_api_headers",
"coerce_list",
]
File diff suppressed because it is too large Load Diff
+74
View File
@@ -0,0 +1,74 @@
"""QQBot package-level constants shared across adapter, onboard, and other modules."""
from __future__ import annotations
import os
# ---------------------------------------------------------------------------
# QQBot adapter version — bump on functional changes to the adapter package.
# ---------------------------------------------------------------------------
QQBOT_VERSION = "1.1.0"
# ---------------------------------------------------------------------------
# API endpoints
# ---------------------------------------------------------------------------
# The portal domain is configurable via QQ_API_HOST for corporate proxies
# or test environments. Default: q.qq.com (production).
PORTAL_HOST = os.getenv("QQ_PORTAL_HOST", "q.qq.com")
API_BASE = "https://api.sgroup.qq.com"
TOKEN_URL = "https://bots.qq.com/app/getAppAccessToken"
GATEWAY_URL_PATH = "/gateway"
# QR-code onboard endpoints (on the portal host)
ONBOARD_CREATE_PATH = "/lite/create_bind_task"
ONBOARD_POLL_PATH = "/lite/poll_bind_result"
QR_URL_TEMPLATE = (
"https://q.qq.com/qqbot/openclaw/connect.html"
"?task_id={task_id}&_wv=2&source=hermes"
)
# ---------------------------------------------------------------------------
# Timeouts & retry
# ---------------------------------------------------------------------------
DEFAULT_API_TIMEOUT = 30.0
FILE_UPLOAD_TIMEOUT = 120.0
CONNECT_TIMEOUT_SECONDS = 20.0
RECONNECT_BACKOFF = [2, 5, 10, 30, 60]
MAX_RECONNECT_ATTEMPTS = 100
RATE_LIMIT_DELAY = 60 # seconds
QUICK_DISCONNECT_THRESHOLD = 5.0 # seconds
MAX_QUICK_DISCONNECT_COUNT = 3
ONBOARD_POLL_INTERVAL = 2.0 # seconds between poll_bind_result calls
ONBOARD_API_TIMEOUT = 10.0
# ---------------------------------------------------------------------------
# Message limits
# ---------------------------------------------------------------------------
MAX_MESSAGE_LENGTH = 4000
DEDUP_WINDOW_SECONDS = 300
DEDUP_MAX_SIZE = 1000
# ---------------------------------------------------------------------------
# QQ Bot message types
# ---------------------------------------------------------------------------
MSG_TYPE_TEXT = 0
MSG_TYPE_MARKDOWN = 2
MSG_TYPE_MEDIA = 7
MSG_TYPE_INPUT_NOTIFY = 6
# ---------------------------------------------------------------------------
# QQ Bot file media types
# ---------------------------------------------------------------------------
MEDIA_TYPE_IMAGE = 1
MEDIA_TYPE_VIDEO = 2
MEDIA_TYPE_VOICE = 3
MEDIA_TYPE_FILE = 4
+45
View File
@@ -0,0 +1,45 @@
"""AES-256-GCM utilities for QQBot scan-to-configure credential decryption."""
from __future__ import annotations
import base64
import os
def generate_bind_key() -> str:
"""Generate a 256-bit random AES key and return it as base64.
The key is passed to ``create_bind_task`` so the server can encrypt
the bot's *client_secret* before returning it. Only this CLI holds
the key, ensuring the secret never travels in plaintext.
"""
return base64.b64encode(os.urandom(32)).decode()
def decrypt_secret(encrypted_base64: str, key_base64: str) -> str:
"""Decrypt a base64-encoded AES-256-GCM ciphertext.
Ciphertext layout (after base64-decoding)::
IV (12 bytes) ciphertext (N bytes) AuthTag (16 bytes)
Args:
encrypted_base64: The ``bot_encrypt_secret`` value from
``poll_bind_result``.
key_base64: The base64 AES key generated by
:func:`generate_bind_key`.
Returns:
The decrypted *client_secret* as a UTF-8 string.
"""
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
key = base64.b64decode(key_base64)
raw = base64.b64decode(encrypted_base64)
iv = raw[:12]
ciphertext_with_tag = raw[12:] # AESGCM expects ciphertext + tag concatenated
aesgcm = AESGCM(key)
plaintext = aesgcm.decrypt(iv, ciphertext_with_tag, None)
return plaintext.decode("utf-8")
+124
View File
@@ -0,0 +1,124 @@
"""
QQBot scan-to-configure (QR code onboard) module.
Calls the ``q.qq.com`` ``create_bind_task`` / ``poll_bind_result`` APIs to
generate a QR-code URL and poll for scan completion. On success the caller
receives the bot's *app_id*, *client_secret* (decrypted locally), and the
scanner's *user_openid* — enough to fully configure the QQBot gateway.
Reference: https://bot.q.qq.com/wiki/develop/api-v2/
"""
from __future__ import annotations
import logging
from enum import IntEnum
from typing import Tuple
from urllib.parse import quote
from .constants import (
ONBOARD_API_TIMEOUT,
ONBOARD_CREATE_PATH,
ONBOARD_POLL_PATH,
PORTAL_HOST,
QR_URL_TEMPLATE,
)
from .crypto import generate_bind_key
from .utils import get_api_headers
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Bind status
# ---------------------------------------------------------------------------
class BindStatus(IntEnum):
"""Status codes returned by ``poll_bind_result``."""
NONE = 0
PENDING = 1
COMPLETED = 2
EXPIRED = 3
# ---------------------------------------------------------------------------
# Public API
# ---------------------------------------------------------------------------
async def create_bind_task(
timeout: float = ONBOARD_API_TIMEOUT,
) -> Tuple[str, str]:
"""Create a bind task and return *(task_id, aes_key_base64)*.
The AES key is generated locally and sent to the server so it can
encrypt the bot credentials before returning them.
Raises:
RuntimeError: If the API returns a non-zero ``retcode``.
"""
import httpx
url = f"https://{PORTAL_HOST}{ONBOARD_CREATE_PATH}"
key = generate_bind_key()
async with httpx.AsyncClient(timeout=timeout, follow_redirects=True) as client:
resp = await client.post(url, json={"key": key}, headers=get_api_headers())
resp.raise_for_status()
data = resp.json()
if data.get("retcode") != 0:
raise RuntimeError(data.get("msg", "create_bind_task failed"))
task_id = data.get("data", {}).get("task_id")
if not task_id:
raise RuntimeError("create_bind_task: missing task_id in response")
logger.debug("create_bind_task ok: task_id=%s", task_id)
return task_id, key
async def poll_bind_result(
task_id: str,
timeout: float = ONBOARD_API_TIMEOUT,
) -> Tuple[BindStatus, str, str, str]:
"""Poll the bind result for *task_id*.
Returns:
A 4-tuple of ``(status, bot_appid, bot_encrypt_secret, user_openid)``.
* ``bot_encrypt_secret`` is AES-256-GCM encrypted decrypt it with
:func:`~gateway.platforms.qqbot.crypto.decrypt_secret` using the
key from :func:`create_bind_task`.
* ``user_openid`` is the OpenID of the person who scanned the code
(available when ``status == COMPLETED``).
Raises:
RuntimeError: If the API returns a non-zero ``retcode``.
"""
import httpx
url = f"https://{PORTAL_HOST}{ONBOARD_POLL_PATH}"
async with httpx.AsyncClient(timeout=timeout, follow_redirects=True) as client:
resp = await client.post(url, json={"task_id": task_id}, headers=get_api_headers())
resp.raise_for_status()
data = resp.json()
if data.get("retcode") != 0:
raise RuntimeError(data.get("msg", "poll_bind_result failed"))
d = data.get("data", {})
return (
BindStatus(d.get("status", 0)),
str(d.get("bot_appid", "")),
d.get("bot_encrypt_secret", ""),
d.get("user_openid", ""),
)
def build_connect_url(task_id: str) -> str:
"""Build the QR-code target URL for a given *task_id*."""
return QR_URL_TEMPLATE.format(task_id=quote(task_id))
+71
View File
@@ -0,0 +1,71 @@
"""QQBot shared utilities — User-Agent, HTTP helpers, config coercion."""
from __future__ import annotations
import platform
import sys
from typing import Any, Dict, List
from .constants import QQBOT_VERSION
# ---------------------------------------------------------------------------
# User-Agent
# ---------------------------------------------------------------------------
def _get_hermes_version() -> str:
"""Return the hermes-agent package version, or 'dev' if unavailable."""
try:
from importlib.metadata import version
return version("hermes-agent")
except Exception:
return "dev"
def build_user_agent() -> str:
"""Build a descriptive User-Agent string.
Format::
QQBotAdapter/<qqbot_version> (Python/<py_version>; <os>; Hermes/<hermes_version>)
Example::
QQBotAdapter/1.0.0 (Python/3.11.15; darwin; Hermes/0.9.0)
"""
py_version = f"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}"
os_name = platform.system().lower()
hermes_version = _get_hermes_version()
return f"QQBotAdapter/{QQBOT_VERSION} (Python/{py_version}; {os_name}; Hermes/{hermes_version})"
def get_api_headers() -> Dict[str, str]:
"""Return standard HTTP headers for QQBot API requests.
Includes ``Content-Type``, ``Accept``, and a dynamic ``User-Agent``.
``q.qq.com`` requires ``Accept: application/json`` without it,
the server returns a JavaScript anti-bot challenge page.
"""
return {
"Content-Type": "application/json",
"Accept": "application/json",
"User-Agent": build_user_agent(),
}
# ---------------------------------------------------------------------------
# Config helpers
# ---------------------------------------------------------------------------
def coerce_list(value: Any) -> List[str]:
"""Coerce config values into a trimmed string list.
Accepts comma-separated strings, lists, tuples, sets, or single values.
"""
if value is None:
return []
if isinstance(value, str):
return [item.strip() for item in value.split(",") if item.strip()]
if isinstance(value, (list, tuple, set)):
return [str(item).strip() for item in value if str(item).strip()]
return [str(value).strip()] if str(value).strip() else []
+78 -6
View File
@@ -160,6 +160,14 @@ class SignalAdapter(BasePlatformAdapter):
self._sse_task: Optional[asyncio.Task] = None
self._health_monitor_task: Optional[asyncio.Task] = None
self._typing_tasks: Dict[str, asyncio.Task] = {}
# Per-chat typing-indicator backoff. When signal-cli reports
# NETWORK_FAILURE (recipient offline / unroutable), base.py's
# _keep_typing refresh loop would otherwise hammer sendTyping every
# ~2s indefinitely, producing WARNING-level log spam and pointless
# RPC traffic. We track consecutive failures per chat and skip the
# RPC during a cooldown window instead.
self._typing_failures: Dict[str, int] = {}
self._typing_skip_until: Dict[str, float] = {}
self._running = False
self._last_sse_activity = 0.0
self._sse_response: Optional[httpx.Response] = None
@@ -548,8 +556,22 @@ class SignalAdapter(BasePlatformAdapter):
# JSON-RPC Communication
# ------------------------------------------------------------------
async def _rpc(self, method: str, params: dict, rpc_id: str = None) -> Any:
"""Send a JSON-RPC 2.0 request to signal-cli daemon."""
async def _rpc(
self,
method: str,
params: dict,
rpc_id: str = None,
*,
log_failures: bool = True,
) -> Any:
"""Send a JSON-RPC 2.0 request to signal-cli daemon.
When ``log_failures=False``, error and exception paths log at DEBUG
instead of WARNING used by the typing-indicator path to silence
repeated NETWORK_FAILURE spam for unreachable recipients while
still preserving visibility for the first occurrence and for
unrelated RPCs.
"""
if not self.client:
logger.warning("Signal: RPC called but client not connected")
return None
@@ -574,13 +596,19 @@ class SignalAdapter(BasePlatformAdapter):
data = resp.json()
if "error" in data:
logger.warning("Signal RPC error (%s): %s", method, data["error"])
if log_failures:
logger.warning("Signal RPC error (%s): %s", method, data["error"])
else:
logger.debug("Signal RPC error (%s): %s", method, data["error"])
return None
return data.get("result")
except Exception as e:
logger.warning("Signal RPC %s failed: %s", method, e)
if log_failures:
logger.warning("Signal RPC %s failed: %s", method, e)
else:
logger.debug("Signal RPC %s failed: %s", method, e)
return None
# ------------------------------------------------------------------
@@ -627,7 +655,28 @@ class SignalAdapter(BasePlatformAdapter):
self._recent_sent_timestamps.pop()
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Send a typing indicator."""
"""Send a typing indicator.
base.py's ``_keep_typing`` refresh loop calls this every ~2s while
the agent is processing. If signal-cli returns NETWORK_FAILURE for
this recipient (offline, unroutable, group membership lost, etc.)
the unmitigated behaviour is: a WARNING log every 2 seconds for as
long as the agent keeps running. Instead we:
- silence the WARNING after the first consecutive failure (subsequent
attempts log at DEBUG) so transport issues are still visible once
but don't flood the log,
- skip the RPC entirely during an exponential cooldown window once
three consecutive failures have happened, so we stop hammering
signal-cli with requests it can't deliver.
A successful sendTyping clears the counters.
"""
now = time.monotonic()
skip_until = self._typing_skip_until.get(chat_id, 0.0)
if now < skip_until:
return
params: Dict[str, Any] = {
"account": self.account,
}
@@ -637,7 +686,26 @@ class SignalAdapter(BasePlatformAdapter):
else:
params["recipient"] = [chat_id]
await self._rpc("sendTyping", params, rpc_id="typing")
fails = self._typing_failures.get(chat_id, 0)
result = await self._rpc(
"sendTyping",
params,
rpc_id="typing",
log_failures=(fails == 0),
)
if result is None:
fails += 1
self._typing_failures[chat_id] = fails
# After 3 consecutive failures, back off exponentially (16s,
# 32s, 60s cap) to stop spamming signal-cli for a recipient
# that clearly isn't reachable right now.
if fails >= 3:
backoff = min(60.0, 16.0 * (2 ** (fails - 3)))
self._typing_skip_until[chat_id] = now + backoff
else:
self._typing_failures.pop(chat_id, None)
self._typing_skip_until.pop(chat_id, None)
async def send_image(
self,
@@ -789,6 +857,10 @@ class SignalAdapter(BasePlatformAdapter):
await task
except asyncio.CancelledError:
pass
# Reset per-chat typing backoff state so the next agent turn starts
# fresh rather than inheriting a cooldown from a prior conversation.
self._typing_failures.pop(chat_id, None)
self._typing_skip_until.pop(chat_id, None)
async def stop_typing(self, chat_id: str) -> None:
"""Public interface for stopping typing — called by base adapter's
+102 -6
View File
@@ -118,6 +118,84 @@ def _strip_mdv2(text: str) -> str:
return cleaned
# ---------------------------------------------------------------------------
# Markdown table → code block conversion
# ---------------------------------------------------------------------------
# Telegram's MarkdownV2 has no table syntax — '|' is just an escaped literal,
# so pipe tables render as noisy backslash-pipe text with no alignment.
# Wrapping the table in a fenced code block makes Telegram render it as
# monospace preformatted text with columns intact.
# Matches a GFM table delimiter row: optional outer pipes, cells containing
# only dashes (with optional leading/trailing colons for alignment) separated
# by '|'. Requires at least one internal '|' so lone '---' horizontal rules
# are NOT matched.
_TABLE_SEPARATOR_RE = re.compile(
r'^\s*\|?\s*:?-+:?\s*(?:\|\s*:?-+:?\s*){1,}\|?\s*$'
)
def _is_table_row(line: str) -> bool:
"""Return True if *line* could plausibly be a table data row."""
stripped = line.strip()
return bool(stripped) and '|' in stripped
def _wrap_markdown_tables(text: str) -> str:
"""Wrap GFM-style pipe tables in ``` fences so Telegram renders them.
Detected by a row containing '|' immediately followed by a delimiter
row matching :data:`_TABLE_SEPARATOR_RE`. Subsequent pipe-containing
non-blank lines are consumed as the table body and included in the
wrapped block. Tables inside existing fenced code blocks are left
alone.
"""
if '|' not in text or '-' not in text:
return text
lines = text.split('\n')
out: list[str] = []
in_fence = False
i = 0
while i < len(lines):
line = lines[i]
stripped = line.lstrip()
# Track existing fenced code blocks — never touch content inside.
if stripped.startswith('```'):
in_fence = not in_fence
out.append(line)
i += 1
continue
if in_fence:
out.append(line)
i += 1
continue
# Look for a header row (contains '|') immediately followed by a
# delimiter row.
if (
'|' in line
and i + 1 < len(lines)
and _TABLE_SEPARATOR_RE.match(lines[i + 1])
):
table_block = [line, lines[i + 1]]
j = i + 2
while j < len(lines) and _is_table_row(lines[j]):
table_block.append(lines[j])
j += 1
out.append('```')
out.extend(table_block)
out.append('```')
i = j
continue
out.append(line)
i += 1
return '\n'.join(out)
class TelegramAdapter(BasePlatformAdapter):
"""
Telegram bot adapter.
@@ -1916,6 +1994,12 @@ class TelegramAdapter(BasePlatformAdapter):
text = content
# 0) Pre-wrap GFM-style pipe tables in ``` fences. Telegram can't
# render tables natively, but fenced code blocks render as
# monospace preformatted text with columns intact. The wrapped
# tables then flow through step (1) below as protected regions.
text = _wrap_markdown_tables(text)
# 1) Protect fenced code blocks (``` ... ```)
# Per MarkdownV2 spec, \ and ` inside pre/code must be escaped.
def _protect_fenced(m):
@@ -2242,7 +2326,7 @@ class TelegramAdapter(BasePlatformAdapter):
if not self._should_process_message(update.message):
return
event = self._build_message_event(update.message, MessageType.TEXT)
event = self._build_message_event(update.message, MessageType.TEXT, update_id=update.update_id)
event.text = self._clean_bot_trigger_text(event.text)
self._enqueue_text_event(event)
@@ -2253,7 +2337,7 @@ class TelegramAdapter(BasePlatformAdapter):
if not self._should_process_message(update.message, is_command=True):
return
event = self._build_message_event(update.message, MessageType.COMMAND)
event = self._build_message_event(update.message, MessageType.COMMAND, update_id=update.update_id)
await self.handle_message(event)
async def _handle_location_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
@@ -2289,7 +2373,7 @@ class TelegramAdapter(BasePlatformAdapter):
parts.append(f"Map: https://www.google.com/maps/search/?api=1&query={lat},{lon}")
parts.append("Ask what they'd like to find nearby (restaurants, cafes, etc.) and any preferences.")
event = self._build_message_event(msg, MessageType.LOCATION)
event = self._build_message_event(msg, MessageType.LOCATION, update_id=update.update_id)
event.text = "\n".join(parts)
await self.handle_message(event)
@@ -2440,7 +2524,7 @@ class TelegramAdapter(BasePlatformAdapter):
else:
msg_type = MessageType.DOCUMENT
event = self._build_message_event(msg, msg_type)
event = self._build_message_event(msg, msg_type, update_id=update.update_id)
# Add caption as text
if msg.caption:
@@ -2779,8 +2863,19 @@ class TelegramAdapter(BasePlatformAdapter):
self.name, cache_key, thread_id,
)
def _build_message_event(self, message: Message, msg_type: MessageType) -> MessageEvent:
"""Build a MessageEvent from a Telegram message."""
def _build_message_event(
self,
message: Message,
msg_type: MessageType,
update_id: Optional[int] = None,
) -> MessageEvent:
"""Build a MessageEvent from a Telegram message.
``update_id`` is the ``Update.update_id`` from PTB; passing it through
lets ``/restart`` record the triggering offset so the new gateway
process can advance past it (prevents ``/restart`` being re-delivered
when PTB's graceful-shutdown ACK fails).
"""
chat = message.chat
user = message.from_user
@@ -2859,6 +2954,7 @@ class TelegramAdapter(BasePlatformAdapter):
source=source,
raw_message=message,
message_id=str(message.message_id),
platform_update_id=update_id,
reply_to_message_id=reply_to_id,
reply_to_text=reply_to_text,
auto_skill=topic_skill,
+41 -10
View File
@@ -180,6 +180,8 @@ class WeComAdapter(BasePlatformAdapter):
self._text_batch_split_delay_seconds = float(os.getenv("HERMES_WECOM_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
self._device_id = uuid.uuid4().hex
self._last_chat_req_ids: Dict[str, str] = {}
# ------------------------------------------------------------------
# Connection lifecycle
@@ -277,7 +279,11 @@ class WeComAdapter(BasePlatformAdapter):
{
"cmd": APP_CMD_SUBSCRIBE,
"headers": {"req_id": req_id},
"body": {"bot_id": self._bot_id, "secret": self._secret},
"body": {
"bot_id": self._bot_id,
"secret": self._secret,
"device_id": self._device_id,
},
}
)
@@ -496,6 +502,11 @@ class WeComAdapter(BasePlatformAdapter):
logger.debug("[%s] DM sender %s blocked by policy", self.name, sender_id)
return
# Cache the inbound req_id after policy checks so proactive sends to
# this chat can fall back to APP_CMD_RESPONSE (required for groups —
# WeCom AI Bots cannot initiate APP_CMD_SEND in group chats).
self._remember_chat_req_id(chat_id, self._payload_req_id(payload))
text, reply_text = self._extract_text(body)
media_urls, media_types = await self._extract_media(body)
message_type = self._derive_message_type(body, text, media_types)
@@ -847,6 +858,23 @@ class WeComAdapter(BasePlatformAdapter):
while len(self._reply_req_ids) > DEDUP_MAX_SIZE:
self._reply_req_ids.pop(next(iter(self._reply_req_ids)))
def _remember_chat_req_id(self, chat_id: str, req_id: str) -> None:
"""Cache the most recent inbound req_id per chat.
Used as a fallback reply target when we need to send into a group
without an explicit ``reply_to`` WeCom AI Bots are blocked from
APP_CMD_SEND in groups and must use APP_CMD_RESPONSE bound to some
prior req_id. Bounded like _reply_req_ids so long-running gateways
don't leak memory across many chats.
"""
normalized_chat_id = str(chat_id or "").strip()
normalized_req_id = str(req_id or "").strip()
if not normalized_chat_id or not normalized_req_id:
return
self._last_chat_req_ids[normalized_chat_id] = normalized_req_id
while len(self._last_chat_req_ids) > DEDUP_MAX_SIZE:
self._last_chat_req_ids.pop(next(iter(self._last_chat_req_ids)))
def _reply_req_id_for_message(self, reply_to: Optional[str]) -> Optional[str]:
normalized = str(reply_to or "").strip()
if not normalized or normalized.startswith("quote:"):
@@ -1163,19 +1191,15 @@ class WeComAdapter(BasePlatformAdapter):
self._raise_for_wecom_error(response, "send media message")
return response
async def _send_reply_stream(self, reply_req_id: str, content: str) -> Dict[str, Any]:
async def _send_reply_markdown(self, reply_req_id: str, content: str) -> Dict[str, Any]:
response = await self._send_reply_request(
reply_req_id,
{
"msgtype": "stream",
"stream": {
"id": self._new_req_id("stream"),
"finish": True,
"content": content[:self.MAX_MESSAGE_LENGTH],
},
"msgtype": "markdown",
"markdown": {"content": content[:self.MAX_MESSAGE_LENGTH]},
},
)
self._raise_for_wecom_error(response, "send reply stream")
self._raise_for_wecom_error(response, "send reply markdown")
return response
async def _send_reply_media_message(
@@ -1235,6 +1259,9 @@ class WeComAdapter(BasePlatformAdapter):
return SendResult(success=False, error=prepared["reject_reason"])
reply_req_id = self._reply_req_id_for_message(reply_to)
if not reply_req_id and chat_id in self._last_chat_req_ids:
reply_req_id = self._last_chat_req_ids[chat_id]
try:
upload_result = await self._upload_media_bytes(
prepared["data"],
@@ -1302,8 +1329,12 @@ class WeComAdapter(BasePlatformAdapter):
try:
reply_req_id = self._reply_req_id_for_message(reply_to)
if not reply_req_id and chat_id in self._last_chat_req_ids:
reply_req_id = self._last_chat_req_ids[chat_id]
if reply_req_id:
response = await self._send_reply_stream(reply_req_id, content)
response = await self._send_reply_markdown(reply_req_id, content)
else:
response = await self._send_request(
APP_CMD_SEND,
+310 -86
View File
@@ -28,7 +28,7 @@ import uuid
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from urllib.parse import quote
from urllib.parse import quote, urlparse
logger = logging.getLogger(__name__)
@@ -96,6 +96,28 @@ MEDIA_VIDEO = 2
MEDIA_FILE = 3
MEDIA_VOICE = 4
_LIVE_ADAPTERS: Dict[str, Any] = {}
def _make_ssl_connector() -> Optional["aiohttp.TCPConnector"]:
"""Return a TCPConnector with a certifi CA bundle, or None if certifi is unavailable.
Tencent's iLink server (``ilinkai.weixin.qq.com``) is not verifiable against
some system CA stores (notably Homebrew's OpenSSL on macOS Apple Silicon).
When ``certifi`` is installed, use its Mozilla CA bundle to guarantee
verification. Otherwise fall back to aiohttp's default (which honors
``SSL_CERT_FILE`` env var via ``trust_env=True``).
"""
try:
import ssl
import certifi
except ImportError:
return None
if not AIOHTTP_AVAILABLE:
return None
ssl_ctx = ssl.create_default_context(cafile=certifi.where())
return aiohttp.TCPConnector(ssl=ssl_ctx)
ITEM_TEXT = 1
ITEM_IMAGE = 2
ITEM_VOICE = 3
@@ -398,7 +420,12 @@ async def _send_message(
text: str,
context_token: Optional[str],
client_id: str,
) -> None:
) -> Dict[str, Any]:
"""Send a text message via iLink sendmessage API.
Returns the raw API response dict (may contain error codes like
``errcode: -14`` for session expiry that the caller can inspect).
"""
if not text or not text.strip():
raise ValueError("_send_message: text must not be empty")
message: Dict[str, Any] = {
@@ -411,7 +438,7 @@ async def _send_message(
}
if context_token:
message["context_token"] = context_token
await _api_post(
return await _api_post(
session,
base_url=base_url,
endpoint=EP_SEND_MESSAGE,
@@ -533,6 +560,39 @@ async def _download_bytes(
return await response.read()
_WEIXIN_CDN_ALLOWLIST: frozenset[str] = frozenset(
{
"novac2c.cdn.weixin.qq.com",
"ilinkai.weixin.qq.com",
"wx.qlogo.cn",
"thirdwx.qlogo.cn",
"res.wx.qq.com",
"mmbiz.qpic.cn",
"mmbiz.qlogo.cn",
}
)
def _assert_weixin_cdn_url(url: str) -> None:
"""Raise ValueError if *url* does not point at a known WeChat CDN host."""
try:
parsed = urlparse(url)
scheme = parsed.scheme.lower()
host = parsed.hostname or ""
except Exception as exc: # noqa: BLE001
raise ValueError(f"Unparseable media URL: {url!r}") from exc
if scheme not in ("http", "https"):
raise ValueError(
f"Media URL has disallowed scheme {scheme!r}; only http/https are permitted."
)
if host not in _WEIXIN_CDN_ALLOWLIST:
raise ValueError(
f"Media URL host {host!r} is not in the WeChat CDN allowlist. "
"Refusing to fetch to prevent SSRF."
)
def _media_reference(item: Dict[str, Any], key: str) -> Dict[str, Any]:
return (item.get(key) or {}).get("media") or {}
@@ -553,6 +613,7 @@ async def _download_and_decrypt_media(
timeout_seconds=timeout_seconds,
)
elif full_url:
_assert_weixin_cdn_url(full_url)
raw = await _download_bytes(session, url=full_url, timeout_seconds=timeout_seconds)
else:
raise RuntimeError("media item had neither encrypt_query_param nor full_url")
@@ -623,42 +684,31 @@ def _rewrite_table_block_for_weixin(lines: List[str]) -> str:
def _normalize_markdown_blocks(content: str) -> str:
lines = content.splitlines()
result: List[str] = []
i = 0
in_code_block = False
blank_run = 0
while i < len(lines):
line = lines[i].rstrip()
fence_match = _FENCE_RE.match(line.strip())
if fence_match:
for raw_line in lines:
line = raw_line.rstrip()
if _FENCE_RE.match(line.strip()):
in_code_block = not in_code_block
result.append(line)
i += 1
blank_run = 0
continue
if in_code_block:
result.append(line)
i += 1
continue
if (
i + 1 < len(lines)
and "|" in lines[i]
and _TABLE_RULE_RE.match(lines[i + 1].rstrip())
):
table_lines = [lines[i].rstrip(), lines[i + 1].rstrip()]
i += 2
while i < len(lines) and "|" in lines[i]:
table_lines.append(lines[i].rstrip())
i += 1
result.append(_rewrite_table_block_for_weixin(table_lines))
if not line.strip():
blank_run += 1
if blank_run <= 1:
result.append("")
continue
result.append(_MARKDOWN_LINK_RE.sub(r"\1 (\2)", _rewrite_headers_for_weixin(line)))
i += 1
blank_run = 0
result.append(line)
normalized = "\n".join(item.rstrip() for item in result)
normalized = re.sub(r"\n{3,}", "\n\n", normalized)
return normalized.strip()
return "\n".join(result).strip()
def _split_markdown_blocks(content: str) -> List[str]:
@@ -704,8 +754,8 @@ def _split_delivery_units_for_weixin(content: str) -> List[str]:
Weixin can render Markdown, but chat readability is better when top-level
line breaks become separate messages. Keep fenced code blocks intact and
attach indented continuation lines to the previous top-level line so
transformed tables/lists do not get torn apart.
attach indented continuation lines to the previous top-level line so nested
list items do not get torn apart.
"""
units: List[str] = []
@@ -747,7 +797,9 @@ def _looks_like_chatty_line_for_weixin(line: str) -> bool:
return False
if line.startswith((" ", "\t")):
return False
if stripped.startswith((">", "-", "*", "")):
if stripped.startswith((">", "-", "*", "", "#", "|")):
return False
if _TABLE_RULE_RE.match(stripped):
return False
if re.match(r"^\*\*[^*]+\*\*$", stripped):
return False
@@ -757,10 +809,12 @@ def _looks_like_chatty_line_for_weixin(line: str) -> bool:
def _looks_like_heading_line_for_weixin(line: str) -> bool:
"""Return True when a short line behaves like a plain-text heading."""
"""Return True when a short line behaves like a heading."""
stripped = line.strip()
if not stripped:
return False
if _HEADER_RE.match(stripped):
return True
return len(stripped) <= 24 and stripped.endswith((":", ""))
@@ -935,7 +989,7 @@ async def qr_login(
if not AIOHTTP_AVAILABLE:
raise RuntimeError("aiohttp is required for Weixin QR login")
async with aiohttp.ClientSession(trust_env=True) as session:
async with aiohttp.ClientSession(trust_env=True, connector=_make_ssl_connector()) as session:
try:
qr_resp = await _api_get(
session,
@@ -953,6 +1007,10 @@ async def qr_login(
logger.error("weixin: QR response missing qrcode")
return None
# qrcode_url is the full scannable liteapp URL; qrcode_value is just the hex token
# WeChat needs to scan the full URL, not the raw hex string
qr_scan_data = qrcode_url if qrcode_url else qrcode_value
print("\n请使用微信扫描以下二维码:")
if qrcode_url:
print(qrcode_url)
@@ -960,11 +1018,11 @@ async def qr_login(
import qrcode
qr = qrcode.QRCode()
qr.add_data(qrcode_url or qrcode_value)
qr.add_data(qr_scan_data)
qr.make(fit=True)
qr.print_ascii(invert=True)
except Exception:
print("(终端二维码渲染失败,请直接打开上面的二维码链接)")
except Exception as _qr_exc:
print(f"(终端二维码渲染失败: {_qr_exc},请直接打开上面的二维码链接)")
deadline = time.time() + timeout_seconds
current_base_url = ILINK_BASE_URL
@@ -1010,8 +1068,17 @@ async def qr_login(
)
qrcode_value = str(qr_resp.get("qrcode") or "")
qrcode_url = str(qr_resp.get("qrcode_img_content") or "")
qr_scan_data = qrcode_url if qrcode_url else qrcode_value
if qrcode_url:
print(qrcode_url)
try:
import qrcode as _qrcode
qr = _qrcode.QRCode()
qr.add_data(qr_scan_data)
qr.make(fit=True)
qr.print_ascii(invert=True)
except Exception:
pass
except Exception as exc:
logger.error("weixin: QR refresh failed: %s", exc)
return None
@@ -1059,7 +1126,8 @@ class WeixinAdapter(BasePlatformAdapter):
self._hermes_home = hermes_home
self._token_store = ContextTokenStore(hermes_home)
self._typing_cache = TypingTicketCache()
self._session: Optional[aiohttp.ClientSession] = None
self._poll_session: Optional[aiohttp.ClientSession] = None
self._send_session: Optional[aiohttp.ClientSession] = None
self._poll_task: Optional[asyncio.Task] = None
self._dedup = MessageDeduplicator(ttl_seconds=MESSAGE_DEDUP_TTL_SECONDS)
@@ -1134,14 +1202,17 @@ class WeixinAdapter(BasePlatformAdapter):
except Exception as exc:
logger.debug("[%s] Token lock unavailable (non-fatal): %s", self.name, exc)
self._session = aiohttp.ClientSession(trust_env=True)
self._poll_session = aiohttp.ClientSession(trust_env=True, connector=_make_ssl_connector())
self._send_session = aiohttp.ClientSession(trust_env=True, connector=_make_ssl_connector())
self._token_store.restore(self._account_id)
self._poll_task = asyncio.create_task(self._poll_loop(), name="weixin-poll")
self._mark_connected()
_LIVE_ADAPTERS[self._token] = self
logger.info("[%s] Connected account=%s base=%s", self.name, _safe_id(self._account_id), self._base_url)
return True
async def disconnect(self) -> None:
_LIVE_ADAPTERS.pop(self._token, None)
self._running = False
if self._poll_task and not self._poll_task.done():
self._poll_task.cancel()
@@ -1150,15 +1221,18 @@ class WeixinAdapter(BasePlatformAdapter):
except asyncio.CancelledError:
pass
self._poll_task = None
if self._session and not self._session.closed:
await self._session.close()
self._session = None
if self._poll_session and not self._poll_session.closed:
await self._poll_session.close()
self._poll_session = None
if self._send_session and not self._send_session.closed:
await self._send_session.close()
self._send_session = None
self._release_platform_lock()
self._mark_disconnected()
logger.info("[%s] Disconnected", self.name)
async def _poll_loop(self) -> None:
assert self._session is not None
assert self._poll_session is not None
sync_buf = _load_sync_buf(self._hermes_home, self._account_id)
timeout_ms = LONG_POLL_TIMEOUT_MS
consecutive_failures = 0
@@ -1166,7 +1240,7 @@ class WeixinAdapter(BasePlatformAdapter):
while self._running:
try:
response = await _get_updates(
self._session,
self._poll_session,
base_url=self._base_url,
token=self._token,
sync_buf=sync_buf,
@@ -1223,7 +1297,7 @@ class WeixinAdapter(BasePlatformAdapter):
logger.error("[%s] unhandled inbound error from=%s: %s", self.name, _safe_id(message.get("from_user_id")), exc, exc_info=True)
async def _process_message(self, message: Dict[str, Any]) -> None:
assert self._session is not None
assert self._poll_session is not None
sender_id = str(message.get("from_user_id") or "").strip()
if not sender_id:
return
@@ -1316,7 +1390,7 @@ class WeixinAdapter(BasePlatformAdapter):
media = _media_reference(item, "image_item")
try:
data = await _download_and_decrypt_media(
self._session,
self._poll_session,
cdn_base_url=self._cdn_base_url,
encrypted_query_param=media.get("encrypt_query_param"),
aes_key_b64=(item.get("image_item") or {}).get("aeskey")
@@ -1334,7 +1408,7 @@ class WeixinAdapter(BasePlatformAdapter):
media = _media_reference(item, "video_item")
try:
data = await _download_and_decrypt_media(
self._session,
self._poll_session,
cdn_base_url=self._cdn_base_url,
encrypted_query_param=media.get("encrypt_query_param"),
aes_key_b64=media.get("aes_key"),
@@ -1353,7 +1427,7 @@ class WeixinAdapter(BasePlatformAdapter):
mime = _mime_from_filename(filename)
try:
data = await _download_and_decrypt_media(
self._session,
self._poll_session,
cdn_base_url=self._cdn_base_url,
encrypted_query_param=media.get("encrypt_query_param"),
aes_key_b64=media.get("aes_key"),
@@ -1372,7 +1446,7 @@ class WeixinAdapter(BasePlatformAdapter):
return None
try:
data = await _download_and_decrypt_media(
self._session,
self._poll_session,
cdn_base_url=self._cdn_base_url,
encrypted_query_param=media.get("encrypt_query_param"),
aes_key_b64=media.get("aes_key"),
@@ -1385,13 +1459,13 @@ class WeixinAdapter(BasePlatformAdapter):
return None
async def _maybe_fetch_typing_ticket(self, user_id: str, context_token: Optional[str]) -> None:
if not self._session or not self._token:
if not self._poll_session or not self._token:
return
if self._typing_cache.get(user_id):
return
try:
response = await _get_config(
self._session,
self._poll_session,
base_url=self._base_url,
token=self._token,
user_id=user_id,
@@ -1416,12 +1490,19 @@ class WeixinAdapter(BasePlatformAdapter):
context_token: Optional[str],
client_id: str,
) -> None:
"""Send a single text chunk with per-chunk retry and backoff."""
"""Send a single text chunk with per-chunk retry and backoff.
On session-expired errors (errcode -14), automatically retries
*without* ``context_token`` iLink accepts tokenless sends as a
degraded fallback, which keeps cron-initiated push messages working
even when no user message has refreshed the session recently.
"""
last_error: Optional[Exception] = None
retried_without_token = False
for attempt in range(self._send_chunk_retries + 1):
try:
await _send_message(
self._session,
resp = await _send_message(
self._send_session,
base_url=self._base_url,
token=self._token,
to=chat_id,
@@ -1429,6 +1510,31 @@ class WeixinAdapter(BasePlatformAdapter):
context_token=context_token,
client_id=client_id,
)
# Check iLink response for session-expired error
if resp and isinstance(resp, dict):
ret = resp.get("ret")
errcode = resp.get("errcode")
if (ret is not None and ret not in (0,)) or (errcode is not None and errcode not in (0,)):
is_session_expired = (
ret == SESSION_EXPIRED_ERRCODE
or errcode == SESSION_EXPIRED_ERRCODE
)
# Session expired — strip token and retry once
if is_session_expired and not retried_without_token and context_token:
retried_without_token = True
context_token = None
self._token_store._cache.pop(
self._token_store._key(self._account_id, chat_id), None
)
logger.warning(
"[%s] session expired for %s; retrying without context_token",
self.name, _safe_id(chat_id),
)
continue
errmsg = resp.get("errmsg") or resp.get("msg") or "unknown error"
raise RuntimeError(
f"iLink sendmessage error: ret={ret} errcode={errcode} errmsg={errmsg}"
)
return
except Exception as exc:
last_error = exc
@@ -1456,12 +1562,48 @@ class WeixinAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
if not self._session or not self._token:
if not self._send_session or not self._token:
return SendResult(success=False, error="Not connected")
context_token = self._token_store.get(self._account_id, chat_id)
last_message_id: Optional[str] = None
# Extract MEDIA: tags and bare local file paths before text delivery.
media_files, cleaned_content = self.extract_media(content)
_, image_cleaned = self.extract_images(cleaned_content)
local_files, final_content = self.extract_local_files(image_cleaned)
_AUDIO_EXTS = {".ogg", ".opus", ".mp3", ".wav", ".m4a"}
_VIDEO_EXTS = {".mp4", ".mov", ".avi", ".mkv", ".webm", ".3gp"}
_IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".webp", ".gif"}
async def _deliver_media(path: str, is_voice: bool = False) -> None:
ext = Path(path).suffix.lower()
if is_voice or ext in _AUDIO_EXTS:
await self.send_voice(chat_id=chat_id, audio_path=path, metadata=metadata)
elif ext in _VIDEO_EXTS:
await self.send_video(chat_id=chat_id, video_path=path, metadata=metadata)
elif ext in _IMAGE_EXTS:
await self.send_image_file(chat_id=chat_id, image_path=path, metadata=metadata)
else:
await self.send_document(chat_id=chat_id, file_path=path, metadata=metadata)
try:
chunks = [c for c in self._split_text(self.format_message(content)) if c and c.strip()]
# Deliver extracted MEDIA: attachments first.
for media_path, is_voice in media_files:
try:
await _deliver_media(media_path, is_voice)
except Exception as exc:
logger.warning("[%s] media delivery failed for %s: %s", self.name, media_path, exc)
# Deliver bare local file paths.
for file_path in local_files:
try:
await _deliver_media(file_path, is_voice=False)
except Exception as exc:
logger.warning("[%s] local file delivery failed for %s: %s", self.name, file_path, exc)
# Deliver text content.
chunks = [c for c in self._split_text(self.format_message(final_content)) if c and c.strip()]
for idx, chunk in enumerate(chunks):
client_id = f"hermes-weixin-{uuid.uuid4().hex}"
await self._send_text_chunk(
@@ -1479,14 +1621,14 @@ class WeixinAdapter(BasePlatformAdapter):
return SendResult(success=False, error=str(exc))
async def send_typing(self, chat_id: str, metadata: Optional[Dict[str, Any]] = None) -> None:
if not self._session or not self._token:
if not self._send_session or not self._token:
return
typing_ticket = self._typing_cache.get(chat_id)
if not typing_ticket:
return
try:
await _send_typing(
self._session,
self._send_session,
base_url=self._base_url,
token=self._token,
to_user_id=chat_id,
@@ -1497,14 +1639,14 @@ class WeixinAdapter(BasePlatformAdapter):
logger.debug("[%s] typing start failed for %s: %s", self.name, _safe_id(chat_id), exc)
async def stop_typing(self, chat_id: str) -> None:
if not self._session or not self._token:
if not self._send_session or not self._token:
return
typing_ticket = self._typing_cache.get(chat_id)
if not typing_ticket:
return
try:
await _send_typing(
self._session,
self._send_session,
base_url=self._base_url,
token=self._token,
to_user_id=chat_id,
@@ -1542,24 +1684,35 @@ class WeixinAdapter(BasePlatformAdapter):
async def send_image_file(
self,
chat_id: str,
path: str,
caption: str = "",
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs,
) -> SendResult:
return await self.send_document(chat_id, file_path=path, caption=caption, metadata=metadata)
del reply_to, kwargs
return await self.send_document(
chat_id=chat_id,
file_path=image_path,
caption=caption,
metadata=metadata,
)
async def send_document(
self,
chat_id: str,
file_path: str,
caption: str = "",
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs,
) -> SendResult:
if not self._session or not self._token:
del file_name, reply_to, metadata, kwargs
if not self._send_session or not self._token:
return SendResult(success=False, error="Not connected")
try:
message_id = await self._send_file(chat_id, file_path, caption)
message_id = await self._send_file(chat_id, file_path, caption or "")
return SendResult(success=True, message_id=message_id)
except Exception as exc:
logger.error("[%s] send_document failed to=%s: %s", self.name, _safe_id(chat_id), exc)
@@ -1573,7 +1726,7 @@ class WeixinAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
if not self._session or not self._token:
if not self._send_session or not self._token:
return SendResult(success=False, error="Not connected")
try:
message_id = await self._send_file(chat_id, video_path, caption or "")
@@ -1590,7 +1743,24 @@ class WeixinAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
return await self.send_document(chat_id, audio_path, caption=caption or "", metadata=metadata)
if not self._send_session or not self._token:
return SendResult(success=False, error="Not connected")
# Native outbound Weixin voice bubbles are not proven-working in the
# upstream reference implementation. Prefer a reliable file attachment
# fallback so users at least receive playable audio, even for .silk.
fallback_caption = caption or "[voice message as attachment]"
try:
message_id = await self._send_file(
chat_id,
audio_path,
fallback_caption,
force_file_attachment=True,
)
return SendResult(success=True, message_id=message_id)
except Exception as exc:
logger.error("[%s] send_voice failed to=%s: %s", self.name, _safe_id(chat_id), exc)
return SendResult(success=False, error=str(exc))
async def _download_remote_media(self, url: str) -> str:
from tools.url_safety import is_safe_url
@@ -1598,8 +1768,8 @@ class WeixinAdapter(BasePlatformAdapter):
if not is_safe_url(url):
raise ValueError(f"Blocked unsafe URL (SSRF protection): {url}")
assert self._session is not None
async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as response:
assert self._send_session is not None
async with self._send_session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as response:
response.raise_for_status()
data = await response.read()
suffix = Path(url.split("?", 1)[0]).suffix or ".bin"
@@ -1607,16 +1777,22 @@ class WeixinAdapter(BasePlatformAdapter):
handle.write(data)
return handle.name
async def _send_file(self, chat_id: str, path: str, caption: str) -> str:
assert self._session is not None and self._token is not None
async def _send_file(
self,
chat_id: str,
path: str,
caption: str,
force_file_attachment: bool = False,
) -> str:
assert self._send_session is not None and self._token is not None
plaintext = Path(path).read_bytes()
media_type, item_builder = self._outbound_media_builder(path)
media_type, item_builder = self._outbound_media_builder(path, force_file_attachment=force_file_attachment)
filekey = secrets.token_hex(16)
aes_key = secrets.token_bytes(16)
rawsize = len(plaintext)
rawfilemd5 = hashlib.md5(plaintext).hexdigest()
upload_response = await _get_upload_url(
self._session,
self._send_session,
base_url=self._base_url,
token=self._token,
to_user_id=chat_id,
@@ -1642,30 +1818,34 @@ class WeixinAdapter(BasePlatformAdapter):
raise RuntimeError(f"getUploadUrl returned neither upload_param nor upload_full_url: {upload_response}")
encrypted_query_param = await _upload_ciphertext(
self._session,
self._send_session,
ciphertext=ciphertext,
upload_url=upload_url,
)
context_token = self._token_store.get(self._account_id, chat_id)
# The iLink API expects aes_key as base64(hex_string), not base64(raw_bytes).
# Sending base64(raw_bytes) causes images to show as grey boxes on the
# receiver side because the decryption key doesn't match.
aes_key_for_api = base64.b64encode(aes_key.hex().encode("ascii")).decode("ascii")
media_item = item_builder(
encrypt_query_param=encrypted_query_param,
aes_key_for_api=aes_key_for_api,
ciphertext_size=len(ciphertext),
plaintext_size=rawsize,
filename=Path(path).name,
rawfilemd5=rawfilemd5,
)
item_kwargs = {
"encrypt_query_param": encrypted_query_param,
"aes_key_for_api": aes_key_for_api,
"ciphertext_size": len(ciphertext),
"plaintext_size": rawsize,
"filename": Path(path).name,
"rawfilemd5": rawfilemd5,
}
if media_type == MEDIA_VOICE and path.endswith(".silk"):
item_kwargs["encode_type"] = 6
item_kwargs["sample_rate"] = 24000
item_kwargs["bits_per_sample"] = 16
media_item = item_builder(**item_kwargs)
last_message_id = None
if caption:
last_message_id = f"hermes-weixin-{uuid.uuid4().hex}"
await _send_message(
self._session,
self._send_session,
base_url=self._base_url,
token=self._token,
to=chat_id,
@@ -1676,7 +1856,7 @@ class WeixinAdapter(BasePlatformAdapter):
last_message_id = f"hermes-weixin-{uuid.uuid4().hex}"
await _api_post(
self._session,
self._send_session,
base_url=self._base_url,
endpoint=EP_SEND_MESSAGE,
payload={
@@ -1695,7 +1875,7 @@ class WeixinAdapter(BasePlatformAdapter):
)
return last_message_id
def _outbound_media_builder(self, path: str):
def _outbound_media_builder(self, path: str, force_file_attachment: bool = False):
mime = mimetypes.guess_type(path)[0] or "application/octet-stream"
if mime.startswith("image/"):
return MEDIA_IMAGE, lambda **kw: {
@@ -1723,7 +1903,7 @@ class WeixinAdapter(BasePlatformAdapter):
"video_md5": kw.get("rawfilemd5", ""),
},
}
if mime.startswith("audio/") or path.endswith(".silk"):
if path.endswith(".silk") and not force_file_attachment:
return MEDIA_VOICE, lambda **kw: {
"type": ITEM_VOICE,
"voice_item": {
@@ -1732,9 +1912,25 @@ class WeixinAdapter(BasePlatformAdapter):
"aes_key": kw["aes_key_for_api"],
"encrypt_type": 1,
},
"encode_type": kw.get("encode_type"),
"bits_per_sample": kw.get("bits_per_sample"),
"sample_rate": kw.get("sample_rate"),
"playtime": kw.get("playtime", 0),
},
}
if mime.startswith("audio/"):
return MEDIA_FILE, lambda **kw: {
"type": ITEM_FILE,
"file_item": {
"media": {
"encrypt_query_param": kw["encrypt_query_param"],
"aes_key": kw["aes_key_for_api"],
"encrypt_type": 1,
},
"file_name": kw["filename"],
"len": str(kw["plaintext_size"]),
},
}
return MEDIA_FILE, lambda **kw: {
"type": ITEM_FILE,
"file_item": {
@@ -1784,7 +1980,34 @@ async def send_weixin_direct(
token_store.restore(account_id)
context_token = token_store.get(account_id, chat_id)
async with aiohttp.ClientSession(trust_env=True) as session:
live_adapter = _LIVE_ADAPTERS.get(resolved_token)
send_session = getattr(live_adapter, '_send_session', None)
if live_adapter is not None and send_session is not None and not send_session.closed:
last_result: Optional[SendResult] = None
cleaned = live_adapter.format_message(message)
if cleaned:
last_result = await live_adapter.send(chat_id, cleaned)
if not last_result.success:
return {"error": f"Weixin send failed: {last_result.error}"}
for media_path, _is_voice in media_files or []:
ext = Path(media_path).suffix.lower()
if ext in {".jpg", ".jpeg", ".png", ".gif", ".webp", ".bmp"}:
last_result = await live_adapter.send_image_file(chat_id, media_path)
else:
last_result = await live_adapter.send_document(chat_id, media_path)
if not last_result.success:
return {"error": f"Weixin media send failed: {last_result.error}"}
return {
"success": True,
"platform": "weixin",
"chat_id": chat_id,
"message_id": last_result.message_id if last_result else None,
"context_token_used": bool(context_token),
}
async with aiohttp.ClientSession(trust_env=True, connector=_make_ssl_connector()) as session:
adapter = WeixinAdapter(
PlatformConfig(
enabled=True,
@@ -1797,6 +2020,7 @@ async def send_weixin_direct(
},
)
)
adapter._send_session = session
adapter._session = session
adapter._token = resolved_token
adapter._account_id = account_id
+622 -29
View File
@@ -24,11 +24,20 @@ import signal
import tempfile
import threading
import time
from collections import OrderedDict
from contextvars import copy_context
from pathlib import Path
from datetime import datetime
from typing import Dict, Optional, Any, List
# --- Agent cache tuning ---------------------------------------------------
# Bounds the per-session AIAgent cache to prevent unbounded growth in
# long-lived gateways (each AIAgent holds LLM clients, tool schemas,
# memory providers, etc.). LRU order + idle TTL eviction are enforced
# from _enforce_agent_cache_cap() and _session_expiry_watcher() below.
_AGENT_CACHE_MAX_SIZE = 128
_AGENT_CACHE_IDLE_TTL_SECS = 3600.0 # evict agents idle for >1h
# ---------------------------------------------------------------------------
# SSL certificate auto-detection for NixOS and other non-standard systems.
# Must run BEFORE any HTTP library (discord, aiohttp, etc.) is imported.
@@ -622,8 +631,13 @@ class GatewayRunner:
# system prompt (including memory) every turn — breaking prefix cache
# and costing ~10x more on providers with prompt caching (Anthropic).
# Key: session_key, Value: (AIAgent, config_signature_str)
#
# OrderedDict so _enforce_agent_cache_cap() can pop the least-recently-
# used entry (move_to_end() on cache hits, popitem(last=False) for
# eviction). Hard cap via _AGENT_CACHE_MAX_SIZE, idle TTL enforced
# from _session_expiry_watcher().
import threading as _threading
self._agent_cache: Dict[str, tuple] = {}
self._agent_cache: "OrderedDict[str, tuple]" = OrderedDict()
self._agent_cache_lock = _threading.Lock()
# Per-session model overrides from /model command.
@@ -2102,6 +2116,11 @@ class GatewayRunner:
_cached_agent = self._running_agents.get(key)
if _cached_agent and _cached_agent is not _AGENT_PENDING_SENTINEL:
self._cleanup_agent_resources(_cached_agent)
# Drop the cache entry so the AIAgent (and its LLM
# clients, tool schemas, memory provider refs) can
# be garbage-collected. Otherwise the cache grows
# unbounded across the gateway's lifetime.
self._evict_cached_agent(key)
# Mark as flushed and persist to disk so the flag
# survives gateway restarts.
with self.session_store._lock:
@@ -2145,6 +2164,44 @@ class GatewayRunner:
logger.info(
"Session expiry done: %d flushed", _flushed,
)
# Sweep agents that have been idle beyond the TTL regardless
# of session reset policy. This catches sessions with very
# long / "never" reset windows, whose cached AIAgents would
# otherwise pin memory for the gateway's entire lifetime.
try:
_idle_evicted = self._sweep_idle_cached_agents()
if _idle_evicted:
logger.info(
"Agent cache idle sweep: evicted %d agent(s)",
_idle_evicted,
)
except Exception as _e:
logger.debug("Idle agent sweep failed: %s", _e)
# Periodically prune stale SessionStore entries. The
# in-memory dict (and sessions.json) would otherwise grow
# unbounded in gateways serving many rotating chats /
# threads / users over long time windows. Pruning is
# invisible to users — a resumed session just gets a
# fresh session_id, exactly as if the reset policy fired.
_last_prune_ts = getattr(self, "_last_session_store_prune_ts", 0.0)
_prune_interval = 3600.0 # once per hour
if time.time() - _last_prune_ts > _prune_interval:
try:
_max_age = int(
getattr(self.config, "session_store_max_age_days", 0) or 0
)
if _max_age > 0:
_pruned = self.session_store.prune_old_entries(_max_age)
if _pruned:
logger.info(
"SessionStore prune: dropped %d stale entries",
_pruned,
)
except Exception as _e:
logger.debug("SessionStore prune failed: %s", _e)
self._last_session_store_prune_ts = time.time()
except Exception as e:
logger.debug("Session expiry watcher error: %s", e)
# Sleep in small increments so we can stop quickly
@@ -2351,6 +2408,7 @@ class GatewayRunner:
self.adapters.clear()
self._running_agents.clear()
self._running_agents_ts.clear()
self._pending_messages.clear()
self._pending_approvals.clear()
if hasattr(self, '_busy_ack_ts'):
@@ -2375,6 +2433,20 @@ class GatewayRunner:
except Exception:
pass
# Close SQLite session DBs so the WAL write lock is released.
# Without this, --replace and similar restart flows leave the
# old gateway's connection holding the WAL lock until Python
# actually exits — causing 'database is locked' errors when
# the new gateway tries to open the same file.
for _db_holder in (self, getattr(self, "session_store", None)):
_db = getattr(_db_holder, "_db", None) if _db_holder else None
if _db is None or not hasattr(_db, "close"):
continue
try:
_db.close()
except Exception as _e:
logger.debug("SessionDB close error: %s", _e)
from gateway.status import remove_pid_file
remove_pid_file()
@@ -2618,6 +2690,9 @@ class GatewayRunner:
Platform.BLUEBUBBLES: "BLUEBUBBLES_ALLOWED_USERS",
Platform.QQBOT: "QQ_ALLOWED_USERS",
}
platform_group_env_map = {
Platform.QQBOT: "QQ_GROUP_ALLOWED_USERS",
}
platform_allow_all_map = {
Platform.TELEGRAM: "TELEGRAM_ALLOW_ALL_USERS",
Platform.DISCORD: "DISCORD_ALLOW_ALL_USERS",
@@ -2642,6 +2717,28 @@ class GatewayRunner:
if platform_allow_all_var and os.getenv(platform_allow_all_var, "").lower() in ("true", "1", "yes"):
return True
# Discord bot senders that passed the DISCORD_ALLOW_BOTS platform
# filter are already authorized at the platform level — skip the
# user allowlist. Without this, bot messages allowed by
# DISCORD_ALLOW_BOTS=mentions/all would be rejected here with
# "Unauthorized user" (fixes #4466).
if source.platform == Platform.DISCORD and getattr(source, "is_bot", False):
allow_bots = os.getenv("DISCORD_ALLOW_BOTS", "none").lower().strip()
if allow_bots in ("mentions", "all"):
return True
# Discord role-based access (DISCORD_ALLOWED_ROLES): the adapter's
# on_message pre-filter already verified role membership — if the
# message reached here, the user passed that check. Authorize
# directly to avoid the "no allowlists configured" branch below
# rejecting role-only setups where DISCORD_ALLOWED_USERS is empty
# (issue #7871).
if (
source.platform == Platform.DISCORD
and os.getenv("DISCORD_ALLOWED_ROLES", "").strip()
):
return True
# Check pairing store (always checked, regardless of allowlists)
platform_name = source.platform.value if source.platform else ""
if self.pairing_store.is_approved(platform_name, user_id):
@@ -2649,12 +2746,23 @@ class GatewayRunner:
# Check platform-specific and global allowlists
platform_allowlist = os.getenv(platform_env_map.get(source.platform, ""), "").strip()
group_allowlist = ""
if source.chat_type == "group":
group_allowlist = os.getenv(platform_group_env_map.get(source.platform, ""), "").strip()
global_allowlist = os.getenv("GATEWAY_ALLOWED_USERS", "").strip()
if not platform_allowlist and not global_allowlist:
if not platform_allowlist and not group_allowlist and not global_allowlist:
# No allowlists configured -- check global allow-all flag
return os.getenv("GATEWAY_ALLOW_ALL_USERS", "").lower() in ("true", "1", "yes")
# Some platforms authorize group traffic by chat ID rather than sender ID.
if group_allowlist and source.chat_type == "group" and source.chat_id:
allowed_group_ids = {
chat_id.strip() for chat_id in group_allowlist.split(",") if chat_id.strip()
}
if "*" in allowed_group_ids or source.chat_id in allowed_group_ids:
return True
# Check if user is in any allowlist
allowed_ids = set()
if platform_allowlist:
@@ -2837,16 +2945,17 @@ class GatewayRunner:
_quick_key[:30], _stale_age, _stale_idle,
_raw_stale_timeout, _stale_detail,
)
del self._running_agents[_quick_key]
self._running_agents_ts.pop(_quick_key, None)
self._busy_ack_ts.pop(_quick_key, None)
self._release_running_agent_state(_quick_key)
if _quick_key in self._running_agents:
if event.get_command() == "status":
return await self._handle_status_command(event)
# Resolve the command once for all early-intercept checks below.
from hermes_cli.commands import resolve_command as _resolve_cmd_inner
from hermes_cli.commands import (
resolve_command as _resolve_cmd_inner,
should_bypass_active_session as _should_bypass_active_inner,
)
_evt_cmd = event.get_command()
_cmd_def_inner = _resolve_cmd_inner(_evt_cmd) if _evt_cmd else None
@@ -2867,8 +2976,7 @@ class GatewayRunner:
if adapter and hasattr(adapter, 'get_pending_message'):
adapter.get_pending_message(_quick_key) # consume and discard
self._pending_messages.pop(_quick_key, None)
if _quick_key in self._running_agents:
del self._running_agents[_quick_key]
self._release_running_agent_state(_quick_key)
logger.info("STOP for session %s — agent interrupted, session lock released", _quick_key[:20])
return "⚡ Stopped. You can continue this session."
@@ -2890,8 +2998,7 @@ class GatewayRunner:
self._pending_messages.pop(_quick_key, None)
# Clean up the running agent entry so the reset handler
# doesn't think an agent is still active.
if _quick_key in self._running_agents:
del self._running_agents[_quick_key]
self._release_running_agent_state(_quick_key)
return await self._handle_reset_command(event)
# /queue <prompt> — queue without interrupting
@@ -2912,6 +3019,54 @@ class GatewayRunner:
adapter._pending_messages[_quick_key] = queued_event
return "Queued for the next turn."
# /steer <prompt> — inject mid-run after the next tool call.
# Unlike /queue (turn boundary), /steer lands BETWEEN tool-call
# iterations inside the same agent run, by appending to the
# last tool result's content. No interrupt, no new user turn,
# no role-alternation violation.
if _cmd_def_inner and _cmd_def_inner.name == "steer":
steer_text = event.get_command_args().strip()
if not steer_text:
return "Usage: /steer <prompt>"
running_agent = self._running_agents.get(_quick_key)
if running_agent is _AGENT_PENDING_SENTINEL:
# Agent hasn't started yet — queue as turn-boundary fallback.
adapter = self.adapters.get(source.platform)
if adapter:
from gateway.platforms.base import MessageEvent as _ME, MessageType as _MT
queued_event = _ME(
text=steer_text,
message_type=_MT.TEXT,
source=event.source,
message_id=event.message_id,
channel_prompt=event.channel_prompt,
)
adapter._pending_messages[_quick_key] = queued_event
return "Agent still starting — /steer queued for the next turn."
if running_agent and hasattr(running_agent, "steer"):
try:
accepted = running_agent.steer(steer_text)
except Exception as exc:
logger.warning("Steer failed for session %s: %s", _quick_key[:20], exc)
return f"⚠️ Steer failed: {exc}"
if accepted:
preview = steer_text[:60] + ("..." if len(steer_text) > 60 else "")
return f"⏩ Steer queued — arrives after the next tool call: '{preview}'"
return "Steer rejected (empty payload)."
# Running agent is missing or lacks steer() — fall back to queue.
adapter = self.adapters.get(source.platform)
if adapter:
from gateway.platforms.base import MessageEvent as _ME, MessageType as _MT
queued_event = _ME(
text=steer_text,
message_type=_MT.TEXT,
source=event.source,
message_id=event.message_id,
channel_prompt=event.channel_prompt,
)
adapter._pending_messages[_quick_key] = queued_event
return "No active agent — /steer queued for the next turn."
# /model must not be used while the agent is running.
if _cmd_def_inner and _cmd_def_inner.name == "model":
return "Agent is running — wait or /stop first, then switch models."
@@ -2925,11 +3080,29 @@ class GatewayRunner:
return await self._handle_approve_command(event)
return await self._handle_deny_command(event)
# /agents (/tasks alias) should be query-only and never interrupt.
if _cmd_def_inner and _cmd_def_inner.name == "agents":
return await self._handle_agents_command(event)
# /background must bypass the running-agent guard — it starts a
# parallel task and must never interrupt the active conversation.
if _cmd_def_inner and _cmd_def_inner.name == "background":
return await self._handle_background_command(event)
# Gateway-handled info/control commands must never fall through to
# the interrupt path. If they are queued as pending text, the
# slash-command safety net discards them before the user sees any
# response.
if _cmd_def_inner and _should_bypass_active_inner(_cmd_def_inner.name):
if _cmd_def_inner.name == "help":
return await self._handle_help_command(event)
if _cmd_def_inner.name == "commands":
return await self._handle_commands_command(event)
if _cmd_def_inner.name == "profile":
return await self._handle_profile_command(event)
if _cmd_def_inner.name == "update":
return await self._handle_update_command(event)
if event.message_type == MessageType.PHOTO:
logger.debug("PRIORITY photo follow-up for session %s — queueing without interrupt", _quick_key[:20])
adapter = self.adapters.get(source.platform)
@@ -2968,8 +3141,7 @@ class GatewayRunner:
# Agent is being set up but not ready yet.
if event.get_command() == "stop":
# Force-clean the sentinel so the session is unlocked.
if _quick_key in self._running_agents:
del self._running_agents[_quick_key]
self._release_running_agent_state(_quick_key)
logger.info("HARD STOP (pending) for session %s — sentinel cleared", _quick_key[:20])
return "⚡ Force-stopped. The agent was still starting — session unlocked."
# Queue the message so it will be picked up after the
@@ -3033,6 +3205,9 @@ class GatewayRunner:
if canonical == "status":
return await self._handle_status_command(event)
if canonical == "agents":
return await self._handle_agents_command(event)
if canonical == "restart":
return await self._handle_restart_command(event)
@@ -3133,6 +3308,21 @@ class GatewayRunner:
if canonical == "btw":
return await self._handle_btw_command(event)
if canonical == "steer":
# No active agent — /steer has no tool call to inject into.
# Strip the prefix so downstream treats it as a normal user
# message. If the payload is empty, surface the usage hint.
steer_payload = event.get_command_args().strip()
if not steer_payload:
return "Usage: /steer <prompt> (no agent is running; sending as a normal message)"
try:
event.text = steer_payload
except Exception:
pass
# Do NOT return — fall through to _handle_message_with_agent
# at the end of this function so the rewritten text is sent
# to the agent as a regular user turn.
if canonical == "voice":
return await self._handle_voice_command(event)
@@ -3285,8 +3475,13 @@ class GatewayRunner:
# (exception, command fallthrough, etc.) the sentinel must
# not linger or the session would be permanently locked out.
if self._running_agents.get(_quick_key) is _AGENT_PENDING_SENTINEL:
del self._running_agents[_quick_key]
self._running_agents_ts.pop(_quick_key, None)
self._release_running_agent_state(_quick_key)
else:
# Agent path already cleaned _running_agents; make sure
# the paired metadata dicts are gone too.
self._running_agents_ts.pop(_quick_key, None)
if hasattr(self, "_busy_ack_ts"):
self._busy_ack_ts.pop(_quick_key, None)
async def _prepare_inbound_message_text(
self,
@@ -4483,6 +4678,96 @@ class GatewayRunner:
])
return "\n".join(lines)
async def _handle_agents_command(self, event: MessageEvent) -> str:
"""Handle /agents command - list active agents and running tasks."""
from tools.process_registry import format_uptime_short, process_registry
now = time.time()
current_session_key = self._session_key_for_source(event.source)
running_agents: dict = getattr(self, "_running_agents", {}) or {}
running_started: dict = getattr(self, "_running_agents_ts", {}) or {}
agent_rows: list[dict] = []
for session_key, agent in running_agents.items():
started = float(running_started.get(session_key, now))
elapsed = max(0, int(now - started))
is_pending = agent is _AGENT_PENDING_SENTINEL
agent_rows.append(
{
"session_key": session_key,
"elapsed": elapsed,
"state": "starting" if is_pending else "running",
"session_id": "" if is_pending else str(getattr(agent, "session_id", "") or ""),
"model": "" if is_pending else str(getattr(agent, "model", "") or ""),
}
)
agent_rows.sort(key=lambda row: row["elapsed"], reverse=True)
running_processes: list[dict] = []
try:
running_processes = [
p for p in process_registry.list_sessions()
if p.get("status") == "running"
]
except Exception:
running_processes = []
background_tasks = [
t for t in (getattr(self, "_background_tasks", set()) or set())
if hasattr(t, "done") and not t.done()
]
lines = [
"🤖 **Active Agents & Tasks**",
"",
f"**Active agents:** {len(agent_rows)}",
]
if agent_rows:
for idx, row in enumerate(agent_rows[:12], 1):
current = " · this chat" if row["session_key"] == current_session_key else ""
sid = f" · `{row['session_id']}`" if row["session_id"] else ""
model = f" · `{row['model']}`" if row["model"] else ""
lines.append(
f"{idx}. `{row['session_key']}` · {row['state']} · "
f"{format_uptime_short(row['elapsed'])}{sid}{model}{current}"
)
if len(agent_rows) > 12:
lines.append(f"... and {len(agent_rows) - 12} more")
lines.extend(
[
"",
f"**Running background processes:** {len(running_processes)}",
]
)
if running_processes:
for proc in running_processes[:12]:
cmd = " ".join(str(proc.get("command", "")).split())
if len(cmd) > 90:
cmd = cmd[:87] + "..."
lines.append(
f"- `{proc.get('session_id', '?')}` · "
f"{format_uptime_short(int(proc.get('uptime_seconds', 0)))} · `{cmd}`"
)
if len(running_processes) > 12:
lines.append(f"... and {len(running_processes) - 12} more")
lines.extend(
[
"",
f"**Gateway async jobs:** {len(background_tasks)}",
]
)
if not agent_rows and not running_processes and not background_tasks:
lines.append("")
lines.append("No active agents or running tasks.")
return "\n".join(lines)
async def _handle_stop_command(self, event: MessageEvent) -> str:
"""Handle /stop command - interrupt a running agent.
@@ -4502,22 +4787,40 @@ class GatewayRunner:
agent = self._running_agents.get(session_key)
if agent is _AGENT_PENDING_SENTINEL:
# Force-clean the sentinel so the session is unlocked.
if session_key in self._running_agents:
del self._running_agents[session_key]
self._release_running_agent_state(session_key)
logger.info("STOP (pending) for session %s — sentinel cleared", session_key[:20])
return "⚡ Stopped. The agent hadn't started yet — you can continue this session."
if agent:
agent.interrupt("Stop requested")
# Force-clean the session lock so a truly hung agent doesn't
# keep it locked forever.
if session_key in self._running_agents:
del self._running_agents[session_key]
self._release_running_agent_state(session_key)
return "⚡ Stopped. You can continue this session."
else:
return "No active task to stop."
async def _handle_restart_command(self, event: MessageEvent) -> str:
"""Handle /restart command - drain active work, then restart the gateway."""
# Defensive idempotency check: if the previous gateway process
# recorded this same /restart (same platform + update_id) and the new
# process is seeing it *again*, this is a re-delivery caused by PTB's
# graceful-shutdown `get_updates` ACK failing on the way out ("Error
# while calling `get_updates` one more time to mark all fetched
# updates. Suppressing error to ensure graceful shutdown. When
# polling for updates is restarted, updates may be received twice."
# in gateway.log). Ignoring the stale redelivery prevents a
# self-perpetuating restart loop where every fresh gateway
# re-processes the same /restart command and immediately restarts
# again.
if self._is_stale_restart_redelivery(event):
logger.info(
"Ignoring redelivered /restart (platform=%s, update_id=%s) — "
"already processed by a previous gateway instance.",
event.source.platform.value if event.source and event.source.platform else "?",
event.platform_update_id,
)
return ""
if self._restart_requested or self._draining:
count = self._running_agent_count()
if count:
@@ -4540,6 +4843,26 @@ class GatewayRunner:
except Exception as e:
logger.debug("Failed to write restart notify file: %s", e)
# Record the triggering platform + update_id in a dedicated dedup
# marker. Unlike .restart_notify.json (which gets unlinked once the
# new gateway sends the "gateway restarted" notification), this
# marker persists so the new gateway can still detect a delayed
# /restart redelivery from Telegram. Overwritten on every /restart.
try:
import json as _json
import time as _time
dedup_data = {
"platform": event.source.platform.value if event.source.platform else None,
"requested_at": _time.time(),
}
if event.platform_update_id is not None:
dedup_data["update_id"] = event.platform_update_id
(_hermes_home / ".restart_last_processed.json").write_text(
_json.dumps(dedup_data)
)
except Exception as e:
logger.debug("Failed to write restart dedup marker: %s", e)
active_agents = self._running_agent_count()
# When running under a service manager (systemd/launchd), use the
# service restart path: exit with code 75 so the service manager
@@ -4555,6 +4878,58 @@ class GatewayRunner:
return f"⏳ Draining {active_agents} active agent(s) before restart..."
return "♻ Restarting gateway. If you aren't notified within 60 seconds, restart from the console with `hermes gateway restart`."
def _is_stale_restart_redelivery(self, event: MessageEvent) -> bool:
"""Return True if this /restart is a Telegram re-delivery we already handled.
The previous gateway wrote ``.restart_last_processed.json`` with the
triggering platform + update_id when it processed the /restart. If
we now see a /restart on the same platform with an update_id <= that
recorded value AND the marker is recent (< 5 minutes), it's a
redelivery and should be ignored.
Only applies to Telegram today (the only platform that exposes a
numeric cross-session update ordering); other platforms return False.
"""
if event is None or event.source is None:
return False
if event.platform_update_id is None:
return False
if event.source.platform is None:
return False
# Only Telegram populates platform_update_id currently; be explicit
# so future platforms aren't accidentally gated by this check.
try:
platform_value = event.source.platform.value
except Exception:
return False
if platform_value != "telegram":
return False
try:
import json as _json
import time as _time
marker_path = _hermes_home / ".restart_last_processed.json"
if not marker_path.exists():
return False
data = _json.loads(marker_path.read_text())
except Exception:
return False
if data.get("platform") != platform_value:
return False
recorded_uid = data.get("update_id")
if not isinstance(recorded_uid, int):
return False
# Staleness guard: ignore markers older than 5 minutes. A legitimately
# old marker (e.g. crash recovery where notify never fired) should not
# swallow a fresh /restart from the user.
requested_at = data.get("requested_at")
if isinstance(requested_at, (int, float)):
if _time.time() - requested_at > 300:
return False
return event.platform_update_id <= recorded_uid
async def _handle_help_command(self, event: MessageEvent) -> str:
"""Handle /help command - list available commands."""
from hermes_cli.commands import gateway_help_lines
@@ -5300,8 +5675,7 @@ class GatewayRunner:
if "pynacl" in err_lower or "nacl" in err_lower or "davey" in err_lower:
return (
"Voice dependencies are missing (PyNaCl / davey). "
"Install or reinstall Hermes with the messaging extra, e.g. "
"`pip install hermes-agent[messaging]`."
f"Install with: `{sys.executable} -m pip install PyNaCl`"
)
return f"Failed to join voice channel: {e}"
@@ -5797,7 +6171,7 @@ class GatewayRunner:
pass
# Send media files
for media_path in (media_files or []):
for media_path, _is_voice in (media_files or []):
try:
await adapter.send_document(
chat_id=source.chat_id,
@@ -5975,7 +6349,7 @@ class GatewayRunner:
except Exception:
pass
for media_path in (media_files or []):
for media_path, _is_voice in (media_files or []):
try:
await adapter.send_file(chat_id=source.chat_id, file_path=media_path)
except Exception:
@@ -6427,8 +6801,7 @@ class GatewayRunner:
logger.debug("Memory flush on resume failed: %s", e)
# Clear any running agent for this session key
if session_key in self._running_agents:
del self._running_agents[session_key]
self._release_running_agent_state(session_key)
# Switch the session entry to point at the old session
new_entry = self.session_store.switch_session(session_key, target_id)
@@ -7844,6 +8217,30 @@ class GatewayRunner:
override = self._session_model_overrides.get(session_key)
return override is not None and override.get("model") == agent_model
def _release_running_agent_state(self, session_key: str) -> None:
"""Pop ALL per-running-agent state entries for ``session_key``.
Replaces ad-hoc ``del self._running_agents[key]`` calls scattered
across the gateway. Those sites had drifted: some popped only
``_running_agents``; some also ``_running_agents_ts``; only one
path also cleared ``_busy_ack_ts``. Each missed entry was a
small, persistent leak a (str_key float) tuple per session
per gateway lifetime.
Use this at every site that ends a running turn, regardless of
cause (normal completion, /stop, /reset, /resume, sentinel
cleanup, stale-eviction). Per-session state that PERSISTS
across turns (``_session_model_overrides``, ``_voice_mode``,
``_pending_approvals``, ``_update_prompt_pending``) is NOT
touched here those have their own lifecycles.
"""
if not session_key:
return
self._running_agents.pop(session_key, None)
self._running_agents_ts.pop(session_key, None)
if hasattr(self, "_busy_ack_ts"):
self._busy_ack_ts.pop(session_key, None)
def _evict_cached_agent(self, session_key: str) -> None:
"""Remove a cached agent for a session (called on /new, /model, etc)."""
_lock = getattr(self, "_agent_cache_lock", None)
@@ -7851,6 +8248,153 @@ class GatewayRunner:
with _lock:
self._agent_cache.pop(session_key, None)
def _release_evicted_agent_soft(self, agent: Any) -> None:
"""Soft cleanup for cache-evicted agents — preserves session tool state.
Called from _enforce_agent_cache_cap and _sweep_idle_cached_agents.
Distinct from _cleanup_agent_resources (full teardown) because a
cache-evicted session may resume at any time its terminal
sandbox, browser daemon, and tracked bg processes must outlive
the Python AIAgent instance so the next agent built for the
same task_id inherits them.
"""
if agent is None:
return
try:
if hasattr(agent, "release_clients"):
agent.release_clients()
else:
# Older agent instance (shouldn't happen in practice) —
# fall back to the legacy full-close path.
self._cleanup_agent_resources(agent)
except Exception:
pass
def _enforce_agent_cache_cap(self) -> None:
"""Evict oldest cached agents when cache exceeds _AGENT_CACHE_MAX_SIZE.
Must be called with _agent_cache_lock held. Resource cleanup
(memory provider shutdown, tool resource close) is scheduled
on a daemon thread so the caller doesn't block on slow teardown
while holding the cache lock.
Agents currently in _running_agents are SKIPPED their clients,
terminal sandboxes, background processes, and child subagents
are all in active use by the running turn. Evicting them would
tear down those resources mid-turn and crash the request. If
every candidate in the LRU order is active, we simply leave the
cache over the cap; it will be re-checked on the next insert.
"""
_cache = getattr(self, "_agent_cache", None)
if _cache is None:
return
# OrderedDict.popitem(last=False) pops oldest; plain dict lacks the
# arg so skip enforcement if a test fixture swapped the cache type.
if not hasattr(_cache, "move_to_end"):
return
# Snapshot of agent instances that are actively mid-turn. Use id()
# so the lookup is O(1) and doesn't depend on AIAgent.__eq__ (which
# MagicMock overrides in tests).
running_ids = {
id(a)
for a in getattr(self, "_running_agents", {}).values()
if a is not None and a is not _AGENT_PENDING_SENTINEL
}
# Walk LRU → MRU and evict excess-LRU entries that aren't mid-turn.
# We only consider entries in the first (size - cap) LRU positions
# as eviction candidates. If one of those slots is held by an
# active agent, we SKIP it without compensating by evicting a
# newer entry — that would penalise a freshly-inserted session
# (which has no cache history to retain) while protecting an
# already-cached long-running one. The cache may therefore stay
# temporarily over cap; it will re-check on the next insert,
# after active turns have finished.
excess = max(0, len(_cache) - _AGENT_CACHE_MAX_SIZE)
evict_plan: List[tuple] = [] # [(key, agent), ...]
if excess > 0:
ordered_keys = list(_cache.keys())
for key in ordered_keys[:excess]:
entry = _cache.get(key)
agent = entry[0] if isinstance(entry, tuple) and entry else None
if agent is not None and id(agent) in running_ids:
continue # active mid-turn; don't evict, don't substitute
evict_plan.append((key, agent))
for key, _ in evict_plan:
_cache.pop(key, None)
remaining_over_cap = len(_cache) - _AGENT_CACHE_MAX_SIZE
if remaining_over_cap > 0:
logger.warning(
"Agent cache over cap (%d > %d); %d excess slot(s) held by "
"mid-turn agents — will re-check on next insert.",
len(_cache), _AGENT_CACHE_MAX_SIZE, remaining_over_cap,
)
for key, agent in evict_plan:
logger.info(
"Agent cache at cap; evicting LRU session=%s (cache_size=%d)",
key, len(_cache),
)
if agent is not None:
threading.Thread(
target=self._release_evicted_agent_soft,
args=(agent,),
daemon=True,
name=f"agent-cache-evict-{key[:24]}",
).start()
def _sweep_idle_cached_agents(self) -> int:
"""Evict cached agents whose AIAgent has been idle > _AGENT_CACHE_IDLE_TTL_SECS.
Safe to call from the session expiry watcher without holding the
cache lock acquires it internally. Returns the number of entries
evicted. Resource cleanup is scheduled on daemon threads.
Agents currently in _running_agents are SKIPPED for the same reason
as _enforce_agent_cache_cap: tearing down an active turn's clients
mid-flight would crash the request.
"""
_cache = getattr(self, "_agent_cache", None)
_lock = getattr(self, "_agent_cache_lock", None)
if _cache is None or _lock is None:
return 0
now = time.time()
to_evict: List[tuple] = []
running_ids = {
id(a)
for a in getattr(self, "_running_agents", {}).values()
if a is not None and a is not _AGENT_PENDING_SENTINEL
}
with _lock:
for key, entry in list(_cache.items()):
agent = entry[0] if isinstance(entry, tuple) and entry else None
if agent is None:
continue
if id(agent) in running_ids:
continue # mid-turn — don't tear it down
last_activity = getattr(agent, "_last_activity_ts", None)
if last_activity is None:
continue
if (now - last_activity) > _AGENT_CACHE_IDLE_TTL_SECS:
to_evict.append((key, agent))
for key, _ in to_evict:
_cache.pop(key, None)
for key, agent in to_evict:
logger.info(
"Agent cache idle-TTL evict: session=%s (idle=%.0fs)",
key, now - getattr(agent, "_last_activity_ts", now),
)
threading.Thread(
target=self._release_evicted_agent_soft,
args=(agent,),
daemon=True,
name=f"agent-cache-idle-{key[:24]}",
).start()
return len(to_evict)
# ------------------------------------------------------------------
# Proxy mode: forward messages to a remote Hermes API server
# ------------------------------------------------------------------
@@ -8618,6 +9162,13 @@ class GatewayRunner:
cached = _cache.get(session_key)
if cached and cached[1] == _sig:
agent = cached[0]
# Refresh LRU order so the cap enforcement evicts
# truly-oldest entries, not the one we just used.
if hasattr(_cache, "move_to_end"):
try:
_cache.move_to_end(session_key)
except KeyError:
pass
# Reset activity timestamp so the inactivity timeout
# handler doesn't see stale idle time from the previous
# turn and immediately kill this agent. (#9051)
@@ -8656,6 +9207,7 @@ class GatewayRunner:
if _cache_lock and _cache is not None:
with _cache_lock:
_cache[session_key] = (agent, _sig)
self._enforce_agent_cache_cap()
logger.debug("Created new agent for session %s (sig=%s)", session_key, _sig)
# Per-message state — callbacks and reasoning config change every
@@ -9524,10 +10076,8 @@ class GatewayRunner:
# Clean up tracking
tracking_task.cancel()
if session_key and session_key in self._running_agents:
del self._running_agents[session_key]
if session_key:
self._running_agents_ts.pop(session_key, None)
self._release_running_agent_state(session_key)
if self._draining:
self._update_runtime_status("draining")
@@ -9656,6 +10206,16 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
"Replacing existing gateway instance (PID %d) with --replace.",
existing_pid,
)
# Record a takeover marker so the target's shutdown handler
# recognises its SIGTERM as a planned takeover and exits 0
# (rather than exit 1, which would trigger systemd's
# Restart=on-failure and start a flap loop against us).
# Best-effort — proceed even if the write fails.
try:
from gateway.status import write_takeover_marker
write_takeover_marker(existing_pid)
except Exception as e:
logger.debug("Could not write takeover marker: %s", e)
try:
terminate_pid(existing_pid, force=False)
except ProcessLookupError:
@@ -9665,6 +10225,13 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
"Permission denied killing PID %d. Cannot replace.",
existing_pid,
)
# Marker is scoped to a specific target; clean it up on
# give-up so it doesn't grief an unrelated future shutdown.
try:
from gateway.status import clear_takeover_marker
clear_takeover_marker()
except Exception:
pass
return False
# Wait up to 10 seconds for the old process to exit
for _ in range(20):
@@ -9685,6 +10252,13 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
except (ProcessLookupError, PermissionError, OSError):
pass
remove_pid_file()
# Clean up any takeover marker the old process didn't consume
# (e.g. SIGKILL'd before its shutdown handler could read it).
try:
from gateway.status import clear_takeover_marker
clear_takeover_marker()
except Exception:
pass
# Also release all scoped locks left by the old process.
# Stopped (Ctrl+Z) processes don't release locks on exit,
# leaving stale lock files that block the new gateway from starting.
@@ -9752,8 +10326,27 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
# Set up signal handlers
def shutdown_signal_handler():
nonlocal _signal_initiated_shutdown
_signal_initiated_shutdown = True
logger.info("Received SIGTERM/SIGINT — initiating shutdown")
# Planned --replace takeover check: when a sibling gateway is
# taking over via --replace, it wrote a marker naming this PID
# before sending SIGTERM. If present, treat the signal as a
# planned shutdown and exit 0 so systemd's Restart=on-failure
# doesn't revive us (which would flap-fight the replacer when
# both services are enabled, e.g. hermes.service + hermes-
# gateway.service from pre-rename installs).
planned_takeover = False
try:
from gateway.status import consume_takeover_marker_for_self
planned_takeover = consume_takeover_marker_for_self()
except Exception as e:
logger.debug("Takeover marker check failed: %s", e)
if planned_takeover:
logger.info(
"Received SIGTERM as a planned --replace takeover — exiting cleanly"
)
else:
_signal_initiated_shutdown = True
logger.info("Received SIGTERM/SIGINT — initiating shutdown")
# Diagnostic: log all hermes-related processes so we can identify
# what triggered the signal (hermes update, hermes gateway restart,
# a stale detached subprocess, etc.).
+52
View File
@@ -82,6 +82,7 @@ class SessionSource:
chat_topic: Optional[str] = None # Channel topic/description (Discord, Slack)
user_id_alt: Optional[str] = None # Signal UUID (alternative to phone number)
chat_id_alt: Optional[str] = None # Signal group internal ID
is_bot: bool = False # True when the message author is a bot/webhook (Discord)
@property
def description(self) -> str:
@@ -801,6 +802,57 @@ class SessionStore:
return True
return False
def prune_old_entries(self, max_age_days: int) -> int:
"""Drop SessionEntry records older than max_age_days.
Pruning is based on ``updated_at`` (last activity), not ``created_at``.
A session that's been active within the window is kept regardless of
how old it is. Entries marked ``suspended`` are kept the user
explicitly paused them for later resume. Entries held by an active
process (via has_active_processes_fn) are also kept so long-running
background work isn't orphaned.
Pruning is functionally identical to a natural reset-policy expiry:
the transcript in SQLite stays, but the session_key session_id
mapping is dropped and the user starts a fresh session on return.
``max_age_days <= 0`` disables pruning; returns 0 immediately.
Returns the number of entries removed.
"""
if max_age_days is None or max_age_days <= 0:
return 0
from datetime import timedelta
cutoff = _now() - timedelta(days=max_age_days)
removed_keys: list[str] = []
with self._lock:
self._ensure_loaded_locked()
for key, entry in list(self._entries.items()):
if entry.suspended:
continue
# Never prune sessions with an active background process
# attached — the user may still be waiting on output.
if self._has_active_processes_fn is not None:
try:
if self._has_active_processes_fn(entry.session_id):
continue
except Exception:
pass
if entry.updated_at < cutoff:
removed_keys.append(key)
for key in removed_keys:
self._entries.pop(key, None)
if removed_keys:
self._save()
if removed_keys:
logger.info(
"SessionStore pruned %d entries older than %d days",
len(removed_keys), max_age_days,
)
return len(removed_keys)
def suspend_recently_active(self, max_age_seconds: int = 120) -> int:
"""Mark recently-active sessions as suspended.
+159 -11
View File
@@ -188,8 +188,8 @@ def _write_json_file(path: Path, payload: dict[str, Any]) -> None:
path.write_text(json.dumps(payload))
def _read_pid_record() -> Optional[dict]:
pid_path = _get_pid_path()
def _read_pid_record(pid_path: Optional[Path] = None) -> Optional[dict]:
pid_path = pid_path or _get_pid_path()
if not pid_path.exists():
return None
@@ -212,6 +212,18 @@ def _read_pid_record() -> Optional[dict]:
return None
def _cleanup_invalid_pid_path(pid_path: Path, *, cleanup_stale: bool) -> None:
if not cleanup_stale:
return
try:
if pid_path == _get_pid_path():
remove_pid_file()
else:
pid_path.unlink(missing_ok=True)
except Exception:
pass
def write_pid_file() -> None:
"""Write the current process PID and metadata to the gateway PID file."""
_write_json_file(_get_pid_path(), _build_pid_record())
@@ -413,43 +425,179 @@ def release_all_scoped_locks() -> int:
return removed
def get_running_pid() -> Optional[int]:
# ── --replace takeover marker ─────────────────────────────────────────
#
# When a new gateway starts with ``--replace``, it SIGTERMs the existing
# gateway so it can take over the bot token. PR #5646 made SIGTERM exit
# the gateway with code 1 so ``Restart=on-failure`` can revive it after
# unexpected kills — but that also means a --replace takeover target
# exits 1, which tricks systemd into reviving it 30 seconds later,
# starting a flap loop against the replacer when both services are
# enabled in the user's systemd (e.g. ``hermes.service`` + ``hermes-
# gateway.service``).
#
# The takeover marker breaks the loop: the replacer writes a short-lived
# file naming the target PID + start_time BEFORE sending SIGTERM.
# The target's shutdown handler reads the marker and, if it names
# this process, treats the SIGTERM as a planned takeover and exits 0.
# The marker is unlinked after the target has consumed it, so a stale
# marker left by a crashed replacer can grief at most one future
# shutdown on the same PID — and only within _TAKEOVER_MARKER_TTL_S.
_TAKEOVER_MARKER_FILENAME = ".gateway-takeover.json"
_TAKEOVER_MARKER_TTL_S = 60 # Marker older than this is treated as stale
def _get_takeover_marker_path() -> Path:
"""Return the path to the --replace takeover marker file."""
home = get_hermes_home()
return home / _TAKEOVER_MARKER_FILENAME
def write_takeover_marker(target_pid: int) -> bool:
"""Record that ``target_pid`` is being replaced by the current process.
Captures the target's ``start_time`` so that PID reuse after the
target exits cannot later match the marker. Also records the
replacer's PID and a UTC timestamp for TTL-based staleness checks.
Returns True on successful write, False on any failure. The caller
should proceed with the SIGTERM even if the write fails (the marker
is a best-effort signal, not a correctness requirement).
"""
try:
target_start_time = _get_process_start_time(target_pid)
record = {
"target_pid": target_pid,
"target_start_time": target_start_time,
"replacer_pid": os.getpid(),
"written_at": _utc_now_iso(),
}
_write_json_file(_get_takeover_marker_path(), record)
return True
except (OSError, PermissionError):
return False
def consume_takeover_marker_for_self() -> bool:
"""Check & unlink the takeover marker if it names the current process.
Returns True only when a valid (non-stale) marker names this PID +
start_time. A returning True indicates the current SIGTERM is a
planned --replace takeover; the caller should exit 0 instead of
signalling ``_signal_initiated_shutdown``.
Always unlinks the marker on match (and on detected staleness) so
subsequent unrelated signals don't re-trigger.
"""
path = _get_takeover_marker_path()
record = _read_json_file(path)
if not record:
return False
# Any malformed or stale marker → drop it and return False
try:
target_pid = int(record["target_pid"])
target_start_time = record.get("target_start_time")
written_at = record.get("written_at") or ""
except (KeyError, TypeError, ValueError):
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
# TTL guard: a stale marker older than _TAKEOVER_MARKER_TTL_S is ignored.
stale = False
try:
written_dt = datetime.fromisoformat(written_at)
age = (datetime.now(timezone.utc) - written_dt).total_seconds()
if age > _TAKEOVER_MARKER_TTL_S:
stale = True
except (TypeError, ValueError):
stale = True # Unparseable timestamp — treat as stale
if stale:
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
# Does the marker name THIS process?
our_pid = os.getpid()
our_start_time = _get_process_start_time(our_pid)
matches = (
target_pid == our_pid
and target_start_time is not None
and our_start_time is not None
and target_start_time == our_start_time
)
# Consume the marker whether it matched or not — a marker that doesn't
# match our identity is stale-for-us anyway.
try:
path.unlink(missing_ok=True)
except OSError:
pass
return matches
def clear_takeover_marker() -> None:
"""Remove the takeover marker unconditionally. Safe to call repeatedly."""
try:
_get_takeover_marker_path().unlink(missing_ok=True)
except OSError:
pass
def get_running_pid(
pid_path: Optional[Path] = None,
*,
cleanup_stale: bool = True,
) -> Optional[int]:
"""Return the PID of a running gateway instance, or ``None``.
Checks the PID file and verifies the process is actually alive.
Cleans up stale PID files automatically.
"""
record = _read_pid_record()
resolved_pid_path = pid_path or _get_pid_path()
record = _read_pid_record(resolved_pid_path)
if not record:
remove_pid_file()
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
try:
pid = int(record["pid"])
except (KeyError, TypeError, ValueError):
remove_pid_file()
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
try:
os.kill(pid, 0) # signal 0 = existence check, no actual signal sent
except (ProcessLookupError, PermissionError):
remove_pid_file()
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
recorded_start = record.get("start_time")
current_start = _get_process_start_time(pid)
if recorded_start is not None and current_start is not None and current_start != recorded_start:
remove_pid_file()
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
if not _looks_like_gateway_process(pid):
if not _record_looks_like_gateway(record):
remove_pid_file()
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
return pid
def is_gateway_running() -> bool:
def is_gateway_running(
pid_path: Optional[Path] = None,
*,
cleanup_stale: bool = True,
) -> bool:
"""Check if the gateway daemon is currently running."""
return get_running_pid() is not None
return get_running_pid(pid_path, cleanup_stale=cleanup_stale) is not None
+46 -6
View File
@@ -100,6 +100,14 @@ class GatewayStreamConsumer:
self._flood_strikes = 0 # Consecutive flood-control edit failures
self._current_edit_interval = self.cfg.edit_interval # Adaptive backoff
self._final_response_sent = False
# Cache adapter lifecycle capability: only platforms that need an
# explicit finalize call (e.g. DingTalk AI Cards) force us to make
# a redundant final edit. Everyone else keeps the fast path.
# Use ``is True`` (not ``bool(...)``) so MagicMock attribute access
# in tests doesn't incorrectly enable this path.
self._adapter_requires_finalize: bool = (
getattr(adapter, "REQUIRES_EDIT_FINALIZE", False) is True
)
# Think-block filter state (mirrors CLI's _stream_delta tag suppression)
self._in_think_block = False
@@ -361,7 +369,16 @@ class GatewayStreamConsumer:
if not got_done and not got_segment_break and commentary_text is None:
display_text += self.cfg.cursor
current_update_visible = await self._send_or_edit(display_text)
# Segment break: finalize the current message so platforms
# that need explicit closure (e.g. DingTalk AI Cards) don't
# leave the previous segment stuck in a loading state when
# the next segment (tool progress, next chunk) creates a
# new message below it. got_done has its own finalize
# path below so we don't finalize here for it.
current_update_visible = await self._send_or_edit(
display_text,
finalize=got_segment_break,
)
self._last_edit_time = time.monotonic()
if got_done:
@@ -372,10 +389,22 @@ class GatewayStreamConsumer:
if self._accumulated:
if self._fallback_final_send:
await self._send_fallback_final(self._accumulated)
elif current_update_visible:
elif (
current_update_visible
and not self._adapter_requires_finalize
):
# Mid-stream edit above already delivered the
# final accumulated content. Skip the redundant
# final edit — but only for adapters that don't
# need an explicit finalize signal.
self._final_response_sent = True
elif self._message_id:
self._final_response_sent = await self._send_or_edit(self._accumulated)
# Either the mid-stream edit didn't run (no
# visible update this tick) OR the adapter needs
# explicit finalize=True to close the stream.
self._final_response_sent = await self._send_or_edit(
self._accumulated, finalize=True,
)
elif not self._already_sent:
self._final_response_sent = await self._send_or_edit(self._accumulated)
return
@@ -633,12 +662,15 @@ class GatewayStreamConsumer:
logger.error("Commentary send error: %s", e)
return False
async def _send_or_edit(self, text: str) -> bool:
async def _send_or_edit(self, text: str, *, finalize: bool = False) -> bool:
"""Send or edit the streaming message.
Returns True if the text was successfully delivered (sent or edited),
False otherwise. Callers like the overflow split loop use this to
decide whether to advance past the delivered chunk.
``finalize`` is True when this is the last edit in a streaming
sequence.
"""
# Strip MEDIA: directives so they don't appear as visible text.
# Media files are delivered as native attachments after the stream
@@ -672,14 +704,22 @@ class GatewayStreamConsumer:
try:
if self._message_id is not None:
if self._edit_supported:
# Skip if text is identical to what we last sent
if text == self._last_sent_text:
# Skip if text is identical to what we last sent.
# Exception: adapters that require an explicit finalize
# call (REQUIRES_EDIT_FINALIZE) must still receive the
# finalize=True edit even when content is unchanged, so
# their streaming UI can transition out of the in-
# progress state. Everyone else short-circuits.
if text == self._last_sent_text and not (
finalize and self._adapter_requires_finalize
):
return True
# Edit existing message
result = await self.adapter.edit_message(
chat_id=self.chat_id,
message_id=self._message_id,
content=text,
finalize=finalize,
)
if result.success:
self._already_sent = True
+115
View File
@@ -233,6 +233,14 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
api_key_env_vars=("XAI_API_KEY",),
base_url_env_var="XAI_BASE_URL",
),
"nvidia": ProviderConfig(
id="nvidia",
name="NVIDIA NIM",
auth_type="api_key",
inference_base_url="https://integrate.api.nvidia.com/v1",
api_key_env_vars=("NVIDIA_API_KEY",),
base_url_env_var="NVIDIA_BASE_URL",
),
"ai-gateway": ProviderConfig(
id="ai-gateway",
name="Vercel AI Gateway",
@@ -773,6 +781,28 @@ def is_source_suppressed(provider_id: str, source: str) -> bool:
return False
def unsuppress_credential_source(provider_id: str, source: str) -> bool:
"""Clear a suppression marker so the source will be re-seeded on the next load.
Returns True if a marker was cleared, False if no marker existed.
"""
with _auth_store_lock():
auth_store = _load_auth_store()
suppressed = auth_store.get("suppressed_sources")
if not isinstance(suppressed, dict):
return False
provider_list = suppressed.get(provider_id)
if not isinstance(provider_list, list) or source not in provider_list:
return False
provider_list.remove(source)
if not provider_list:
suppressed.pop(provider_id, None)
if not suppressed:
auth_store.pop("suppressed_sources", None)
_save_auth_store(auth_store)
return True
def get_provider_auth_state(provider_id: str) -> Optional[Dict[str, Any]]:
"""Return persisted auth state for a provider, or None."""
auth_store = _load_auth_store()
@@ -2129,6 +2159,62 @@ def refresh_nous_oauth_from_state(
)
NOUS_DEVICE_CODE_SOURCE = "device_code"
def persist_nous_credentials(
creds: Dict[str, Any],
*,
label: Optional[str] = None,
):
"""Persist minted Nous OAuth credentials as the singleton provider state
and ensure the credential pool is in sync.
Nous credentials are read at runtime from two independent locations:
- ``providers.nous``: singleton state read by
``resolve_nous_runtime_credentials()`` during 401 recovery and by
``_seed_from_singletons()`` during pool load.
- ``credential_pool.nous``: used by the runtime ``pool.select()`` path.
Historically ``hermes auth add nous`` wrote a ``manual:device_code`` pool
entry only, skipping ``providers.nous``. When the 24h agent_key TTL
expired, the recovery path read the empty singleton state and raised
``AuthError`` silently (``logger.debug`` at INFO level).
This helper writes ``providers.nous`` then calls ``load_pool("nous")`` so
``_seed_from_singletons`` materialises the canonical ``device_code`` pool
entry from the singleton. Re-running login upserts the same entry in
place; the pool never accumulates duplicate device_code rows.
``label`` is an optional user-chosen display name (from
``hermes auth add nous --label <name>``). It gets embedded in the
singleton state so that ``_seed_from_singletons`` uses it as the pool
entry's label on every subsequent ``load_pool("nous")`` instead of the
auto-derived token fingerprint. When ``None``, the auto-derived label
via ``label_from_token`` is used (unchanged default behaviour).
Returns the upserted :class:`PooledCredential` entry (or ``None`` if
seeding somehow produced no match shouldn't happen).
"""
from agent.credential_pool import load_pool
state = dict(creds)
if label and str(label).strip():
state["label"] = str(label).strip()
with _auth_store_lock():
auth_store = _load_auth_store()
_save_provider_state(auth_store, "nous", state)
_save_auth_store(auth_store)
pool = load_pool("nous")
return next(
(e for e in pool.entries() if e.source == NOUS_DEVICE_CODE_SOURCE),
None,
)
def resolve_nous_runtime_credentials(
*,
min_key_ttl_seconds: int = DEFAULT_AGENT_KEY_MIN_TTL_SECONDS,
@@ -3297,6 +3383,14 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
inference_base_url = auth_state["inference_base_url"]
# Snapshot the prior active_provider BEFORE _save_provider_state
# overwrites it to "nous". If the user picks "Skip (keep current)"
# during model selection below, we restore this so the user's previous
# provider (e.g. openrouter) is preserved.
with _auth_store_lock():
_prior_store = _load_auth_store()
prior_active_provider = _prior_store.get("active_provider")
with _auth_store_lock():
auth_store = _load_auth_store()
_save_provider_state(auth_store, "nous", auth_state)
@@ -3356,6 +3450,27 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
print(f"Login succeeded, but could not fetch available models. Reason: {message}")
# Write provider + model atomically so config is never mismatched.
# If no model was selected (user picked "Skip (keep current)",
# model list fetch failed, or no curated models were available),
# preserve the user's previous provider — don't silently switch
# them to Nous with a mismatched model. The Nous OAuth tokens
# stay saved for future use.
if not selected_model:
# Restore the prior active_provider that _save_provider_state
# overwrote to "nous". config.yaml model.provider is left
# untouched, so the user's previous provider is fully preserved.
with _auth_store_lock():
auth_store = _load_auth_store()
if prior_active_provider:
auth_store["active_provider"] = prior_active_provider
else:
auth_store.pop("active_provider", None)
_save_auth_store(auth_store)
print()
print("No provider change. Nous credentials saved for future use.")
print(" Run `hermes model` again to switch to Nous Portal.")
return
config_path = _update_config_for_provider(
"nous", inference_base_url, default_model=selected_model,
)
+39 -13
View File
@@ -217,22 +217,21 @@ def auth_add_command(args) -> None:
ca_bundle=getattr(args, "ca_bundle", None),
min_key_ttl_seconds=max(60, int(getattr(args, "min_key_ttl_seconds", 5 * 60))),
)
label = (getattr(args, "label", None) or "").strip() or label_from_token(
creds.get("access_token", ""),
_oauth_default_label(provider, len(pool.entries()) + 1),
# Honor `--label <name>` so nous matches other providers' UX. The
# helper embeds this into providers.nous so that label_from_token
# doesn't overwrite it on every subsequent load_pool("nous").
custom_label = (getattr(args, "label", None) or "").strip() or None
entry = auth_mod.persist_nous_credentials(creds, label=custom_label)
shown_label = entry.label if entry is not None else label_from_token(
creds.get("access_token", ""), _oauth_default_label(provider, 1),
)
entry = PooledCredential.from_dict(provider, {
**creds,
"label": label,
"auth_type": AUTH_TYPE_OAUTH,
"source": f"{SOURCE_MANUAL}:device_code",
"base_url": creds.get("inference_base_url"),
})
pool.add_entry(entry)
print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
print(f'Saved {provider} OAuth device-code credentials: "{shown_label}"')
return
if provider == "openai-codex":
# Clear any existing suppression marker so a re-link after `hermes auth
# remove openai-codex` works without the new tokens being skipped.
auth_mod.unsuppress_credential_source(provider, "device_code")
creds = auth_mod._codex_device_code_login()
label = (getattr(args, "label", None) or "").strip() or label_from_token(
creds["tokens"]["access_token"],
@@ -352,7 +351,34 @@ def auth_remove_command(args) -> None:
# If this was a singleton-seeded credential (OAuth device_code, hermes_pkce),
# clear the underlying auth store / credential file so it doesn't get
# re-seeded on the next load_pool() call.
elif removed.source == "device_code" and provider in ("openai-codex", "nous"):
elif provider == "openai-codex" and (
removed.source == "device_code" or removed.source.endswith(":device_code")
):
# Codex tokens live in TWO places: the Hermes auth store and
# ~/.codex/auth.json (the Codex CLI shared file). On every refresh,
# refresh_codex_oauth_pure() writes to both. So clearing only the
# Hermes auth store is not enough — _seed_from_singletons() will
# auto-import from ~/.codex/auth.json on the next load_pool() and
# the removal is instantly undone. Mark the source as suppressed
# so auto-import is skipped; leave ~/.codex/auth.json untouched so
# the Codex CLI itself keeps working.
from hermes_cli.auth import (
_load_auth_store, _save_auth_store, _auth_store_lock,
suppress_credential_source,
)
with _auth_store_lock():
auth_store = _load_auth_store()
providers_dict = auth_store.get("providers")
if isinstance(providers_dict, dict) and provider in providers_dict:
del providers_dict[provider]
_save_auth_store(auth_store)
print(f"Cleared {provider} OAuth tokens from auth store")
suppress_credential_source(provider, "device_code")
print("Suppressed openai-codex device_code source — it will not be re-seeded.")
print("Note: Codex CLI credentials still live in ~/.codex/auth.json")
print("Run `hermes auth add openai-codex` to re-enable if needed.")
elif removed.source == "device_code" and provider == "nous":
from hermes_cli.auth import (
_load_auth_store, _save_auth_store, _auth_store_lock,
)
+119 -70
View File
@@ -7,8 +7,8 @@ CLI tools that ship with the platform (or are commonly installed).
Platform support:
macOS osascript (always available), pngpaste (if installed)
Windows PowerShell via .NET System.Windows.Forms.Clipboard
WSL2 powershell.exe via .NET System.Windows.Forms.Clipboard
Windows PowerShell via WinForms, Get-Clipboard, file-drop fallback
WSL2 powershell.exe via WinForms, Get-Clipboard, file-drop fallback
Linux wl-paste (Wayland), xclip (X11)
"""
@@ -46,10 +46,11 @@ def has_clipboard_image() -> bool:
return _macos_has_image()
if sys.platform == "win32":
return _windows_has_image()
if _is_wsl():
return _wsl_has_image()
if os.environ.get("WAYLAND_DISPLAY"):
return _wayland_has_image()
# Match _linux_save fallthrough order: WSL → Wayland → X11
if _is_wsl() and _wsl_has_image():
return True
if os.environ.get("WAYLAND_DISPLAY") and _wayland_has_image():
return True
return _xclip_has_image()
@@ -135,6 +136,114 @@ _PS_EXTRACT_IMAGE = (
"[System.Convert]::ToBase64String($ms.ToArray())"
)
_PS_CHECK_IMAGE_GET_CLIPBOARD = (
"try { "
"$img = Get-Clipboard -Format Image -ErrorAction Stop;"
"if ($null -ne $img) { 'True' } else { 'False' }"
"} catch { 'False' }"
)
_PS_EXTRACT_IMAGE_GET_CLIPBOARD = (
"try { "
"Add-Type -AssemblyName System.Drawing;"
"Add-Type -AssemblyName PresentationCore;"
"Add-Type -AssemblyName WindowsBase;"
"$img = Get-Clipboard -Format Image -ErrorAction Stop;"
"if ($null -eq $img) { exit 1 }"
"$ms = New-Object System.IO.MemoryStream;"
"if ($img -is [System.Drawing.Image]) {"
"$img.Save($ms, [System.Drawing.Imaging.ImageFormat]::Png)"
"} elseif ($img -is [System.Windows.Media.Imaging.BitmapSource]) {"
"$enc = New-Object System.Windows.Media.Imaging.PngBitmapEncoder;"
"$enc.Frames.Add([System.Windows.Media.Imaging.BitmapFrame]::Create($img));"
"$enc.Save($ms)"
"} else { exit 2 }"
"[System.Convert]::ToBase64String($ms.ToArray())"
"} catch { exit 1 }"
)
_FILEDROP_IMAGE_EXTS = "'.png','.jpg','.jpeg','.gif','.webp','.bmp','.tiff','.tif'"
_PS_CHECK_FILEDROP_IMAGE = (
"try { "
"$files = Get-Clipboard -Format FileDropList -ErrorAction Stop;"
f"$exts = @({_FILEDROP_IMAGE_EXTS});"
"$hit = $files | Where-Object { $exts -contains ([System.IO.Path]::GetExtension($_).ToLowerInvariant()) } | Select-Object -First 1;"
"if ($null -ne $hit) { 'True' } else { 'False' }"
"} catch { 'False' }"
)
_PS_EXTRACT_FILEDROP_IMAGE = (
"try { "
"$files = Get-Clipboard -Format FileDropList -ErrorAction Stop;"
f"$exts = @({_FILEDROP_IMAGE_EXTS});"
"$hit = $files | Where-Object { $exts -contains ([System.IO.Path]::GetExtension($_).ToLowerInvariant()) } | Select-Object -First 1;"
"if ($null -eq $hit) { exit 1 }"
"[System.Convert]::ToBase64String([System.IO.File]::ReadAllBytes($hit))"
"} catch { exit 1 }"
)
_POWERSHELL_HAS_IMAGE_SCRIPTS = (
_PS_CHECK_IMAGE,
_PS_CHECK_IMAGE_GET_CLIPBOARD,
_PS_CHECK_FILEDROP_IMAGE,
)
_POWERSHELL_EXTRACT_IMAGE_SCRIPTS = (
_PS_EXTRACT_IMAGE,
_PS_EXTRACT_IMAGE_GET_CLIPBOARD,
_PS_EXTRACT_FILEDROP_IMAGE,
)
def _run_powershell(exe: str, script: str, timeout: int) -> subprocess.CompletedProcess:
return subprocess.run(
[exe, "-NoProfile", "-NonInteractive", "-Command", script],
capture_output=True, text=True, timeout=timeout,
)
def _write_base64_image(dest: Path, b64_data: str) -> bool:
image_bytes = base64.b64decode(b64_data, validate=True)
dest.write_bytes(image_bytes)
return dest.exists() and dest.stat().st_size > 0
def _powershell_has_image(exe: str, *, timeout: int, label: str) -> bool:
for script in _POWERSHELL_HAS_IMAGE_SCRIPTS:
try:
r = _run_powershell(exe, script, timeout=timeout)
if r.returncode == 0 and "True" in r.stdout:
return True
except FileNotFoundError:
logger.debug("%s not found — clipboard unavailable", exe)
return False
except Exception as e:
logger.debug("%s clipboard image check failed: %s", label, e)
return False
def _powershell_save_image(exe: str, dest: Path, *, timeout: int, label: str) -> bool:
for script in _POWERSHELL_EXTRACT_IMAGE_SCRIPTS:
try:
r = _run_powershell(exe, script, timeout=timeout)
if r.returncode != 0:
continue
b64_data = r.stdout.strip()
if not b64_data:
continue
if _write_base64_image(dest, b64_data):
return True
except FileNotFoundError:
logger.debug("%s not found — clipboard unavailable", exe)
return False
except Exception as e:
logger.debug("%s clipboard image extraction failed: %s", label, e)
dest.unlink(missing_ok=True)
return False
# ── Native Windows ────────────────────────────────────────────────────────
@@ -175,15 +284,7 @@ def _windows_has_image() -> bool:
ps = _get_ps_exe()
if ps is None:
return False
try:
r = subprocess.run(
[ps, "-NoProfile", "-NonInteractive", "-Command", _PS_CHECK_IMAGE],
capture_output=True, text=True, timeout=5,
)
return r.returncode == 0 and "True" in r.stdout
except Exception as e:
logger.debug("Windows clipboard image check failed: %s", e)
return False
return _powershell_has_image(ps, timeout=5, label="Windows")
def _windows_save(dest: Path) -> bool:
@@ -192,26 +293,7 @@ def _windows_save(dest: Path) -> bool:
if ps is None:
logger.debug("No PowerShell found — Windows clipboard image paste unavailable")
return False
try:
r = subprocess.run(
[ps, "-NoProfile", "-NonInteractive", "-Command", _PS_EXTRACT_IMAGE],
capture_output=True, text=True, timeout=15,
)
if r.returncode != 0:
return False
b64_data = r.stdout.strip()
if not b64_data:
return False
png_bytes = base64.b64decode(b64_data)
dest.write_bytes(png_bytes)
return dest.exists() and dest.stat().st_size > 0
except Exception as e:
logger.debug("Windows clipboard image extraction failed: %s", e)
dest.unlink(missing_ok=True)
return False
return _powershell_save_image(ps, dest, timeout=15, label="Windows")
# ── Linux ────────────────────────────────────────────────────────────────
@@ -235,45 +317,12 @@ def _linux_save(dest: Path) -> bool:
def _wsl_has_image() -> bool:
"""Check if Windows clipboard has an image (via powershell.exe)."""
try:
r = subprocess.run(
["powershell.exe", "-NoProfile", "-NonInteractive", "-Command",
_PS_CHECK_IMAGE],
capture_output=True, text=True, timeout=8,
)
return r.returncode == 0 and "True" in r.stdout
except FileNotFoundError:
logger.debug("powershell.exe not found — WSL clipboard unavailable")
except Exception as e:
logger.debug("WSL clipboard check failed: %s", e)
return False
return _powershell_has_image("powershell.exe", timeout=8, label="WSL")
def _wsl_save(dest: Path) -> bool:
"""Extract clipboard image via powershell.exe → base64 → decode to PNG."""
try:
r = subprocess.run(
["powershell.exe", "-NoProfile", "-NonInteractive", "-Command",
_PS_EXTRACT_IMAGE],
capture_output=True, text=True, timeout=15,
)
if r.returncode != 0:
return False
b64_data = r.stdout.strip()
if not b64_data:
return False
png_bytes = base64.b64decode(b64_data)
dest.write_bytes(png_bytes)
return dest.exists() and dest.stat().st_size > 0
except FileNotFoundError:
logger.debug("powershell.exe not found — WSL clipboard unavailable")
except Exception as e:
logger.debug("WSL clipboard extraction failed: %s", e)
dest.unlink(missing_ok=True)
return False
return _powershell_save_image("powershell.exe", dest, timeout=15, label="WSL")
# ── Wayland (wl-paste) ──────────────────────────────────────────────────
+95 -7
View File
@@ -87,8 +87,12 @@ COMMAND_REGISTRY: list[CommandDef] = [
aliases=("bg",), args_hint="<prompt>"),
CommandDef("btw", "Ephemeral side question using session context (no tools, not persisted)", "Session",
args_hint="<question>"),
CommandDef("agents", "Show active agents and running tasks", "Session",
aliases=("tasks",)),
CommandDef("queue", "Queue a prompt for the next turn (doesn't interrupt)", "Session",
aliases=("q",), args_hint="<prompt>"),
CommandDef("steer", "Inject a message after the next tool call without interrupting", "Session",
args_hint="<prompt>"),
CommandDef("status", "Show session info", "Session"),
CommandDef("profile", "Show active profile name and home directory", "Info"),
CommandDef("sethome", "Set this chat as the home channel", "Session",
@@ -99,7 +103,7 @@ COMMAND_REGISTRY: list[CommandDef] = [
# Configuration
CommandDef("config", "Show current configuration", "Configuration",
cli_only=True),
CommandDef("model", "Switch model for this session", "Configuration", args_hint="[model] [--global]"),
CommandDef("model", "Switch model for this session", "Configuration", args_hint="[model] [--provider name] [--global]"),
CommandDef("provider", "Show available providers and current provider",
"Configuration"),
CommandDef("gquota", "Show Google Gemini Code Assist quota usage", "Info"),
@@ -120,7 +124,7 @@ COMMAND_REGISTRY: list[CommandDef] = [
args_hint="[normal|fast|status]",
subcommands=("normal", "fast", "status", "on", "off")),
CommandDef("skin", "Show or change the display skin/theme", "Configuration",
cli_only=True, args_hint="[name]"),
args_hint="[name]"),
CommandDef("voice", "Toggle voice mode", "Configuration",
args_hint="[on|off|tts|status]", subcommands=("on", "off", "tts", "status")),
@@ -155,7 +159,9 @@ COMMAND_REGISTRY: list[CommandDef] = [
args_hint="[days]"),
CommandDef("platforms", "Show gateway/messaging platform status", "Info",
cli_only=True, aliases=("gateway",)),
CommandDef("paste", "Check clipboard for an image and attach it", "Info",
CommandDef("copy", "Copy the last assistant response to clipboard", "Info",
cli_only=True, args_hint="[number]"),
CommandDef("paste", "Attach clipboard image from your clipboard", "Info",
cli_only=True),
CommandDef("image", "Attach a local image file for your next prompt", "Info",
cli_only=True, args_hint="<path>"),
@@ -254,6 +260,36 @@ GATEWAY_KNOWN_COMMANDS: frozenset[str] = frozenset(
)
# Commands that must never be queued behind an active gateway session.
# These are explicit control/info commands handled by the gateway itself;
# if they get queued as pending text, the safety net in gateway.run will
# discard them before they ever reach the user.
ACTIVE_SESSION_BYPASS_COMMANDS: frozenset[str] = frozenset(
{
"agents",
"approve",
"background",
"commands",
"deny",
"help",
"new",
"profile",
"queue",
"restart",
"status",
"steer",
"stop",
"update",
}
)
def should_bypass_active_session(command_name: str | None) -> bool:
"""Return True when a slash command must bypass active-session queuing."""
cmd = resolve_command(command_name) if command_name else None
return bool(cmd and cmd.name in ACTIVE_SESSION_BYPASS_COMMANDS)
def _resolve_config_gates() -> set[str]:
"""Return canonical names of commands whose ``gateway_config_gate`` is truthy.
@@ -1044,6 +1080,51 @@ class SlashCommandCompleter(Completer):
display_meta=f"{fp} {meta}" if meta else fp,
)
@staticmethod
def _skin_completions(sub_text: str, sub_lower: str):
"""Yield completions for /skin from available skins."""
try:
from hermes_cli.skin_engine import list_skins
for s in list_skins():
name = s["name"]
if name.startswith(sub_lower) and name != sub_lower:
yield Completion(
name,
start_position=-len(sub_text),
display=name,
display_meta=s.get("description", "") or s.get("source", ""),
)
except Exception:
pass
@staticmethod
def _personality_completions(sub_text: str, sub_lower: str):
"""Yield completions for /personality from configured personalities."""
try:
from hermes_cli.config import load_config
personalities = load_config().get("agent", {}).get("personalities", {})
if "none".startswith(sub_lower) and "none" != sub_lower:
yield Completion(
"none",
start_position=-len(sub_text),
display="none",
display_meta="clear personality overlay",
)
for name, prompt in personalities.items():
if name.startswith(sub_lower) and name != sub_lower:
if isinstance(prompt, dict):
meta = prompt.get("description") or prompt.get("system_prompt", "")[:50]
else:
meta = str(prompt)[:50]
yield Completion(
name,
start_position=-len(sub_text),
display=name,
display_meta=meta,
)
except Exception:
pass
def _model_completions(self, sub_text: str, sub_lower: str):
"""Yield completions for /model from config aliases + built-in aliases."""
seen = set()
@@ -1098,10 +1179,17 @@ class SlashCommandCompleter(Completer):
sub_text = parts[1] if len(parts) > 1 else ""
sub_lower = sub_text.lower()
# Dynamic model alias completions for /model
if " " not in sub_text and base_cmd == "/model":
yield from self._model_completions(sub_text, sub_lower)
return
# Dynamic completions for commands with runtime lists
if " " not in sub_text:
if base_cmd == "/model":
yield from self._model_completions(sub_text, sub_lower)
return
if base_cmd == "/skin":
yield from self._skin_completions(sub_text, sub_lower)
return
if base_cmd == "/personality":
yield from self._personality_completions(sub_text, sub_lower)
return
# Static subcommand completions
if " " not in sub_text and base_cmd in SUBCOMMANDS and self._command_allowed(base_cmd):
+139 -8
View File
@@ -12,6 +12,7 @@ This module provides:
- hermes config wizard - Re-run setup wizard
"""
import copy
import os
import platform
import re
@@ -26,6 +27,7 @@ from typing import Dict, Any, Optional, List, Tuple
_IS_WINDOWS = platform.system() == "Windows"
_ENV_VAR_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
_LAST_EXPANDED_CONFIG_BY_PATH: Dict[str, Any] = {}
# Env var names written to .env that aren't in OPTIONAL_ENV_VARS
# (managed by setup/provider flows directly).
_EXTRA_ENV_KEYS = frozenset({
@@ -44,7 +46,8 @@ _EXTRA_ENV_KEYS = frozenset({
"WEIXIN_HOME_CHANNEL", "WEIXIN_HOME_CHANNEL_NAME", "WEIXIN_DM_POLICY", "WEIXIN_GROUP_POLICY",
"WEIXIN_ALLOWED_USERS", "WEIXIN_GROUP_ALLOWED_USERS", "WEIXIN_ALLOW_ALL_USERS",
"BLUEBUBBLES_SERVER_URL", "BLUEBUBBLES_PASSWORD",
"QQ_APP_ID", "QQ_CLIENT_SECRET", "QQ_HOME_CHANNEL", "QQ_HOME_CHANNEL_NAME",
"QQ_APP_ID", "QQ_CLIENT_SECRET", "QQBOT_HOME_CHANNEL", "QQBOT_HOME_CHANNEL_NAME",
"QQ_HOME_CHANNEL", "QQ_HOME_CHANNEL_NAME", # legacy aliases (pre-rename, still read for back-compat)
"QQ_ALLOWED_USERS", "QQ_GROUP_ALLOWED_USERS", "QQ_ALLOW_ALL_USERS", "QQ_MARKDOWN_SUPPORT",
"QQ_STT_API_KEY", "QQ_STT_BASE_URL", "QQ_STT_MODEL",
"TERMINAL_ENV", "TERMINAL_SSH_KEY", "TERMINAL_SSH_PORT",
@@ -417,6 +420,7 @@ DEFAULT_CONFIG = {
"command_timeout": 30, # Timeout for browser commands in seconds (screenshot, navigate, etc.)
"record_sessions": False, # Auto-record browser sessions as WebM videos
"allow_private_urls": False, # Allow navigating to private/internal IPs (localhost, 192.168.x.x, etc.)
"cdp_url": "", # Optional persistent CDP endpoint for attaching to an existing Chromium/Chrome
"camofox": {
# When true, Hermes sends a stable profile-scoped userId to Camofox
# so the server maps it to a persistent Firefox profile automatically.
@@ -537,6 +541,13 @@ DEFAULT_CONFIG = {
"api_key": "",
"timeout": 30,
},
"title_generation": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30,
},
},
"display": {
@@ -760,6 +771,20 @@ DEFAULT_CONFIG = {
"wrap_response": True,
},
# execute_code settings — controls the tool used for programmatic tool calls.
"code_execution": {
# Execution mode:
# project (default) — scripts run in the session's working directory
# with the active virtualenv/conda env's python, so project deps
# (pandas, torch, project packages) and relative paths resolve.
# strict — scripts run in an isolated temp directory with
# hermes-agent's own python (sys.executable). Maximum isolation
# and reproducibility; project deps and relative paths won't work.
# Env scrubbing (strips *_API_KEY, *_TOKEN, *_SECRET, ...) and the
# tool whitelist apply identically in both modes.
"mode": "project",
},
# Logging — controls file logging to ~/.hermes/logs/.
# agent.log captures INFO+ (all agent activity); errors.log captures WARNING+.
"logging": {
@@ -777,7 +802,7 @@ DEFAULT_CONFIG = {
},
# Config schema version - bump this when adding new required fields
"_config_version": 18,
"_config_version": 19,
}
# =============================================================================
@@ -861,6 +886,22 @@ OPTIONAL_ENV_VARS = {
"category": "provider",
"advanced": True,
},
"NVIDIA_API_KEY": {
"description": "NVIDIA NIM API key (build.nvidia.com or local NIM endpoint)",
"prompt": "NVIDIA NIM API key",
"url": "https://build.nvidia.com/",
"password": True,
"category": "provider",
"advanced": True,
},
"NVIDIA_BASE_URL": {
"description": "NVIDIA NIM base URL override (e.g. http://localhost:8000/v1 for local NIM)",
"prompt": "NVIDIA NIM base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
"GLM_API_KEY": {
"description": "Z.AI / GLM API key (also recognized as ZAI_API_KEY / Z_AI_API_KEY)",
"prompt": "Z.AI / GLM API key",
@@ -1518,12 +1559,12 @@ OPTIONAL_ENV_VARS = {
"prompt": "Allow All QQ Users",
"category": "messaging",
},
"QQ_HOME_CHANNEL": {
"QQBOT_HOME_CHANNEL": {
"description": "Default QQ channel/group for cron delivery and notifications",
"prompt": "QQ Home Channel",
"category": "messaging",
},
"QQ_HOME_CHANNEL_NAME": {
"QQBOT_HOME_CHANNEL_NAME": {
"description": "Display name for the QQ home channel",
"prompt": "QQ Home Channel Name",
"category": "messaging",
@@ -2610,6 +2651,85 @@ def _expand_env_vars(obj):
return obj
def _items_by_unique_name(items):
"""Return a name-indexed dict only when all items have unique string names."""
if not isinstance(items, list):
return None
indexed = {}
for item in items:
if not isinstance(item, dict) or not isinstance(item.get("name"), str):
return None
name = item["name"]
if name in indexed:
return None
indexed[name] = item
return indexed
def _preserve_env_ref_templates(current, raw, loaded_expanded=None):
"""Restore raw ``${VAR}`` templates when a value is otherwise unchanged.
``load_config()`` expands env refs for runtime use. When a caller later
persists that config after modifying some unrelated setting, keep the
original on-disk template instead of writing the expanded plaintext
secret back to ``config.yaml``.
Prefer preserving the raw template when ``current`` still matches either
the value previously returned by ``load_config()`` for this config path or
the current environment expansion of ``raw``. This handles env-var
rotation between load and save while still treating mixed literal/template
string edits as caller-owned once their rendered value diverges.
"""
if isinstance(current, str) and isinstance(raw, str) and re.search(r"\${[^}]+}", raw):
if current == raw:
return raw
if isinstance(loaded_expanded, str) and current == loaded_expanded:
return raw
if _expand_env_vars(raw) == current:
return raw
return current
if isinstance(current, dict) and isinstance(raw, dict):
return {
key: _preserve_env_ref_templates(
value,
raw.get(key),
loaded_expanded.get(key) if isinstance(loaded_expanded, dict) else None,
)
for key, value in current.items()
}
if isinstance(current, list) and isinstance(raw, list):
# Prefer matching named config objects (e.g. custom_providers) by name
# so harmless reordering doesn't drop the original template. If names
# are duplicated, fall back to positional matching instead of silently
# shadowing one entry.
current_by_name = _items_by_unique_name(current)
raw_by_name = _items_by_unique_name(raw)
loaded_by_name = _items_by_unique_name(loaded_expanded)
if current_by_name is not None and raw_by_name is not None:
return [
_preserve_env_ref_templates(
item,
raw_by_name.get(item.get("name")),
loaded_by_name.get(item.get("name")) if loaded_by_name is not None else None,
)
for item in current
]
return [
_preserve_env_ref_templates(
item,
raw[index] if index < len(raw) else None,
loaded_expanded[index]
if isinstance(loaded_expanded, list) and index < len(loaded_expanded)
else None,
)
for index, item in enumerate(current)
]
return current
def _normalize_root_model_keys(config: Dict[str, Any]) -> Dict[str, Any]:
"""Move stale root-level provider/base_url into model section.
@@ -2677,7 +2797,6 @@ def read_raw_config() -> Dict[str, Any]:
def load_config() -> Dict[str, Any]:
"""Load configuration from ~/.hermes/config.yaml."""
import copy
ensure_hermes_home()
config_path = get_config_path()
@@ -2698,8 +2817,11 @@ def load_config() -> Dict[str, Any]:
config = _deep_merge(config, user_config)
except Exception as e:
print(f"Warning: Failed to load config: {e}")
return _expand_env_vars(_normalize_root_model_keys(_normalize_max_turns_config(config)))
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
expanded = _expand_env_vars(normalized)
_LAST_EXPANDED_CONFIG_BY_PATH[str(config_path)] = copy.deepcopy(expanded)
return expanded
_SECURITY_COMMENT = """
@@ -2808,7 +2930,15 @@ def save_config(config: Dict[str, Any]):
ensure_hermes_home()
config_path = get_config_path()
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
current_normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
normalized = current_normalized
raw_existing = _normalize_root_model_keys(_normalize_max_turns_config(read_raw_config()))
if raw_existing:
normalized = _preserve_env_ref_templates(
normalized,
raw_existing,
_LAST_EXPANDED_CONFIG_BY_PATH.get(str(config_path)),
)
# Build optional commented-out sections for features that are off by
# default or only relevant when explicitly configured.
@@ -2826,6 +2956,7 @@ def save_config(config: Dict[str, Any]):
extra_content="".join(parts) if parts else None,
)
_secure_file(config_path)
_LAST_EXPANDED_CONFIG_BY_PATH[str(config_path)] = copy.deepcopy(current_normalized)
def load_env() -> Dict[str, str]:
+137 -29
View File
@@ -6,7 +6,10 @@ Currently supports:
"""
import io
import json
import os
import sys
import time
import urllib.error
import urllib.parse
import urllib.request
@@ -31,6 +34,119 @@ _MAX_LOG_BYTES = 512_000
_AUTO_DELETE_SECONDS = 21600
# ---------------------------------------------------------------------------
# Pending-deletion tracking (replaces the old fork-and-sleep subprocess).
# ---------------------------------------------------------------------------
def _pending_file() -> Path:
"""Path to ``~/.hermes/pastes/pending.json``.
Each entry: ``{"url": "...", "expire_at": <unix_ts>}``. Scheduled
DELETEs used to be handled by spawning a detached Python process per
paste that slept for 6 hours; those accumulated forever if the user
ran ``hermes debug share`` repeatedly. We now persist the schedule
to disk and sweep expired entries on the next debug invocation.
"""
return get_hermes_home() / "pastes" / "pending.json"
def _load_pending() -> list[dict]:
path = _pending_file()
if not path.exists():
return []
try:
data = json.loads(path.read_text(encoding="utf-8"))
if isinstance(data, list):
# Filter to well-formed entries only
return [
e for e in data
if isinstance(e, dict) and "url" in e and "expire_at" in e
]
except (OSError, ValueError, json.JSONDecodeError):
pass
return []
def _save_pending(entries: list[dict]) -> None:
path = _pending_file()
try:
path.parent.mkdir(parents=True, exist_ok=True)
tmp = path.with_suffix(".json.tmp")
tmp.write_text(json.dumps(entries, indent=2), encoding="utf-8")
os.replace(tmp, path)
except OSError:
# Non-fatal — worst case the user has to run ``hermes debug delete``
# manually.
pass
def _record_pending(urls: list[str], delay_seconds: int = _AUTO_DELETE_SECONDS) -> None:
"""Record *urls* for deletion at ``now + delay_seconds``.
Only paste.rs URLs are recorded (dpaste.com auto-expires). Entries
are merged into any existing pending.json.
"""
paste_rs_urls = [u for u in urls if _extract_paste_id(u)]
if not paste_rs_urls:
return
entries = _load_pending()
# Dedupe by URL: keep the later expire_at if same URL appears twice
by_url: dict[str, float] = {e["url"]: float(e["expire_at"]) for e in entries}
expire_at = time.time() + delay_seconds
for u in paste_rs_urls:
by_url[u] = max(expire_at, by_url.get(u, 0.0))
merged = [{"url": u, "expire_at": ts} for u, ts in by_url.items()]
_save_pending(merged)
def _sweep_expired_pastes(now: Optional[float] = None) -> tuple[int, int]:
"""Synchronously DELETE any pending pastes whose ``expire_at`` has passed.
Returns ``(deleted, remaining)``. Best-effort: failed deletes stay in
the pending file and will be retried on the next sweep. Silent
intended to be called from every ``hermes debug`` invocation with
minimal noise.
"""
entries = _load_pending()
if not entries:
return (0, 0)
current = time.time() if now is None else now
deleted = 0
remaining: list[dict] = []
for entry in entries:
try:
expire_at = float(entry.get("expire_at", 0))
except (TypeError, ValueError):
continue # drop malformed entries
if expire_at > current:
remaining.append(entry)
continue
url = entry.get("url", "")
try:
if delete_paste(url):
deleted += 1
continue
except Exception:
# Network hiccup, 404 (already gone), etc. — drop the entry
# after a grace period; don't retry forever.
pass
# Retain failed deletes for up to 24h past expiration, then give up.
if expire_at + 86400 > current:
remaining.append(entry)
else:
deleted += 1 # count as reaped (paste.rs will GC eventually)
if deleted:
_save_pending(remaining)
return (deleted, len(remaining))
# ---------------------------------------------------------------------------
# Privacy / delete helpers
# ---------------------------------------------------------------------------
@@ -90,37 +206,19 @@ def delete_paste(url: str) -> bool:
def _schedule_auto_delete(urls: list[str], delay_seconds: int = _AUTO_DELETE_SECONDS):
"""Spawn a detached process to delete paste.rs pastes after *delay_seconds*.
"""Record *urls* for deletion ``delay_seconds`` from now.
The child process is fully detached (``start_new_session=True``) so it
survives the parent exiting (important for CLI mode). Only paste.rs
URLs are attempted dpaste.com pastes auto-expire on their own.
Previously this spawned a detached Python subprocess per call that slept
for 6 hours and then issued DELETE requests. Those subprocesses leaked
every ``hermes debug share`` invocation added ~20 MB of resident Python
interpreters that never exited until the sleep completed.
The replacement is stateless: we append to ``~/.hermes/pastes/pending.json``
and rely on opportunistic sweeps (``_sweep_expired_pastes``) called from
every ``hermes debug`` invocation. If the user never runs ``hermes debug``
again, paste.rs's own retention policy handles cleanup.
"""
import subprocess
paste_rs_urls = [u for u in urls if _extract_paste_id(u)]
if not paste_rs_urls:
return
# Build a tiny inline Python script. No imports beyond stdlib.
url_list = ", ".join(f'"{u}"' for u in paste_rs_urls)
script = (
"import time, urllib.request; "
f"time.sleep({delay_seconds}); "
f"[urllib.request.urlopen(urllib.request.Request(u, method='DELETE', "
f"headers={{'User-Agent': 'hermes-agent/auto-delete'}}), timeout=15) "
f"for u in [{url_list}]]"
)
try:
subprocess.Popen(
[sys.executable, "-c", script],
start_new_session=True,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
except Exception:
pass # Best-effort; manual delete still available.
_record_pending(urls, delay_seconds=delay_seconds)
def _delete_hint(url: str) -> str:
@@ -455,6 +553,16 @@ def run_debug_delete(args):
def run_debug(args):
"""Route debug subcommands."""
# Opportunistic sweep of expired pastes on every ``hermes debug`` call.
# Replaces the old per-paste sleeping subprocess that used to leak as
# one orphaned Python interpreter per scheduled deletion. Silent and
# best-effort — any failure is swallowed so ``hermes debug`` stays
# reliable even when offline.
try:
_sweep_expired_pastes()
except Exception:
pass
subcmd = getattr(args, "debug_command", None)
if subcmd == "share":
run_debug_share(args)
+294
View File
@@ -0,0 +1,294 @@
"""
DingTalk Device Flow authorization.
Implements the same 3-step registration flow as dingtalk-openclaw-connector:
1. POST /app/registration/init get nonce
2. POST /app/registration/begin get device_code + verification_uri_complete
3. POST /app/registration/poll poll until SUCCESS get client_id + client_secret
The verification_uri_complete is rendered as a QR code in the terminal so the
user can scan it with DingTalk to authorize, yielding AppKey + AppSecret
automatically.
"""
from __future__ import annotations
import io
import os
import sys
import time
import logging
from typing import Optional, Tuple
import requests
logger = logging.getLogger(__name__)
# ── Configuration ──────────────────────────────────────────────────────────
REGISTRATION_BASE_URL = os.environ.get(
"DINGTALK_REGISTRATION_BASE_URL", "https://oapi.dingtalk.com"
).rstrip("/")
REGISTRATION_SOURCE = os.environ.get("DINGTALK_REGISTRATION_SOURCE", "openClaw")
# ── API helpers ────────────────────────────────────────────────────────────
class RegistrationError(Exception):
"""Raised when a DingTalk registration API call fails."""
def _api_post(path: str, payload: dict) -> dict:
"""POST to the registration API and return the parsed JSON body."""
url = f"{REGISTRATION_BASE_URL}{path}"
try:
resp = requests.post(url, json=payload, timeout=15)
resp.raise_for_status()
data = resp.json()
except requests.RequestException as exc:
raise RegistrationError(f"Network error calling {url}: {exc}") from exc
errcode = data.get("errcode", -1)
if errcode != 0:
errmsg = data.get("errmsg", "unknown error")
raise RegistrationError(f"API error [{path}]: {errmsg} (errcode={errcode})")
return data
# ── Core flow ──────────────────────────────────────────────────────────────
def begin_registration() -> dict:
"""Start a device-flow registration.
Returns a dict with keys:
device_code, verification_uri_complete, expires_in, interval
"""
# Step 1: init → nonce
init_data = _api_post("/app/registration/init", {"source": REGISTRATION_SOURCE})
nonce = str(init_data.get("nonce", "")).strip()
if not nonce:
raise RegistrationError("init response missing nonce")
# Step 2: begin → device_code, verification_uri_complete
begin_data = _api_post("/app/registration/begin", {"nonce": nonce})
device_code = str(begin_data.get("device_code", "")).strip()
verification_uri_complete = str(begin_data.get("verification_uri_complete", "")).strip()
if not device_code:
raise RegistrationError("begin response missing device_code")
if not verification_uri_complete:
raise RegistrationError("begin response missing verification_uri_complete")
return {
"device_code": device_code,
"verification_uri_complete": verification_uri_complete,
"expires_in": int(begin_data.get("expires_in", 7200)),
"interval": max(int(begin_data.get("interval", 3)), 2),
}
def poll_registration(device_code: str) -> dict:
"""Poll the registration status once.
Returns a dict with keys: status, client_id?, client_secret?, fail_reason?
"""
data = _api_post("/app/registration/poll", {"device_code": device_code})
status_raw = str(data.get("status", "")).strip().upper()
if status_raw not in ("WAITING", "SUCCESS", "FAIL", "EXPIRED"):
status_raw = "UNKNOWN"
return {
"status": status_raw,
"client_id": str(data.get("client_id", "")).strip() or None,
"client_secret": str(data.get("client_secret", "")).strip() or None,
"fail_reason": str(data.get("fail_reason", "")).strip() or None,
}
def wait_for_registration_success(
device_code: str,
interval: int = 3,
expires_in: int = 7200,
on_waiting: Optional[callable] = None,
) -> Tuple[str, str]:
"""Block until the registration succeeds or times out.
Returns (client_id, client_secret).
"""
deadline = time.monotonic() + expires_in
retry_window = 120 # 2 minutes for transient errors
retry_start = 0.0
while time.monotonic() < deadline:
time.sleep(interval)
try:
result = poll_registration(device_code)
except RegistrationError:
if retry_start == 0:
retry_start = time.monotonic()
if time.monotonic() - retry_start < retry_window:
continue
raise
status = result["status"]
if status == "WAITING":
retry_start = 0
if on_waiting:
on_waiting()
continue
if status == "SUCCESS":
cid = result["client_id"]
csecret = result["client_secret"]
if not cid or not csecret:
raise RegistrationError("authorization succeeded but credentials are missing")
return cid, csecret
# FAIL / EXPIRED / UNKNOWN
if retry_start == 0:
retry_start = time.monotonic()
if time.monotonic() - retry_start < retry_window:
continue
reason = result.get("fail_reason") or status
raise RegistrationError(f"authorization failed: {reason}")
raise RegistrationError("authorization timed out, please retry")
# ── QR code rendering ─────────────────────────────────────────────────────
def _ensure_qrcode_installed() -> bool:
"""Try to import qrcode; if missing, auto-install it via pip/uv."""
try:
import qrcode # noqa: F401
return True
except ImportError:
pass
import subprocess
# Try uv first (Hermes convention), then pip
for cmd in (
[sys.executable, "-m", "uv", "pip", "install", "qrcode"],
[sys.executable, "-m", "pip", "install", "-q", "qrcode"],
):
try:
subprocess.check_call(cmd, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
import qrcode # noqa: F401,F811
return True
except (subprocess.CalledProcessError, ImportError, FileNotFoundError):
continue
return False
def render_qr_to_terminal(url: str) -> bool:
"""Render *url* as a compact QR code in the terminal.
Returns True if the QR code was printed, False if the library is missing.
"""
try:
import qrcode
except ImportError:
return False
qr = qrcode.QRCode(
version=1,
error_correction=qrcode.constants.ERROR_CORRECT_L,
box_size=1,
border=1,
)
qr.add_data(url)
qr.make(fit=True)
# Use half-block characters for compact rendering (2 rows per character)
matrix = qr.get_matrix()
rows = len(matrix)
lines: list[str] = []
TOP_HALF = "\u2580" # ▀
BOTTOM_HALF = "\u2584" # ▄
FULL_BLOCK = "\u2588" # █
EMPTY = " "
for r in range(0, rows, 2):
line_chars: list[str] = []
for c in range(len(matrix[r])):
top = matrix[r][c]
bottom = matrix[r + 1][c] if r + 1 < rows else False
if top and bottom:
line_chars.append(FULL_BLOCK)
elif top:
line_chars.append(TOP_HALF)
elif bottom:
line_chars.append(BOTTOM_HALF)
else:
line_chars.append(EMPTY)
lines.append(" " + "".join(line_chars))
print("\n".join(lines))
return True
# ── High-level entry point for the setup wizard ───────────────────────────
def dingtalk_qr_auth() -> Optional[Tuple[str, str]]:
"""Run the interactive QR-code device-flow authorization.
Returns (client_id, client_secret) on success, or None if the user
cancelled or the flow failed.
"""
from hermes_cli.setup import print_info, print_success, print_warning, print_error
print()
print_info(" Initializing DingTalk device authorization...")
print_info(" Note: the scan page is branded 'OpenClaw' — DingTalk's")
print_info(" ecosystem onboarding bridge. Safe to use.")
try:
reg = begin_registration()
except RegistrationError as exc:
print_error(f" Authorization init failed: {exc}")
return None
url = reg["verification_uri_complete"]
# Ensure qrcode library is available (auto-install if missing)
if not _ensure_qrcode_installed():
print_warning(" qrcode library install failed, will show link only.")
print()
print_info(" Please scan the QR code below with DingTalk to authorize:")
print()
if not render_qr_to_terminal(url):
print_warning(f" QR code render failed, please open the link below to authorize:")
print()
print_info(f" Or open this link manually: {url}")
print()
print_info(" Waiting for QR scan authorization... (timeout: 2 hours)")
dot_count = 0
def _on_waiting():
nonlocal dot_count
dot_count += 1
if dot_count % 10 == 0:
sys.stdout.write(".")
sys.stdout.flush()
try:
client_id, client_secret = wait_for_registration_success(
device_code=reg["device_code"],
interval=reg["interval"],
expires_in=reg["expires_in"],
on_waiting=_on_waiting,
)
except RegistrationError as exc:
print()
print_error(f" Authorization failed: {exc}")
return None
print()
print_success(" QR scan authorization successful!")
print_success(f" Client ID: {client_id}")
print_success(f" Client Secret: {client_secret[:8]}{'*' * (len(client_secret) - 8)}")
return client_id, client_secret
+3 -2
View File
@@ -825,6 +825,7 @@ def run_doctor(args):
("Arcee AI", ("ARCEEAI_API_KEY",), "https://api.arcee.ai/api/v1/models", "ARCEE_BASE_URL", True),
("DeepSeek", ("DEEPSEEK_API_KEY",), "https://api.deepseek.com/v1/models", "DEEPSEEK_BASE_URL", True),
("Hugging Face", ("HF_TOKEN",), "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
("NVIDIA NIM", ("NVIDIA_API_KEY",), "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
("Alibaba/DashScope", ("DASHSCOPE_API_KEY",), "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
# MiniMax: the /anthropic endpoint doesn't support /models, but the /v1 endpoint does.
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL", True),
@@ -894,8 +895,8 @@ def run_doctor(args):
_model_count = len(_br_resp.get("modelSummaries", []))
print(f"\r {color('', Colors.GREEN)} {_label} {color(f'({_auth_var}, {_region}, {_model_count} models)', Colors.DIM)} ")
except ImportError:
print(f"\r {color('', Colors.YELLOW)} {_label} {color('(boto3 not installed — pip install hermes-agent[bedrock])', Colors.DIM)} ")
issues.append("Install boto3 for Bedrock: pip install hermes-agent[bedrock]")
print(f"\r {color('', Colors.YELLOW)} {_label} {color(f'(boto3 not installed — {sys.executable} -m pip install boto3)', Colors.DIM)} ")
issues.append(f"Install boto3 for Bedrock: {sys.executable} -m pip install boto3")
except Exception as _e:
_err_name = type(_e).__name__
print(f"\r {color('', Colors.YELLOW)} {_label} {color(f'({_err_name}: {_e})', Colors.DIM)} ")
+15 -35
View File
@@ -43,41 +43,20 @@ def _redact(value: str) -> str:
def _gateway_status() -> str:
"""Return a short gateway status string."""
if sys.platform.startswith("linux"):
from hermes_constants import is_container
if is_container():
try:
from hermes_cli.gateway import find_gateway_pids
pids = find_gateway_pids()
if pids:
return f"running (docker, pid {pids[0]})"
return "stopped (docker)"
except Exception:
return "stopped (docker)"
try:
from hermes_cli.gateway import get_service_name
svc = get_service_name()
except Exception:
svc = "hermes-gateway"
try:
r = subprocess.run(
["systemctl", "--user", "is-active", svc],
capture_output=True, text=True, timeout=5,
)
return "running (systemd)" if r.stdout.strip() == "active" else "stopped"
except Exception:
return "unknown"
elif sys.platform == "darwin":
try:
from hermes_cli.gateway import get_launchd_label
r = subprocess.run(
["launchctl", "list", get_launchd_label()],
capture_output=True, text=True, timeout=5,
)
return "loaded (launchd)" if r.returncode == 0 else "not loaded"
except Exception:
return "unknown"
return "N/A"
try:
from hermes_cli.gateway import get_gateway_runtime_snapshot
snapshot = get_gateway_runtime_snapshot()
if snapshot.running:
mode = snapshot.manager
if snapshot.has_process_service_mismatch:
mode = "manual"
return f"running ({mode}, pid {snapshot.gateway_pids[0]})"
if snapshot.service_installed and not snapshot.service_running:
return f"stopped ({snapshot.manager})"
return f"stopped ({snapshot.manager})"
except Exception:
return "unknown" if sys.platform.startswith(("linux", "darwin")) else "N/A"
def _count_skills(hermes_home: Path) -> int:
@@ -296,6 +275,7 @@ def run_dump(args):
("DEEPSEEK_API_KEY", "deepseek"),
("DASHSCOPE_API_KEY", "dashscope"),
("HF_TOKEN", "huggingface"),
("NVIDIA_API_KEY", "nvidia"),
("AI_GATEWAY_API_KEY", "ai_gateway"),
("OPENCODE_ZEN_API_KEY", "opencode_zen"),
("OPENCODE_GO_API_KEY", "opencode_go"),
+691 -34
View File
@@ -10,6 +10,7 @@ import shutil
import signal
import subprocess
import sys
from dataclasses import dataclass
from pathlib import Path
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
@@ -41,6 +42,23 @@ from hermes_cli.colors import Colors, color
# Process Management (for manual gateway runs)
# =============================================================================
@dataclass(frozen=True)
class GatewayRuntimeSnapshot:
manager: str
service_installed: bool = False
service_running: bool = False
gateway_pids: tuple[int, ...] = ()
service_scope: str | None = None
@property
def running(self) -> bool:
return self.service_running or bool(self.gateway_pids)
@property
def has_process_service_mismatch(self) -> bool:
return self.service_installed and self.running and not self.service_running
def _get_service_pids() -> set:
"""Return PIDs currently managed by systemd or launchd gateway services.
@@ -157,20 +175,22 @@ def _request_gateway_self_restart(pid: int) -> bool:
return True
def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = False) -> list:
"""Find PIDs of running gateway processes.
def _append_unique_pid(pids: list[int], pid: int | None, exclude_pids: set[int]) -> None:
if pid is None or pid <= 0:
return
if pid == os.getpid() or pid in exclude_pids or pid in pids:
return
pids.append(pid)
Args:
exclude_pids: PIDs to exclude from the result (e.g. service-managed
PIDs that should not be killed during a stale-process sweep).
all_profiles: When ``True``, return gateway PIDs across **all**
profiles (the pre-7923 global behaviour). ``hermes update``
needs this because a code update affects every profile.
When ``False`` (default), only PIDs belonging to the current
Hermes profile are returned.
def _scan_gateway_pids(exclude_pids: set[int], all_profiles: bool = False) -> list[int]:
"""Best-effort process-table scan for gateway PIDs.
This supplements the profile-scoped PID file so status views can still spot
a live gateway when the PID file is stale/missing, and ``--all`` sweeps can
discover gateways outside the current profile.
"""
_exclude = exclude_pids or set()
pids = [pid for pid in _get_service_pids() if pid not in _exclude]
pids: list[int] = []
patterns = [
"hermes_cli.main gateway",
"hermes_cli.main --profile",
@@ -203,20 +223,24 @@ def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = Fals
if is_windows():
result = subprocess.run(
["wmic", "process", "get", "ProcessId,CommandLine", "/FORMAT:LIST"],
capture_output=True, text=True, timeout=10
capture_output=True,
text=True,
timeout=10,
)
if result.returncode != 0:
return []
current_cmd = ""
for line in result.stdout.split('\n'):
for line in result.stdout.split("\n"):
line = line.strip()
if line.startswith("CommandLine="):
current_cmd = line[len("CommandLine="):]
elif line.startswith("ProcessId="):
pid_str = line[len("ProcessId="):]
if any(p in current_cmd for p in patterns) and (all_profiles or _matches_current_profile(current_cmd)):
if any(p in current_cmd for p in patterns) and (
all_profiles or _matches_current_profile(current_cmd)
):
try:
pid = int(pid_str)
if pid != os.getpid() and pid not in pids and pid not in _exclude:
pids.append(pid)
_append_unique_pid(pids, int(pid_str), exclude_pids)
except ValueError:
pass
current_cmd = ""
@@ -227,9 +251,11 @@ def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = Fals
text=True,
timeout=10,
)
for line in result.stdout.split('\n'):
if result.returncode != 0:
return []
for line in result.stdout.split("\n"):
stripped = line.strip()
if not stripped or 'grep' in stripped:
if not stripped or "grep" in stripped:
continue
pid = None
@@ -251,16 +277,137 @@ def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = Fals
if pid is None:
continue
if pid == os.getpid() or pid in pids or pid in _exclude:
continue
if any(pattern in command for pattern in patterns) and (all_profiles or _matches_current_profile(command)):
pids.append(pid)
if any(pattern in command for pattern in patterns) and (
all_profiles or _matches_current_profile(command)
):
_append_unique_pid(pids, pid, exclude_pids)
except (OSError, subprocess.TimeoutExpired):
pass
return []
return pids
def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = False) -> list:
"""Find PIDs of running gateway processes.
Args:
exclude_pids: PIDs to exclude from the result (e.g. service-managed
PIDs that should not be killed during a stale-process sweep).
all_profiles: When ``True``, return gateway PIDs across **all**
profiles (the pre-7923 global behaviour). ``hermes update``
needs this because a code update affects every profile.
When ``False`` (default), only PIDs belonging to the current
Hermes profile are returned.
"""
_exclude = set(exclude_pids or set())
pids: list[int] = []
if not all_profiles:
try:
from gateway.status import get_running_pid
_append_unique_pid(pids, get_running_pid(), _exclude)
except Exception:
pass
for pid in _get_service_pids():
_append_unique_pid(pids, pid, _exclude)
for pid in _scan_gateway_pids(_exclude, all_profiles=all_profiles):
_append_unique_pid(pids, pid, _exclude)
return pids
def _probe_systemd_service_running(system: bool = False) -> tuple[bool, bool]:
selected_system = _select_systemd_scope(system)
unit_exists = get_systemd_unit_path(system=selected_system).exists()
if not unit_exists:
return selected_system, False
try:
result = _run_systemctl(
["is-active", get_service_name()],
system=selected_system,
capture_output=True,
text=True,
timeout=10,
)
except (RuntimeError, subprocess.TimeoutExpired):
return selected_system, False
return selected_system, result.stdout.strip() == "active"
def _probe_launchd_service_running() -> bool:
if not get_launchd_plist_path().exists():
return False
try:
result = subprocess.run(
["launchctl", "list", get_launchd_label()],
capture_output=True,
text=True,
timeout=10,
)
except subprocess.TimeoutExpired:
return False
return result.returncode == 0
def get_gateway_runtime_snapshot(system: bool = False) -> GatewayRuntimeSnapshot:
"""Return a unified view of gateway liveness for the current profile."""
gateway_pids = tuple(find_gateway_pids())
if is_termux():
return GatewayRuntimeSnapshot(
manager="Termux / manual process",
gateway_pids=gateway_pids,
)
from hermes_constants import is_container
if is_linux() and is_container():
return GatewayRuntimeSnapshot(
manager="docker (foreground)",
gateway_pids=gateway_pids,
)
if supports_systemd_services():
selected_system, service_running = _probe_systemd_service_running(system=system)
scope_label = _service_scope_label(selected_system)
return GatewayRuntimeSnapshot(
manager=f"systemd ({scope_label})",
service_installed=get_systemd_unit_path(system=selected_system).exists(),
service_running=service_running,
gateway_pids=gateway_pids,
service_scope=scope_label,
)
if is_macos():
return GatewayRuntimeSnapshot(
manager="launchd",
service_installed=get_launchd_plist_path().exists(),
service_running=_probe_launchd_service_running(),
gateway_pids=gateway_pids,
service_scope="launchd",
)
return GatewayRuntimeSnapshot(
manager="manual process",
gateway_pids=gateway_pids,
)
def _format_gateway_pids(pids: tuple[int, ...] | list[int], *, limit: int | None = 3) -> str:
rendered = [str(pid) for pid in pids[:limit] if pid > 0] if limit is not None else [str(pid) for pid in pids if pid > 0]
if limit is not None and len(pids) > limit:
rendered.append("...")
return ", ".join(rendered)
def _print_gateway_process_mismatch(snapshot: GatewayRuntimeSnapshot) -> None:
if not snapshot.has_process_service_mismatch:
return
print()
print("⚠ Gateway process is running for this profile, but the service is not active")
print(f" PID(s): {_format_gateway_pids(snapshot.gateway_pids, limit=None)}")
print(" This is usually a manual foreground/tmux/nohup run, so `hermes gateway`")
print(" can refuse to start another copy until this process stops.")
def kill_gateway_processes(force: bool = False, exclude_pids: set | None = None,
all_profiles: bool = False) -> int:
"""Kill any running gateway processes. Returns count killed.
@@ -340,25 +487,44 @@ def _wsl_systemd_operational() -> bool:
WSL2 with ``systemd=true`` in wsl.conf has working systemd.
WSL2 without it (or WSL1) does not systemctl commands fail.
"""
return _systemd_operational(system=True)
def _systemd_operational(system: bool = False) -> bool:
"""Return True when the requested systemd scope is usable."""
try:
result = subprocess.run(
["systemctl", "is-system-running"],
capture_output=True, text=True, timeout=5,
result = _run_systemctl(
["is-system-running"],
system=system,
capture_output=True,
text=True,
timeout=5,
)
# "running", "degraded", "starting" all mean systemd is PID 1
status = result.stdout.strip().lower()
return status in ("running", "degraded", "starting", "initializing")
except (FileNotFoundError, subprocess.TimeoutExpired, OSError):
except (RuntimeError, subprocess.TimeoutExpired, OSError):
return False
def _container_systemd_operational() -> bool:
"""Return True when a container exposes working user or system systemd."""
if _systemd_operational(system=False):
return True
if _systemd_operational(system=True):
return True
return False
def supports_systemd_services() -> bool:
if not is_linux() or is_termux() or is_container():
if not is_linux() or is_termux():
return False
if shutil.which("systemctl") is None:
return False
if is_wsl():
return _wsl_systemd_operational()
if is_container():
return _container_systemd_operational()
return True
@@ -521,6 +687,195 @@ def has_conflicting_systemd_units() -> bool:
return len(get_installed_systemd_scopes()) > 1
# Legacy service names from older Hermes installs that predate the
# hermes-gateway rename. Kept as an explicit allowlist (NOT a glob) so
# profile units (hermes-gateway-*.service) and unrelated third-party
# "hermes" units are never matched.
_LEGACY_SERVICE_NAMES: tuple[str, ...] = ("hermes.service",)
# ExecStart content markers that identify a unit as running our gateway.
# A legacy unit is only flagged when its file contains one of these.
_LEGACY_UNIT_EXECSTART_MARKERS: tuple[str, ...] = (
"hermes_cli.main gateway",
"hermes_cli/main.py gateway",
"gateway/run.py",
" hermes gateway ",
"/hermes gateway ",
)
def _legacy_unit_search_paths() -> list[tuple[bool, Path]]:
"""Return ``[(is_system, base_dir), ...]`` — directories to scan for legacy units.
Factored out so tests can monkeypatch the search roots without touching
real filesystem paths.
"""
return [
(False, Path.home() / ".config" / "systemd" / "user"),
(True, Path("/etc/systemd/system")),
]
def _find_legacy_hermes_units() -> list[tuple[str, Path, bool]]:
"""Return ``[(unit_name, unit_path, is_system)]`` for legacy Hermes gateway units.
Detects unit files installed by older Hermes versions that used a
different service name (e.g. ``hermes.service`` before the rename to
``hermes-gateway.service``). When both a legacy unit and the current
``hermes-gateway.service`` are active, they fight over the same bot
token the PR #5646 signal-recovery change turns this into a 30-second
SIGTERM flap loop.
Safety guards:
* Explicit allowlist of legacy names (no globbing). Profile units such
as ``hermes-gateway-coder.service`` and unrelated third-party
``hermes-*`` services are never matched.
* ExecStart content check only flag units that invoke our gateway
entrypoint. A user-created ``hermes.service`` running an unrelated
binary is left untouched.
* Results are returned purely for caller inspection; this function
never mutates or removes anything.
"""
results: list[tuple[str, Path, bool]] = []
for is_system, base in _legacy_unit_search_paths():
for name in _LEGACY_SERVICE_NAMES:
unit_path = base / name
try:
if not unit_path.exists():
continue
text = unit_path.read_text(encoding="utf-8", errors="ignore")
except (OSError, PermissionError):
continue
if not any(marker in text for marker in _LEGACY_UNIT_EXECSTART_MARKERS):
# Not our gateway — leave alone
continue
results.append((name, unit_path, is_system))
return results
def has_legacy_hermes_units() -> bool:
"""Return True when any legacy Hermes gateway unit files exist."""
return bool(_find_legacy_hermes_units())
def print_legacy_unit_warning() -> None:
"""Warn about legacy Hermes gateway unit files if any are installed.
Idempotent: prints nothing when no legacy units are detected. Safe to
call from any status/install/setup path.
"""
legacy = _find_legacy_hermes_units()
if not legacy:
return
print_warning("Legacy Hermes gateway unit(s) detected from an older install:")
for name, path, is_system in legacy:
scope = "system" if is_system else "user"
print_info(f" {path} ({scope} scope)")
print_info(" These run alongside the current hermes-gateway service and")
print_info(" cause SIGTERM flap loops — both try to use the same bot token.")
print_info(" Remove them with:")
print_info(" hermes gateway migrate-legacy")
def remove_legacy_hermes_units(
interactive: bool = True,
dry_run: bool = False,
) -> tuple[int, list[Path]]:
"""Stop, disable, and remove legacy Hermes gateway unit files.
Iterates over whatever ``_find_legacy_hermes_units()`` returns which is
an explicit allowlist of legacy names (not a glob). Profile units and
unrelated third-party services are never touched.
Args:
interactive: When True, prompt before removing. When False, remove
without asking (used when another prompt has already confirmed,
e.g. from the install flow).
dry_run: When True, list what would be removed and return.
Returns:
``(removed_count, remaining_paths)`` remaining includes units we
couldn't remove (typically system-scope when not running as root).
"""
legacy = _find_legacy_hermes_units()
if not legacy:
print("No legacy Hermes gateway units found.")
return 0, []
user_units = [(n, p) for n, p, is_sys in legacy if not is_sys]
system_units = [(n, p) for n, p, is_sys in legacy if is_sys]
print()
print("Legacy Hermes gateway unit(s) found:")
for name, path, is_system in legacy:
scope = "system" if is_system else "user"
print(f" {path} ({scope} scope)")
print()
if dry_run:
print("(dry-run — nothing removed)")
return 0, [p for _, p, _ in legacy]
if interactive and not prompt_yes_no("Remove these legacy units?", True):
print("Skipped. Run again with: hermes gateway migrate-legacy")
return 0, [p for _, p, _ in legacy]
removed = 0
remaining: list[Path] = []
# User-scope removal
for name, path in user_units:
try:
_run_systemctl(["stop", name], system=False, check=False, timeout=90)
_run_systemctl(["disable", name], system=False, check=False, timeout=30)
path.unlink(missing_ok=True)
print(f" ✓ Removed {path}")
removed += 1
except (OSError, RuntimeError) as e:
print(f" ⚠ Could not remove {path}: {e}")
remaining.append(path)
if user_units:
try:
_run_systemctl(["daemon-reload"], system=False, check=False, timeout=30)
except RuntimeError:
pass
# System-scope removal (needs root)
if system_units:
if os.geteuid() != 0:
print()
print_warning("System-scope legacy units require root to remove.")
print_info(" Re-run with: sudo hermes gateway migrate-legacy")
for _, path in system_units:
remaining.append(path)
else:
for name, path in system_units:
try:
_run_systemctl(["stop", name], system=True, check=False, timeout=90)
_run_systemctl(["disable", name], system=True, check=False, timeout=30)
path.unlink(missing_ok=True)
print(f" ✓ Removed {path}")
removed += 1
except (OSError, RuntimeError) as e:
print(f" ⚠ Could not remove {path}: {e}")
remaining.append(path)
try:
_run_systemctl(["daemon-reload"], system=True, check=False, timeout=30)
except RuntimeError:
pass
print()
if remaining:
print_warning(f"{len(remaining)} legacy unit(s) still present — see messages above.")
else:
print_success(f"Removed {removed} legacy unit(s).")
return removed, remaining
def print_systemd_scope_conflict_warning() -> None:
scopes = get_installed_systemd_scopes()
if len(scopes) < 2:
@@ -1054,6 +1409,19 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
if system:
_require_root_for_system_service("install")
# Offer to remove legacy units (hermes.service from pre-rename installs)
# before installing the new hermes-gateway.service. If both remain, they
# flap-fight for the Telegram bot token on every gateway startup.
# Only removes units matching _LEGACY_SERVICE_NAMES + our ExecStart
# signature — profile units are never touched.
if has_legacy_hermes_units():
print()
print_legacy_unit_warning()
print()
if prompt_yes_no("Remove the legacy unit(s) before installing?", True):
remove_legacy_hermes_units(interactive=False)
print()
unit_path = get_systemd_unit_path(system=system)
scope_flag = " --system" if system else ""
@@ -1092,6 +1460,7 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
_ensure_linger_enabled()
print_systemd_scope_conflict_warning()
print_legacy_unit_warning()
def systemd_uninstall(system: bool = False):
@@ -1215,6 +1584,10 @@ def systemd_status(deep: bool = False, system: bool = False):
print_systemd_scope_conflict_warning()
print()
if has_legacy_hermes_units():
print_legacy_unit_warning()
print()
if not systemd_unit_is_current(system=system):
print("⚠ Installed gateway service definition is outdated")
print(f" Run: {'sudo ' if system else ''}hermes gateway restart{scope_flag} # auto-refreshes the unit")
@@ -1998,7 +2371,7 @@ _PLATFORMS = [
{"name": "QQ_ALLOWED_USERS", "prompt": "Allowed user OpenIDs (comma-separated, leave empty for open access)", "password": False,
"is_allowlist": True,
"help": "Optional — restrict DM access to specific user OpenIDs."},
{"name": "QQ_HOME_CHANNEL", "prompt": "Home channel (user/group OpenID for cron delivery, or empty)", "password": False,
{"name": "QQBOT_HOME_CHANNEL", "prompt": "Home channel (user/group OpenID for cron delivery, or empty)", "password": False,
"help": "OpenID to deliver cron results and notifications to."},
],
},
@@ -2211,9 +2584,62 @@ def _setup_sms():
def _setup_dingtalk():
"""Configure DingTalk via the standard platform setup."""
"""Configure DingTalk — QR scan (recommended) or manual credential entry."""
from hermes_cli.setup import (
prompt_choice, prompt_yes_no, print_info, print_success, print_warning,
)
dingtalk_platform = next(p for p in _PLATFORMS if p["key"] == "dingtalk")
_setup_standard_platform(dingtalk_platform)
emoji = dingtalk_platform["emoji"]
label = dingtalk_platform["label"]
print()
print(color(f" ─── {emoji} {label} Setup ───", Colors.CYAN))
existing = get_env_value("DINGTALK_CLIENT_ID")
if existing:
print()
print_success(f"{label} is already configured (Client ID: {existing}).")
if not prompt_yes_no(f" Reconfigure {label}?", False):
return
print()
method = prompt_choice(
" Choose setup method",
[
"QR Code Scan (Recommended, auto-obtain Client ID and Client Secret)",
"Manual Input (Client ID and Client Secret)",
],
default=0,
)
if method == 0:
# ── QR-code device-flow authorization ──
try:
from hermes_cli.dingtalk_auth import dingtalk_qr_auth
except ImportError as exc:
print_warning(f" QR auth module failed to load ({exc}), falling back to manual input.")
_setup_standard_platform(dingtalk_platform)
return
result = dingtalk_qr_auth()
if result is None:
print_warning(" QR auth incomplete, falling back to manual input.")
_setup_standard_platform(dingtalk_platform)
return
client_id, client_secret = result
save_env_value("DINGTALK_CLIENT_ID", client_id)
save_env_value("DINGTALK_CLIENT_SECRET", client_secret)
save_env_value("DINGTALK_ALLOW_ALL_USERS", "true")
print()
print_success(f"{emoji} {label} configured via QR scan!")
else:
# ── Manual entry ──
_setup_standard_platform(dingtalk_platform)
# Also enable allow-all by default for convenience
if get_env_value("DINGTALK_CLIENT_ID"):
save_env_value("DINGTALK_ALLOW_ALL_USERS", "true")
def _setup_wecom():
@@ -2572,6 +2998,215 @@ def _setup_feishu():
print_info(f" Bot: {bot_name}")
def _setup_qqbot():
"""Interactive setup for QQ Bot — scan-to-configure or manual credentials."""
print()
print(color(" ─── 🐧 QQ Bot Setup ───", Colors.CYAN))
existing_app_id = get_env_value("QQ_APP_ID")
existing_secret = get_env_value("QQ_CLIENT_SECRET")
if existing_app_id and existing_secret:
print()
print_success("QQ Bot is already configured.")
if not prompt_yes_no(" Reconfigure QQ Bot?", False):
return
# ── Choose setup method ──
print()
method_choices = [
"Scan QR code to add bot automatically (recommended)",
"Enter existing App ID and App Secret manually",
]
method_idx = prompt_choice(" How would you like to set up QQ Bot?", method_choices, 0)
credentials = None
used_qr = False
if method_idx == 0:
# ── QR scan-to-configure ──
try:
credentials = _qqbot_qr_flow()
except KeyboardInterrupt:
print()
print_warning(" QQ Bot setup cancelled.")
return
if credentials:
used_qr = True
if not credentials:
print_info(" QR setup did not complete. Continuing with manual input.")
# ── Manual credential input ──
if not credentials:
print()
print_info(" Go to https://q.qq.com to register a QQ Bot application.")
print_info(" Note your App ID and App Secret from the application page.")
print()
app_id = prompt(" App ID", password=False)
if not app_id:
print_warning(" Skipped — QQ Bot won't work without an App ID.")
return
app_secret = prompt(" App Secret", password=True)
if not app_secret:
print_warning(" Skipped — QQ Bot won't work without an App Secret.")
return
credentials = {"app_id": app_id.strip(), "client_secret": app_secret.strip(), "user_openid": ""}
# ── Save core credentials ──
save_env_value("QQ_APP_ID", credentials["app_id"])
save_env_value("QQ_CLIENT_SECRET", credentials["client_secret"])
user_openid = credentials.get("user_openid", "")
# ── DM security policy ──
print()
access_choices = [
"Use DM pairing approval (recommended)",
"Allow all direct messages",
"Only allow listed user OpenIDs",
]
access_idx = prompt_choice(" How should direct messages be authorized?", access_choices, 0)
if access_idx == 0:
save_env_value("QQ_ALLOW_ALL_USERS", "false")
if user_openid:
print()
if prompt_yes_no(f" Add yourself ({user_openid}) to the allow list?", True):
save_env_value("QQ_ALLOWED_USERS", user_openid)
print_success(f" Allow list set to {user_openid}")
else:
save_env_value("QQ_ALLOWED_USERS", "")
else:
save_env_value("QQ_ALLOWED_USERS", "")
print_success(" DM pairing enabled.")
print_info(" Unknown users can request access; approve with `hermes pairing approve`.")
elif access_idx == 1:
save_env_value("QQ_ALLOW_ALL_USERS", "true")
save_env_value("QQ_ALLOWED_USERS", "")
print_warning(" Open DM access enabled for QQ Bot.")
else:
default_allow = user_openid or ""
allowlist = prompt(" Allowed user OpenIDs (comma-separated)", default_allow, password=False).replace(" ", "")
save_env_value("QQ_ALLOW_ALL_USERS", "false")
save_env_value("QQ_ALLOWED_USERS", allowlist)
print_success(" Allowlist saved.")
# ── Home channel ──
if user_openid:
print()
if prompt_yes_no(f" Use your QQ user ID ({user_openid}) as the home channel?", True):
save_env_value("QQBOT_HOME_CHANNEL", user_openid)
print_success(f" Home channel set to {user_openid}")
else:
print()
home_channel = prompt(" Home channel OpenID (for cron/notifications, or empty)", password=False)
if home_channel:
save_env_value("QQBOT_HOME_CHANNEL", home_channel.strip())
print_success(f" Home channel set to {home_channel.strip()}")
print()
print_success("🐧 QQ Bot configured!")
print_info(f" App ID: {credentials['app_id']}")
def _qqbot_render_qr(url: str) -> bool:
"""Try to render a QR code in the terminal. Returns True if successful."""
try:
import qrcode as _qr
qr = _qr.QRCode(border=1,error_correction=_qr.constants.ERROR_CORRECT_L)
qr.add_data(url)
qr.make(fit=True)
qr.print_ascii(invert=True)
return True
except Exception:
return False
def _qqbot_qr_flow():
"""Run the QR-code scan-to-configure flow.
Returns a dict with app_id, client_secret, user_openid on success,
or None on failure/cancel.
"""
try:
from gateway.platforms.qqbot import (
create_bind_task, poll_bind_result, build_connect_url,
decrypt_secret, BindStatus,
)
from gateway.platforms.qqbot.constants import ONBOARD_POLL_INTERVAL
except Exception as exc:
print_error(f" QQBot onboard import failed: {exc}")
return None
import asyncio
import time
MAX_REFRESHES = 3
refresh_count = 0
while refresh_count <= MAX_REFRESHES:
loop = asyncio.new_event_loop()
# ── Create bind task ──
try:
task_id, aes_key = loop.run_until_complete(create_bind_task())
except Exception as e:
print_warning(f" Failed to create bind task: {e}")
loop.close()
return None
url = build_connect_url(task_id)
# ── Display QR code + URL ──
print()
if _qqbot_render_qr(url):
print(f" Scan the QR code above, or open this URL directly:\n {url}")
else:
print(f" Open this URL in QQ on your phone:\n {url}")
print_info(" Tip: pip install qrcode to show a scannable QR code here")
# ── Poll loop (silent — keep QR visible at bottom) ──
try:
while True:
try:
status, app_id, encrypted_secret, user_openid = loop.run_until_complete(
poll_bind_result(task_id)
)
except Exception:
time.sleep(ONBOARD_POLL_INTERVAL)
continue
if status == BindStatus.COMPLETED:
client_secret = decrypt_secret(encrypted_secret, aes_key)
print()
print_success(f" QR scan complete! (App ID: {app_id})")
if user_openid:
print_info(f" Scanner's OpenID: {user_openid}")
return {
"app_id": app_id,
"client_secret": client_secret,
"user_openid": user_openid,
}
if status == BindStatus.EXPIRED:
refresh_count += 1
if refresh_count > MAX_REFRESHES:
print()
print_warning(f" QR code expired {MAX_REFRESHES} times — giving up.")
return None
print()
print_warning(f" QR code expired, refreshing... ({refresh_count}/{MAX_REFRESHES})")
loop.close()
break # outer while creates a new task
time.sleep(ONBOARD_POLL_INTERVAL)
except KeyboardInterrupt:
loop.close()
raise
finally:
loop.close()
return None
def _setup_signal():
"""Interactive setup for Signal messenger."""
import shutil
@@ -2709,6 +3344,10 @@ def gateway_setup():
print_systemd_scope_conflict_warning()
print()
if supports_systemd_services() and has_legacy_hermes_units():
print_legacy_unit_warning()
print()
if service_installed and service_running:
print_success("Gateway service is installed and running.")
elif service_installed:
@@ -2749,8 +3388,12 @@ def gateway_setup():
_setup_signal()
elif platform["key"] == "weixin":
_setup_weixin()
elif platform["key"] == "dingtalk":
_setup_dingtalk()
elif platform["key"] == "feishu":
_setup_feishu()
elif platform["key"] == "qqbot":
_setup_qqbot()
else:
_setup_standard_platform(platform)
@@ -3110,15 +3753,18 @@ def gateway_command(args):
elif subcmd == "status":
deep = getattr(args, 'deep', False)
system = getattr(args, 'system', False)
snapshot = get_gateway_runtime_snapshot(system=system)
# Check for service first
if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
systemd_status(deep, system=system)
_print_gateway_process_mismatch(snapshot)
elif is_macos() and get_launchd_plist_path().exists():
launchd_status(deep)
_print_gateway_process_mismatch(snapshot)
else:
# Check for manually running processes
pids = find_gateway_pids()
pids = list(snapshot.gateway_pids)
if pids:
print(f"✓ Gateway is running (PID: {', '.join(map(str, pids))})")
print(" (Running manually, not as a system service)")
@@ -3159,3 +3805,14 @@ def gateway_command(args):
else:
print(" hermes gateway install # Install as user service")
print(" sudo hermes gateway install --system # Install as boot-time system service")
elif subcmd == "migrate-legacy":
# Stop, disable, and remove legacy Hermes gateway unit files from
# pre-rename installs (e.g. hermes.service). Profile units and
# unrelated third-party services are never touched.
dry_run = getattr(args, 'dry_run', False)
yes = getattr(args, 'yes', False)
if not supports_systemd_services() and not is_macos():
print("Legacy unit migration only applies to systemd-based Linux hosts.")
return
remove_legacy_hermes_units(interactive=not yes, dry_run=dry_run)
+2607 -712
View File
File diff suppressed because it is too large Load Diff
+66 -5
View File
@@ -279,8 +279,8 @@ def cmd_mcp_add(args):
_info(f"Starting OAuth flow for '{name}'...")
oauth_ok = False
try:
from tools.mcp_oauth import build_oauth_auth
oauth_auth = build_oauth_auth(name, url)
from tools.mcp_oauth_manager import get_manager
oauth_auth = get_manager().get_or_build_provider(name, url, None)
if oauth_auth:
server_config["auth"] = "oauth"
_success("OAuth configured (tokens will be acquired on first connection)")
@@ -428,10 +428,12 @@ def cmd_mcp_remove(args):
_remove_mcp_server(name)
_success(f"Removed '{name}' from config")
# Clean up OAuth tokens if they exist
# Clean up OAuth tokens if they exist — route through MCPOAuthManager so
# any provider instance cached in the current process (e.g. from an
# earlier `hermes mcp test` in the same session) is evicted too.
try:
from tools.mcp_oauth import remove_oauth_tokens
remove_oauth_tokens(name)
from tools.mcp_oauth_manager import get_manager
get_manager().remove(name)
_success("Cleaned up OAuth tokens")
except Exception:
pass
@@ -577,6 +579,63 @@ def _interpolate_value(value: str) -> str:
return re.sub(r"\$\{(\w+)\}", _replace, value)
# ─── hermes mcp login ────────────────────────────────────────────────────────
def cmd_mcp_login(args):
"""Force re-authentication for an OAuth-based MCP server.
Deletes cached tokens (both on disk and in the running process's
MCPOAuthManager cache) and triggers a fresh OAuth flow via the
existing probe path.
Use this when:
- Tokens are stuck in a bad state (server revoked, refresh token
consumed by an external process, etc.)
- You want to re-authenticate to change scopes or account
- A tool call returned ``needs_reauth: true``
"""
name = args.name
servers = _get_mcp_servers()
if name not in servers:
_error(f"Server '{name}' not found in config.")
if servers:
_info(f"Available servers: {', '.join(servers)}")
return
server_config = servers[name]
url = server_config.get("url")
if not url:
_error(f"Server '{name}' has no URL — not an OAuth-capable server")
return
if server_config.get("auth") != "oauth":
_error(f"Server '{name}' is not configured for OAuth (auth={server_config.get('auth')})")
_info("Use `hermes mcp remove` + `hermes mcp add` to reconfigure auth.")
return
# Wipe both disk and in-memory cache so the next probe forces a fresh
# OAuth flow.
try:
from tools.mcp_oauth_manager import get_manager
mgr = get_manager()
mgr.remove(name)
except Exception as exc:
_warning(f"Could not clear existing OAuth state: {exc}")
print()
_info(f"Starting OAuth flow for '{name}'...")
# Probe triggers the OAuth flow (browser redirect + callback capture).
try:
tools = _probe_single_server(name, server_config)
if tools:
_success(f"Authenticated — {len(tools)} tool(s) available")
else:
_success("Authenticated (server reported no tools)")
except Exception as exc:
_error(f"Authentication failed: {exc}")
# ─── hermes mcp configure ────────────────────────────────────────────────────
def cmd_mcp_configure(args):
@@ -696,6 +755,7 @@ def mcp_command(args):
"test": cmd_mcp_test,
"configure": cmd_mcp_configure,
"config": cmd_mcp_configure,
"login": cmd_mcp_login,
}
handler = handlers.get(action)
@@ -713,4 +773,5 @@ def mcp_command(args):
_info("hermes mcp list List servers")
_info("hermes mcp test <name> Test connection")
_info("hermes mcp configure <name> Toggle tools")
_info("hermes mcp login <name> Re-authenticate OAuth")
print()
+20 -1
View File
@@ -374,7 +374,26 @@ def normalize_model_for_provider(model_input: str, target_provider: str) -> str:
return bare
return _dots_to_hyphens(bare)
# --- Copilot: strip matching provider prefix, keep dots ---
# --- Copilot / Copilot ACP: delegate to the Copilot-specific
# normalizer. It knows about the alias table (vendor-prefix
# stripping for Anthropic/OpenAI, dash-to-dot repair for Claude)
# and live-catalog lookups. Without this, vendor-prefixed or
# dash-notation Claude IDs survive to the Copilot API and hit
# HTTP 400 "model_not_supported". See issue #6879.
if provider in {"copilot", "copilot-acp"}:
try:
from hermes_cli.models import normalize_copilot_model_id
normalized = normalize_copilot_model_id(name)
if normalized:
return normalized
except Exception:
# Fall through to the generic strip-vendor behaviour below
# if the Copilot-specific path is unavailable for any reason.
pass
# --- Copilot / Copilot ACP / openai-codex fallback:
# strip matching provider prefix, keep dots ---
if provider in _STRIP_VENDOR_ONLY_PROVIDERS:
stripped = _strip_matching_provider_prefix(name, provider)
if stripped == name and name.startswith("openai/"):
+20 -4
View File
@@ -692,12 +692,12 @@ def switch_model(
api_key=api_key,
base_url=base_url,
)
except Exception:
except Exception as e:
validation = {
"accepted": True,
"persist": True,
"accepted": False,
"persist": False,
"recognized": False,
"message": None,
"message": f"Could not validate `{new_model}`: {e}",
}
if not validation.get("accepted"):
@@ -727,6 +727,22 @@ def switch_model(
if not api_mode:
api_mode = determine_api_mode(target_provider, base_url)
# OpenCode base URLs end with /v1 for OpenAI-compatible models, but the
# Anthropic SDK prepends its own /v1/messages to the base_url. Strip the
# trailing /v1 so the SDK constructs the correct path (e.g.
# https://opencode.ai/zen/go/v1/messages instead of .../v1/v1/messages).
# Mirrors the same logic in hermes_cli.runtime_provider.resolve_runtime_provider;
# without it, /model switches into an anthropic_messages-routed OpenCode
# model (e.g. `/model minimax-m2.7` on opencode-go, `/model claude-sonnet-4-6`
# on opencode-zen) hit a double /v1 and returned OpenCode's website 404 page.
if (
api_mode == "anthropic_messages"
and target_provider in {"opencode-zen", "opencode-go"}
and isinstance(base_url, str)
and base_url
):
base_url = re.sub(r"/v1/?$", "", base_url)
# --- Get capabilities (legacy) ---
capabilities = get_model_capabilities(target_provider, new_model)
+59 -29
View File
@@ -26,7 +26,8 @@ COPILOT_REASONING_EFFORTS_O_SERIES = ["low", "medium", "high"]
# Fallback OpenRouter snapshot used when the live catalog is unavailable.
# (model_id, display description shown in menus)
OPENROUTER_MODELS: list[tuple[str, str]] = [
("anthropic/claude-opus-4.7", "recommended"),
("moonshotai/kimi-k2.5", "recommended"),
("anthropic/claude-opus-4.7", ""),
("anthropic/claude-opus-4.6", ""),
("anthropic/claude-sonnet-4.6", ""),
("qwen/qwen3.6-plus", ""),
@@ -49,7 +50,6 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("z-ai/glm-5.1", ""),
("z-ai/glm-5v-turbo", ""),
("z-ai/glm-5-turbo", ""),
("moonshotai/kimi-k2.5", ""),
("x-ai/grok-4.20", ""),
("nvidia/nemotron-3-super-120b-a12b", ""),
("nvidia/nemotron-3-super-120b-a12b:free", "free"),
@@ -75,7 +75,9 @@ def _codex_curated_models() -> list[str]:
_PROVIDER_MODELS: dict[str, list[str]] = {
"nous": [
"moonshotai/kimi-k2.5",
"xiaomi/mimo-v2-pro",
"anthropic/claude-opus-4.7",
"anthropic/claude-opus-4.6",
"anthropic/claude-sonnet-4.6",
"anthropic/claude-sonnet-4.5",
@@ -95,7 +97,6 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"z-ai/glm-5.1",
"z-ai/glm-5v-turbo",
"z-ai/glm-5-turbo",
"moonshotai/kimi-k2.5",
"x-ai/grok-4.20-beta",
"nvidia/nemotron-3-super-120b-a12b",
"nvidia/nemotron-3-super-120b-a12b:free",
@@ -134,7 +135,6 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"gemini-2.5-flash-lite",
# Gemma open models (also served via AI Studio)
"gemma-4-31b-it",
"gemma-4-26b-it",
],
"google-gemini-cli": [
"gemini-2.5-pro",
@@ -154,9 +154,23 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"grok-4.20-reasoning",
"grok-4-1-fast-reasoning",
],
"nvidia": [
# NVIDIA flagship reasoning models
"nvidia/nemotron-3-super-120b-a12b",
"nvidia/nemotron-3-nano-30b-a3b",
"nvidia/llama-3.3-nemotron-super-49b-v1.5",
# Third-party agentic models hosted on build.nvidia.com
# (map to OpenRouter defaults — users get familiar picks on NIM)
"qwen/qwen3.5-397b-a17b",
"deepseek-ai/deepseek-v3.2",
"moonshotai/kimi-k2.5",
"minimaxai/minimax-m2.5",
"z-ai/glm5",
"openai/gpt-oss-120b",
],
"kimi-coding": [
"kimi-for-coding",
"kimi-k2.5",
"kimi-for-coding",
"kimi-k2-thinking",
"kimi-k2-thinking-turbo",
"kimi-k2-turbo-preview",
@@ -211,6 +225,7 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"trinity-mini",
],
"opencode-zen": [
"kimi-k2.5",
"gpt-5.4-pro",
"gpt-5.4",
"gpt-5.3-codex",
@@ -242,16 +257,15 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"glm-5",
"glm-4.7",
"glm-4.6",
"kimi-k2.5",
"kimi-k2-thinking",
"kimi-k2",
"qwen3-coder",
"big-pickle",
],
"opencode-go": [
"kimi-k2.5",
"glm-5.1",
"glm-5",
"kimi-k2.5",
"mimo-v2-pro",
"mimo-v2-omni",
"minimax-m2.7",
@@ -284,21 +298,21 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
# to https://dashscope-intl.aliyuncs.com/compatible-mode/v1 (OpenAI-compat)
# or https://dashscope-intl.aliyuncs.com/apps/anthropic (Anthropic-compat).
"alibaba": [
"kimi-k2.5",
"qwen3.5-plus",
"qwen3-coder-plus",
"qwen3-coder-next",
# Third-party models available on coding-intl
"glm-5",
"glm-4.7",
"kimi-k2.5",
"MiniMax-M2.5",
],
# Curated HF model list — only agentic models that map to OpenRouter defaults.
"huggingface": [
"moonshotai/Kimi-K2.5",
"Qwen/Qwen3.5-397B-A17B",
"Qwen/Qwen3.5-35B-A3B",
"deepseek-ai/DeepSeek-V3.2",
"moonshotai/Kimi-K2.5",
"MiniMaxAI/MiniMax-M2.5",
"zai-org/GLM-5",
"XiaomiMiMo/MiMo-V2-Flash",
@@ -535,6 +549,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
ProviderEntry("xiaomi", "Xiaomi MiMo", "Xiaomi MiMo (MiMo-V2 models — pro, omni, flash)"),
ProviderEntry("nvidia", "NVIDIA NIM", "NVIDIA NIM (Nemotron models — build.nvidia.com or local NIM)"),
ProviderEntry("qwen-oauth", "Qwen OAuth (Portal)", "Qwen OAuth (reuses local Qwen CLI login)"),
ProviderEntry("copilot", "GitHub Copilot", "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"),
ProviderEntry("copilot-acp", "GitHub Copilot ACP", "GitHub Copilot ACP (spawns `copilot --acp --stdio`)"),
@@ -617,6 +632,10 @@ _PROVIDER_ALIASES = {
"grok": "xai",
"x-ai": "xai",
"x.ai": "xai",
"nim": "nvidia",
"nvidia-nim": "nvidia",
"build-nvidia": "nvidia",
"nemotron": "nvidia",
"ollama": "custom", # bare "ollama" = local; use "ollama-cloud" for cloud
"ollama_cloud": "ollama-cloud",
}
@@ -1487,6 +1506,19 @@ _COPILOT_MODEL_ALIASES = {
"anthropic/claude-sonnet-4.6": "claude-sonnet-4.6",
"anthropic/claude-sonnet-4.5": "claude-sonnet-4.5",
"anthropic/claude-haiku-4.5": "claude-haiku-4.5",
# Dash-notation fallbacks: Hermes' default Claude IDs elsewhere use
# hyphens (anthropic native format), but Copilot's API only accepts
# dot-notation. Accept both so users who configure copilot + a
# default hyphenated Claude model don't hit HTTP 400
# "model_not_supported". See issue #6879.
"claude-opus-4-6": "claude-opus-4.6",
"claude-sonnet-4-6": "claude-sonnet-4.6",
"claude-sonnet-4-5": "claude-sonnet-4.5",
"claude-haiku-4-5": "claude-haiku-4.5",
"anthropic/claude-opus-4-6": "claude-opus-4.6",
"anthropic/claude-sonnet-4-6": "claude-sonnet-4.6",
"anthropic/claude-sonnet-4-5": "claude-sonnet-4.5",
"anthropic/claude-haiku-4-5": "claude-haiku-4.5",
}
@@ -2018,8 +2050,8 @@ def validate_requested_model(
)
return {
"accepted": True,
"persist": True,
"accepted": False,
"persist": False,
"recognized": False,
"message": message,
}
@@ -2032,8 +2064,8 @@ def validate_requested_model(
message += f"\n If this server expects `/v1`, try base URL: `{probe.get('suggested_base_url')}`"
return {
"accepted": True,
"persist": True,
"accepted": False,
"persist": False,
"recognized": False,
"message": message,
}
@@ -2067,12 +2099,11 @@ def validate_requested_model(
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": True,
"persist": True,
"accepted": False,
"persist": False,
"recognized": False,
"message": (
f"Note: `{requested}` was not found in the OpenAI Codex model listing. "
f"It may still work if your account has access to it."
f"Model `{requested}` was not found in the OpenAI Codex model listing."
f"{suggestion_text}"
),
}
@@ -2111,16 +2142,15 @@ def validate_requested_model(
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": True,
"persist": True,
"recognized": False,
"message": (
f"Note: `{requested}` was not found in this provider's model listing. "
f"It may still work if your plan supports it."
f"{suggestion_text}"
),
}
return {
"accepted": False,
"persist": False,
"recognized": False,
"message": (
f"Model `{requested}` was not found in this provider's model listing."
f"{suggestion_text}"
),
}
# api_models is None — couldn't reach API. Accept and persist,
# but warn so typos don't silently break things.
@@ -2162,8 +2192,8 @@ def validate_requested_model(
provider_label = _PROVIDER_LABELS.get(normalized, normalized)
return {
"accepted": True,
"persist": True,
"accepted": False,
"persist": False,
"recognized": False,
"message": (
f"Could not reach the {provider_label} API to validate `{requested}`. "
+3 -12
View File
@@ -300,19 +300,10 @@ def _read_config_model(profile_dir: Path) -> tuple:
def _check_gateway_running(profile_dir: Path) -> bool:
"""Check if a gateway is running for a given profile directory."""
pid_file = profile_dir / "gateway.pid"
if not pid_file.exists():
return False
try:
raw = pid_file.read_text().strip()
if not raw:
return False
data = json.loads(raw) if raw.startswith("{") else {"pid": int(raw)}
pid = int(data["pid"])
os.kill(pid, 0) # existence check
return True
except (json.JSONDecodeError, KeyError, ValueError, TypeError,
ProcessLookupError, PermissionError, OSError):
from gateway.status import get_running_pid
return get_running_pid(profile_dir / "gateway.pid", cleanup_stale=False) is not None
except Exception:
return False
+11
View File
@@ -137,6 +137,11 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
base_url_override="https://api.x.ai/v1",
base_url_env_var="XAI_BASE_URL",
),
"nvidia": HermesOverlay(
transport="openai_chat",
base_url_override="https://integrate.api.nvidia.com/v1",
base_url_env_var="NVIDIA_BASE_URL",
),
"xiaomi": HermesOverlay(
transport="openai_chat",
base_url_env_var="XIAOMI_BASE_URL",
@@ -191,6 +196,12 @@ ALIASES: Dict[str, str] = {
"x.ai": "xai",
"grok": "xai",
# nvidia
"nim": "nvidia",
"nvidia-nim": "nvidia",
"build-nvidia": "nvidia",
"nemotron": "nvidia",
# kimi-for-coding (models.dev ID)
"kimi": "kimi-for-coding",
"kimi-coding": "kimi-for-coding",
+13 -54
View File
@@ -91,7 +91,7 @@ _DEFAULT_PROVIDER_MODELS = {
"gemini": [
"gemini-3.1-pro-preview", "gemini-3-flash-preview", "gemini-3.1-flash-lite-preview",
"gemini-2.5-pro", "gemini-2.5-flash", "gemini-2.5-flash-lite",
"gemma-4-31b-it", "gemma-4-26b-it",
"gemma-4-31b-it",
],
"zai": ["glm-5.1", "glm-5", "glm-4.7", "glm-4.5", "glm-4.5-flash"],
"kimi-coding": ["kimi-k2.5", "kimi-k2-thinking", "kimi-k2-turbo-preview"],
@@ -2005,52 +2005,6 @@ def _setup_wecom_callback():
_gw_setup()
def _setup_qqbot():
"""Configure QQ Bot gateway."""
print_header("QQ Bot")
existing = get_env_value("QQ_APP_ID")
if existing:
print_info("QQ Bot: already configured")
if not prompt_yes_no("Reconfigure QQ Bot?", False):
return
print_info("Connects Hermes to QQ via the Official QQ Bot API (v2).")
print_info(" Requires a QQ Bot application at q.qq.com")
print_info(" Reference: https://bot.q.qq.com/wiki/develop/api-v2/")
print()
app_id = prompt("QQ Bot App ID")
if not app_id:
print_warning("App ID is required — skipping QQ Bot setup")
return
save_env_value("QQ_APP_ID", app_id.strip())
client_secret = prompt("QQ Bot App Secret", password=True)
if not client_secret:
print_warning("App Secret is required — skipping QQ Bot setup")
return
save_env_value("QQ_CLIENT_SECRET", client_secret)
print_success("QQ Bot credentials saved")
print()
print_info("🔒 Security: Restrict who can DM your bot")
print_info(" Use QQ user OpenIDs (found in event payloads)")
print()
allowed_users = prompt("Allowed user OpenIDs (comma-separated, leave empty for open access)")
if allowed_users:
save_env_value("QQ_ALLOWED_USERS", allowed_users.replace(" ", ""))
print_success("QQ Bot allowlist configured")
else:
print_info("⚠️ No allowlist set — anyone can DM the bot!")
print()
print_info("📬 Home Channel: OpenID for cron job delivery and notifications.")
home_channel = prompt("Home channel OpenID (leave empty to set later)")
if home_channel:
save_env_value("QQ_HOME_CHANNEL", home_channel)
print()
print_success("QQ Bot configured!")
def _setup_bluebubbles():
@@ -2119,12 +2073,9 @@ def _setup_bluebubbles():
def _setup_qqbot():
"""Configure QQ Bot (Official API v2) via standard platform setup."""
from hermes_cli.gateway import _PLATFORMS
qq_platform = next((p for p in _PLATFORMS if p["key"] == "qqbot"), None)
if qq_platform:
from hermes_cli.gateway import _setup_standard_platform
_setup_standard_platform(qq_platform)
"""Configure QQ Bot (Official API v2) via gateway setup."""
from hermes_cli.gateway import _setup_qqbot as _gateway_setup_qqbot
_gateway_setup_qqbot()
def _setup_webhooks():
@@ -2264,7 +2215,9 @@ def setup_gateway(config: dict):
missing_home.append("Slack")
if get_env_value("BLUEBUBBLES_SERVER_URL") and not get_env_value("BLUEBUBBLES_HOME_CHANNEL"):
missing_home.append("BlueBubbles")
if get_env_value("QQ_APP_ID") and not get_env_value("QQ_HOME_CHANNEL"):
if get_env_value("QQ_APP_ID") and not (
get_env_value("QQBOT_HOME_CHANNEL") or get_env_value("QQ_HOME_CHANNEL")
):
missing_home.append("QQBot")
if missing_home:
@@ -2289,8 +2242,10 @@ def setup_gateway(config: dict):
_is_service_running,
supports_systemd_services,
has_conflicting_systemd_units,
has_legacy_hermes_units,
install_linux_gateway_from_setup,
print_systemd_scope_conflict_warning,
print_legacy_unit_warning,
systemd_start,
systemd_restart,
launchd_install,
@@ -2308,6 +2263,10 @@ def setup_gateway(config: dict):
print_systemd_scope_conflict_warning()
print()
if supports_systemd and has_legacy_hermes_units():
print_legacy_unit_warning()
print()
if service_running:
if prompt_yes_no(" Restart the gateway to pick up changes?", True):
try:
+147 -1
View File
@@ -515,6 +515,90 @@ def do_inspect(identifier: str, console: Optional[Console] = None) -> None:
c.print()
def browse_skills(page: int = 1, page_size: int = 20, source: str = "all") -> dict:
"""Paginated hub browse for programmatic callers (e.g. TUI gateway).
Returns ``{"items": [...], "page": int, "total_pages": int, "total": int}``.
"""
from tools.skills_hub import GitHubAuth, create_source_router
page_size = max(1, min(page_size, 100))
_TRUST_RANK = {"builtin": 3, "trusted": 2, "community": 1}
_PER_SOURCE_LIMIT = {"official": 100, "skills-sh": 100, "well-known": 25, "github": 100, "clawhub": 50,
"claude-marketplace": 50, "lobehub": 50}
auth = GitHubAuth()
sources = create_source_router(auth)
all_results: list = []
for src in sources:
sid = src.source_id()
if source != "all" and sid != source and sid != "official":
continue
try:
limit = _PER_SOURCE_LIMIT.get(sid, 50)
all_results.extend(src.search("", limit=limit))
except Exception:
continue
if not all_results:
return {"items": [], "page": 1, "total_pages": 1, "total": 0}
seen: dict = {}
for r in all_results:
rank = _TRUST_RANK.get(r.trust_level, 0)
if r.name not in seen or rank > _TRUST_RANK.get(seen[r.name].trust_level, 0):
seen[r.name] = r
deduped = list(seen.values())
deduped.sort(key=lambda r: (-_TRUST_RANK.get(r.trust_level, 0), r.source != "official", r.name.lower()))
total = len(deduped)
total_pages = max(1, (total + page_size - 1) // page_size)
page = max(1, min(page, total_pages))
start = (page - 1) * page_size
page_items = deduped[start : min(start + page_size, total)]
return {
"items": [{"name": r.name, "description": r.description, "source": r.source,
"trust": r.trust_level} for r in page_items],
"page": page,
"total_pages": total_pages,
"total": total,
}
def inspect_skill(identifier: str) -> Optional[dict]:
"""Skill metadata (+ SKILL.md preview) for programmatic callers."""
from tools.skills_hub import GitHubAuth, create_source_router
class _Q:
def print(self, *a, **k):
pass
c = _Q()
auth = GitHubAuth()
sources = create_source_router(auth)
ident = identifier
if "/" not in ident:
ident = _resolve_short_name(ident, sources, c)
if not ident:
return None
meta, bundle, _ = _resolve_source_meta_and_bundle(ident, sources)
if not meta:
return None
out: dict = {
"name": meta.name,
"description": meta.description,
"source": meta.source,
"identifier": meta.identifier,
"tags": list(meta.tags) if meta.tags else [],
}
if bundle and "SKILL.md" in bundle.files:
content = bundle.files["SKILL.md"]
if isinstance(content, bytes):
content = content.decode("utf-8", errors="replace")
lines = content.split("\n")
preview = "\n".join(lines[:50])
if len(lines) > 50:
preview += f"\n\n... ({len(lines) - 50} more lines)"
out["skill_md_preview"] = preview
return out
def do_list(source_filter: str = "all", console: Optional[Console] = None) -> None:
"""List installed skills, distinguishing hub, builtin, and local skills."""
from tools.skills_hub import HubLockFile, ensure_hub_dirs
@@ -684,6 +768,51 @@ def do_uninstall(name: str, console: Optional[Console] = None,
c.print(f"[bold red]Error:[/] {msg}\n")
def do_reset(name: str, restore: bool = False,
console: Optional[Console] = None,
skip_confirm: bool = False,
invalidate_cache: bool = True) -> None:
"""Reset a bundled skill's manifest tracking (+ optionally restore from bundled)."""
from tools.skills_sync import reset_bundled_skill
c = console or _console
if not skip_confirm and restore:
c.print(f"\n[bold]Restore '{name}' from bundled source?[/]")
c.print("[dim]This will DELETE your current copy and re-copy the bundled version.[/]")
try:
answer = input("Confirm [y/N]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
answer = "n"
if answer not in ("y", "yes"):
c.print("[dim]Cancelled.[/]\n")
return
result = reset_bundled_skill(name, restore=restore)
if not result["ok"]:
c.print(f"[bold red]Error:[/] {result['message']}\n")
return
c.print(f"[bold green]{result['message']}[/]")
synced = result.get("synced") or {}
if synced.get("copied"):
c.print(f"[dim]Copied: {', '.join(synced['copied'])}[/]")
if synced.get("updated"):
c.print(f"[dim]Updated: {', '.join(synced['updated'])}[/]")
c.print()
if invalidate_cache:
try:
from agent.prompt_builder import clear_skills_system_prompt_cache
clear_skills_system_prompt_cache(clear_snapshot=True)
except Exception:
pass
else:
c.print("[dim]Change will take effect in your next session.[/]")
c.print("[dim]Use /reset to start a new session now, or --now to apply immediately (invalidates prompt cache).[/]\n")
def do_tap(action: str, repo: str = "", console: Optional[Console] = None) -> None:
"""Manage taps (custom GitHub repo sources)."""
from tools.skills_hub import TapsManager
@@ -1007,6 +1136,9 @@ def skills_command(args) -> None:
do_audit(name=getattr(args, "name", None))
elif action == "uninstall":
do_uninstall(args.name)
elif action == "reset":
do_reset(args.name, restore=getattr(args, "restore", False),
skip_confirm=getattr(args, "yes", False))
elif action == "publish":
do_publish(
args.skill_path,
@@ -1029,7 +1161,7 @@ def skills_command(args) -> None:
return
do_tap(tap_action, repo=repo)
else:
_console.print("Usage: hermes skills [browse|search|install|inspect|list|check|update|audit|uninstall|publish|snapshot|tap]\n")
_console.print("Usage: hermes skills [browse|search|install|inspect|list|check|update|audit|uninstall|reset|publish|snapshot|tap]\n")
_console.print("Run 'hermes skills <command> --help' for details.\n")
@@ -1175,6 +1307,19 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
do_uninstall(args[0], console=c, skip_confirm=skip_confirm,
invalidate_cache=invalidate_cache)
elif action == "reset":
if not args:
c.print("[bold red]Usage:[/] /skills reset <name> [--restore] [--now]\n")
c.print("[dim]Clears the bundled-skills manifest entry so future updates stop marking it as user-modified.[/]")
c.print("[dim]Pass --restore to also replace the current copy with the bundled version.[/]\n")
return
name = args[0]
restore = "--restore" in args
invalidate_cache = "--now" in args
# Slash commands can't prompt — --restore in slash mode is implicit consent.
do_reset(name, restore=restore, console=c, skip_confirm=True,
invalidate_cache=invalidate_cache)
elif action == "publish":
if not args:
c.print("[bold red]Usage:[/] /skills publish <skill-path> [--to github] [--repo owner/repo]\n")
@@ -1231,6 +1376,7 @@ def _print_skills_help(console: Console) -> None:
" [cyan]update[/] [name] Update hub skills with upstream changes\n"
" [cyan]audit[/] [name] Re-scan hub skills for security\n"
" [cyan]uninstall[/] <name> Remove a hub-installed skill\n"
" [cyan]reset[/] <name> [--restore] Reset bundled-skill tracking (fix 'user-modified' flag)\n"
" [cyan]publish[/] <path> --repo <r> Publish a skill to GitHub via PR\n"
" [cyan]snapshot[/] export|import Export/import skill configurations\n"
" [cyan]tap[/] list|add|remove Manage skill sources\n",
+2 -2
View File
@@ -23,7 +23,7 @@ All fields are optional. Missing values inherit from the ``default`` skin.
banner_dim: "#B8860B" # Dim/muted text (separators, labels)
banner_text: "#FFF8DC" # Body text (tool names, skill names)
ui_accent: "#FFBF00" # General UI accent
ui_label: "#4dd0e1" # UI labels
ui_label: "#DAA520" # UI labels (warm gold; teal clashed w/ default banner gold)
ui_ok: "#4caf50" # Success indicators
ui_error: "#ef5350" # Error indicators
ui_warn: "#ffa726" # Warning indicators
@@ -163,7 +163,7 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"banner_dim": "#B8860B",
"banner_text": "#FFF8DC",
"ui_accent": "#FFBF00",
"ui_label": "#4dd0e1",
"ui_label": "#DAA520",
"ui_ok": "#4caf50",
"ui_error": "#ef5350",
"ui_warn": "#ffa726",
+30 -64
View File
@@ -317,7 +317,7 @@ def show_status(args):
"WeCom Callback": ("WECOM_CALLBACK_CORP_ID", None),
"Weixin": ("WEIXIN_ACCOUNT_ID", "WEIXIN_HOME_CHANNEL"),
"BlueBubbles": ("BLUEBUBBLES_SERVER_URL", "BLUEBUBBLES_HOME_CHANNEL"),
"QQBot": ("QQ_APP_ID", "QQ_HOME_CHANNEL"),
"QQBot": ("QQ_APP_ID", "QQBOT_HOME_CHANNEL"),
}
for name, (token_var, home_var) in platforms.items():
@@ -327,6 +327,9 @@ def show_status(args):
home_channel = ""
if home_var:
home_channel = os.getenv(home_var, "")
# Back-compat: QQBot home channel was renamed from QQ_HOME_CHANNEL to QQBOT_HOME_CHANNEL
if not home_channel and home_var == "QQBOT_HOME_CHANNEL":
home_channel = os.getenv("QQ_HOME_CHANNEL", "")
status = "configured" if has_token else "not configured"
if home_channel:
@@ -339,73 +342,36 @@ def show_status(args):
# =========================================================================
print()
print(color("◆ Gateway Service", Colors.CYAN, Colors.BOLD))
if _is_termux():
try:
from hermes_cli.gateway import find_gateway_pids
gateway_pids = find_gateway_pids()
except Exception:
gateway_pids = []
is_running = bool(gateway_pids)
try:
from hermes_cli.gateway import get_gateway_runtime_snapshot, _format_gateway_pids
snapshot = get_gateway_runtime_snapshot()
is_running = snapshot.running
print(f" Status: {check_mark(is_running)} {'running' if is_running else 'stopped'}")
print(" Manager: Termux / manual process")
if gateway_pids:
rendered = ", ".join(str(pid) for pid in gateway_pids[:3])
if len(gateway_pids) > 3:
rendered += ", ..."
print(f" PID(s): {rendered}")
else:
print(f" Manager: {snapshot.manager}")
if snapshot.gateway_pids:
print(f" PID(s): {_format_gateway_pids(snapshot.gateway_pids)}")
if snapshot.has_process_service_mismatch:
print(" Service: installed but not managing the current running gateway")
elif _is_termux() and not snapshot.gateway_pids:
print(" Start with: hermes gateway")
print(" Note: Android may stop background jobs when Termux is suspended")
elif sys.platform.startswith('linux'):
from hermes_constants import is_container
if is_container():
# Docker/Podman: no systemd — check for running gateway processes
try:
from hermes_cli.gateway import find_gateway_pids
gateway_pids = find_gateway_pids()
is_active = len(gateway_pids) > 0
except Exception:
is_active = False
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
print(" Manager: docker (foreground)")
elif snapshot.service_installed and not snapshot.service_running:
print(" Service: installed but stopped")
except Exception:
if _is_termux():
print(f" Status: {color('unknown', Colors.DIM)}")
print(" Manager: Termux / manual process")
elif sys.platform.startswith('linux'):
print(f" Status: {color('unknown', Colors.DIM)}")
print(" Manager: systemd/manual")
elif sys.platform == 'darwin':
print(f" Status: {color('unknown', Colors.DIM)}")
print(" Manager: launchd")
else:
try:
from hermes_cli.gateway import get_service_name
_gw_svc = get_service_name()
except Exception:
_gw_svc = "hermes-gateway"
try:
result = subprocess.run(
["systemctl", "--user", "is-active", _gw_svc],
capture_output=True,
text=True,
timeout=5
)
is_active = result.stdout.strip() == "active"
except (FileNotFoundError, subprocess.TimeoutExpired):
is_active = False
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
print(" Manager: systemd (user)")
elif sys.platform == 'darwin':
from hermes_cli.gateway import get_launchd_label
try:
result = subprocess.run(
["launchctl", "list", get_launchd_label()],
capture_output=True,
text=True,
timeout=5
)
is_loaded = result.returncode == 0
except subprocess.TimeoutExpired:
is_loaded = False
print(f" Status: {check_mark(is_loaded)} {'loaded' if is_loaded else 'not loaded'}")
print(" Manager: launchd")
else:
print(f" Status: {color('N/A', Colors.DIM)}")
print(" Manager: (not supported on this platform)")
print(f" Status: {color('N/A', Colors.DIM)}")
print(" Manager: (not supported on this platform)")
# =========================================================================
# Cron Jobs
+121 -2
View File
@@ -258,14 +258,16 @@ TOOL_CATEGORIES = {
"requires_nous_auth": True,
"managed_nous_feature": "image_gen",
"override_env_vars": ["FAL_KEY"],
"imagegen_backend": "fal",
},
{
"name": "FAL.ai",
"badge": "paid",
"tag": "FLUX 2 Pro with auto-upscaling",
"tag": "Pick from flux-2-klein, flux-2-pro, gpt-image, nano-banana, etc.",
"env_vars": [
{"key": "FAL_KEY", "prompt": "FAL API key", "url": "https://fal.ai/dashboard/keys"},
],
"imagegen_backend": "fal",
},
],
},
@@ -510,7 +512,7 @@ def _get_platform_tools(
"""Resolve which individual toolset names are enabled for a platform."""
from toolsets import resolve_toolset
platform_toolsets = config.get("platform_toolsets", {})
platform_toolsets = config.get("platform_toolsets") or {}
toolset_names = platform_toolsets.get(platform)
if toolset_names is None or not isinstance(toolset_names, list):
@@ -950,6 +952,106 @@ def _detect_active_provider_index(providers: list, config: dict) -> int:
return 0
# ─── Image Generation Model Pickers ───────────────────────────────────────────
#
# IMAGEGEN_BACKENDS is a per-backend catalog. Each entry exposes:
# - config_key: top-level config.yaml key for this backend's settings
# - model_catalog_fn: returns an OrderedDict-like {model_id: metadata}
# - default_model: fallback when nothing is configured
#
# This prepares for future imagegen backends (Replicate, Stability, etc.):
# each new backend registers its own entry; the FAL provider entry in
# TOOL_CATEGORIES tags itself with `imagegen_backend: "fal"` to select the
# right catalog at picker time.
def _fal_model_catalog():
"""Lazy-load the FAL model catalog from the tool module."""
from tools.image_generation_tool import FAL_MODELS, DEFAULT_MODEL
return FAL_MODELS, DEFAULT_MODEL
IMAGEGEN_BACKENDS = {
"fal": {
"display": "FAL.ai",
"config_key": "image_gen",
"catalog_fn": _fal_model_catalog,
},
}
def _format_imagegen_model_row(model_id: str, meta: dict, widths: dict) -> str:
"""Format a single picker row with column-aligned speed / strengths / price."""
return (
f"{model_id:<{widths['model']}} "
f"{meta.get('speed', ''):<{widths['speed']}} "
f"{meta.get('strengths', ''):<{widths['strengths']}} "
f"{meta.get('price', '')}"
)
def _configure_imagegen_model(backend_name: str, config: dict) -> None:
"""Prompt the user to pick a model for the given imagegen backend.
Writes selection to ``config[backend_config_key]["model"]``. Safe to
call even when stdin is not a TTY curses_radiolist falls back to
keeping the current selection.
"""
backend = IMAGEGEN_BACKENDS.get(backend_name)
if not backend:
return
catalog, default_model = backend["catalog_fn"]()
if not catalog:
return
cfg_key = backend["config_key"]
cur_cfg = config.setdefault(cfg_key, {})
if not isinstance(cur_cfg, dict):
cur_cfg = {}
config[cfg_key] = cur_cfg
current_model = cur_cfg.get("model") or default_model
if current_model not in catalog:
current_model = default_model
model_ids = list(catalog.keys())
# Put current model at the top so the cursor lands on it by default.
ordered = [current_model] + [m for m in model_ids if m != current_model]
# Column widths
widths = {
"model": max(len(m) for m in model_ids),
"speed": max((len(catalog[m].get("speed", "")) for m in model_ids), default=6),
"strengths": max((len(catalog[m].get("strengths", "")) for m in model_ids), default=0),
}
print()
header = (
f" {'Model':<{widths['model']}} "
f"{'Speed':<{widths['speed']}} "
f"{'Strengths':<{widths['strengths']}} "
f"Price"
)
print(color(header, Colors.CYAN))
rows = []
for mid in ordered:
row = _format_imagegen_model_row(mid, catalog[mid], widths)
if mid == current_model:
row += " ← currently in use"
rows.append(row)
idx = _prompt_choice(
f" Choose {backend['display']} model:",
rows,
default=0,
)
chosen = ordered[idx]
cur_cfg["model"] = chosen
_print_success(f" Model set to: {chosen}")
def _configure_provider(provider: dict, config: dict):
"""Configure a single provider - prompt for API keys and set config."""
env_vars = provider.get("env_vars", [])
@@ -1006,6 +1108,10 @@ def _configure_provider(provider: dict, config: dict):
_print_success(f" {provider['name']} - no configuration needed!")
if managed_feature:
_print_info(" Requests for this tool will be billed to your Nous subscription.")
# Imagegen backends prompt for model selection after backend pick.
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
return
# Prompt for each required env var
@@ -1040,6 +1146,10 @@ def _configure_provider(provider: dict, config: dict):
if all_configured:
_print_success(f" {provider['name']} configured!")
# Imagegen backends prompt for model selection after env vars are in.
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
def _configure_simple_requirements(ts_key: str):
@@ -1211,6 +1321,10 @@ def _reconfigure_provider(provider: dict, config: dict):
_print_success(f" {provider['name']} - no configuration needed!")
if managed_feature:
_print_info(" Requests for this tool will be billed to your Nous subscription.")
# Imagegen backends prompt for model selection on reconfig too.
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
return
for var in env_vars:
@@ -1228,6 +1342,11 @@ def _reconfigure_provider(provider: dict, config: dict):
else:
_print_info(" Kept current")
# Imagegen backends prompt for model selection on reconfig too.
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
def _reconfigure_simple_requirements(ts_key: str):
"""Reconfigure simple env var requirements."""
+3 -33
View File
@@ -56,7 +56,7 @@ try:
except ImportError:
raise SystemExit(
"Web UI requires fastapi and uvicorn.\n"
"Run 'hermes web' to auto-install, or: pip install hermes-agent[web]"
f"Install with: {sys.executable} -m pip install 'fastapi' 'uvicorn[standard]'"
)
WEB_DIST = Path(__file__).parent / "web_dist"
@@ -1444,38 +1444,8 @@ def _nous_poller(session_id: str) -> None:
auth_state, min_key_ttl_seconds=300, timeout_seconds=15.0,
force_refresh=False, force_mint=True,
)
# Save into credential pool same as auth_commands.py does
from agent.credential_pool import (
PooledCredential,
load_pool,
AUTH_TYPE_OAUTH,
SOURCE_MANUAL,
)
pool = load_pool("nous")
entry = PooledCredential.from_dict("nous", {
**full_state,
"label": "dashboard device_code",
"auth_type": AUTH_TYPE_OAUTH,
"source": f"{SOURCE_MANUAL}:dashboard_device_code",
"base_url": full_state.get("inference_base_url"),
})
pool.add_entry(entry)
# Also persist to auth store so get_nous_auth_status() sees it
# (matches what _login_nous in auth.py does for the CLI flow).
try:
from hermes_cli.auth import (
_load_auth_store, _save_provider_state, _save_auth_store,
_auth_store_lock,
)
with _auth_store_lock():
auth_store = _load_auth_store()
_save_provider_state(auth_store, "nous", full_state)
_save_auth_store(auth_store)
except Exception as store_exc:
_log.warning(
"oauth/device: credential pool saved but auth store write failed "
"(session=%s): %s", session_id, store_exc,
)
from hermes_cli.auth import persist_nous_credentials
persist_nous_credentials(full_state)
with _oauth_sessions_lock:
sess["status"] = "approved"
_log.info("oauth/device: nous login completed (session=%s)", session_id)
+2 -1
View File
@@ -14,7 +14,8 @@ def get_hermes_home() -> Path:
Reads HERMES_HOME env var, falls back to ~/.hermes.
This is the single source of truth all other copies should import this.
"""
return Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
val = os.environ.get("HERMES_HOME", "").strip()
return Path(val) if val else Path.home() / ".hermes"
def get_default_hermes_root() -> Path:
+57 -152
View File
@@ -987,6 +987,22 @@ class SessionDB:
return sanitized.strip()
@staticmethod
def _contains_cjk(text: str) -> bool:
"""Check if text contains CJK (Chinese, Japanese, Korean) characters."""
for ch in text:
cp = ord(ch)
if (0x4E00 <= cp <= 0x9FFF or # CJK Unified Ideographs
0x3400 <= cp <= 0x4DBF or # CJK Extension A
0x20000 <= cp <= 0x2A6DF or # CJK Extension B
0x3000 <= cp <= 0x303F or # CJK Symbols
0x3040 <= cp <= 0x309F or # Hiragana
0x30A0 <= cp <= 0x30FF or # Katakana
0xAC00 <= cp <= 0xD7AF): # Hangul Syllables
return True
return False
def search_messages(
self,
query: str,
@@ -1062,8 +1078,47 @@ class SessionDB:
cursor = self._conn.execute(sql, params)
except sqlite3.OperationalError:
# FTS5 query syntax error despite sanitization — return empty
return []
matches = [dict(row) for row in cursor.fetchall()]
# unless query contains CJK (fall back to LIKE below)
if not self._contains_cjk(query):
return []
matches = []
else:
matches = [dict(row) for row in cursor.fetchall()]
# LIKE fallback for CJK queries: FTS5 default tokenizer splits CJK
# characters individually, causing multi-character queries to fail.
if not matches and self._contains_cjk(query):
raw_query = query.strip('"').strip()
like_where = ["m.content LIKE ?"]
like_params: list = [f"%{raw_query}%"]
if source_filter is not None:
like_where.append(f"s.source IN ({','.join('?' for _ in source_filter)})")
like_params.extend(source_filter)
if exclude_sources is not None:
like_where.append(f"s.source NOT IN ({','.join('?' for _ in exclude_sources)})")
like_params.extend(exclude_sources)
if role_filter:
like_where.append(f"m.role IN ({','.join('?' for _ in role_filter)})")
like_params.extend(role_filter)
like_sql = f"""
SELECT m.id, m.session_id, m.role,
substr(m.content,
max(1, instr(m.content, ?) - 40),
120) AS snippet,
m.content, m.timestamp, m.tool_name,
s.source, s.model, s.started_at AS session_started
FROM messages m
JOIN sessions s ON s.id = m.session_id
WHERE {' AND '.join(like_where)}
ORDER BY m.timestamp DESC
LIMIT ? OFFSET ?
"""
like_params.extend([limit, offset])
# instr() parameter goes first in the bound list
like_params = [raw_query] + like_params
with self._lock:
like_cursor = self._conn.execute(like_sql, like_params)
matches = [dict(row) for row in like_cursor.fetchall()]
# Add surrounding context (1 message before + after each match).
# Done outside the lock so we don't hold it across N sequential queries.
@@ -1160,23 +1215,6 @@ class SessionDB:
results.append({**session, "messages": messages})
return results
# ---------------------------------------------------------------
# Export sanitization
# ---------------------------------------------------------------
#
# When users share session exports for debugging or training, the
# raw JSON contains every user message, tool output, and reasoning
# trace — which often includes file contents, command output, env
# variables, paths, and other confidential information.
#
# ``sanitize_session_export`` produces a deep copy of the export
# with all content fields replaced by opaque ``[redacted:<kind>:<id>]``
# tokens. Structural metadata (IDs, roles, timestamps, token counts,
# tool names, finish reasons, model info, cost data) is preserved
# so that the shape of a conversation is still analysable.
#
# Inspired by anomalyco/opencode#22489 (opencode's ``export --sanitize``).
def clear_messages(self, session_id: str) -> None:
"""Delete all messages for a session and reset its counters."""
def _do(conn):
@@ -1253,136 +1291,3 @@ class SessionDB:
return len(session_ids)
return self._execute_write(_do)
# =========================================================================
# Session export sanitization
# =========================================================================
#
# Ported from anomalyco/opencode#22489 — users often want to share a
# session export for bug reports, feature requests, or training data
# collection, but the raw export contains every user prompt, tool
# output, file content, and reasoning trace. ``sanitize_session_export``
# replaces content fields with opaque tokens while preserving the
# conversation's structure and metrics.
# Message-level content fields that are always redacted on a message.
_REDACT_MSG_STRING_FIELDS = (
"content",
"reasoning",
)
# Session-level fields that can contain user-facing text.
_REDACT_SESSION_STRING_FIELDS = (
"system_prompt",
"title",
)
def _redact_token(kind: str, id_: Any, value: Any) -> Any:
"""Produce an opaque redaction token. Preserves empty/None values."""
if value in (None, "", b""):
return value
return f"[redacted:{kind}:{id_}]"
def _redact_tool_call(call: Any, msg_id: Any, index: int) -> Any:
"""Redact arguments inside a tool_call while preserving structure (id, name)."""
if not isinstance(call, dict):
return call
out = dict(call)
tcid = out.get("id") or f"{msg_id}-{index}"
fn = out.get("function")
if isinstance(fn, dict):
new_fn = dict(fn)
if "arguments" in new_fn and new_fn["arguments"] not in (None, "", "{}"):
new_fn["arguments"] = _redact_token("tool-input", tcid, new_fn["arguments"])
out["function"] = new_fn
# Some schemas put args at the top level rather than under ``function``.
if "arguments" in out and out["arguments"] not in (None, "", "{}"):
out["arguments"] = _redact_token("tool-input", tcid, out["arguments"])
return out
def _redact_reasoning_details(details: Any, msg_id: Any) -> Any:
"""Redact text inside OpenAI / Anthropic reasoning_details blocks.
``reasoning_details`` is a list of dicts with shapes like::
{"type": "reasoning.text", "text": "..."}
{"type": "reasoning.encrypted", "data": "..."}
{"type": "reasoning.summary", "summary": "..."}
We preserve the block type/structure and redact the inner payload.
"""
if not isinstance(details, list):
return details
out = []
for idx, block in enumerate(details):
if not isinstance(block, dict):
out.append(block)
continue
new_block = dict(block)
for key in ("text", "data", "summary", "content"):
if key in new_block and new_block[key] not in (None, ""):
new_block[key] = _redact_token(f"reasoning-{key}", f"{msg_id}-{idx}", new_block[key])
out.append(new_block)
return out
def _redact_message(msg: Dict[str, Any]) -> Dict[str, Any]:
"""Return a sanitized copy of a single message row."""
if not isinstance(msg, dict):
return msg
msg_id = msg.get("id", "msg")
out = dict(msg)
# Plain string content fields.
for field in _REDACT_MSG_STRING_FIELDS:
if field in out and out[field] not in (None, ""):
out[field] = _redact_token(field.replace("_", "-"), msg_id, out[field])
# Tool calls: keep structure (id, name) but redact arguments.
tcs = out.get("tool_calls")
if isinstance(tcs, list):
out["tool_calls"] = [_redact_tool_call(tc, msg_id, i) for i, tc in enumerate(tcs)]
# Reasoning details: preserve block structure, redact text/data.
if "reasoning_details" in out:
out["reasoning_details"] = _redact_reasoning_details(out["reasoning_details"], msg_id)
# Codex reasoning items follow the same shape as reasoning_details.
if "codex_reasoning_items" in out:
out["codex_reasoning_items"] = _redact_reasoning_details(out["codex_reasoning_items"], msg_id)
return out
def sanitize_session_export(session: Dict[str, Any]) -> Dict[str, Any]:
"""Return a deep-sanitized copy of a session export.
All user-facing content (message text, reasoning, tool arguments and
outputs, system prompt, title) is replaced by ``[redacted:<kind>:<id>]``
tokens. Structural metadata (ids, timestamps, token counts, tool names,
model/provider info, cost data, finish reasons) is preserved so the
export remains useful for debugging schema issues, analysing tool-use
patterns, or counting sessions without leaking confidential data.
The input dict is not mutated.
"""
if not isinstance(session, dict):
return session
sid = session.get("id", "session")
out = dict(session)
# Session-level text fields (title, system prompt).
for field in _REDACT_SESSION_STRING_FIELDS:
if field in out and out[field] not in (None, ""):
out[field] = _redact_token(field.replace("_", "-"), sid, out[field])
# Messages list: sanitize each row.
msgs = out.get("messages")
if isinstance(msgs, list):
out["messages"] = [_redact_message(m) for m in msgs]
return out
+2 -2
View File
@@ -433,7 +433,7 @@ def create_mcp_server(event_bridge: Optional[EventBridge] = None) -> "FastMCP":
if not _MCP_SERVER_AVAILABLE:
raise ImportError(
"MCP server requires the 'mcp' package. "
"Install with: pip install 'hermes-agent[mcp]'"
f"Install with: {sys.executable} -m pip install 'mcp'"
)
mcp = FastMCP(
@@ -838,7 +838,7 @@ def run_mcp_server(verbose: bool = False) -> None:
if not _MCP_SERVER_AVAILABLE:
print(
"Error: MCP server requires the 'mcp' package.\n"
"Install with: pip install 'hermes-agent[mcp]'",
f"Install with: {sys.executable} -m pip install 'mcp'",
file=sys.stderr,
)
sys.exit(1)
+20 -6
View File
@@ -43,6 +43,15 @@ from dotenv import load_dotenv
load_dotenv()
def _effective_temperature_for_model(model: str) -> Optional[float]:
"""Return a fixed temperature for models with strict sampling contracts."""
try:
from agent.auxiliary_client import _fixed_temperature_for_model
except Exception:
return None
return _fixed_temperature_for_model(model)
# ============================================================================
@@ -442,12 +451,17 @@ Complete the user's task step by step."""
# Make API call
try:
response = self.client.chat.completions.create(
model=self.model,
messages=api_messages,
tools=self.tools,
timeout=300.0
)
api_kwargs = {
"model": self.model,
"messages": api_messages,
"tools": self.tools,
"timeout": 300.0,
}
fixed_temperature = _effective_temperature_for_model(self.model)
if fixed_temperature is not None:
api_kwargs["temperature"] = fixed_temperature
response = self.client.chat.completions.create(**api_kwargs)
except Exception as e:
self.logger.error(f"API call failed: {e}")
break
+2 -2
View File
@@ -274,9 +274,9 @@ def get_tool_definitions(
# execute_code" even when the API key isn't configured or the toolset is
# disabled (#560-discord).
if "execute_code" in available_tool_names:
from tools.code_execution_tool import SANDBOX_ALLOWED_TOOLS, build_execute_code_schema
from tools.code_execution_tool import SANDBOX_ALLOWED_TOOLS, build_execute_code_schema, _get_execution_mode
sandbox_enabled = SANDBOX_ALLOWED_TOOLS & available_tool_names
dynamic_schema = build_execute_code_schema(sandbox_enabled)
dynamic_schema = build_execute_code_schema(sandbox_enabled, mode=_get_execution_mode())
for i, td in enumerate(filtered_tools):
if td.get("function", {}).get("name") == "execute_code":
filtered_tools[i] = {"type": "function", "function": dynamic_schema}
+22
View File
@@ -103,6 +103,28 @@ json.dump(sorted(leaf_paths(DEFAULT_CONFIG)), sys.stdout, indent=2)
echo "ok" > $out/result
'';
# Verify bundled TUI is present and compiled
bundled-tui = pkgs.runCommand "hermes-bundled-tui" { } ''
set -e
echo "=== Checking bundled TUI ==="
test -d ${hermes-agent}/ui-tui || (echo "FAIL: ui-tui directory missing"; exit 1)
echo "PASS: ui-tui directory exists"
test -f ${hermes-agent}/ui-tui/dist/entry.js || (echo "FAIL: compiled entry.js missing"; exit 1)
echo "PASS: compiled entry.js present"
test -d ${hermes-agent}/ui-tui/node_modules || (echo "FAIL: node_modules missing"; exit 1)
echo "PASS: node_modules present"
grep -q "HERMES_TUI_DIR" ${hermes-agent}/bin/hermes || \
(echo "FAIL: HERMES_TUI_DIR not in wrapper"; exit 1)
echo "PASS: HERMES_TUI_DIR set in wrapper"
echo "=== All bundled TUI checks passed ==="
mkdir -p $out
echo "ok" > $out/result
'';
# Verify HERMES_MANAGED guard works on all mutation commands
managed-guard = pkgs.runCommand "hermes-managed-guard" { } ''
set -e
+15 -38
View File
@@ -1,49 +1,26 @@
# nix/devShell.nix — Fast dev shell with stamp-file optimization
# nix/devShell.nix — Dev shell that delegates setup to each package
#
# Each package in inputsFrom exposes passthru.devShellHook — a bash snippet
# with stamp-checked setup logic. This file collects and runs them all.
{ inputs, ... }: {
perSystem = { pkgs, ... }:
perSystem = { pkgs, system, ... }:
let
python = pkgs.python311;
hermes-agent = inputs.self.packages.${system}.default;
hermes-tui = inputs.self.packages.${system}.tui;
packages = [ hermes-agent hermes-tui ];
in {
devShells.default = pkgs.mkShell {
inputsFrom = packages;
packages = with pkgs; [
python uv nodejs_20 ripgrep git openssh ffmpeg
python311 uv nodejs_22 ripgrep git openssh ffmpeg
];
shellHook = ''
shellHook = let
hooks = map (p: p.passthru.devShellHook or "") packages;
combined = pkgs.lib.concatStringsSep "\n" (builtins.filter (h: h != "") hooks);
in ''
echo "Hermes Agent dev shell"
# Composite stamp: changes when nix python or uv change
STAMP_VALUE="${python}:${pkgs.uv}"
STAMP_FILE=".venv/.nix-stamp"
# Create venv if missing
if [ ! -d .venv ]; then
echo "Creating Python 3.11 venv..."
uv venv .venv --python ${python}/bin/python3
fi
source .venv/bin/activate
# Only install if stamp is stale or missing
if [ ! -f "$STAMP_FILE" ] || [ "$(cat "$STAMP_FILE")" != "$STAMP_VALUE" ]; then
echo "Installing Python dependencies..."
uv pip install -e ".[all]"
if [ -d mini-swe-agent ]; then
uv pip install -e ./mini-swe-agent 2>/dev/null || true
fi
if [ -d tinker-atropos ]; then
uv pip install -e ./tinker-atropos 2>/dev/null || true
fi
# Install npm deps
if [ -f package.json ] && [ ! -d node_modules ]; then
echo "Installing npm dependencies..."
npm install
fi
echo "$STAMP_VALUE" > "$STAMP_FILE"
fi
${combined}
echo "Ready. Run 'hermes' to start."
'';
};
+83 -29
View File
@@ -1,54 +1,108 @@
# nix/packages.nix — Hermes Agent package built with uv2nix
{ inputs, ... }: {
perSystem = { pkgs, system, ... }:
{ inputs, ... }:
{
perSystem =
{ pkgs, inputs', ... }:
let
hermesVenv = pkgs.callPackage ./python.nix {
inherit (inputs) uv2nix pyproject-nix pyproject-build-systems;
};
hermesTui = pkgs.callPackage ./tui.nix {
npm-lockfile-fix = inputs'.npm-lockfile-fix.packages.default;
};
# Import bundled skills, excluding runtime caches
bundledSkills = pkgs.lib.cleanSourceWith {
src = ../skills;
filter = path: _type:
!(pkgs.lib.hasInfix "/index-cache/" path);
filter = path: _type: !(pkgs.lib.hasInfix "/index-cache/" path);
};
runtimeDeps = with pkgs; [
nodejs_20 ripgrep git openssh ffmpeg tirith
nodejs_22
ripgrep
git
openssh
ffmpeg
tirith
];
runtimePath = pkgs.lib.makeBinPath runtimeDeps;
in {
packages.default = pkgs.stdenv.mkDerivation {
pname = "hermes-agent";
version = (builtins.fromTOML (builtins.readFile ../pyproject.toml)).project.version;
dontUnpack = true;
dontBuild = true;
nativeBuildInputs = [ pkgs.makeWrapper ];
# Lockfile hashes for dev shell stamps
pyprojectHash = builtins.hashString "sha256" (builtins.readFile ../pyproject.toml);
uvLockHash =
if builtins.pathExists ../uv.lock then
builtins.hashString "sha256" (builtins.readFile ../uv.lock)
else
"none";
in
{
packages = {
default = pkgs.stdenv.mkDerivation {
pname = "hermes-agent";
version = (fromTOML (builtins.readFile ../pyproject.toml)).project.version;
installPhase = ''
runHook preInstall
dontUnpack = true;
dontBuild = true;
nativeBuildInputs = [ pkgs.makeWrapper ];
mkdir -p $out/share/hermes-agent $out/bin
cp -r ${bundledSkills} $out/share/hermes-agent/skills
installPhase = ''
runHook preInstall
${pkgs.lib.concatMapStringsSep "\n" (name: ''
makeWrapper ${hermesVenv}/bin/${name} $out/bin/${name} \
--suffix PATH : "${runtimePath}" \
--set HERMES_BUNDLED_SKILLS $out/share/hermes-agent/skills
'') [ "hermes" "hermes-agent" "hermes-acp" ]}
mkdir -p $out/share/hermes-agent $out/bin
cp -r ${bundledSkills} $out/share/hermes-agent/skills
runHook postInstall
'';
# copy pre-built TUI (same layout as dev: ui-tui/dist/ + node_modules/)
mkdir -p $out/ui-tui
cp -r ${hermesTui}/lib/hermes-tui/* $out/ui-tui/
meta = with pkgs.lib; {
description = "AI agent with advanced tool-calling capabilities";
homepage = "https://github.com/NousResearch/hermes-agent";
mainProgram = "hermes";
license = licenses.mit;
platforms = platforms.unix;
${pkgs.lib.concatMapStringsSep "\n"
(name: ''
makeWrapper ${hermesVenv}/bin/${name} $out/bin/${name} \
--suffix PATH : "${runtimePath}" \
--set HERMES_BUNDLED_SKILLS $out/share/hermes-agent/skills \
--set HERMES_TUI_DIR $out/ui-tui \
--set HERMES_PYTHON ${hermesVenv}/bin/python3
'')
[
"hermes"
"hermes-agent"
"hermes-acp"
]
}
runHook postInstall
'';
passthru.devShellHook = ''
STAMP=".nix-stamps/hermes-agent"
STAMP_VALUE="${pyprojectHash}:${uvLockHash}"
if [ ! -f "$STAMP" ] || [ "$(cat "$STAMP")" != "$STAMP_VALUE" ]; then
echo "hermes-agent: installing Python dependencies..."
uv venv .venv --python ${pkgs.python311}/bin/python3 2>/dev/null || true
source .venv/bin/activate
uv pip install -e ".[all]"
[ -d mini-swe-agent ] && uv pip install -e ./mini-swe-agent 2>/dev/null || true
[ -d tinker-atropos ] && uv pip install -e ./tinker-atropos 2>/dev/null || true
mkdir -p .nix-stamps
echo "$STAMP_VALUE" > "$STAMP"
else
source .venv/bin/activate
export HERMES_PYTHON=${hermesVenv}/bin/python3
fi
'';
meta = with pkgs.lib; {
description = "AI agent with advanced tool-calling capabilities";
homepage = "https://github.com/NousResearch/hermes-agent";
mainProgram = "hermes";
license = licenses.mit;
platforms = platforms.unix;
};
};
tui = hermesTui;
};
};
}
+82
View File
@@ -0,0 +1,82 @@
# nix/tui.nix — Hermes TUI (Ink/React) compiled with tsc and bundled
{ pkgs, npm-lockfile-fix, ... }:
let
src = ../ui-tui;
npmDeps = pkgs.fetchNpmDeps {
inherit src;
hash = "sha256-zsUPmbC6oMUO10EhS3ptvDjwlfpCSEmrkjyeORw7fac=";
};
packageJson = builtins.fromJSON (builtins.readFile (src + "/package.json"));
version = packageJson.version;
npmLockHash = builtins.hashString "sha256" (builtins.readFile ../ui-tui/package-lock.json);
in
pkgs.buildNpmPackage {
pname = "hermes-tui";
inherit src npmDeps version;
doCheck = false;
postPatch = ''
# fetchNpmDeps strips the trailing newline; match it so the diff passes
sed -i -z 's/\n$//' package-lock.json
'';
installPhase = ''
runHook preInstall
mkdir -p $out/lib/hermes-tui
cp -r dist $out/lib/hermes-tui/dist
# runtime node_modules
cp -r node_modules $out/lib/hermes-tui/node_modules
# @hermes/ink is a file: dependency, we need to copy it in fr
rm -f $out/lib/hermes-tui/node_modules/@hermes/ink
cp -r packages/hermes-ink $out/lib/hermes-tui/node_modules/@hermes/ink
# package.json needed for "type": "module" resolution
cp package.json $out/lib/hermes-tui/
runHook postInstall
'';
nativeBuildInputs = [
(pkgs.writeShellScriptBin "update_tui_lockfile" ''
set -euox pipefail
# get root of repo
REPO_ROOT=$(git rev-parse --show-toplevel)
# cd into ui-tui and reinstall
cd "$REPO_ROOT/ui-tui"
rm -rf node_modules/
npm cache clean --force
CI=true npm install # ci env var to suppress annoying unicode install banner lag
${pkgs.lib.getExe npm-lockfile-fix} ./package-lock.json
NIX_FILE="$REPO_ROOT/nix/tui.nix"
# compute the new hash
sed -i "s/hash = \"[^\"]*\";/hash = \"\";/" $NIX_FILE
NIX_OUTPUT=$(nix build .#tui 2>&1 || true)
NEW_HASH=$(echo "$NIX_OUTPUT" | grep 'got:' | awk '{print $2}')
echo got new hash $NEW_HASH
sed -i "s|hash = \"[^\"]*\";|hash = \"$NEW_HASH\";|" $NIX_FILE
nix build .#tui
echo "Updated npm hash in $NIX_FILE to $NEW_HASH"
'')
];
passthru.devShellHook = ''
STAMP=".nix-stamps/hermes-tui"
STAMP_VALUE="${npmLockHash}"
if [ ! -f "$STAMP" ] || [ "$(cat "$STAMP")" != "$STAMP_VALUE" ]; then
echo "hermes-tui: installing npm dependencies..."
cd ui-tui && CI=true npm install --silent --no-fund --no-audit 2>/dev/null && cd ..
mkdir -p .nix-stamps
echo "$STAMP_VALUE" > "$STAMP"
fi
'';
}
@@ -0,0 +1,346 @@
---
name: comfyui-mcp
description: Control a running ComfyUI instance from Hermes — queue workflows, generate images/video, upload inputs, manage models. Use when the user wants to create or modify anything with ComfyUI's node-based generative pipeline.
version: 1.0.0
requires: ComfyUI running locally, remotely, or via Comfy Cloud (default http://127.0.0.1:8188)
author: kshitijk4poor
license: MIT
metadata:
hermes:
tags: [comfyui, image-generation, stable-diffusion, flux, creative, generative-ai]
related_skills: [hermes-blender, stable-diffusion-image-generation, image_gen]
category: creative
---
# ComfyUI
Control a running ComfyUI instance from Hermes via its REST API. Queue workflow prompts, generate images and video, upload inputs, check progress, and retrieve outputs — all through `execute_code`.
## When to Use
- User asks to generate images with Stable Diffusion, SDXL, Flux, or other diffusion models
- User wants to run a specific ComfyUI workflow
- User wants to chain generative steps (txt2img → upscale → face restore)
- User needs ControlNet, inpainting, img2img, or other advanced pipelines
- User asks to manage ComfyUI queue or check generation progress
## Setup
ComfyUI must be running and reachable. Three options:
### Option A: Local
**Requires Python 3.10+.**
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
python3 -m venv venv && source venv/bin/activate
pip install torch torchvision torchaudio
pip install -r requirements.txt
python main.py --listen 127.0.0.1 --port 8188
GPU acceleration is auto-detected (CUDA on NVIDIA, MPS on Apple Silicon).
### Option B: Comfy Cloud
1. Sign up at https://platform.comfy.org
2. Generate an API key at https://platform.comfy.org/profile/api-keys (**requires paid plan**)
3. Set in `~/.hermes/.env`:
```
COMFYUI_URL=https://cloud.comfy.org/api
COMFYUI_API_KEY=<your-key>
```
### Option C: Remote instance
Point `COMFYUI_URL` at any reachable ComfyUI server:
```
COMFYUI_URL=http://192.168.1.100:8188
```
### Verify connection
```python
from hermes_tools import terminal
r = terminal("curl -s ${COMFYUI_URL:-http://127.0.0.1:8188}/system_stats | python3 -m json.tool | head -5")
print(r["output"])
```
## Core Pattern — ComfyUI Helper
Use this helper inside `execute_code` for all ComfyUI interactions:
```python
import json, time, urllib.request, urllib.parse, urllib.error, uuid, os
COMFY_URL = os.getenv("COMFYUI_URL", "http://127.0.0.1:8188")
COMFY_API_KEY = os.getenv("COMFYUI_API_KEY", "")
def comfy_api(method, path, data=None, timeout=30):
"""Send a request to the ComfyUI API."""
url = f"{COMFY_URL}{path}"
body = json.dumps(data).encode() if data else None
req = urllib.request.Request(url, data=body, method=method)
if body:
req.add_header("Content-Type", "application/json")
if COMFY_API_KEY:
req.add_header("X-API-Key", COMFY_API_KEY)
with urllib.request.urlopen(req, timeout=timeout) as resp:
return json.loads(resp.read())
def queue_prompt(workflow, client_id=None):
"""Queue a workflow for execution. Returns prompt_id."""
client_id = client_id or str(uuid.uuid4())
result = comfy_api("POST", "/prompt", {
"prompt": workflow,
"client_id": client_id,
})
return result["prompt_id"]
def wait_for_completion(prompt_id, timeout=300, poll_interval=2):
"""Poll /history until the prompt completes. Returns output dict."""
deadline = time.time() + timeout
while time.time() < deadline:
history = comfy_api("GET", f"/history/{prompt_id}")
if prompt_id in history:
return history[prompt_id]
time.sleep(poll_interval)
raise TimeoutError(f"Prompt {prompt_id} did not complete in {timeout}s")
def get_image(filename, subfolder="", img_type="output"):
"""Download a generated image. Returns bytes."""
params = urllib.parse.urlencode({
"filename": filename, "subfolder": subfolder, "type": img_type
})
url = f"{COMFY_URL}/view?{params}"
req = urllib.request.Request(url)
if COMFY_API_KEY:
req.add_header("X-API-Key", COMFY_API_KEY)
with urllib.request.urlopen(req) as resp:
return resp.read()
def upload_image(filepath, img_type="input", overwrite=True):
"""Upload an image to ComfyUI. Returns server-side filename."""
import mimetypes
boundary = uuid.uuid4().hex
filename = os.path.basename(filepath)
mime = mimetypes.guess_type(filepath)[0] or "image/png"
with open(filepath, "rb") as f:
file_data = f.read()
body = (
f"--{boundary}\r\n"
f'Content-Disposition: form-data; name="image"; filename="{filename}"\r\n'
f"Content-Type: {mime}\r\n\r\n"
).encode() + file_data + (
f"\r\n--{boundary}\r\n"
f'Content-Disposition: form-data; name="type"\r\n\r\n'
f"{img_type}\r\n"
f"--{boundary}\r\n"
f'Content-Disposition: form-data; name="overwrite"\r\n\r\n'
f"{'true' if overwrite else 'false'}\r\n"
f"--{boundary}--\r\n"
).encode()
req = urllib.request.Request(
f"{COMFY_URL}/upload/image", data=body, method="POST",
headers={"Content-Type": f"multipart/form-data; boundary={boundary}"},
)
if COMFY_API_KEY:
req.add_header("X-API-Key", COMFY_API_KEY)
with urllib.request.urlopen(req) as resp:
return json.loads(resp.read())
def list_models(folder="checkpoints"):
"""List available models in a folder (checkpoints, loras, vae, etc.)."""
return comfy_api("GET", f"/models/{folder}")
def get_queue_status():
"""Get current queue (running + pending)."""
return comfy_api("GET", "/queue")
def interrupt():
"""Interrupt the currently running generation."""
return comfy_api("POST", "/interrupt")
```
## Common Workflows
### Text-to-Image (Minimal)
Always call `list_models("checkpoints")` first to get the exact filename.
```python
# Discover which checkpoint is installed
models = list_models("checkpoints")
ckpt = models[0] # use first available
workflow = {
"3": {
"class_type": "KSampler",
"inputs": {
"seed": 42,
"steps": 20,
"cfg": 7.0,
"sampler_name": "euler",
"scheduler": "normal",
"denoise": 1.0,
"model": ["4", 0],
"positive": ["6", 0],
"negative": ["7", 0],
"latent_image": ["5", 0],
},
},
"4": {
"class_type": "CheckpointLoaderSimple",
"inputs": {"ckpt_name": ckpt},
},
"5": {
"class_type": "EmptyLatentImage",
"inputs": {"width": 512, "height": 512, "batch_size": 1},
},
"6": {
"class_type": "CLIPTextEncode",
"inputs": {
"text": "a beautiful sunset over mountains, photorealistic",
"clip": ["4", 1],
},
},
"7": {
"class_type": "CLIPTextEncode",
"inputs": {
"text": "ugly, blurry, low quality",
"clip": ["4", 1],
},
},
"8": {
"class_type": "VAEDecode",
"inputs": {"samples": ["3", 0], "vae": ["4", 2]},
},
"9": {
"class_type": "SaveImage",
"inputs": {"filename_prefix": "hermes", "images": ["8", 0]},
},
}
pid = queue_prompt(workflow)
result = wait_for_completion(pid)
# Extract output image filename
for node_id, node_output in result["outputs"].items():
if "images" in node_output:
for img in node_output["images"]:
img_data = get_image(img["filename"], img["subfolder"], img["type"])
with open(f"/tmp/{img['filename']}", "wb") as f:
f.write(img_data)
print(f"Saved: /tmp/{img['filename']}")
```
### Parameterized Generation
When the user asks to generate an image, build the workflow by modifying the template:
- **Prompt**: Set node "6" inputs.text to the user's positive prompt
- **Negative**: Set node "7" inputs.text (default: "ugly, blurry, low quality")
- **Model**: Set node "4" inputs.ckpt_name (use `list_models()` to find available ones)
- **Size**: Set node "5" inputs.width/height (SD 1.5: 512, SDXL: 1024, Flux: 1024)
- **Steps/CFG**: Set node "3" inputs.steps and inputs.cfg
- **Seed**: Set node "3" inputs.seed (random for variation, fixed for reproducibility)
### Loading User Workflows
Users often have saved workflow JSON files. Two formats exist:
1. **API format** — flat node dict, directly usable with `queue_prompt()`:
```python
with open("workflow_api.json") as f:
workflow = json.load(f)
pid = queue_prompt(workflow)
```
2. **UI format** — includes visual layout, NOT directly usable. Look for the
`"prompt"` key inside the exported data, or ask the user to export as API format
from ComfyUI's menu: Save (API Format).
### Checking Available Nodes
```python
# List all available node types
info = comfy_api("GET", "/object_info")
print(f"Total node types: {len(info)}")
# Get info for a specific node
ksampler_info = comfy_api("GET", "/object_info/KSampler")
print(json.dumps(ksampler_info, indent=2)[:500])
```
## Queue Management
```python
# Check what's running/pending
status = get_queue_status()
running = status.get("queue_running", [])
pending = status.get("queue_pending", [])
print(f"Running: {len(running)}, Pending: {len(pending)}")
# Cancel everything
if pending:
comfy_api("POST", "/queue", {"clear": True})
# Interrupt current generation
interrupt()
```
## Advanced: Native MCP Server Integration
For deeper integration with dedicated MCP tools, configure an external ComfyUI
MCP server in `~/.hermes/config.yaml`:
```yaml
mcp_servers:
comfyui:
command: "npx"
args: ["-y", "comfyui-mcp-server"]
env:
COMFYUI_URL: "http://127.0.0.1:8188"
```
This registers ComfyUI operations as native Hermes tools (prefixed `mcp_comfyui_*`).
See the `native-mcp` skill for MCP server configuration details.
## Pitfalls
1. **Python 3.10+ required**: ComfyUI's dependencies require Python 3.10+.
2. **API format vs UI format**: ComfyUI Save produces UI format (with layout info).
Only API format works with POST /prompt. Use "Save (API Format)" or extract
the `"prompt"` key from the UI format JSON.
3. **Node IDs are strings**: Always use `"3"` not `3` in workflow dicts. Links
between nodes use `["source_node_id", output_index]` arrays.
4. **Model names must be exact**: Use `list_models("checkpoints")` to get the
exact filename including extension. Names are case-sensitive.
5. **Long generations**: Complex workflows (high steps, large images, video) can
take minutes. Set `wait_for_completion(timeout=600)` for heavy workloads.
6. **VRAM/memory exhaustion**: Large models + high resolution can OOM. Use
`comfy_api("POST", "/free", {"unload_models": True})` to free memory between
generations, or start ComfyUI with `--lowvram` / `--cpu` flags.
7. **Custom nodes**: Many workflows require custom nodes (ControlNet, IPAdapter,
AnimateDiff, etc.). If a workflow fails with "class_type not found", the user
needs to install the missing node pack via ComfyUI Manager or manually.
8. **Output path**: Generated images are saved in ComfyUI's `output/` directory.
Use `get_image()` to download them to a local path the user can access.
9. **Concurrent generations**: ComfyUI queues prompts sequentially by default.
Multiple `queue_prompt()` calls will queue, not parallelize.
10. **Sampler/scheduler compatibility**: Not all combinations work with all models.
Safe defaults — SD 1.5/SDXL: `euler` + `normal`, CFG 7.0.
Flux: `euler` + `simple`, CFG 1.0. SD3: `euler` + `sgm_uniform`, CFG 4.5.
@@ -0,0 +1,97 @@
# ComfyUI REST API Reference
Default: `http://127.0.0.1:8188`. Cloud: `https://cloud.comfy.org/api`.
## Workflow Execution
### POST /prompt — Queue a workflow
```json
{
"prompt": { "<workflow nodes dict>" },
"client_id": "optional-uuid"
}
```
Response: `{"prompt_id": "uuid", "number": 1, "node_errors": {}}`
### GET /history/{prompt_id} — Single prompt history
Returns: `{ "prompt_id": { "prompt": [...], "outputs": {...}, "status": {...} } }`
Empty dict `{}` if not yet complete.
### GET /history — All execution history
Query params: `?max_items=200&offset=0`
### POST /interrupt — Stop current generation
### GET /queue — Queue status
Returns: `{"queue_running": [...], "queue_pending": [...]}`
### POST /queue — Manage queue
Body: `{"clear": true}` to clear all, or `{"delete": ["prompt_id1", ...]}`.
## Images
### GET /view — Download image
Query params: `filename` (required), `type` (`output`|`input`|`temp`), `subfolder`.
### POST /upload/image — Upload image
Multipart form: `image` (file), `type` (`input`), `subfolder`, `overwrite` (`true`|`false`).
Response: `{"name": "filename.png", "subfolder": "", "type": "input"}`
## Node/Model Information
### GET /object_info — All node types
Returns every registered node with inputs, outputs, types, defaults, category.
### GET /object_info/{class_type} — Single node info
### GET /models/{folder} — List models
Folders: `checkpoints`, `loras`, `vae`, `controlnet`, `clip`, `clip_vision`,
`upscale_models`, `embeddings`, `unet`, `diffusion_models`.
Returns: array of filename strings.
## System
### GET /system_stats — System information
Returns: OS, Python version, PyTorch version, VRAM per device, RAM total/free.
### POST /free — Free memory
Body: `{"unload_models": true, "free_memory": true}`
## Workflow JSON Format (API Format)
```json
{
"node_id_string": {
"class_type": "NodeClassName",
"inputs": {
"param_name": "value",
"linked_input": ["source_node_id", output_index]
}
}
}
```
- Node IDs are **strings** (`"3"`, not `3`)
- Links use `["node_id", output_index]` arrays (0-based int)
- `class_type` must match a registered node exactly (case-sensitive)
## WebSocket (real-time progress)
Connect to: `ws://host:8188/ws?clientId={uuid}`
Key events: `execution_start`, `executing` (null = done), `progress`, `execution_success`, `execution_error`.
@@ -0,0 +1,121 @@
# ComfyUI Workflow Recipes
Ready-to-use workflow templates. Always call `list_models("checkpoints")` first
to discover the exact checkpoint filename on the user's system.
## SDXL Text-to-Image
```python
workflow = {
"1": {
"class_type": "CheckpointLoaderSimple",
"inputs": {"ckpt_name": "SDXL_CHECKPOINT_HERE"},
},
"2": {
"class_type": "CLIPTextEncode",
"inputs": {"text": "POSITIVE PROMPT", "clip": ["1", 1]},
},
"3": {
"class_type": "CLIPTextEncode",
"inputs": {"text": "ugly, blurry, low quality, deformed", "clip": ["1", 1]},
},
"4": {
"class_type": "EmptyLatentImage",
"inputs": {"width": 1024, "height": 1024, "batch_size": 1},
},
"5": {
"class_type": "KSampler",
"inputs": {
"seed": 0, "steps": 25, "cfg": 7.0,
"sampler_name": "euler_ancestral", "scheduler": "normal",
"denoise": 1.0,
"model": ["1", 0], "positive": ["2", 0],
"negative": ["3", 0], "latent_image": ["4", 0],
},
},
"6": {
"class_type": "VAEDecode",
"inputs": {"samples": ["5", 0], "vae": ["1", 2]},
},
"7": {
"class_type": "SaveImage",
"inputs": {"filename_prefix": "hermes_sdxl", "images": ["6", 0]},
},
}
```
SDXL sizes: 1024×1024, 1152×896, 896×1152. Steps 20-30. CFG 5-9.
## Image-to-Image
```python
# Upload the input image first
result = upload_image("/path/to/input.png")
input_name = result["name"]
workflow = {
"1": {"class_type": "CheckpointLoaderSimple", "inputs": {"ckpt_name": "CHECKPOINT"}},
"2": {"class_type": "LoadImage", "inputs": {"image": input_name}},
"3": {"class_type": "VAEEncode", "inputs": {"pixels": ["2", 0], "vae": ["1", 2]}},
"4": {"class_type": "CLIPTextEncode", "inputs": {"text": "POSITIVE", "clip": ["1", 1]}},
"5": {"class_type": "CLIPTextEncode", "inputs": {"text": "ugly, blurry", "clip": ["1", 1]}},
"6": {
"class_type": "KSampler",
"inputs": {
"seed": 0, "steps": 20, "cfg": 7.0,
"sampler_name": "euler", "scheduler": "normal",
"denoise": 0.6,
"model": ["1", 0], "positive": ["4", 0],
"negative": ["5", 0], "latent_image": ["3", 0],
},
},
"7": {"class_type": "VAEDecode", "inputs": {"samples": ["6", 0], "vae": ["1", 2]}},
"8": {"class_type": "SaveImage", "inputs": {"filename_prefix": "hermes_img2img", "images": ["7", 0]}},
}
```
Key: **denoise** (0.3 = subtle, 0.6 = moderate, 0.9 = heavy changes).
## Flux Text-to-Image
Flux uses separate UNET/CLIP/VAE loaders (not CheckpointLoaderSimple).
```python
workflow = {
"1": {"class_type": "UNETLoader", "inputs": {"unet_name": "FLUX_UNET", "weight_dtype": "default"}},
"2": {"class_type": "DualCLIPLoader", "inputs": {"clip_name1": "T5_CLIP", "clip_name2": "CLIP_L", "type": "flux"}},
"3": {"class_type": "VAELoader", "inputs": {"vae_name": "VAE_NAME"}},
"4": {"class_type": "CLIPTextEncode", "inputs": {"text": "PROMPT", "clip": ["2", 0]}},
"5": {"class_type": "EmptySD3LatentImage", "inputs": {"width": 1024, "height": 1024, "batch_size": 1}},
"6": {
"class_type": "KSampler",
"inputs": {
"seed": 0, "steps": 20, "cfg": 1.0,
"sampler_name": "euler", "scheduler": "simple",
"denoise": 1.0,
"model": ["1", 0], "positive": ["4", 0],
"negative": ["4", 0], "latent_image": ["5", 0],
},
},
"7": {"class_type": "VAEDecode", "inputs": {"samples": ["6", 0], "vae": ["3", 0]}},
"8": {"class_type": "SaveImage", "inputs": {"filename_prefix": "hermes_flux", "images": ["7", 0]}},
}
```
Flux: CFG 1.0, `euler` + `simple`. Negative prompt has minimal effect.
## Execution Pattern (all recipes)
```python
pid = queue_prompt(workflow)
result = wait_for_completion(pid, timeout=300)
for node_id, node_output in result["outputs"].items():
if "images" in node_output:
for img_info in node_output["images"]:
data = get_image(img_info["filename"], img_info["subfolder"], img_info["type"])
local_path = f"/tmp/{img_info['filename']}"
with open(local_path, "wb") as f:
f.write(data)
print(f"Saved: {local_path} ({len(data)} bytes)")
```
@@ -0,0 +1,361 @@
---
name: concept-diagrams
description: Generate flat, minimal light/dark-aware SVG diagrams as standalone HTML files, using a unified educational visual language with 9 semantic color ramps, sentence-case typography, and automatic dark mode. Best suited for educational and non-software visuals — physics setups, chemistry mechanisms, math curves, physical objects (aircraft, turbines, smartphones, mechanical watches), anatomy, floor plans, cross-sections, narrative journeys (lifecycle of X, process of Y), hub-spoke system integrations (smart city, IoT), and exploded layer views. If a more specialized skill exists for the subject (dedicated software/cloud architecture, hand-drawn sketches, animated explainers, etc.), prefer that — otherwise this skill can also serve as a general-purpose SVG diagram fallback with a clean educational look. Ships with 15 example diagrams.
version: 0.1.0
author: v1k22 (original PR), ported into hermes-agent
license: MIT
dependencies: []
metadata:
hermes:
tags: [diagrams, svg, visualization, education, physics, chemistry, engineering]
related_skills: [architecture-diagram, excalidraw, generative-widgets]
---
# Concept Diagrams
Generate production-quality SVG diagrams with a unified flat, minimal design system. Output is a single self-contained HTML file that renders identically in any modern browser, with automatic light/dark mode.
## Scope
**Best suited for:**
- Physics setups, chemistry mechanisms, math curves, biology
- Physical objects (aircraft, turbines, smartphones, mechanical watches, cells)
- Anatomy, cross-sections, exploded layer views
- Floor plans, architectural conversions
- Narrative journeys (lifecycle of X, process of Y)
- Hub-spoke system integrations (smart city, IoT networks, electricity grids)
- Educational / textbook-style visuals in any domain
- Quantitative charts (grouped bars, energy profiles)
**Look elsewhere first for:**
- Dedicated software / cloud infrastructure architecture with a dark tech aesthetic (consider `architecture-diagram` if available)
- Hand-drawn whiteboard sketches (consider `excalidraw` if available)
- Animated explainers or video output (consider an animation skill)
If a more specialized skill is available for the subject, prefer that. If none fits, this skill can serve as a general-purpose SVG diagram fallback — the output will carry the clean educational aesthetic described below, which is a reasonable default for almost any subject.
## Workflow
1. Decide on the diagram type (see Diagram Types below).
2. Lay out components using the Design System rules.
3. Write the full HTML page using `templates/template.html` as the wrapper — paste your SVG where the template says `<!-- PASTE SVG HERE -->`.
4. Save as a standalone `.html` file (for example `~/my-diagram.html` or `./my-diagram.html`).
5. User opens it directly in a browser — no server, no dependencies.
Optional: if the user wants a browsable gallery of multiple diagrams, see "Local Preview Server" at the bottom.
Load the HTML template:
```
skill_view(name="concept-diagrams", file_path="templates/template.html")
```
The template embeds the full CSS design system (`c-*` color classes, text classes, light/dark variables, arrow marker styles). The SVG you generate relies on these classes being present on the hosting page.
---
## Design System
### Philosophy
- **Flat**: no gradients, drop shadows, blur, glow, or neon effects.
- **Minimal**: show the essential. No decorative icons inside boxes.
- **Consistent**: same colors, spacing, typography, and stroke widths across every diagram.
- **Dark-mode ready**: all colors auto-adapt via CSS classes — no per-mode SVG.
### Color Palette
9 color ramps, each with 7 stops. Put the class name on a `<g>` or shape element; the template CSS handles both modes.
| Class | 50 (lightest) | 100 | 200 | 400 | 600 | 800 | 900 (darkest) |
|------------|---------------|---------|---------|---------|---------|---------|---------------|
| `c-purple` | #EEEDFE | #CECBF6 | #AFA9EC | #7F77DD | #534AB7 | #3C3489 | #26215C |
| `c-teal` | #E1F5EE | #9FE1CB | #5DCAA5 | #1D9E75 | #0F6E56 | #085041 | #04342C |
| `c-coral` | #FAECE7 | #F5C4B3 | #F0997B | #D85A30 | #993C1D | #712B13 | #4A1B0C |
| `c-pink` | #FBEAF0 | #F4C0D1 | #ED93B1 | #D4537E | #993556 | #72243E | #4B1528 |
| `c-gray` | #F1EFE8 | #D3D1C7 | #B4B2A9 | #888780 | #5F5E5A | #444441 | #2C2C2A |
| `c-blue` | #E6F1FB | #B5D4F4 | #85B7EB | #378ADD | #185FA5 | #0C447C | #042C53 |
| `c-green` | #EAF3DE | #C0DD97 | #97C459 | #639922 | #3B6D11 | #27500A | #173404 |
| `c-amber` | #FAEEDA | #FAC775 | #EF9F27 | #BA7517 | #854F0B | #633806 | #412402 |
| `c-red` | #FCEBEB | #F7C1C1 | #F09595 | #E24B4A | #A32D2D | #791F1F | #501313 |
#### Color Assignment Rules
Color encodes **meaning**, not sequence. Never cycle through colors like a rainbow.
- Group nodes by **category** — all nodes of the same type share one color.
- Use `c-gray` for neutral/structural nodes (start, end, generic steps, users).
- Use **2-3 colors per diagram**, not 6+.
- Prefer `c-purple`, `c-teal`, `c-coral`, `c-pink` for general categories.
- Reserve `c-blue`, `c-green`, `c-amber`, `c-red` for semantic meaning (info, success, warning, error).
Light/dark stop mapping (handled by the template CSS — just use the class):
- Light mode: 50 fill + 600 stroke + 800 title / 600 subtitle
- Dark mode: 800 fill + 200 stroke + 100 title / 200 subtitle
### Typography
Only two font sizes. No exceptions.
| Class | Size | Weight | Use |
|-------|------|--------|-----|
| `th` | 14px | 500 | Node titles, region labels |
| `ts` | 12px | 400 | Subtitles, descriptions, arrow labels |
| `t` | 14px | 400 | General text |
- **Sentence case always.** Never Title Case, never ALL CAPS.
- Every `<text>` MUST carry a class (`t`, `ts`, or `th`). No unclassed text.
- `dominant-baseline="central"` on all text inside boxes.
- `text-anchor="middle"` for centered text in boxes.
**Width estimation (approx):**
- 14px weight 500: ~8px per character
- 12px weight 400: ~6.5px per character
- Always verify: `box_width >= (char_count × px_per_char) + 48` (24px padding each side)
### Spacing & Layout
- **ViewBox**: `viewBox="0 0 680 H"` where H = content height + 40px buffer.
- **Safe area**: x=40 to x=640, y=40 to y=(H-40).
- **Between boxes**: 60px minimum gap.
- **Inside boxes**: 24px horizontal padding, 12px vertical padding.
- **Arrowhead gap**: 10px between arrowhead and box edge.
- **Single-line box**: 44px height.
- **Two-line box**: 56px height, 18px between title and subtitle baselines.
- **Container padding**: 20px minimum inside every container.
- **Max nesting**: 2-3 levels deep. Deeper gets unreadable at 680px width.
### Stroke & Shape
- **Stroke width**: 0.5px on all node borders. Not 1px, not 2px.
- **Rect rounding**: `rx="8"` for nodes, `rx="12"` for inner containers, `rx="16"` to `rx="20"` for outer containers.
- **Connector paths**: MUST have `fill="none"`. SVG defaults to `fill: black` otherwise.
### Arrow Marker
Include this `<defs>` block at the start of **every** SVG:
```xml
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
```
Use `marker-end="url(#arrow)"` on lines. The arrowhead inherits the line color via `context-stroke`.
### CSS Classes (Provided by the Template)
The template page provides:
- Text: `.t`, `.ts`, `.th`
- Neutral: `.box`, `.arr`, `.leader`, `.node`
- Color ramps: `.c-purple`, `.c-teal`, `.c-coral`, `.c-pink`, `.c-gray`, `.c-blue`, `.c-green`, `.c-amber`, `.c-red` (all with automatic light/dark mode)
You do **not** need to redefine these — just apply them in your SVG. The template file contains the full CSS definitions.
---
## SVG Boilerplate
Every SVG inside the template page starts with this exact structure:
```xml
<svg width="100%" viewBox="0 0 680 {HEIGHT}" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
<!-- Diagram content here -->
</svg>
```
Replace `{HEIGHT}` with the actual computed height (last element bottom + 40px).
### Node Patterns
**Single-line node (44px):**
```xml
<g class="node c-blue">
<rect x="100" y="20" width="180" height="44" rx="8" stroke-width="0.5"/>
<text class="th" x="190" y="42" text-anchor="middle" dominant-baseline="central">Service name</text>
</g>
```
**Two-line node (56px):**
```xml
<g class="node c-teal">
<rect x="100" y="20" width="200" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="200" y="38" text-anchor="middle" dominant-baseline="central">Service name</text>
<text class="ts" x="200" y="56" text-anchor="middle" dominant-baseline="central">Short description</text>
</g>
```
**Connector (no label):**
```xml
<line x1="200" y1="76" x2="200" y2="120" class="arr" marker-end="url(#arrow)"/>
```
**Container (dashed or solid):**
```xml
<g class="c-purple">
<rect x="40" y="92" width="600" height="300" rx="16" stroke-width="0.5"/>
<text class="th" x="66" y="116">Container label</text>
<text class="ts" x="66" y="134">Subtitle info</text>
</g>
```
---
## Diagram Types
Choose the layout that fits the subject:
1. **Flowchart** — CI/CD pipelines, request lifecycles, approval workflows, data processing. Single-direction flow (top-down or left-right). Max 4-5 nodes per row.
2. **Structural / Containment** — Cloud infrastructure nesting, system architecture with layers. Large outer containers with inner regions. Dashed rects for logical groupings.
3. **API / Endpoint Map** — REST routes, GraphQL schemas. Tree from root, branching to resource groups, each containing endpoint nodes.
4. **Microservice Topology** — Service mesh, event-driven systems. Services as nodes, arrows for communication patterns, message queues between.
5. **Data Flow** — ETL pipelines, streaming architectures. Left-to-right flow from sources through processing to sinks.
6. **Physical / Structural** — Vehicles, buildings, hardware, anatomy. Use shapes that match the physical form — `<path>` for curved bodies, `<polygon>` for tapered shapes, `<ellipse>`/`<circle>` for cylindrical parts, nested `<rect>` for compartments. See `references/physical-shape-cookbook.md`.
7. **Infrastructure / Systems Integration** — Smart cities, IoT networks, multi-domain systems. Hub-spoke layout with central platform connecting subsystems. Semantic line styles (`.data-line`, `.power-line`, `.water-pipe`, `.road`). See `references/infrastructure-patterns.md`.
8. **UI / Dashboard Mockups** — Admin panels, monitoring dashboards. Screen frame with nested chart/gauge/indicator elements. See `references/dashboard-patterns.md`.
For physical, infrastructure, and dashboard diagrams, load the matching reference file before generating — each one provides ready-made CSS classes and shape primitives.
---
## Validation Checklist
Before finalizing any SVG, verify ALL of the following:
1. Every `<text>` has class `t`, `ts`, or `th`.
2. Every `<text>` inside a box has `dominant-baseline="central"`.
3. Every connector `<path>` or `<line>` used as arrow has `fill="none"`.
4. No arrow line crosses through an unrelated box.
5. `box_width >= (longest_label_chars × 8) + 48` for 14px text.
6. `box_width >= (longest_label_chars × 6.5) + 48` for 12px text.
7. ViewBox height = bottom-most element + 40px.
8. All content stays within x=40 to x=640.
9. Color classes (`c-*`) are on `<g>` or shape elements, never on `<path>` connectors.
10. Arrow `<defs>` block is present.
11. No gradients, shadows, blur, or glow effects.
12. Stroke width is 0.5px on all node borders.
---
## Output & Preview
### Default: standalone HTML file
Write a single `.html` file the user can open directly. No server, no dependencies, works offline. Pattern:
```python
# 1. Load the template
template = skill_view("concept-diagrams", "templates/template.html")
# 2. Fill in title, subtitle, and paste your SVG
html = template.replace(
"<!-- DIAGRAM TITLE HERE -->", "SN2 reaction mechanism"
).replace(
"<!-- OPTIONAL SUBTITLE HERE -->", "Bimolecular nucleophilic substitution"
).replace(
"<!-- PASTE SVG HERE -->", svg_content
)
# 3. Write to a user-chosen path (or ./ by default)
write_file("./sn2-mechanism.html", html)
```
Tell the user how to open it:
```
# macOS
open ./sn2-mechanism.html
# Linux
xdg-open ./sn2-mechanism.html
```
### Optional: local preview server (multi-diagram gallery)
Only use this when the user explicitly wants a browsable gallery of multiple diagrams.
**Rules:**
- Bind to `127.0.0.1` only. Never `0.0.0.0`. Exposing diagrams on all network interfaces is a security hazard on shared networks.
- Pick a free port (do NOT hard-code one) and tell the user the chosen URL.
- The server is optional and opt-in — prefer the standalone HTML file first.
Recommended pattern (lets the OS pick a free ephemeral port):
```bash
# Put each diagram in its own folder under .diagrams/
mkdir -p .diagrams/sn2-mechanism
# ...write .diagrams/sn2-mechanism/index.html...
# Serve on loopback only, free port
cd .diagrams && python3 -c "
import http.server, socketserver
with socketserver.TCPServer(('127.0.0.1', 0), http.server.SimpleHTTPRequestHandler) as s:
print(f'Serving at http://127.0.0.1:{s.server_address[1]}/')
s.serve_forever()
" &
```
If the user insists on a fixed port, use `127.0.0.1:<port>` — still never `0.0.0.0`. Document how to stop the server (`kill %1` or `pkill -f "http.server"`).
---
## Examples Reference
The `examples/` directory ships 15 complete, tested diagrams. Browse them for working patterns before writing a new diagram of a similar type:
| File | Type | Demonstrates |
|------|------|--------------|
| `hospital-emergency-department-flow.md` | Flowchart | Priority routing with semantic colors |
| `feature-film-production-pipeline.md` | Flowchart | Phased workflow, horizontal sub-flows |
| `automated-password-reset-flow.md` | Flowchart | Auth flow with error branches |
| `autonomous-llm-research-agent-flow.md` | Flowchart | Loop-back arrows, decision branches |
| `place-order-uml-sequence.md` | Sequence | UML sequence diagram style |
| `commercial-aircraft-structure.md` | Physical | Paths, polygons, ellipses for realistic shapes |
| `wind-turbine-structure.md` | Physical cross-section | Underground/above-ground separation, color coding |
| `smartphone-layer-anatomy.md` | Exploded view | Alternating left/right labels, layered components |
| `apartment-floor-plan-conversion.md` | Floor plan | Walls, doors, proposed changes in dotted red |
| `banana-journey-tree-to-smoothie.md` | Narrative journey | Winding path, progressive state changes |
| `cpu-ooo-microarchitecture.md` | Hardware pipeline | Fan-out, memory hierarchy sidebar |
| `sn2-reaction-mechanism.md` | Chemistry | Molecules, curved arrows, energy profile |
| `smart-city-infrastructure.md` | Hub-spoke | Semantic line styles per system |
| `electricity-grid-flow.md` | Multi-stage flow | Voltage hierarchy, flow markers |
| `ml-benchmark-grouped-bar-chart.md` | Chart | Grouped bars, dual axis |
Load any example with:
```
skill_view(name="concept-diagrams", file_path="examples/<filename>")
```
---
## Quick Reference: What to Use When
| User says | Diagram type | Suggested colors |
|-----------|--------------|------------------|
| "show the pipeline" | Flowchart | gray start/end, purple steps, red errors, teal deploy |
| "draw the data flow" | Data pipeline (left-right) | gray sources, purple processing, teal sinks |
| "visualize the system" | Structural (containment) | purple container, teal services, coral data |
| "map the endpoints" | API tree | purple root, one ramp per resource group |
| "show the services" | Microservice topology | gray ingress, teal services, purple bus, coral workers |
| "draw the aircraft/vehicle" | Physical | paths, polygons, ellipses for realistic shapes |
| "smart city / IoT" | Hub-spoke integration | semantic line styles per subsystem |
| "show the dashboard" | UI mockup | dark screen, chart colors: teal, purple, coral for alerts |
| "power grid / electricity" | Multi-stage flow | voltage hierarchy (HV/MV/LV line weights) |
| "wind turbine / turbine" | Physical cross-section | foundation + tower cutaway + nacelle color-coded |
| "journey of X / lifecycle" | Narrative journey | winding path, progressive state changes |
| "layers of X / exploded" | Exploded layer view | vertical stack, alternating labels |
| "CPU / pipeline" | Hardware pipeline | vertical stages, fan-out to execution ports |
| "floor plan / apartment" | Floor plan | walls, doors, proposed changes in dotted red |
| "reaction mechanism" | Chemistry | atoms, bonds, curved arrows, transition state, energy profile |
@@ -0,0 +1,244 @@
# Apartment Floor Plan: 3 BHK to 4 BHK Conversion
An architectural floor plan showing a 1,500 sq ft apartment with proposed modifications to convert from 3 BHK to 4 BHK. Demonstrates architectural drawing conventions, room layouts, proposed changes with dotted lines, and area comparison tables.
## Key Patterns Used
- **Architectural floor plan**: Top-down view with walls, doors, windows
- **Proposed modifications**: Dotted red lines for new walls
- **Room color coding**: Light fills to distinguish room types
- **Circulation paths**: Arrows showing new access routes
- **Data table**: Before/after area comparison with highlighting
- **Architectural symbols**: North arrow, scale bar, door swings
## Diagram Type
This is an **architectural floor plan** with:
- **Plan view**: Top-down orthographic projection
- **Overlay technique**: Existing structure + proposed changes
- **Quantitative data**: Area measurements and comparison table
## Architectural Drawing Elements
### Wall Styles
```xml
<!-- Outer walls (thick) -->
<line class="wall" x1="0" y1="0" x2="560" y2="0"/>
<!-- Internal walls (thinner) -->
<line class="wall-thin" x1="180" y1="0" x2="180" y2="140"/>
<!-- Proposed new walls (dotted red) -->
<line class="proposed-wall" x1="125" y1="170" x2="125" y2="330"/>
```
```css
.wall { stroke: var(--text-primary); stroke-width: 6; fill: none; stroke-linecap: square; }
.wall-thin { stroke: var(--text-primary); stroke-width: 3; fill: none; }
.proposed-wall { stroke: #A32D2D; stroke-width: 4; fill: none; stroke-dasharray: 8 4; }
```
### Door Symbols
```xml
<!-- Door opening with swing arc -->
<rect x="150" y="137" width="25" height="6" fill="var(--bg-primary)"/>
<path class="door" d="M150,140 L150,165"/>
<path class="door-swing" d="M150,140 A25,25 0 0,0 175,140"/>
<!-- Sliding door (balcony) -->
<rect x="60" y="327" width="60" height="6" fill="var(--bg-primary)" stroke="var(--text-secondary)" stroke-width="1"/>
<line x1="60" y1="330" x2="90" y2="330" stroke="var(--text-secondary)" stroke-width="2"/>
<line x1="90" y1="330" x2="120" y2="330" stroke="var(--text-secondary)" stroke-width="2" stroke-dasharray="3 3"/>
<!-- Proposed door (dotted) -->
<rect x="143" y="292" width="22" height="6" fill="var(--bg-primary)" stroke="#A32D2D" stroke-width="1" stroke-dasharray="3 2"/>
<path d="M165,295 A22,22 0 0,0 165,273" stroke="#A32D2D" stroke-width="1" stroke-dasharray="3 2" fill="none"/>
```
```css
.door { stroke: var(--text-secondary); stroke-width: 1.5; fill: none; }
.door-swing { stroke: var(--text-tertiary); stroke-width: 1; fill: none; stroke-dasharray: 3 2; }
```
### Window Symbols
```xml
<!-- Window with glass indication -->
<rect class="window" x="-3" y="30" width="6" height="50"/>
<line class="window-glass" x1="0" y1="35" x2="0" y2="75"/>
<!-- Horizontal window (top wall) -->
<rect class="window" x="220" y="-3" width="60" height="6"/>
<line class="window-glass" x1="225" y1="0" x2="275" y2="0"/>
```
```css
.window { stroke: var(--text-primary); stroke-width: 1; fill: var(--bg-primary); }
.window-glass { stroke: #378ADD; stroke-width: 2; fill: none; }
```
### Room Fills
```xml
<!-- Different colors for room types -->
<rect class="room-master" x="3" y="3" width="174" height="134" rx="2"/>
<rect class="room-bed2" x="183" y="3" width="134" height="104" rx="2"/>
<rect class="room-living" x="3" y="173" width="554" height="154" rx="2"/>
<rect class="room-kitchen" x="443" y="3" width="114" height="104" rx="2"/>
<rect class="room-bath" x="183" y="113" width="54" height="54" rx="2"/>
<!-- Proposed new room (highlighted) -->
<rect class="room-new" x="3" y="223" width="120" height="104"/>
```
```css
.room-master { fill: rgba(206, 203, 246, 0.3); } /* purple tint */
.room-bed2 { fill: rgba(159, 225, 203, 0.3); } /* teal tint */
.room-bed3 { fill: rgba(250, 199, 117, 0.3); } /* amber tint */
.room-living { fill: rgba(245, 196, 179, 0.3); } /* coral tint */
.room-kitchen { fill: rgba(237, 147, 177, 0.3); } /* pink tint */
.room-bath { fill: rgba(133, 183, 235, 0.3); } /* blue tint */
.room-new { fill: rgba(163, 45, 45, 0.15); } /* red tint for proposed */
```
### Support Fixtures
```xml
<!-- Kitchen counter hint -->
<rect x="450" y="15" width="50" height="25" fill="none" stroke="var(--text-tertiary)" stroke-width="0.5" rx="2"/>
<text class="tx" x="475" y="30" text-anchor="middle">Counter</text>
<!-- Balcony (dashed outline) -->
<rect class="balcony-fill" x="3" y="333" width="200" height="50"/>
```
```css
.balcony { fill: none; stroke: var(--text-secondary); stroke-width: 2; stroke-dasharray: 6 3; }
.balcony-fill { fill: rgba(93, 202, 165, 0.1); }
```
### Room Labels
```xml
<!-- Room name and area -->
<text class="room-label" x="90" y="65" text-anchor="middle">MASTER</text>
<text class="room-label" x="90" y="78" text-anchor="middle">BEDROOM</text>
<text class="area-label" x="90" y="95" text-anchor="middle">195 sq ft</text>
<!-- Proposed room (in red) -->
<text class="room-label" x="63" y="268" text-anchor="middle" fill="#A32D2D">BEDROOM 4</text>
<text class="tx" x="63" y="282" text-anchor="middle" fill="#A32D2D">(NEW)</text>
```
```css
.room-label { font-family: system-ui; font-size: 11px; fill: var(--text-primary); font-weight: 500; }
.area-label { font-family: system-ui; font-size: 9px; fill: var(--text-tertiary); }
```
### Circulation Arrow
```xml
<defs>
<marker id="circ-arrow" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="6" markerHeight="6" orient="auto">
<path d="M0,0 L10,5 L0,10 Z" class="circulation-fill"/>
</marker>
</defs>
<path class="circulation" d="M300,250 L200,250 L145,250 L145,280" marker-end="url(#circ-arrow)"/>
<text class="tx" x="250" y="242" fill="#3B6D11" font-weight="500">New corridor access</text>
```
```css
.circulation { stroke: #3B6D11; stroke-width: 2; fill: none; }
.circulation-fill { fill: #3B6D11; }
```
### North Arrow and Scale Bar
```xml
<!-- North arrow -->
<g transform="translate(520, 260)">
<circle cx="0" cy="0" r="20" fill="none" stroke="var(--text-tertiary)" stroke-width="0.5"/>
<polygon points="0,-18 -5,5 0,0 5,5" fill="var(--text-primary)"/>
<text class="tx" x="0" y="-22" text-anchor="middle">N</text>
</g>
<!-- Scale bar -->
<g transform="translate(420, 300)">
<line x1="0" y1="0" x2="100" y2="0" stroke="var(--text-primary)" stroke-width="2"/>
<line x1="0" y1="-5" x2="0" y2="5" stroke="var(--text-primary)" stroke-width="1"/>
<line x1="50" y1="-3" x2="50" y2="3" stroke="var(--text-primary)" stroke-width="1"/>
<line x1="100" y1="-5" x2="100" y2="5" stroke="var(--text-primary)" stroke-width="1"/>
<text class="tx" x="0" y="15" text-anchor="middle">0</text>
<text class="tx" x="50" y="15" text-anchor="middle">5'</text>
<text class="tx" x="100" y="15" text-anchor="middle">10'</text>
</g>
```
## Area Comparison Table
### Table Structure
```xml
<!-- Header row -->
<rect class="table-header" x="0" y="0" width="180" height="28" rx="4 4 0 0"/>
<text class="ts" x="90" y="18" text-anchor="middle" font-weight="500">Room</text>
<!-- Normal row -->
<rect class="table-row" x="0" y="28" width="180" height="24"/>
<text class="tx" x="10" y="44">Master Bedroom</text>
<text class="tx" x="230" y="44" text-anchor="middle">195</text>
<!-- Alternating row -->
<rect class="table-row-alt" x="0" y="52" width="180" height="24"/>
<!-- Highlighted row (for changes) -->
<rect class="table-highlight" x="0" y="100" width="180" height="24"/>
<text class="tx" x="10" y="116" fill="#A32D2D" font-weight="500">Bedroom 4 (NEW)</text>
<text class="tx" x="430" y="116" text-anchor="middle" fill="#3B6D11">+100</text>
<!-- Total row -->
<rect x="0" y="268" width="180" height="28" fill="var(--bg-secondary)" stroke="var(--border)" stroke-width="1"/>
<text class="ts" x="10" y="286" font-weight="500">TOTAL CARPET AREA</text>
```
```css
.table-header { fill: var(--bg-secondary); }
.table-row { fill: var(--bg-primary); stroke: var(--border); stroke-width: 0.5; }
.table-row-alt { fill: var(--bg-tertiary); stroke: var(--border); stroke-width: 0.5; }
.table-highlight { fill: rgba(163, 45, 45, 0.1); stroke: #A32D2D; stroke-width: 0.5; }
```
## Layout Notes
- **ViewBox**: 800×780 (portrait for floor plan + table)
- **Scale**: 10px = 1 foot (apartment ~50ft × 33ft)
- **Floor plan origin**: Offset at (50, 60) for margins
- **Wall thickness**: 6px outer, 3px inner (represents ~6" walls)
- **Room labels**: Centered in each room with area below
- **Table placement**: Below floor plan with full width
## Color Coding
| Element | Color | Usage |
|---------|-------|-------|
| Proposed walls | Red (#A32D2D) dotted | New construction |
| New room fill | Red 15% opacity | Bedroom 4 area |
| Circulation | Green (#3B6D11) | New access path |
| Window glass | Blue (#378ADD) | Glass indication |
| Bedrooms | Purple/Teal/Amber tints | Room differentiation |
| Wet areas | Blue tint | Bathrooms |
| Living | Coral tint | Common areas |
## When to Use This Pattern
Use this diagram style for:
- Apartment/house floor plans
- Office layout planning
- Renovation proposals showing before/after
- Space planning with area calculations
- Real estate marketing materials
- Interior design presentations
- Building permit documentation
@@ -0,0 +1,276 @@
# Automated Password Reset Flow
A two-section flowchart tracing the full user journey for a web application password reset: the initial request phase (forgot password → email check → token generation) and the reset-form phase (link click → new password entry → token/password validation). Demonstrates multi-exit decision diamonds, a three-column branching layout, a loop-back path, and a cross-section separator arrow.
## Key Patterns Used
- **Three-column layout**: Left column (error/terminal branches at cx=115), center column (main happy path at cx=340), right column (expired-token branch at cx=552) — allows side branches to live at the same y-level as center nodes without overlap
- **Decision diamonds with `<polygon>`**: Each decision uses a `<g class="decision">` wrapper containing a `<polygon>` and centered `<text>`; the diamond points are computed as `cx±hw, cy±hh` (hw=100, hh=28)
- **Pill-shaped terminals**: Start and end nodes use `rx=22` on their `<rect>` to signal entry/exit points; all mid-flow process nodes use `rx=8`
- **Three-branch decision paths**: Each diamond has a "Yes" branch (down, short `<line>`) and a "No" branch (`<path>` going horizontal then vertical to a side column)
- **Loop-back path**: Mismatch error node loops back to the password-entry node via a routing corridor at x=215 — a 5-px gap between the left column (right edge x=210) and center column (left edge x=220); the path exits the bottom of the error node, drops below it, travels right to x=215, then goes up to the target node's center y, then right 5 px into the node's left edge
- **Section separator**: A dashed horizontal `<line>` at y=452 splits the two phases; the connecting arrow crosses it with a faded label ("user receives email") to preserve flow continuity
- **Italic annotation**: The exact UX copy for the generic message ("If that email exists…") is shown as a faded italic `ts` text block below the left-branch terminal node
- **Legend row**: Five inline swatches (gray, purple, teal, red, amber diamond) at the bottom explain the color-to-role mapping
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 960" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
<!--
Column layout (680px viewBox, safe area x=40640):
Left col : x=20, w=190, cx=115 (error / terminal branches)
Center col: x=220, w=240, cx=340 (main happy path)
Right col: x=465, w=175, cx=552 (expired-token branch)
Loop corridor at x=215 (5-px gap between left and center cols)
-->
<!-- ═══ SECTION 1 — Forgot password request ═══ -->
<text class="ts" x="40" y="38" opacity=".45">Section 1 — Forgot password request</text>
<!-- START terminal (pill rx=22 signals start/end) -->
<g class="c-gray">
<rect x="220" y="46" width="240" height="44" rx="22"/>
<text class="th" x="340" y="68" text-anchor="middle" dominant-baseline="central">User: &quot;Forgot password&quot;</text>
</g>
<line x1="340" y1="90" x2="340" y2="108" class="arr" marker-end="url(#arrow)"/>
<!-- N2 · Enter email -->
<g class="c-gray">
<rect x="220" y="108" width="240" height="44" rx="8"/>
<text class="th" x="340" y="130" text-anchor="middle" dominant-baseline="central">Enter email address</text>
</g>
<line x1="340" y1="152" x2="340" y2="172" class="arr" marker-end="url(#arrow)"/>
<!-- D1 · Email in system? diamond: center=(340,200) hw=100 hh=28 -->
<g class="decision">
<polygon points="340,172 440,200 340,228 240,200"/>
<text class="th" x="340" y="200" text-anchor="middle" dominant-baseline="central">Email in system?</text>
</g>
<!-- D1 "No" → left column -->
<path d="M 240,200 L 115,200 L 115,248" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="178" y="193" text-anchor="middle" opacity=".75">No</text>
<!-- D1 "Yes" → continue down -->
<line x1="340" y1="228" x2="340" y2="248" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="348" y="242" text-anchor="start" opacity=".75">Yes</text>
<!-- ── Left branch (D1 = No): generic security message → end ── -->
<!-- L1 · Generic message (security: never confirm email existence) -->
<g class="c-gray">
<rect x="20" y="248" width="190" height="56" rx="8"/>
<text class="th" x="115" y="269" text-anchor="middle" dominant-baseline="central">Generic message shown</text>
<text class="ts" x="115" y="287" text-anchor="middle" dominant-baseline="central">Email sent if found</text>
</g>
<line x1="115" y1="304" x2="115" y2="324" class="arr" marker-end="url(#arrow)"/>
<!-- L2 · End terminal (left) -->
<g class="c-gray">
<rect x="20" y="324" width="190" height="44" rx="22"/>
<text class="th" x="115" y="346" text-anchor="middle" dominant-baseline="central">Request handled</text>
</g>
<!-- Italic annotation: actual UX copy shown below the end node -->
<text class="ts" x="20" y="384" opacity=".45" font-style="italic">&quot;If that email exists, a reset</text>
<text class="ts" x="20" y="398" opacity=".45" font-style="italic">link has been sent.&quot;</text>
<!-- ── Center Yes branch: system generates & sends token ── -->
<!-- N3 · Generate unique token -->
<g class="c-purple">
<rect x="220" y="248" width="240" height="56" rx="8"/>
<text class="th" x="340" y="269" text-anchor="middle" dominant-baseline="central">Generate unique token</text>
<text class="ts" x="340" y="287" text-anchor="middle" dominant-baseline="central">Time-limited, cryptographic</text>
</g>
<line x1="340" y1="304" x2="340" y2="324" class="arr" marker-end="url(#arrow)"/>
<!-- N4 · Store token + user ID -->
<g class="c-purple">
<rect x="220" y="324" width="240" height="44" rx="8"/>
<text class="th" x="340" y="346" text-anchor="middle" dominant-baseline="central">Store token + user ID</text>
</g>
<line x1="340" y1="368" x2="340" y2="388" class="arr" marker-end="url(#arrow)"/>
<!-- N5 · Send reset email -->
<g class="c-teal">
<rect x="220" y="388" width="240" height="44" rx="8"/>
<text class="th" x="340" y="410" text-anchor="middle" dominant-baseline="central">Send reset link via email</text>
</g>
<!-- ═══ Section separator ═══ -->
<line x1="40" y1="452" x2="640" y2="452"
stroke="var(--border)" stroke-width="1" stroke-dasharray="8 5"/>
<!-- Arrow crossing separator (with inline label) -->
<line x1="340" y1="432" x2="340" y2="472" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="348" y="448" text-anchor="start" opacity=".55">user receives email</text>
<text class="ts" x="40" y="464" opacity=".45">Section 2 — Password reset form</text>
<!-- ═══ SECTION 2 — Password reset form ═══ -->
<!-- N6 · User clicks reset link -->
<g class="c-gray">
<rect x="220" y="480" width="240" height="44" rx="8"/>
<text class="th" x="340" y="502" text-anchor="middle" dominant-baseline="central">User clicks reset link</text>
</g>
<line x1="340" y1="524" x2="340" y2="544" class="arr" marker-end="url(#arrow)"/>
<!-- N7 · Enter new password ×2 -->
<g class="c-gray">
<rect x="220" y="544" width="240" height="56" rx="8"/>
<text class="th" x="340" y="565" text-anchor="middle" dominant-baseline="central">Enter new password ×2</text>
<text class="ts" x="340" y="583" text-anchor="middle" dominant-baseline="central">Confirm both passwords match</text>
</g>
<line x1="340" y1="600" x2="340" y2="620" class="arr" marker-end="url(#arrow)"/>
<!-- D2 · Token expired? diamond: center=(340,648) hw=100 hh=28 -->
<g class="decision">
<polygon points="340,620 440,648 340,676 240,648"/>
<text class="th" x="340" y="648" text-anchor="middle" dominant-baseline="central">Token expired?</text>
</g>
<!-- D2 "Yes" → right column (expired-token branch) -->
<path d="M 440,648 L 552,648 L 552,692" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="496" y="641" text-anchor="middle" opacity=".75">Yes</text>
<!-- D2 "No" → down to password-match check -->
<line x1="340" y1="676" x2="340" y2="714" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="348" y="698" text-anchor="start" opacity=".75">No</text>
<!-- ── Right branch (D2 = Yes): token expired → dead end ── -->
<!-- R1 · Token expired error -->
<g class="c-red">
<rect x="465" y="692" width="175" height="56" rx="8"/>
<text class="th" x="552" y="713" text-anchor="middle" dominant-baseline="central">Token expired</text>
<text class="ts" x="552" y="731" text-anchor="middle" dominant-baseline="central">Show expiry error</text>
</g>
<line x1="552" y1="748" x2="552" y2="768" class="arr" marker-end="url(#arrow)"/>
<!-- R2 · End terminal (right) -->
<g class="c-gray">
<rect x="465" y="768" width="175" height="44" rx="22"/>
<text class="th" x="552" y="790" text-anchor="middle" dominant-baseline="central">End — request again</text>
</g>
<!-- D3 · Passwords match? diamond: center=(340,742) hw=100 hh=28 -->
<g class="decision">
<polygon points="340,714 440,742 340,770 240,742"/>
<text class="th" x="340" y="742" text-anchor="middle" dominant-baseline="central">Passwords match?</text>
</g>
<!-- D3 "No" → left column (mismatch branch) -->
<path d="M 240,742 L 115,742 L 115,786" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="178" y="735" text-anchor="middle" opacity=".75">No</text>
<!-- D3 "Yes" → down to reset -->
<line x1="340" y1="770" x2="340" y2="790" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="348" y="783" text-anchor="start" opacity=".75">Yes</text>
<!-- ── Left branch (D3 = No): passwords don't match → loop back ── -->
<!-- L3 · Password mismatch error -->
<g class="c-red">
<rect x="20" y="786" width="190" height="56" rx="8"/>
<text class="th" x="115" y="807" text-anchor="middle" dominant-baseline="central">Password mismatch</text>
<text class="ts" x="115" y="825" text-anchor="middle" dominant-baseline="central">Passwords do not match</text>
</g>
<!-- Loop-back arrow: exits L3 bottom → drops to y=862 →
travels right to corridor x=215 → climbs to N7 center y=572 →
enters N7 left edge at (220, 572) pointing right -->
<path d="M 115,842 L 115,862 L 215,862 L 215,572 L 220,572"
class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="224" y="538" text-anchor="start" opacity=".6">retry</text>
<!-- ── Center Yes branch (D3 = Yes): reset password & invalidate token ── -->
<!-- N8 · Reset password -->
<g class="c-teal">
<rect x="220" y="790" width="240" height="56" rx="8"/>
<text class="th" x="340" y="811" text-anchor="middle" dominant-baseline="central">Reset password</text>
<text class="ts" x="340" y="829" text-anchor="middle" dominant-baseline="central">Invalidate used token</text>
</g>
<line x1="340" y1="846" x2="340" y2="866" class="arr" marker-end="url(#arrow)"/>
<!-- N9 · Success terminal -->
<g class="c-green">
<rect x="220" y="866" width="240" height="44" rx="22"/>
<text class="th" x="340" y="888" text-anchor="middle" dominant-baseline="central">Password reset complete</text>
</g>
<!-- ═══ Legend ═══ -->
<text class="ts" x="40" y="930" opacity=".4">Legend —</text>
<rect x="108" y="920" width="13" height="13" rx="2" fill="#F1EFE8" stroke="#5F5E5A" stroke-width="0.5"/>
<text class="ts" x="126" y="930" opacity=".7">User action</text>
<rect x="210" y="920" width="13" height="13" rx="2" fill="#EEEDFE" stroke="#534AB7" stroke-width="0.5"/>
<text class="ts" x="228" y="930" opacity=".7">System process</text>
<rect x="334" y="920" width="13" height="13" rx="2" fill="#E1F5EE" stroke="#0F6E56" stroke-width="0.5"/>
<text class="ts" x="352" y="930" opacity=".7">Email / success</text>
<rect x="455" y="920" width="13" height="13" rx="2" fill="#FCEBEB" stroke="#A32D2D" stroke-width="0.5"/>
<text class="ts" x="473" y="930" opacity=".7">Error state</text>
<polygon points="556,926 566,932 556,938 546,932" fill="#FAEEDA" stroke="#854F0B" stroke-width="0.5"/>
<text class="ts" x="572" y="932" opacity=".7">Decision</text>
</svg>
```
## Custom CSS
Add these classes to the hosting page `<style>` block (in addition to the standard skill CSS):
```css
/* Decision diamond — amber fill, same palette as c-amber */
.decision > polygon { fill: #FAEEDA; stroke: #854F0B; stroke-width: 0.5; }
.decision > .th { fill: #633806; }
@media (prefers-color-scheme: dark) {
.decision > polygon { fill: #633806; stroke: #EF9F27; }
.decision > .th { fill: #FAC775; }
}
```
## Color Assignments
| Element | Color | Reason |
|---------|-------|--------|
| Start / end terminals | `c-gray` | Neutral entry and exit points |
| User actions (enter email, click link, enter password) | `c-gray` | User-facing steps with no system processing |
| Generic message + request-handled terminal | `c-gray` | Intentionally neutral — the security message must not reveal data |
| Generate & store token | `c-purple` | Backend system operations |
| Send reset email | `c-teal` | Positive external action (outbound communication) |
| Token expired error | `c-red` | Failure / blocking error state |
| Password mismatch error | `c-red` | Validation failure |
| Reset password + success | `c-teal` / `c-green` | Positive outcome: teal for the action, green pill for the terminal |
| Decision diamonds | `c-amber` (custom `.decision`) | Warning / branch point — matches amber semantic meaning |
## Layout Notes
- **ViewBox**: 680×960 — tall flowchart with two phases
- **Three-column structure**: Left (cx=115), center (cx=340), right (cx=552) — each branch stays within its column; only `<path>` arrows cross column boundaries
- **Diamond formula**: `<polygon points="cx,cy-hh cx+hw,cy cx,cy+hh cx-hw,cy"/>` with hw=100, hh=28 gives a 200×56px diamond that sits flush with the center column (x=220460)
- **Branch routing pattern**: "No" paths use `<path d="M left_point,cy L side_cx,cy L side_cx,node_top">` — one horizontal segment + one vertical segment, no curves needed
- **Loop corridor**: The 5-px gap at x=210220 between left and center columns provides a clean vertical channel for the loop-back path without any node overlap; the path exits node bottom, drops 20px, goes right to x=215, climbs to target y, enters from left
- **Section separator**: A dashed `<line>` at y=452 with `stroke-dasharray="8 5"` provides a visual phase break; the single connecting arrow crosses it at center, with a faded label on the arrow
- **Pill terminals**: `rx=22` (half the 44px node height) produces a perfect capsule/pill shape — use this consistently for all start/end terminals
- **Error annotation**: The exact UX copy is rendered as faded (`opacity=".45"`) italic `ts` text below the relevant node, keeping it informative without cluttering the flow
@@ -0,0 +1,240 @@
# Autonomous LLM Research Agent Flow
A multi-section flowchart showing Karpathy's autoresearch framework: human-agent handoff, the autonomous experiment loop with keep/discard decision branching, and the modifiable training pipeline. Demonstrates loop-back arrows, convergent decision paths, and semantic color coding for outcomes.
## Key Patterns Used
- **Three-section layout**: Setup row, main loop container, and detail container — each visually distinct
- **Neutral dashed containers**: Loop and training pipeline use `var(--bg-secondary)` fill with dashed borders to recede behind colored content nodes
- **Decision branching with convergence**: "val_bpb improved?" splits into Keep (green) and Discard (red), then both converge back to "Log to results.tsv"
- **Loop-back arrow**: Dashed path with rounded corners on the right side of the container showing infinite repetition
- **Semantic color for outcomes**: Green = improvement (keep), Red = no improvement (discard) — not arbitrary decoration
- **Highlighted key step**: "Run training" uses `c-coral` to visually distinguish the most important step from other `c-teal` actions
- **Horizontal pipeline flow**: Training details section uses left-to-right arrow-connected nodes (GPT → MuonAdamW → Evaluation)
- **Footer metadata**: Fixed constraints shown as subtle centered text below the pipeline nodes
- **Legend row**: Color key at the bottom explaining what each color means
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 920" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
<!-- ========================================== -->
<!-- SECTION 1: SETUP (Human → program.md → AI) -->
<!-- ========================================== -->
<text class="ts" x="40" y="30" text-anchor="start" opacity=".5">One-time setup</text>
<!-- Human -->
<g class="node c-gray">
<rect x="60" y="42" width="140" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="130" y="62" text-anchor="middle" dominant-baseline="central">Human</text>
<text class="ts" x="130" y="82" text-anchor="middle" dominant-baseline="central">Researcher</text>
</g>
<!-- Arrow: Human → program.md -->
<line x1="200" y1="70" x2="250" y2="70" class="arr" marker-end="url(#arrow)"/>
<!-- program.md -->
<g class="node c-gray">
<rect x="250" y="42" width="180" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="340" y="62" text-anchor="middle" dominant-baseline="central">program.md</text>
<text class="ts" x="340" y="82" text-anchor="middle" dominant-baseline="central">Agent instructions</text>
</g>
<!-- Arrow: program.md → AI Agent -->
<line x1="430" y1="70" x2="470" y2="70" class="arr" marker-end="url(#arrow)"/>
<!-- AI Agent -->
<g class="node c-purple">
<rect x="470" y="42" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="550" y="62" text-anchor="middle" dominant-baseline="central">AI agent</text>
<text class="ts" x="550" y="82" text-anchor="middle" dominant-baseline="central">Claude / Codex</text>
</g>
<!-- Arrow: Setup row → Loop (from program.md center down) -->
<line x1="340" y1="98" x2="340" y2="142" class="arr" marker-end="url(#arrow)"/>
<!-- ========================================== -->
<!-- SECTION 2: AUTONOMOUS EXPERIMENT LOOP -->
<!-- ========================================== -->
<!-- Loop container (neutral dashed) -->
<g>
<rect x="40" y="142" width="600" height="528" rx="16"
stroke-width="1" stroke-dasharray="6 4"
fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="170">Autonomous experiment loop</text>
<text class="ts" x="66" y="188">~12 experiments/hour — runs until manually stopped</text>
</g>
<!-- Step 1: Read code + past results -->
<g class="node c-teal">
<rect x="170" y="208" width="280" height="44" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="230" text-anchor="middle" dominant-baseline="central">Read code + past results</text>
</g>
<!-- Arrow: S1 → S2 -->
<line x1="310" y1="252" x2="310" y2="274" class="arr" marker-end="url(#arrow)"/>
<!-- Step 2: Propose + edit train.py -->
<g class="node c-teal">
<rect x="170" y="274" width="280" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="294" text-anchor="middle" dominant-baseline="central">Propose + edit train.py</text>
<text class="ts" x="310" y="314" text-anchor="middle" dominant-baseline="central">Arch, optimizer, hyperparameters</text>
</g>
<!-- Arrow: S2 → S3 -->
<line x1="310" y1="330" x2="310" y2="352" class="arr" marker-end="url(#arrow)"/>
<!-- Step 3: Run training (highlighted — key step) -->
<g class="node c-coral">
<rect x="170" y="352" width="280" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="372" text-anchor="middle" dominant-baseline="central">Run training</text>
<text class="ts" x="310" y="392" text-anchor="middle" dominant-baseline="central">uv run train.py (5 min budget)</text>
</g>
<!-- Arrow: S3 → S4 -->
<line x1="310" y1="408" x2="310" y2="430" class="arr" marker-end="url(#arrow)"/>
<!-- Step 4: Decision — val_bpb improved? -->
<g class="node c-gray">
<rect x="170" y="430" width="280" height="44" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="452" text-anchor="middle" dominant-baseline="central">val_bpb improved?</text>
</g>
<!-- Decision arrows to Keep / Discard -->
<line x1="240" y1="474" x2="175" y2="508" class="arr" marker-end="url(#arrow)"/>
<line x1="380" y1="474" x2="445" y2="508" class="arr" marker-end="url(#arrow)"/>
<!-- Decision labels -->
<text class="ts" x="195" y="496" opacity=".6">yes</text>
<text class="ts" x="416" y="496" opacity=".6">no</text>
<!-- Keep — advance branch -->
<g class="node c-green">
<rect x="70" y="508" width="210" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="175" y="528" text-anchor="middle" dominant-baseline="central">Keep</text>
<text class="ts" x="175" y="548" text-anchor="middle" dominant-baseline="central">Advance git branch</text>
</g>
<!-- Discard — git reset -->
<g class="node c-red">
<rect x="340" y="508" width="210" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="445" y="528" text-anchor="middle" dominant-baseline="central">Discard</text>
<text class="ts" x="445" y="548" text-anchor="middle" dominant-baseline="central">Git reset to previous</text>
</g>
<!-- Converge arrows: Keep → Log, Discard → Log -->
<line x1="175" y1="564" x2="250" y2="590" class="arr" marker-end="url(#arrow)"/>
<line x1="445" y1="564" x2="370" y2="590" class="arr" marker-end="url(#arrow)"/>
<!-- Step 6: Log to results.tsv -->
<g class="node c-teal">
<rect x="170" y="590" width="280" height="44" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="612" text-anchor="middle" dominant-baseline="central">Log to results.tsv</text>
</g>
<!-- Loop-back arrow (dashed, right side) -->
<path d="M 450 612 L 564 612 Q 576 612 576 600 L 576 242 Q 576 230 564 230 L 450 230"
fill="none" class="arr" stroke-dasharray="4 3" marker-end="url(#arrow)"/>
<!-- ========================================== -->
<!-- SECTION 3: TRAINING PIPELINE DETAILS -->
<!-- ========================================== -->
<!-- Connection arrow: Loop → Training details -->
<line x1="310" y1="670" x2="310" y2="710" class="arr" marker-end="url(#arrow)"/>
<!-- Training container (neutral dashed) -->
<g>
<rect x="40" y="710" width="600" height="170" rx="16"
stroke-width="1" stroke-dasharray="6 4"
fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="738">train.py — modifiable training pipeline</text>
<text class="ts" x="66" y="756">Runs during each training step — single GPU, single file</text>
</g>
<!-- GPT model -->
<g class="node c-coral">
<rect x="70" y="774" width="155" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="147" y="794" text-anchor="middle" dominant-baseline="central">GPT model</text>
<text class="ts" x="147" y="814" text-anchor="middle" dominant-baseline="central">RoPE, FlashAttn3</text>
</g>
<!-- Arrow: GPT → MuonAdamW -->
<line x1="225" y1="802" x2="260" y2="802" class="arr" marker-end="url(#arrow)"/>
<!-- MuonAdamW optimizer -->
<g class="node c-coral">
<rect x="260" y="774" width="155" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="337" y="794" text-anchor="middle" dominant-baseline="central">MuonAdamW</text>
<text class="ts" x="337" y="814" text-anchor="middle" dominant-baseline="central">Hybrid optimizer</text>
</g>
<!-- Arrow: MuonAdamW → Evaluation -->
<line x1="415" y1="802" x2="450" y2="802" class="arr" marker-end="url(#arrow)"/>
<!-- Evaluation -->
<g class="node c-amber">
<rect x="450" y="774" width="155" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="527" y="794" text-anchor="middle" dominant-baseline="central">Evaluation</text>
<text class="ts" x="527" y="814" text-anchor="middle" dominant-baseline="central">val_bpb metric</text>
</g>
<!-- Footer: fixed constraints -->
<text class="ts" x="340" y="856" text-anchor="middle" opacity=".5">climbmix-400b data · 8K BPE vocab · 300s budget · 2048 context</text>
<!-- ========================================== -->
<!-- LEGEND -->
<!-- ========================================== -->
<g class="c-teal"><rect x="40" y="890" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="62" y="902">Agent actions</text>
<g class="c-coral"><rect x="170" y="890" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="192" y="902">Training run</text>
<g class="c-green"><rect x="300" y="890" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="322" y="902">Improvement</text>
<g class="c-red"><rect x="430" y="890" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="452" y="902">No improvement</text>
</svg>
```
## Color Assignments
| Element | Color | Reason |
|---------|-------|--------|
| Human, program.md | `c-gray` | Neutral setup / input nodes |
| AI agent | `c-purple` | The active intelligent actor |
| Loop action steps | `c-teal` | Agent's analytical/editing actions |
| Run training | `c-coral` | Highlighted key step — the 5-min training run |
| Decision check | `c-gray` | Neutral evaluation checkpoint |
| Keep (improved) | `c-green` | Semantic success — val_bpb decreased |
| Discard (not improved) | `c-red` | Semantic failure — no improvement |
| Training pipeline nodes | `c-coral` | Training infrastructure components |
| Evaluation node | `c-amber` | Distinct from training — measurement/metric role |
| Containers | Neutral (dashed) | Subtle grouping that recedes behind content |
## Layout Notes
- **ViewBox**: 680×920 (standard width, tall for 3 sections)
- **Three sections**: Setup row (y=3098), loop container (y=142670), training details (y=710880)
- **Container style**: Dashed border (`stroke-dasharray="6 4"`), neutral fill (`var(--bg-secondary)`), `stroke-width="1"` — not colored, so inner nodes pop
- **Loop-back arrow**: Dashed `<path>` with quadratic curves (`Q`) at corners for smooth rounded turns, running up the right side of the loop container from "Log" back to "Read code"
- **Decision pattern**: Single question node ("val_bpb improved?") with diagonal arrows to Keep/Discard, then convergent diagonal arrows back to "Log to results.tsv"
- **Decision labels**: "yes"/"no" labels placed along the diagonal arrows with `opacity=".6"` to stay subtle
- **Key step highlight**: "Run training" uses `c-coral` while surrounding steps use `c-teal`, drawing the eye to the most important step
- **Horizontal sub-flow**: Training pipeline uses left-to-right arrow-connected nodes (GPT model → MuonAdamW → Evaluation)
- **Footer metadata**: Fixed constraints (data, vocab, budget, context) shown as a single centered `ts` text line with `opacity=".5"`
- **Legend**: Four color swatches at the bottom explaining the semantic meaning of each color used
@@ -0,0 +1,161 @@
# Journey of a Banana: From Tree to Smoothie
A narrative journey diagram following a single banana across 3,000 miles and 3 weeks, from harvest in Costa Rica to a smoothie in the consumer's kitchen. Demonstrates storytelling through visualization, winding path layout, and progressive state changes.
## Key Patterns Used
- **Winding journey path**: S-curve connecting all stages visually
- **Location markers**: Country flags and place names for geographic context
- **Progressive state changes**: Banana color changes (green → yellow → brown → frozen → smoothie)
- **Narrative details**: Fun elements like spider check, stickers, price tags
- **Timeline**: Bottom timeline showing duration of journey
- **Environmental context**: Ocean waves, gas clouds, store awning
## New Shape Techniques
### Banana (curved fruit shape)
```xml
<!-- Green banana -->
<path class="banana-green" d="M 5 0 Q 0 10 3 20 Q 6 25 10 20 Q 13 10 8 0 Z"/>
<!-- Yellow banana -->
<path class="banana-yellow" d="M 0 5 Q -6 18 0 32 Q 7 40 15 30 Q 20 15 12 5 Z"/>
<!-- Brown overripe banana with spots -->
<path class="banana-brown" d="M 0 5 Q -5 15 0 28 Q 6 35 14 26 Q 18 14 12 5 Z"/>
<circle class="banana-spots" cx="5" cy="15" r="1.5"/>
<circle class="banana-spots" cx="9" cy="20" r="1"/>
```
### Banana Tree
```xml
<!-- Trunk -->
<rect class="tree-trunk" x="55" y="50" width="15" height="60" rx="3"/>
<!-- Leaves (rotated ellipses) -->
<ellipse class="tree-leaf" cx="62" cy="45" rx="40" ry="15" transform="rotate(-20, 62, 45)"/>
<ellipse class="tree-leaf" cx="62" cy="50" rx="35" ry="12" transform="rotate(25, 62, 50)"/>
<!-- Banana bunch hanging -->
<g transform="translate(40, 55)">
<path class="banana-green" d="M 5 0 Q 0 10 3 20 Q 6 25 10 20 Q 13 10 8 0 Z"/>
<path class="banana-green" d="M 12 2 Q 8 12 11 22 Q 14 27 18 22 Q 21 12 16 2 Z"/>
<rect class="stem" x="8" y="-5" width="12" height="8" rx="2"/>
</g>
```
### Cargo Ship
```xml
<!-- Ocean waves -->
<path class="ocean" d="M 0 90 Q 30 85 60 90 Q 90 95 120 90 Q 150 85 180 90 L 180 110 L 0 110 Z" opacity="0.5"/>
<!-- Hull -->
<path class="ship-hull" d="M 20 90 L 30 60 L 160 60 L 170 90 Q 150 95 95 95 Q 40 95 20 90 Z"/>
<!-- Deck -->
<rect class="ship-deck" x="40" y="45" width="110" height="18" rx="2"/>
<!-- Reefer containers -->
<rect class="container" x="45" y="25" width="30" height="22" rx="2"/>
<!-- Refrigeration symbol -->
<text x="60" y="40" text-anchor="middle" fill="#185FA5" style="font-size:10px">❄</text>
<!-- Smoke stack -->
<rect x="145" y="35" width="8" height="15" fill="#444441"/>
```
### Inspector Figure
```xml
<!-- Body -->
<rect class="inspector" x="10" y="20" width="25" height="35" rx="3"/>
<!-- Head -->
<circle class="inspector" cx="22" cy="12" r="10"/>
<!-- Hat -->
<rect x="12" y="2" width="20" height="6" rx="2" fill="#534AB7"/>
<!-- Clipboard -->
<rect class="clipboard" x="38" y="28" width="15" height="20" rx="2"/>
<line x1="42" y1="34" x2="50" y2="34" stroke="#888780" stroke-width="1"/>
```
### Spider with "No" Symbol
```xml
<circle cx="15" cy="15" r="18" fill="none" stroke="#A32D2D" stroke-width="2"/>
<line x1="3" y1="3" x2="27" y2="27" stroke="#A32D2D" stroke-width="2"/>
<!-- Spider body -->
<ellipse class="spider" cx="15" cy="15" rx="4" ry="5"/>
<ellipse class="spider" cx="15" cy="10" rx="3" ry="3"/>
<!-- Legs -->
<line x1="12" y1="14" x2="5" y2="10" stroke="#2C2C2A" stroke-width="1"/>
<line x1="18" y1="14" x2="25" y2="10" stroke="#2C2C2A" stroke-width="1"/>
```
### Blender with Smoothie
```xml
<!-- Blender jar -->
<path class="blender" d="M 5 5 L 0 45 L 35 45 L 30 5 Z"/>
<!-- Smoothie inside (wavy top) -->
<path class="smoothie" d="M 3 20 L 0 45 L 35 45 L 32 20 Q 25 18 17 22 Q 10 18 3 20 Z"/>
<!-- Blender base -->
<rect class="blender" x="-2" y="45" width="40" height="12" rx="3"/>
<!-- Lid -->
<rect x="8" y="0" width="20" height="8" rx="2" fill="#AFA9EC" stroke="#534AB7"/>
<!-- Banana chunks floating -->
<ellipse cx="12" cy="32" rx="4" ry="2" fill="#FAC775"/>
```
### Winding Journey Path
```xml
<path class="journey-path" d="
M 80 100
L 200 100
Q 280 100 280 150
L 280 180
Q 280 220 320 220
L 520 220
Q 560 220 560 260
L 560 320
Q 560 360 520 360
L 280 360
...
"/>
```
## CSS Classes
```css
/* Journey */
.journey-path { stroke: #D3D1C7; stroke-width: 3; fill: none; stroke-linecap: round; }
/* Banana ripeness stages */
.banana-green { fill: #97C459; stroke: #3B6D11; stroke-width: 0.5; }
.banana-yellow { fill: #FAC775; stroke: #BA7517; stroke-width: 0.5; }
.banana-brown { fill: #854F0B; stroke: #633806; stroke-width: 0.5; }
.banana-spots { fill: #633806; }
/* Environment elements */
.tree-trunk { fill: #854F0B; stroke: #633806; stroke-width: 1; }
.tree-leaf { fill: #97C459; stroke: #3B6D11; stroke-width: 0.5; }
.ocean { fill: #85B7EB; }
.ship-hull { fill: #5F5E5A; stroke: #444441; stroke-width: 1; }
.container { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
.gas-cloud { fill: #C0DD97; stroke: #97C459; stroke-width: 0.5; opacity: 0.6; }
/* Buildings */
.packhouse { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1; }
.warehouse { fill: #FAEEDA; stroke: #854F0B; stroke-width: 1; }
.store { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 1; }
/* Kitchen */
.counter { fill: #FAECE7; stroke: #993C1D; stroke-width: 1; }
.blender { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1; }
.smoothie { fill: #FAC775; }
.freezer { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
/* Details */
.sticker { fill: #378ADD; stroke: #185FA5; stroke-width: 0.3; }
.spider { fill: #2C2C2A; stroke: #1a1a18; stroke-width: 0.3; }
```
## Layout Notes
- **ViewBox**: 850×680 (tall for winding path)
- **Path style**: S-curve winding path connects all 7 stages
- **Location labels**: Country flags + place names anchor geographic context
- **State progression**: Same object (banana) shown in different states throughout
- **Timeline**: Horizontal timeline at bottom shows journey duration
- **Narrative elements**: Fun details (spider, stickers, price tags) add storytelling value
- **Environmental context**: Ocean waves, gas clouds, awnings create sense of place
@@ -0,0 +1,209 @@
# Commercial Aircraft Structure
A physical/structural diagram showing an aircraft side profile using appropriate SVG shapes beyond rectangles - paths, polygons, ellipses for realistic representation.
## Key Patterns Used
- **Path elements**: Curved fuselage body with nose cone using quadratic bezier curves
- **Polygon elements**: Tapered wing shape, triangular stabilizers, control surfaces
- **Ellipse elements**: Engines (cylinders), wheels (circles)
- **Line elements**: Landing gear struts, leader lines for labels
- **Dashed strokes**: Interior sections (fuel tank), movable control surfaces (rudder, elevator)
- **Layered composition**: Cabin sections drawn inside the fuselage shape
- **Leader lines with labels**: Connect labels to components they describe
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 400" xmlns="http://www.w3.org/2000/svg">
<!-- FUSELAGE - main body cylinder with nose cone -->
<path class="fuselage" d="
M 80 180
Q 40 180 40 200
Q 40 220 80 220
L 560 220
Q 580 220 580 200
Q 580 180 560 180
Z
"/>
<!-- Nose cone -->
<path class="fuselage" d="
M 80 180
Q 50 180 35 200
Q 50 220 80 220
" fill="none" stroke-width="1"/>
<!-- COCKPIT windows -->
<path class="cockpit" d="
M 45 190
L 75 185
L 75 200
L 50 200
Z
"/>
<line x1="55" y1="188" x2="55" y2="200" stroke="#534AB7" stroke-width="0.5"/>
<line x1="65" y1="186" x2="65" y2="200" stroke="#534AB7" stroke-width="0.5"/>
<!-- CABIN SECTIONS (inside fuselage) -->
<!-- First class -->
<rect class="first-class" x="85" y="183" width="50" height="34" rx="2"/>
<text class="tl" x="110" y="203" text-anchor="middle">First</text>
<!-- Business class -->
<rect class="business-class" x="140" y="183" width="80" height="34" rx="2"/>
<text class="tl" x="180" y="203" text-anchor="middle">Business</text>
<!-- Economy class -->
<rect class="economy-class" x="225" y="183" width="200" height="34" rx="2"/>
<text class="tl" x="325" y="203" text-anchor="middle">Economy</text>
<!-- CARGO HOLD (lower section indication) -->
<line x1="85" y1="217" x2="520" y2="217" class="leader"/>
<text class="tl" x="300" y="228" text-anchor="middle" opacity=".6">Cargo hold below deck</text>
<!-- WING - main wing shape -->
<polygon class="wing" points="
200,220
120,300
130,305
160,305
340,235
340,220
"/>
<!-- Wing fuel tank (dashed interior) -->
<polygon class="fuel-tank" points="
210,225
150,280
160,283
180,283
310,232
310,225
"/>
<text class="tl" x="220" y="260" opacity=".7">Fuel</text>
<!-- Flaps (trailing edge) -->
<polygon class="flap" points="
130,300
120,305
160,310
165,305
"/>
<text class="tl" x="143" y="320">Flaps</text>
<!-- ENGINE under wing -->
<ellipse class="engine" cx="175" cy="285" rx="25" ry="12"/>
<ellipse cx="155" cy="285" rx="8" ry="10" fill="none" stroke="#993C1D" stroke-width="0.5"/>
<!-- Engine pylon -->
<line x1="175" y1="273" x2="190" y2="245" stroke="#5F5E5A" stroke-width="2"/>
<text class="tl" x="175" y="308" text-anchor="middle">Engine</text>
<!-- TAIL SECTION -->
<!-- Vertical stabilizer -->
<polygon class="tail-v" points="
520,180
560,100
580,100
580,180
"/>
<text class="tl" x="565" y="150" text-anchor="middle">Vertical</text>
<text class="tl" x="565" y="162" text-anchor="middle">stabilizer</text>
<!-- Rudder -->
<polygon points="575,105 590,105 590,178 580,178" fill="none" stroke="#185FA5" stroke-width="0.5" stroke-dasharray="3 2"/>
<text class="tl" x="595" y="145" opacity=".6">Rudder</text>
<!-- Horizontal stabilizer -->
<polygon class="tail-h" points="
500,195
460,175
465,170
580,170
580,180
520,195
"/>
<text class="tl" x="510" y="166">Horizontal stabilizer</text>
<!-- Elevator -->
<polygon points="462,174 450,168 455,163 467,169" fill="none" stroke="#185FA5" stroke-width="0.5" stroke-dasharray="3 2"/>
<text class="tl" x="440" y="158" opacity=".6">Elevator</text>
<!-- LANDING GEAR -->
<!-- Nose gear -->
<line class="gear" x1="100" y1="220" x2="100" y2="260" stroke-width="3"/>
<ellipse class="wheel" cx="100" cy="268" rx="8" ry="10"/>
<text class="tl" x="100" y="290" text-anchor="middle">Nose gear</text>
<!-- Main gear (under wing/fuselage junction) -->
<line class="gear" x1="280" y1="220" x2="280" y2="270" stroke-width="4"/>
<line class="gear" x1="268" y1="265" x2="292" y2="265" stroke-width="3"/>
<ellipse class="wheel" cx="268" cy="278" rx="10" ry="12"/>
<ellipse class="wheel" cx="292" cy="278" rx="10" ry="12"/>
<text class="tl" x="280" y="302" text-anchor="middle">Main gear</text>
<!-- LABELS with leader lines -->
<!-- Cockpit label -->
<line class="leader" x1="60" y1="175" x2="60" y2="140"/>
<text class="ts" x="60" y="132" text-anchor="middle">Cockpit</text>
<!-- Wing label -->
<line class="leader" x1="250" y1="250" x2="290" y2="330"/>
<text class="ts" x="290" y="345" text-anchor="middle">Wing structure</text>
<text class="tl" x="290" y="358" text-anchor="middle">Spars, ribs, skin</text>
<!-- Fuselage label -->
<line class="leader" x1="400" y1="180" x2="400" y2="140"/>
<text class="ts" x="400" y="132" text-anchor="middle">Fuselage</text>
<text class="tl" x="400" y="145" text-anchor="middle">Pressure vessel</text>
</svg>
```
## CSS Classes for Physical Diagrams
When creating physical/structural diagrams, define semantic classes for each component type:
```css
/* Structure shapes */
.fuselage { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1; }
.wing { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
.tail-v { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
.tail-h { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
/* Interior sections */
.cockpit { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1; }
.first-class { fill: #FBEAF0; stroke: #993556; stroke-width: 0.5; }
.business-class { fill: #FAECE7; stroke: #993C1D; stroke-width: 0.5; }
.economy-class { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 0.5; }
.cargo { fill: #D3D1C7; stroke: #5F5E5A; stroke-width: 0.5; }
/* Systems */
.engine { fill: #FAECE7; stroke: #993C1D; stroke-width: 1; }
.fuel-tank { fill: #FAEEDA; stroke: #854F0B; stroke-width: 0.5; stroke-dasharray: 3 2; }
.flap { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 0.5; }
/* Mechanical */
.gear { fill: #444441; stroke: #2C2C2A; stroke-width: 0.5; }
.wheel { fill: #2C2C2A; stroke: #1a1a18; stroke-width: 0.5; }
```
## Shape Selection Guide
| Physical form | SVG element | Example |
|---------------|-------------|---------|
| Curved body | `<path>` with Q (quadratic) or C (cubic) curves | Fuselage, nose cone |
| Tapered/angular | `<polygon>` | Wings, stabilizers |
| Cylindrical | `<ellipse>` | Engines, wheels, tanks |
| Linear structure | `<line>` | Struts, pylons, gear legs |
| Internal sections | `<rect>` inside parent shape | Cabin classes |
| Dashed boundaries | `stroke-dasharray` on any shape | Fuel tanks, control surfaces |
## Layout Notes
- **ViewBox**: 680×400 (wider aspect ratio suits side profile)
- **Layering**: Draw outer structures first, then interior details on top
- **Leader lines**: Use `.leader` class (dashed) to connect labels to components
- **Text sizes**: Use `.tl` (10px) for component labels, `.ts` (12px) for section labels
- **Semantic colors**: Group by system (structure=blue, propulsion=coral, fuel=amber, etc.)
@@ -0,0 +1,236 @@
# Out-of-Order CPU Core Microarchitecture
A structural diagram showing the internal pipeline stages of a modern superscalar out-of-order CPU core. Demonstrates multi-stage vertical flow with parallel paths, fan-out patterns for execution ports, and a separate memory hierarchy sidebar.
## Key Patterns Used
- **Multi-stage vertical flow**: Six pipeline stages (Front End → Rename → Schedule → Execute → Retire)
- **Parallel decode paths**: Main decode and µop cache bypass (dashed line for cache hit)
- **Container grouping**: Logical stages grouped in colored containers
- **Fan-out pattern**: Single scheduler dispatching to 6 execution ports
- **Sidebar layout**: Memory hierarchy placed in separate column on right
- **Stage labels**: Left-aligned labels indicating pipeline phase
- **Color-coded semantics**: Different colors for each functional unit category
## Diagram Type
This is a **hybrid structural/flow** diagram:
- **Flow aspect**: Instructions move top-to-bottom through pipeline stages
- **Structural aspect**: Components are grouped by function (rename unit, execution cluster)
- **Sidebar**: Memory hierarchy is architecturally separate but connected via data paths
## Pipeline Stage Breakdown
### Front End (Purple)
```xml
<!-- Fetch Unit -->
<g class="node c-purple">
<rect x="40" y="70" width="140" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="110" y="90" text-anchor="middle" dominant-baseline="central">Fetch unit</text>
<text class="ts" x="110" y="110" text-anchor="middle" dominant-baseline="central">6-wide, 32B/cycle</text>
</g>
<!-- Branch Predictor (subordinate) -->
<g class="node c-purple">
<rect x="40" y="140" width="140" height="44" rx="8" stroke-width="0.5"/>
<text class="th" x="110" y="162" text-anchor="middle" dominant-baseline="central">Branch predictor</text>
</g>
<!-- Decode -->
<g class="node c-purple">
<rect x="230" y="70" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="90" text-anchor="middle" dominant-baseline="central">Decode</text>
<text class="ts" x="310" y="110" text-anchor="middle" dominant-baseline="central">x86 → µops, 6-wide</text>
</g>
```
### µop Cache Bypass Path (Teal)
The µop cache (Decoded Stream Buffer) provides an alternate path that bypasses the complex decoder:
```xml
<!-- µop Cache parallel to decode -->
<g class="node c-teal">
<rect x="230" y="150" width="160" height="50" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="168" text-anchor="middle" dominant-baseline="central">µop cache (DSB)</text>
<text class="ts" x="310" y="186" text-anchor="middle" dominant-baseline="central">4K entries, 8-wide</text>
</g>
<!-- Dashed bypass path indicating cache hit -->
<path d="M180 110 L205 110 L205 175 L230 175" fill="none" class="arr"
stroke-dasharray="4 3" marker-end="url(#arrow)"/>
<text class="tx" x="164" y="148" opacity=".6">hit</text>
```
### Rename/Allocate Container (Coral)
Groups related rename components in a container:
```xml
<!-- Outer container -->
<g class="c-coral">
<rect x="40" y="250" width="530" height="130" rx="12" stroke-width="0.5"/>
<text class="th" x="60" y="274">Rename / allocate</text>
<text class="ts" x="60" y="292">Map architectural → physical registers</text>
</g>
<!-- Inner components -->
<g class="node c-coral">
<rect x="60" y="310" width="180" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="150" y="330" text-anchor="middle" dominant-baseline="central">Register alias table</text>
<text class="ts" x="150" y="350" text-anchor="middle" dominant-baseline="central">180 physical regs</text>
</g>
```
### Scheduler Fan-Out Pattern (Amber → Teal)
Single unified scheduler dispatching to multiple execution ports:
```xml
<!-- Unified Scheduler -->
<g class="node c-amber">
<rect x="140" y="420" width="330" height="50" rx="8" stroke-width="0.5"/>
<text class="th" x="305" y="438" text-anchor="middle" dominant-baseline="central">Unified scheduler</text>
<text class="ts" x="305" y="456" text-anchor="middle" dominant-baseline="central">97 entries, out-of-order dispatch</text>
</g>
<!-- Fan-out arrows to 6 ports -->
<line x1="170" y1="470" x2="90" y2="540" class="arr" marker-end="url(#arrow)"/>
<line x1="215" y1="470" x2="170" y2="540" class="arr" marker-end="url(#arrow)"/>
<line x1="265" y1="470" x2="250" y2="540" class="arr" marker-end="url(#arrow)"/>
<line x1="305" y1="470" x2="330" y2="540" class="arr" marker-end="url(#arrow)"/>
<line x1="355" y1="470" x2="410" y2="540" class="arr" marker-end="url(#arrow)"/>
<line x1="420" y1="470" x2="490" y2="540" class="arr" marker-end="url(#arrow)"/>
```
### Execution Port Box Pattern
Compact boxes showing port number and capabilities:
```xml
<!-- Execution port with multi-line capability -->
<g class="node c-teal">
<rect x="55" y="540" width="70" height="64" rx="6" stroke-width="0.5"/>
<text class="th" x="90" y="560" text-anchor="middle" dominant-baseline="central">Port 0</text>
<text class="tx" x="90" y="576" text-anchor="middle" dominant-baseline="central">ALU</text>
<text class="tx" x="90" y="590" text-anchor="middle" dominant-baseline="central">DIV</text>
</g>
```
### Reorder Buffer (Pink)
Wide horizontal bar at bottom showing retirement:
```xml
<g class="c-pink">
<rect x="40" y="670" width="530" height="40" rx="10" stroke-width="0.5"/>
<text class="th" x="305" y="694" text-anchor="middle" dominant-baseline="central">Reorder buffer (ROB) — 512 entries, 8-wide retire</text>
</g>
```
### Memory Hierarchy Sidebar (Blue)
Separate column showing cache levels:
```xml
<!-- Container -->
<g class="c-blue">
<rect x="600" y="30" width="190" height="360" rx="16" stroke-width="0.5"/>
<text class="th" x="695" y="54" text-anchor="middle">Memory hierarchy</text>
</g>
<!-- Cache levels stacked vertically -->
<g class="node c-blue">
<rect x="620" y="70" width="150" height="50" rx="8" stroke-width="0.5"/>
<text class="th" x="695" y="88" text-anchor="middle" dominant-baseline="central">L1-I cache</text>
<text class="ts" x="695" y="106" text-anchor="middle" dominant-baseline="central">32 KB, 8-way</text>
</g>
<!-- Additional levels follow same pattern -->
```
## Connection Patterns
### Instruction Fetch Path
Horizontal arrow from L1-I cache to fetch unit:
```xml
<path d="M620 95 L200 95" fill="none" class="arr" marker-end="url(#arrow)"/>
<text class="tx" x="410" y="88" text-anchor="middle" opacity=".6">instruction fetch</text>
```
### Load/Store Path
Complex path from execution ports to L1-D cache:
```xml
<path d="M250 604 L250 640 L580 640 L580 160 L620 160" fill="none" class="arr" marker-end="url(#arrow)"/>
<text class="tx" x="415" y="652" text-anchor="middle" opacity=".6">load / store</text>
```
### Commit Path (dashed)
Dashed line showing write-back from ROB to register file:
```xml
<path d="M550 690 L580 690 L580 445 L595 445" fill="none" class="arr" stroke-dasharray="4 3"/>
<text class="tx" x="590" y="578" opacity=".6" transform="rotate(-90 590 578)">commit</text>
```
### Path Merge (Decode + µop Cache)
Two paths converging before rename:
```xml
<line x1="390" y1="98" x2="430" y2="98" class="arr"/>
<line x1="390" y1="175" x2="430" y2="175" class="arr"/>
<path d="M430 98 L430 175" fill="none" stroke="var(--text-secondary)" stroke-width="1.5"/>
<line x1="430" y1="136" x2="470" y2="136" class="arr" marker-end="url(#arrow)"/>
```
## Text Classes
This diagram uses an additional text class for very small labels:
```css
.tx { font-family: system-ui, -apple-system, sans-serif; font-size: 10px; fill: var(--text-secondary); }
```
Used for:
- Execution port capability labels (ALU, Branch, Load, etc.)
- Connection labels (instruction fetch, load/store, commit)
- DRAM latency annotation
## Color Semantic Mapping
| Color | Stage | Components |
|-------|-------|------------|
| `c-purple` | Front end | Fetch, Branch predictor, Decode |
| `c-teal` | Execution | µop cache, Execution ports |
| `c-coral` | Rename | RAT, Physical RF, Free list |
| `c-amber` | Schedule | Unified scheduler |
| `c-pink` | Retire | Reorder buffer |
| `c-blue` | Memory | L1-I, L1-D, L2, DRAM |
| `c-gray` | External | Off-chip DRAM |
## Layout Notes
- **ViewBox**: 820×720 (taller than wide for vertical pipeline flow)
- **Main pipeline**: x=40 to x=570 (530px width)
- **Memory sidebar**: x=600 to x=790 (190px width)
- **Stage labels**: x=30, left-aligned, 50% opacity
- **Vertical spacing**: ~80-100px between major stages
- **Container padding**: 20px inside containers
- **Port spacing**: 80px between execution port centers
- **Legend**: Bottom-right of memory sidebar, explains color coding
## Architectural Details Shown
| Component | Specification | Notes |
|-----------|---------------|-------|
| Fetch | 6-wide, 32B/cycle | Typical modern Intel/AMD |
| Decode | 6-wide, x86→µops | Complex decoder |
| µop Cache | 4K entries, 8-wide | Bypass for hot code |
| RAT | 180 physical regs | Supports deep OoO |
| Scheduler | 97 entries | Unified RS |
| Execution | 6 ports | ALU×2, Load, Store×2, Vector |
| ROB | 512 entries, 8-wide | In-order retirement |
| L1-I | 32 KB, 8-way | Instruction cache |
| L1-D | 48 KB, 12-way | Data cache |
| L2 | 1.25 MB, 20-way | Unified |
| DRAM | DDR5-6400, ~80ns | Off-chip |
## When to Use This Pattern
Use this diagram style for:
- CPU/GPU microarchitecture visualization
- Compiler pipeline stages
- Network packet processing pipelines
- Any system with parallel execution units fed by a scheduler
- Hardware designs with multiple functional units
@@ -0,0 +1,182 @@
# Electricity Grid: Generation to Consumption
A left-to-right flow diagram showing electricity from multiple generation sources through transmission and distribution networks to end consumers. Demonstrates multi-stage flow layout, voltage level visual hierarchy, and smart grid data overlay.
## Key Patterns Used
- **Multi-stage horizontal flow**: Four distinct columns (Generation → Transmission → Distribution → Consumption)
- **Stage dividers**: Vertical dashed lines separating each phase
- **Voltage level hierarchy**: Different line weights/colors for HV, MV, LV
- **Smart grid data overlay**: Dashed data flow lines from control center
- **Capacity labels**: Power ratings on generation sources
- **Multiple source convergence**: Four generators feeding into single transmission grid
## New Shape Techniques
### Nuclear Plant (cooling tower + reactor)
```xml
<!-- Cooling tower (hyperbolic curve) -->
<path class="nuclear-tower" d="M 25 80 Q 15 60 20 40 Q 25 20 40 15 Q 55 20 60 40 Q 65 60 55 80 Z"/>
<!-- Steam clouds -->
<ellipse class="nuclear-steam" cx="40" cy="8" rx="12" ry="6"/>
<!-- Reactor dome -->
<rect class="nuclear-building" x="65" y="45" width="40" height="35" rx="3"/>
<ellipse class="nuclear-building" cx="85" cy="45" rx="20" ry="8"/>
```
### Gas Peaker Plant (with flames)
```xml
<rect class="gas-plant" x="0" y="25" width="70" height="40" rx="3"/>
<!-- Smokestacks -->
<rect class="gas-stack" x="15" y="5" width="8" height="25" rx="1"/>
<!-- Flame -->
<path class="gas-flame" d="M 19 5 Q 17 0 19 -3 Q 21 0 19 5"/>
<!-- Turbine housing -->
<ellipse class="gas-plant" cx="55" cy="45" rx="12" ry="8"/>
```
### Transmission Pylon with Insulators
```xml
<!-- Tapered tower -->
<polygon class="pylon" points="20,0 25,0 30,80 15,80"/>
<!-- Cross arms -->
<line class="pylon-arm" x1="5" y1="10" x2="40" y2="10"/>
<line class="pylon-arm" x1="8" y1="25" x2="37" y2="25"/>
<!-- Insulators (where lines attach) -->
<circle class="insulator" cx="8" cy="10" r="3"/>
<circle class="insulator" cx="37" cy="10" r="3"/>
```
### Transformer Symbol
```xml
<!-- Two coils with core -->
<circle class="transformer-coil" cx="25" cy="25" r="12"/>
<circle class="transformer-coil" cx="55" cy="25" r="12"/>
<rect class="transformer-core" x="35" y="15" width="10" height="20" rx="2"/>
<!-- Busbars -->
<line x1="0" y1="15" x2="-10" y2="15" stroke="#EF9F27" stroke-width="3"/>
```
### Pole-mounted Transformer
```xml
<rect class="pole" x="18" y="0" width="4" height="60"/>
<line x1="10" y1="8" x2="30" y2="8" stroke="#854F0B" stroke-width="2"/>
<rect class="dist-transformer" x="8" y="15" width="24" height="18" rx="2"/>
<line class="lv-line" x1="20" y1="33" x2="20" y2="60"/>
```
### House with Roof
```xml
<rect class="home" x="0" y="25" width="35" height="30" rx="2"/>
<polygon class="home-roof" points="0,25 17,8 35,25"/>
<!-- Door -->
<rect x="8" y="35" width="8" height="15" fill="#085041"/>
<!-- Window -->
<rect x="22" y="32" width="8" height="8" fill="#9FE1CB"/>
```
### Factory Building
```xml
<rect class="factory" x="0" y="15" width="90" height="50" rx="3"/>
<!-- Smokestacks -->
<rect class="factory-stack" x="15" y="0" width="10" height="20"/>
<!-- Windows row -->
<rect x="10" y="30" width="15" height="12" fill="#F5C4B3"/>
<rect x="30" y="30" width="15" height="12" fill="#F5C4B3"/>
<!-- Loading dock -->
<rect x="55" y="50" width="30" height="15" fill="#993C1D"/>
```
### EV Charger with Car
```xml
<!-- Charging station -->
<rect class="ev-charger" x="20" y="0" width="25" height="45" rx="3"/>
<rect x="24" y="5" width="17" height="12" rx="1" fill="#3C3489"/>
<!-- Cable -->
<path d="M 32 20 Q 32 35 45 40" stroke="#534AB7" stroke-width="2" fill="none"/>
<circle cx="45" cy="40" r="4" fill="#534AB7"/>
<!-- Status light -->
<circle cx="32" cy="38" r="3" fill="#97C459"/>
<!-- EV Car -->
<path class="ev-car" d="M 5 20 L 5 12 Q 5 5 15 5 L 45 5 Q 55 5 55 12 L 55 20 Z"/>
<!-- Windows -->
<rect x="10" y="8" width="15" height="8" rx="2" fill="#534AB7"/>
<!-- Wheels -->
<circle cx="15" cy="22" r="5" fill="#2C2C2A"/>
<!-- Charging bolt icon -->
<path d="M 28 12 L 32 8 L 30 11 L 34 11 L 30 16 L 32 13 Z" fill="#97C459"/>
```
## Voltage Level Line Styles
```css
/* High voltage (transmission) - thick, bright */
.hv-line { stroke: #EF9F27; stroke-width: 2.5; fill: none; }
/* Medium voltage (distribution) - medium */
.mv-line { stroke: #BA7517; stroke-width: 2; fill: none; }
/* Low voltage (consumer) - thin, darker */
.lv-line { stroke: #854F0B; stroke-width: 1.5; fill: none; }
/* Smart grid data - dashed purple */
.data-flow { stroke: #7F77DD; stroke-width: 1; fill: none; stroke-dasharray: 3 2; opacity: 0.7; }
```
## Flow Arrow Marker
```xml
<defs>
<marker id="flow-arrow" viewBox="0 0 10 10" refX="9" refY="5"
markerWidth="6" markerHeight="6" orient="auto">
<path d="M0,0 L10,5 L0,10 Z" fill="#EF9F27"/>
</marker>
</defs>
<!-- Usage -->
<line x1="140" y1="105" x2="210" y2="105" class="hv-line" marker-end="url(#flow-arrow)"/>
```
## CSS Classes
```css
/* Generation */
.nuclear-tower { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.nuclear-building { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1; }
.solar-panel { fill: #3C3489; stroke: #534AB7; stroke-width: 0.5; }
.wind-tower { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.wind-blade { fill: #F1EFE8; stroke: #888780; stroke-width: 0.5; }
.gas-plant { fill: #FAECE7; stroke: #993C1D; stroke-width: 1; }
.gas-flame { fill: #EF9F27; }
/* Transmission */
.pylon { fill: #5F5E5A; stroke: #444441; stroke-width: 0.5; }
.insulator { fill: #FAEEDA; stroke: #854F0B; stroke-width: 0.5; }
.substation { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
.transformer-coil { fill: none; stroke: #185FA5; stroke-width: 1.5; }
/* Distribution */
.pole { fill: #854F0B; stroke: #633806; stroke-width: 0.5; }
.dist-transformer { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 1; }
/* Consumption */
.home { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 1; }
.home-roof { fill: #0F6E56; stroke: #085041; stroke-width: 0.5; }
.factory { fill: #FAECE7; stroke: #993C1D; stroke-width: 1; }
.ev-charger { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1; }
.ev-car { fill: #3C3489; stroke: #534AB7; stroke-width: 0.5; }
/* Smart grid */
.smart-grid { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1.5; }
```
## Layout Notes
- **ViewBox**: 820×520 (wide for 4-column layout)
- **Column widths**: ~200px per stage
- **Stage dividers**: Vertical dashed lines at x=200, 420, 620
- **Stage labels**: Top of diagram, uppercase for emphasis
- **Flow direction**: Left-to-right with arrows showing power flow
- **Data overlay**: Smart grid data lines use different style (dashed purple) to distinguish from power lines
- **Capacity labels**: Show MW ratings on generators for context
- **Voltage labels**: Show transformation ratios at substations
@@ -0,0 +1,172 @@
# Feature Film Production Pipeline
A phased workflow showing the five stages of filmmaking, using containers with inner nodes and horizontal sub-flows within a phase.
## Key Patterns Used
- **Phase containers**: Large rounded rectangles with neutral background and dashed borders
- **Inner task nodes**: Smaller colored nodes inside containers for sub-tasks
- **Horizontal flow within container**: Post-production shows sequential pipeline with arrows (Editing → Color → VFX → Sound → Score)
- **Consistent phase spacing**: ~30px gap between phase containers
- **Phase labels with subtitles**: Each container has title + description
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 780" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
<!-- Phase 1: Development -->
<g>
<rect x="40" y="30" width="600" height="110" rx="16" stroke-width="1" stroke-dasharray="6 4" fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="56">Development</text>
<text class="ts" x="66" y="74">Concept to greenlight</text>
</g>
<g class="node c-purple">
<rect x="70" y="90" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="150" y="108" text-anchor="middle" dominant-baseline="central">Script / screenplay</text>
</g>
<g class="node c-purple">
<rect x="260" y="90" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="108" text-anchor="middle" dominant-baseline="central">Financing / budget</text>
</g>
<g class="node c-purple">
<rect x="450" y="90" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="530" y="108" text-anchor="middle" dominant-baseline="central">Casting leads</text>
</g>
<!-- Arrow to Phase 2 -->
<line x1="340" y1="140" x2="340" y2="170" class="arr" marker-end="url(#arrow)"/>
<!-- Phase 2: Pre-production -->
<g>
<rect x="40" y="170" width="600" height="110" rx="16" stroke-width="1" stroke-dasharray="6 4" fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="196">Pre-production</text>
<text class="ts" x="66" y="214">Planning and preparation</text>
</g>
<g class="node c-teal">
<rect x="70" y="230" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="150" y="248" text-anchor="middle" dominant-baseline="central">Storyboards</text>
</g>
<g class="node c-teal">
<rect x="260" y="230" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="248" text-anchor="middle" dominant-baseline="central">Location scouting</text>
</g>
<g class="node c-teal">
<rect x="450" y="230" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="530" y="248" text-anchor="middle" dominant-baseline="central">Crew hiring</text>
</g>
<!-- Arrow to Phase 3 -->
<line x1="340" y1="280" x2="340" y2="310" class="arr" marker-end="url(#arrow)"/>
<!-- Phase 3: Production -->
<g>
<rect x="40" y="310" width="600" height="110" rx="16" stroke-width="1" stroke-dasharray="6 4" fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="336">Production</text>
<text class="ts" x="66" y="354">Principal photography</text>
</g>
<g class="node c-coral">
<rect x="70" y="370" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="150" y="388" text-anchor="middle" dominant-baseline="central">Filming / shooting</text>
</g>
<g class="node c-coral">
<rect x="260" y="370" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="388" text-anchor="middle" dominant-baseline="central">Production sound</text>
</g>
<g class="node c-coral">
<rect x="450" y="370" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="530" y="388" text-anchor="middle" dominant-baseline="central">VFX plates</text>
</g>
<!-- Arrow to Phase 4 -->
<line x1="340" y1="420" x2="340" y2="450" class="arr" marker-end="url(#arrow)"/>
<!-- Phase 4: Post-production -->
<g>
<rect x="40" y="450" width="600" height="150" rx="16" stroke-width="1" stroke-dasharray="6 4" fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="476">Post-production</text>
<text class="ts" x="66" y="494">Assembly and finishing</text>
</g>
<g class="node c-amber">
<rect x="70" y="510" width="110" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="125" y="528" text-anchor="middle" dominant-baseline="central">Editing</text>
</g>
<g class="node c-amber">
<rect x="195" y="510" width="110" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="250" y="528" text-anchor="middle" dominant-baseline="central">Color grade</text>
</g>
<g class="node c-amber">
<rect x="320" y="510" width="90" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="365" y="528" text-anchor="middle" dominant-baseline="central">VFX</text>
</g>
<g class="node c-amber">
<rect x="425" y="510" width="100" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="475" y="528" text-anchor="middle" dominant-baseline="central">Sound mix</text>
</g>
<g class="node c-amber">
<rect x="540" y="510" width="80" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="580" y="528" text-anchor="middle" dominant-baseline="central">Score</text>
</g>
<!-- Flow arrows within post -->
<line x1="180" y1="528" x2="195" y2="528" class="arr" marker-end="url(#arrow)"/>
<line x1="305" y1="528" x2="320" y2="528" class="arr" marker-end="url(#arrow)"/>
<line x1="410" y1="528" x2="425" y2="528" class="arr" marker-end="url(#arrow)"/>
<line x1="525" y1="528" x2="540" y2="528" class="arr" marker-end="url(#arrow)"/>
<!-- Final delivery label -->
<g class="node c-amber">
<rect x="240" y="556" width="200" height="32" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="572" text-anchor="middle" dominant-baseline="central">Final master / DCP</text>
</g>
<line x1="340" y1="546" x2="340" y2="556" class="arr" marker-end="url(#arrow)"/>
<!-- Arrow to Phase 5 -->
<line x1="340" y1="600" x2="340" y2="630" class="arr" marker-end="url(#arrow)"/>
<!-- Phase 5: Distribution -->
<g>
<rect x="40" y="630" width="600" height="110" rx="16" stroke-width="1" stroke-dasharray="6 4" fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="656">Distribution</text>
<text class="ts" x="66" y="674">Release and exhibition</text>
</g>
<g class="node c-blue">
<rect x="70" y="690" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="150" y="708" text-anchor="middle" dominant-baseline="central">Film festivals</text>
</g>
<g class="node c-blue">
<rect x="260" y="690" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="708" text-anchor="middle" dominant-baseline="central">Theatrical release</text>
</g>
<g class="node c-blue">
<rect x="450" y="690" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="530" y="708" text-anchor="middle" dominant-baseline="central">Streaming / VOD</text>
</g>
</svg>
```
## Color Assignments
| Element | Color | Reason |
|---------|-------|--------|
| Phase containers | Neutral (dashed) | Subtle grouping, doesn't compete with content |
| Development tasks | `c-purple` | Creative/concept work |
| Pre-production tasks | `c-teal` | Planning and preparation |
| Production tasks | `c-coral` | Active filming (main event) |
| Post-production tasks | `c-amber` | Processing/refinement |
| Distribution tasks | `c-blue` | Outward delivery/release |
## Layout Notes
- **ViewBox**: 680×780 (standard width, tall for 5 phases)
- **Container style**: Dashed border (`stroke-dasharray="6 4"`), neutral fill (`var(--bg-secondary)`), `stroke-width="1"`
- **Container height**: 110px for 3-node phases, 150px for post-production (more complex)
- **Inner node dimensions**: 160×36px for standard tasks, variable width for post-production sequential flow
- **Phase gap**: 30px between containers
- **Horizontal sub-flow**: Post-production uses tightly packed nodes with arrows between them to show sequence
- **Convergence node**: "Final master / DCP" sits below the horizontal flow, collecting all post outputs
@@ -0,0 +1,165 @@
# Hospital Emergency Department Flow
A multi-path flowchart showing patient journey through an emergency department with priority-based routing using semantic colors (red=critical, amber=urgent, green=stable).
## Key Patterns Used
- **Semantic color coding**: Red/amber/green for priority levels (not arbitrary decoration)
- **Stage labels**: Left-aligned faded labels marking workflow phases
- **Convergent paths**: Multiple entry points merging, then branching, then converging again
- **Nested containers**: Diagnostics grouped in a container with inner nodes
- **Legend**: Color key at bottom explaining priority levels
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 620" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
<!-- Stage labels -->
<text class="ts" x="40" y="68" text-anchor="start" opacity=".5">Arrival</text>
<text class="ts" x="40" y="168" text-anchor="start" opacity=".5">Assessment</text>
<text class="ts" x="40" y="288" text-anchor="start" opacity=".5">Priority routing</text>
<text class="ts" x="40" y="418" text-anchor="start" opacity=".5">Diagnostics</text>
<text class="ts" x="40" y="518" text-anchor="start" opacity=".5">Outcome</text>
<!-- Arrival: Ambulance -->
<g class="node c-gray">
<rect x="140" y="40" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="220" y="60" text-anchor="middle" dominant-baseline="central">Ambulance</text>
<text class="ts" x="220" y="80" text-anchor="middle" dominant-baseline="central">Emergency transport</text>
</g>
<!-- Arrival: Walk-in -->
<g class="node c-gray">
<rect x="380" y="40" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="460" y="60" text-anchor="middle" dominant-baseline="central">Walk-in</text>
<text class="ts" x="460" y="80" text-anchor="middle" dominant-baseline="central">Self-arrival</text>
</g>
<!-- Arrows to Triage -->
<line x1="220" y1="96" x2="300" y2="140" class="arr" marker-end="url(#arrow)"/>
<line x1="460" y1="96" x2="380" y2="140" class="arr" marker-end="url(#arrow)"/>
<!-- Triage -->
<g class="node c-purple">
<rect x="240" y="140" width="200" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="340" y="160" text-anchor="middle" dominant-baseline="central">Triage</text>
<text class="ts" x="340" y="180" text-anchor="middle" dominant-baseline="central">Nurse assessment, vitals</text>
</g>
<!-- Arrows from Triage to Priority -->
<line x1="280" y1="196" x2="140" y2="260" class="arr" marker-end="url(#arrow)"/>
<line x1="340" y1="196" x2="340" y2="260" class="arr" marker-end="url(#arrow)"/>
<line x1="400" y1="196" x2="540" y2="260" class="arr" marker-end="url(#arrow)"/>
<!-- Priority: Red - Trauma -->
<g class="node c-red">
<rect x="60" y="260" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="140" y="280" text-anchor="middle" dominant-baseline="central">Trauma bay</text>
<text class="ts" x="140" y="300" text-anchor="middle" dominant-baseline="central">Priority: critical</text>
</g>
<!-- Priority: Yellow - Exam rooms -->
<g class="node c-amber">
<rect x="260" y="260" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="340" y="280" text-anchor="middle" dominant-baseline="central">Exam rooms</text>
<text class="ts" x="340" y="300" text-anchor="middle" dominant-baseline="central">Priority: urgent</text>
</g>
<!-- Priority: Green - Waiting -->
<g class="node c-green">
<rect x="460" y="260" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="540" y="280" text-anchor="middle" dominant-baseline="central">Waiting area</text>
<text class="ts" x="540" y="300" text-anchor="middle" dominant-baseline="central">Priority: stable</text>
</g>
<!-- Arrows to Diagnostics -->
<line x1="140" y1="316" x2="220" y2="390" class="arr" marker-end="url(#arrow)"/>
<line x1="340" y1="316" x2="340" y2="390" class="arr" marker-end="url(#arrow)"/>
<line x1="540" y1="316" x2="460" y2="390" class="arr" marker-end="url(#arrow)"/>
<!-- Diagnostics container -->
<g class="c-teal">
<rect x="140" y="390" width="400" height="56" rx="12" stroke-width="0.5"/>
</g>
<!-- Labs -->
<g class="node c-teal">
<rect x="160" y="400" width="110" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="215" y="418" text-anchor="middle" dominant-baseline="central">Labs</text>
</g>
<!-- Imaging -->
<g class="node c-teal">
<rect x="285" y="400" width="110" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="418" text-anchor="middle" dominant-baseline="central">Imaging</text>
</g>
<!-- Diagnosis -->
<g class="node c-teal">
<rect x="410" y="400" width="110" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="465" y="418" text-anchor="middle" dominant-baseline="central">Diagnosis</text>
</g>
<!-- Arrows to Outcomes -->
<line x1="215" y1="446" x2="160" y2="490" class="arr" marker-end="url(#arrow)"/>
<line x1="340" y1="446" x2="340" y2="490" class="arr" marker-end="url(#arrow)"/>
<line x1="465" y1="446" x2="520" y2="490" class="arr" marker-end="url(#arrow)"/>
<!-- Outcome: Admission -->
<g class="node c-coral">
<rect x="80" y="490" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="160" y="510" text-anchor="middle" dominant-baseline="central">Admission</text>
<text class="ts" x="160" y="530" text-anchor="middle" dominant-baseline="central">Inpatient ward</text>
</g>
<!-- Outcome: Surgery -->
<g class="node c-coral">
<rect x="260" y="490" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="340" y="510" text-anchor="middle" dominant-baseline="central">Surgery</text>
<text class="ts" x="340" y="530" text-anchor="middle" dominant-baseline="central">Operating room</text>
</g>
<!-- Outcome: Discharge -->
<g class="node c-coral">
<rect x="440" y="490" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="520" y="510" text-anchor="middle" dominant-baseline="central">Discharge</text>
<text class="ts" x="520" y="530" text-anchor="middle" dominant-baseline="central">Home with instructions</text>
</g>
<!-- Legend -->
<text class="ts" x="140" y="580" opacity=".5">Priority levels</text>
<g class="c-red"><rect x="140" y="592" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="162" y="604">Critical</text>
<g class="c-amber"><rect x="240" y="592" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="262" y="604">Urgent</text>
<g class="c-green"><rect x="340" y="592" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="362" y="604">Stable</text>
</svg>
```
## Color Assignments
| Element | Color | Reason |
|---------|-------|--------|
| Entry points (Ambulance, Walk-in) | `c-gray` | Neutral starting points |
| Triage | `c-purple` | Processing/assessment step |
| Trauma bay | `c-red` | Critical priority (semantic) |
| Exam rooms | `c-amber` | Urgent priority (semantic) |
| Waiting area | `c-green` | Stable priority (semantic) |
| Diagnostics | `c-teal` | Clinical services category |
| Outcomes | `c-coral` | Final disposition category |
## Layout Notes
- **ViewBox**: 680×620 (standard width, extended height for 5 stages)
- **Stage spacing**: ~110-130px between stage rows
- **Diagonal arrows**: Connect nodes across columns naturally
- **Container with inner nodes**: Diagnostics uses outer `c-teal` rect with inner node rects
@@ -0,0 +1,114 @@
# ML Benchmark Grouped Bar Chart with Dual Axis
A quantitative data visualization comparing LLM inference speed across quantization levels with dual Y-axes, threshold markers, and an inset accuracy table.
## Key Patterns Used
- **Grouped bars**: Min/max range pairs per category using semantic color pairs (lighter=min, darker=max)
- **Dual Y-axis**: Left axis for primary metric (tok/s), right axis for secondary metric (VRAM GB)
- **Overlay line graph**: `<polyline>` with labeled dots showing VRAM usage across categories
- **Threshold marker**: Dashed red horizontal line indicating hardware limit (24 GB GPU)
- **Zone annotations**: Subtle text labels above/below threshold for context
- **Inset data table**: Alternating row fills below chart with quantitative accuracy data
- **Semantic color coding**: Each quantization level gets its own color from the skill palette (red=OOM, amber=slow, teal=sweet spot, blue=fast)
## Diagram Type
This is a **quantitative data chart** with:
- **Grouped vertical bars**: Range bars showing minmax performance per category
- **Secondary axis line**: VRAM usage overlaid as a connected scatter plot
- **Threshold annotation**: Hardware constraint line
- **Inset table**: Supporting accuracy metrics
## Chart Layout Formula
```
Chart area: x=90590, y=70410 (500px wide, 340px tall)
Left Y-axis: Primary metric (tok/s)
y = 410 (val / max_val) × 340
Right Y-axis: Secondary metric (VRAM GB)
Same formula, different scale labels
Groups: Divide width by number of categories
Bars: Each group → min bar (34px) + 8px gap + max bar (34px)
Line overlay: <polyline> connecting data points across group centers
Threshold: Horizontal dashed line at critical value
Table: Below chart, alternating row fills
```
## Data Mapped
| Quantization | Model Size | Speed (tok/s) | VRAM (GB) | MMLU Pro | Status |
|-------------|-----------|---------------|-----------|----------|--------|
| FP16 | 62 GB | 0.52 | 62 | 75.2 | OOM / unusable |
| Q8_0 | 32 GB | 35 | 32 | 75.0 | Partial offload |
| Q4_K_M | 16.8 GB | 812 | 16.8 | 73.1 | Fits in VRAM ✓ |
| IQ3_M | 12 GB | 1215 | 12 | 70.5 | Full GPU speed |
## Bar CSS Classes
```css
/* Light mode */
.bar-fp16-min { fill: #FCEBEB; stroke: #A32D2D; stroke-width: 0.75; }
.bar-fp16-max { fill: #F7C1C1; stroke: #A32D2D; stroke-width: 0.75; }
.bar-q8-min { fill: #FAEEDA; stroke: #854F0B; stroke-width: 0.75; }
.bar-q8-max { fill: #FAC775; stroke: #854F0B; stroke-width: 0.75; }
.bar-q4-min { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 0.75; }
.bar-q4-max { fill: #9FE1CB; stroke: #0F6E56; stroke-width: 0.75; }
.bar-iq3-min { fill: #E6F1FB; stroke: #185FA5; stroke-width: 0.75; }
.bar-iq3-max { fill: #B5D4F4; stroke: #185FA5; stroke-width: 0.75; }
/* Dark mode */
@media (prefers-color-scheme: dark) {
.bar-fp16-min { fill: #501313; stroke: #F09595; }
.bar-fp16-max { fill: #791F1F; stroke: #F09595; }
.bar-q8-min { fill: #412402; stroke: #EF9F27; }
.bar-q8-max { fill: #633806; stroke: #EF9F27; }
.bar-q4-min { fill: #04342C; stroke: #5DCAA5; }
.bar-q4-max { fill: #085041; stroke: #5DCAA5; }
.bar-iq3-min { fill: #042C53; stroke: #85B7EB; }
.bar-iq3-max { fill: #0C447C; stroke: #85B7EB; }
}
```
## Overlay Line CSS
```css
.vram-line { stroke: #534AB7; stroke-width: 2.5; fill: none; }
.vram-dot { fill: #534AB7; stroke: var(--bg-primary); stroke-width: 2; }
.vram-label { font-family: system-ui, sans-serif; font-size: 10px; fill: #534AB7; font-weight: 500; }
```
## Threshold CSS
```css
.threshold { stroke: #A32D2D; stroke-width: 1; stroke-dasharray: 6 3; fill: none; }
.threshold-label { font-family: system-ui, sans-serif; font-size: 10px; fill: #A32D2D; font-weight: 500; }
```
## Table CSS
```css
.tbl-header { fill: var(--bg-secondary); stroke: var(--border); stroke-width: 0.5; }
.tbl-row { fill: transparent; stroke: var(--border); stroke-width: 0.25; }
.tbl-alt { fill: var(--bg-secondary); stroke: var(--border); stroke-width: 0.25; }
```
## Layout Notes
- **ViewBox**: 680×660 (portrait, chart + legend + table)
- **Chart area**: y=70410, x=90590
- **Legend row**: y=458470
- **Inset table**: y=490620
- **Bar width**: 34px each, 8px gap between min/max pair
- **Group spacing**: 125px center-to-center
- **Dot halo**: White circle (r=6) behind colored dot (r=5) for legibility over bars/grid
## When to Use This Pattern
Use this diagram style for:
- Model benchmark comparisons across quantization levels
- Performance vs. resource usage tradeoff analysis
- Any multi-metric comparison with a hardware/software constraint
- GPU/TPU/accelerator benchmarking dashboards
- Accuracy vs. speed Pareto frontiers
- Hardware requirement sizing charts
@@ -0,0 +1,325 @@
# Place Order — UML Sequence Diagram
A UML sequence diagram for the 'Place Order' use case in an e-commerce system. Six lifelines (:Customer, :ShoppingCart, :OrderController, :PaymentGateway, :InventorySystem, :EmailService) interact across 14 numbered messages. An **alt** combined fragment (amber) covers the three conditional outcomes — payment authorized, payment failed, and item unavailable. A **par** combined fragment (teal) nested inside the success branch shows concurrent email confirmation and stock-level update. Demonstrates activation bars, two distinct arrowhead types, UML pentagon fragment tags, and guard conditions.
## Key Patterns Used
- **6 lifelines at equal spacing**: Lifeline centers placed at x=90, 190, 290, 390, 490, 590 (100px apart) so the first box left-edge lands at x=40 and the last right-edge lands at x=640 — exactly filling the safe area
- **Two-row actor headers**: Each lifeline box shows `":"` (small, tertiary color) on one line and the class name (slightly larger, bold) on a second line, matching the UML anonymous-instance notation `:ClassName`
- **Two separate arrowhead markers**: `#arr-call` is a filled triangle (`<polygon>`) for synchronous calls; `#arr-ret` is an open chevron (`fill="none"`) for dashed return messages — both use `context-stroke` to inherit line color
- **Activation bars**: Narrow 8px-wide rectangles (`class="activation"`) layered on top of lifeline stems to show object execution periods; OrderController's bar spans the entire interaction; shorter bars mark PaymentGateway, InventorySystem, and EmailService during their active windows
- **Combined fragment pentagon tag**: Each `alt` / `par` frame uses a `<polygon>` dog-eared label shape in the top-left corner — points follow the pattern `(x,y) (x+w,y) (x+w+6,y+6) (x+w+6,y+18) (x,y+18)` creating the characteristic UML notch
- **Nested par inside alt**: The `par` rect (teal) sits inside branch 1 of the `alt` rect (amber); inner rect uses inset x/y (+15/+2) so both borders remain visible and distinguishable
- **Guard conditions**: Italic text in `[square brackets]` placed immediately after each alt frame divider line, or just inside the top frame for branch 1 — rendered with a dedicated `guard-lbl` class (italic, amber color)
- **Alt branch dividers**: Solid horizontal lines (`.frag-alt-div`) span the full alt rect width to separate the three branches; par branch separator uses a dashed line (`.frag-par-div`) per UML spec
- **Lifeline end caps**: Short 14px horizontal tick marks at y=590 (bottom of all lifeline stems) to formally terminate each lifeline
- **Message sequence annotation**: A faint counter row below the legend (①–③ / ④–⑩ / ⑪–⑫ / ⑬–⑭) explains the four message groups without adding noise to the diagram body
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 648" xmlns="http://www.w3.org/2000/svg">
<defs>
<!-- Open chevron arrowhead — return messages -->
<marker id="arr-ret" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
<!-- Filled triangle arrowhead — synchronous calls -->
<marker id="arr-call" viewBox="0 0 10 10" refX="9" refY="5"
markerWidth="7" markerHeight="7" orient="auto">
<polygon points="0,1 10,5 0,9" fill="context-stroke"/>
</marker>
</defs>
<!--
Lifeline centres (x):
L1 :Customer → 90
L2 :ShoppingCart → 190
L3 :OrderController → 290
L4 :PaymentGateway → 390
L5 :InventorySystem → 490
L6 :EmailService → 590
Actor boxes: x = cx50, y=20, w=100, h=56, rx=6
Lifelines: x = cx, y1=76, y2=590
-->
<!-- ── 1. LIFELINE DASHED STEMS (drawn first, behind everything) ── -->
<line x1="90" y1="76" x2="90" y2="590" class="lifeline"/>
<line x1="190" y1="76" x2="190" y2="590" class="lifeline"/>
<line x1="290" y1="76" x2="290" y2="590" class="lifeline"/>
<line x1="390" y1="76" x2="390" y2="590" class="lifeline"/>
<line x1="490" y1="76" x2="490" y2="590" class="lifeline"/>
<line x1="590" y1="76" x2="590" y2="590" class="lifeline"/>
<!-- ── 2. ACTOR HEADER BOXES ── -->
<!-- :Customer -->
<rect x="40" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="90" y="40" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="90" y="58" text-anchor="middle" dominant-baseline="central">Customer</text>
<!-- :ShoppingCart -->
<rect x="140" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="190" y="37" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="190" y="55" text-anchor="middle" dominant-baseline="central">ShoppingCart</text>
<!-- :OrderController -->
<rect x="240" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="290" y="37" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="290" y="55" text-anchor="middle" dominant-baseline="central">OrderController</text>
<!-- :PaymentGateway -->
<rect x="340" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="390" y="37" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="390" y="55" text-anchor="middle" dominant-baseline="central">PaymentGateway</text>
<!-- :InventorySystem -->
<rect x="440" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="490" y="37" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="490" y="55" text-anchor="middle" dominant-baseline="central">InventorySystem</text>
<!-- :EmailService -->
<rect x="540" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="590" y="37" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="590" y="55" text-anchor="middle" dominant-baseline="central">EmailService</text>
<!-- ── 3. ACTIVATION BARS ── -->
<!-- ShoppingCart: active while forwarding checkout → placeOrder -->
<rect x="186" y="102" width="8" height="26" rx="1" class="activation"/>
<!-- OrderController: active throughout full sequence -->
<rect x="286" y="128" width="8" height="415" rx="1" class="activation"/>
<!-- PaymentGateway: active during auth check (happy-path branch only) -->
<rect x="386" y="154" width="8" height="46" rx="1" class="activation"/>
<!-- InventorySystem: active from reserveItems → updateStockLevels end -->
<rect x="486" y="225" width="8" height="128" rx="1" class="activation"/>
<!-- EmailService: active during confirmation send -->
<rect x="586" y="290" width="8" height="25" rx="1" class="activation"/>
<!-- ── 4. PRE-ALT MESSAGES ── -->
<!-- ① checkout() :Customer → :ShoppingCart -->
<line x1="90" y1="102" x2="186" y2="102" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="140" y="97" text-anchor="middle">checkout()</text>
<!-- ② placeOrder(cartItems) :ShoppingCart → :OrderController -->
<line x1="194" y1="128" x2="286" y2="128" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="242" y="123" text-anchor="middle">placeOrder(cartItems)</text>
<!-- ③ authorizePayment(amount) :OrderController → :PaymentGateway -->
<line x1="294" y1="154" x2="386" y2="154" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="342" y="149" text-anchor="middle">authorizePayment(amount)</text>
<!-- ── 5. ALT COMBINED FRAGMENT y=166 → y=563 ── -->
<!-- Outer alt rectangle -->
<rect x="45" y="166" width="590" height="397" rx="3" class="frag-alt-bg"/>
<!-- Pentagon "alt" tag: TL corner notch shape -->
<polygon points="45,166 84,166 90,173 90,185 45,185" class="frag-alt-tag"/>
<text class="frag-alt-kw" x="67" y="178" text-anchor="middle" dominant-baseline="central">alt</text>
<!-- Guard: branch 1 -->
<text class="guard-lbl" x="96" y="179" dominant-baseline="central">[payment authorized]</text>
<!-- ─── Branch 1: payment authorized ─── -->
<!-- ④ « authorized » :PaymentGateway → :OrderController (dashed return) -->
<line x1="386" y1="200" x2="294" y2="200" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="342" y="195" text-anchor="middle">« authorized »</text>
<!-- ⑤ reserveItems(cartItems) :OrderController → :InventorySystem -->
<line x1="294" y1="225" x2="486" y2="225" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="392" y="220" text-anchor="middle">reserveItems(cartItems)</text>
<!-- ⑥ « itemsReserved » :InventorySystem → :OrderController (dashed return) -->
<line x1="486" y1="250" x2="294" y2="250" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="392" y="245" text-anchor="middle">« itemsReserved »</text>
<!-- ── 6. PAR COMBINED FRAGMENT (nested inside alt branch 1) y=266 → y=373 ── -->
<!-- Inner par rectangle -->
<rect x="60" y="266" width="560" height="107" rx="3" class="frag-par-bg"/>
<!-- Pentagon "par" tag -->
<polygon points="60,266 97,266 102,272 102,284 60,284" class="frag-par-tag"/>
<text class="frag-par-kw" x="81" y="275" text-anchor="middle" dominant-baseline="central">par</text>
<!-- Par branch 1: email confirmation -->
<!-- ⑦ sendConfirmationEmail() :OrderController → :EmailService -->
<line x1="294" y1="295" x2="586" y2="295" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="442" y="290" text-anchor="middle">sendConfirmationEmail()</text>
<!-- ⑧ « emailQueued » :EmailService → :OrderController (dashed return) -->
<line x1="586" y1="318" x2="294" y2="318" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="442" y="313" text-anchor="middle">« emailQueued »</text>
<!-- Par branch divider (dashed, per UML spec) -->
<line x1="60" y1="336" x2="620" y2="336" class="frag-par-div"/>
<!-- Par branch 2: stock level update -->
<!-- ⑨ updateStockLevels() :OrderController → :InventorySystem -->
<line x1="294" y1="355" x2="486" y2="355" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="392" y="350" text-anchor="middle">updateStockLevels()</text>
<!-- PAR fragment ends at y=373 -->
<!-- ⑩ « orderPlaced » :OrderController → :Customer (dashed return, after par) -->
<line x1="286" y1="395" x2="90" y2="395" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="190" y="390" text-anchor="middle">« orderPlaced »</text>
<!-- ─── Alt else: [payment failed] ─── -->
<!-- Alt branch divider 1 (solid line) -->
<line x1="45" y1="415" x2="635" y2="415" class="frag-alt-div"/>
<text class="guard-lbl" x="50" y="429" dominant-baseline="central">[payment failed]</text>
<!-- ⑪ « authFailed » :PaymentGateway → :OrderController (dashed return) -->
<line x1="390" y1="448" x2="294" y2="448" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="344" y="443" text-anchor="middle">« authFailed »</text>
<!-- ⑫ error(PAYMENT_FAILED) :OrderController → :Customer -->
<line x1="286" y1="470" x2="90" y2="470" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="190" y="465" text-anchor="middle">error(PAYMENT_FAILED)</text>
<!-- ─── Alt else: [item unavailable] ─── -->
<!-- Alt branch divider 2 (solid line) -->
<line x1="45" y1="490" x2="635" y2="490" class="frag-alt-div"/>
<text class="guard-lbl" x="50" y="504" dominant-baseline="central">[item unavailable]</text>
<!-- ⑬ « unavailable » :InventorySystem → :OrderController (dashed return) -->
<line x1="486" y1="523" x2="294" y2="523" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="392" y="518" text-anchor="middle">« unavailable »</text>
<!-- ⑭ error(ITEM_UNAVAILABLE) :OrderController → :Customer -->
<line x1="286" y1="545" x2="90" y2="545" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="190" y="540" text-anchor="middle">error(ITEM_UNAVAILABLE)</text>
<!-- ALT fragment ends at y=563 -->
<!-- ── 7. LIFELINE END CAPS (short horizontal tick at y=590) ── -->
<line x1="83" y1="590" x2="97" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<line x1="183" y1="590" x2="197" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<line x1="283" y1="590" x2="297" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<line x1="383" y1="590" x2="397" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<line x1="483" y1="590" x2="497" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<line x1="583" y1="590" x2="597" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<!-- ── 8. LEGEND ── -->
<text class="ts" x="45" y="612" opacity=".45">Legend —</text>
<line x1="110" y1="609" x2="148" y2="609"
stroke="var(--text-primary)" stroke-width="1.5" marker-end="url(#arr-call)"/>
<text class="ts" x="154" y="613" opacity=".75">Synchronous call</text>
<line x1="288" y1="609" x2="326" y2="609"
stroke="var(--text-secondary)" stroke-width="1.5"
stroke-dasharray="5 3" marker-end="url(#arr-ret)"/>
<text class="ts" x="332" y="613" opacity=".75">Return message</text>
<rect x="458" y="603" width="22" height="13" rx="2"
fill="#FAEEDA" fill-opacity="0.5" stroke="#854F0B" stroke-width="0.75"/>
<text class="ts" x="484" y="613" opacity=".75">alt fragment</text>
<rect x="558" y="603" width="22" height="13" rx="2"
fill="#E1F5EE" fill-opacity="0.6" stroke="#0F6E56" stroke-width="0.75"/>
<text class="ts" x="584" y="613" opacity=".75">par fragment</text>
<!-- Message group annotation -->
<text class="ts" x="45" y="632" opacity=".35">
①–③ pre-condition · ④–⑩ happy path · ⑪–⑫ payment failure · ⑬–⑭ item unavailable
</text>
</svg>
```
## Custom CSS
Add these classes to the hosting page `<style>` block (in addition to the standard skill CSS):
```css
/* ── Actor lifeline header boxes ── */
.actor { fill: var(--bg-secondary); stroke: var(--text-secondary); stroke-width: 0.5; }
.actor-name { font-family: system-ui, sans-serif; font-size: 11.5px; font-weight: 600;
fill: var(--text-primary); }
.actor-colon { font-family: system-ui, sans-serif; font-size: 10px; fill: var(--text-tertiary); }
/* ── Lifeline dashed stems ── */
.lifeline { stroke: var(--text-tertiary); stroke-width: 1; stroke-dasharray: 6 4; fill: none; }
/* ── Activation bars ── */
.activation { fill: var(--bg-secondary); stroke: var(--text-secondary); stroke-width: 0.75; }
/* ── Message arrows ── */
.msg-call { stroke: var(--text-primary); stroke-width: 1.5; fill: none; }
.msg-ret { stroke: var(--text-secondary); stroke-width: 1.5; fill: none; stroke-dasharray: 6 3; }
/* ── Message labels ── */
.mlbl { font-family: system-ui, sans-serif; font-size: 11px; fill: var(--text-primary); }
.rlbl { font-family: system-ui, sans-serif; font-size: 11px; fill: var(--text-secondary);
font-style: italic; }
/* ── Combined fragment: alt (amber) ── */
.frag-alt-bg { fill: #FAEEDA; fill-opacity: 0.18; stroke: #854F0B; stroke-width: 1; }
.frag-alt-tag { fill: #FAEEDA; stroke: #854F0B; stroke-width: 0.75; }
.frag-alt-kw { font-family: system-ui, sans-serif; font-size: 11px; font-weight: 700;
fill: #633806; }
.frag-alt-div { stroke: #854F0B; stroke-width: 0.75; fill: none; }
.guard-lbl { font-family: system-ui, sans-serif; font-size: 10.5px; font-style: italic;
fill: #854F0B; }
/* ── Combined fragment: par (teal) ── */
.frag-par-bg { fill: #E1F5EE; fill-opacity: 0.35; stroke: #0F6E56; stroke-width: 1; }
.frag-par-tag { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 0.75; }
.frag-par-kw { font-family: system-ui, sans-serif; font-size: 11px; font-weight: 700;
fill: #085041; }
.frag-par-div { stroke: #0F6E56; stroke-width: 0.75; stroke-dasharray: 5 3; fill: none; }
/* ── Dark mode overrides ── */
@media (prefers-color-scheme: dark) {
.actor { fill: #2c2c2a; stroke: #b4b2a9; }
.actor-name { fill: #e8e6de; }
.actor-colon { fill: #888780; }
.frag-alt-bg { fill: #633806; fill-opacity: 0.25; stroke: #EF9F27; }
.frag-alt-tag { fill: #633806; stroke: #EF9F27; }
.frag-alt-kw { fill: #FAC775; }
.frag-alt-div { stroke: #EF9F27; }
.guard-lbl { fill: #EF9F27; }
.frag-par-bg { fill: #085041; fill-opacity: 0.35; stroke: #5DCAA5; }
.frag-par-tag { fill: #085041; stroke: #5DCAA5; }
.frag-par-kw { fill: #9FE1CB; }
.frag-par-div { stroke: #5DCAA5; }
}
```
## Color Assignments
| Element | Color | Reason |
|---------|-------|--------|
| Actor header boxes | Neutral (`var(--bg-secondary)`) | Structural / non-semantic — all lifelines share one style |
| Activation bars | Neutral (`var(--bg-secondary)`) | Show execution periods without adding semantic color |
| Synchronous call arrows | `var(--text-primary)` + filled triangle | High contrast for calls — the primary interaction direction |
| Return / dashed arrows | `var(--text-secondary)` + open chevron | Lower contrast for returns — secondary flow direction |
| `alt` fragment | Amber (`#FAEEDA` / `#854F0B`) | Warning / conditional — matches `c-amber` semantic meaning |
| Guard condition text | Amber italic | Belongs visually to the alt fragment |
| `par` fragment | Teal (`#E1F5EE` / `#0F6E56`) | Concurrent success path — matches `c-teal` semantic meaning |
| Alt branch dividers | Amber solid line | Continuity with the alt frame color |
| Par branch divider | Teal dashed line | UML spec: par branches separated by dashed lines |
## Layout Notes
- **ViewBox**: 680×648 (standard width; height = lifeline bottom y=590 + legend + annotation + 16px buffer)
- **Lifeline spacing formula**: `(safe_area_width) / (n_lifelines 1) = 600 / 5 = 120px` — but use `spacing = 100px` starting at `x=90` so that first box left = 40 and last box right = 640 exactly
- **Actor box split-label trick**: Two separate `<text>` elements per box — one for `":"` (10px, tertiary color) and one for the class name (11.5px bold, primary color) — avoids the 14px font needing ~150px+ per box for long names like "OrderController"
- **Pentagon tag formula**: For a fragment starting at `(fx, fy)`, the tag polygon points are `(fx,fy) (fx+w,fy) (fx+w+6,fy+6) (fx+w+6,fy+18) (fx,fy+18)` where `w` = approximate text width of the keyword + 8px padding each side
- **Nested fragment inset**: The `par` rect uses `x = alt_x + 15` and `y = alt_y_current + 2` so both borders remain simultaneously visible — inset enough to separate visually, not so much that it wastes vertical space
- **Activation bar placement**: `x = lifeline_cx 4`, `width = 8` — centered on the lifeline and narrow enough not to obscure the dashed stem behind it
- **Message label y-offset**: All labels are placed at `y = arrow_y 5` to sit just above the arrow line; this applies to both left-going and right-going arrows since `text-anchor="middle"` handles horizontal centering automatically
- **Return arrows entering activation bars**: End `x1/x2` at lifeline center (e.g. x=294 for OrderController) rather than the bar edge (x=286) — the small overlap is intentional and clarifies the target object
- **Alt guard label placement**: Branch 1 guard goes at `y = frame_top + 13` to the right of the pentagon tag; subsequent branch guards go at `divider_y + 14` so they sit just inside the new branch
- **Lifeline end cap pattern**: `<line x1="cx7" y1="590" x2="cx+7" y2="590" stroke-width="1.5"/>` — a simple symmetric tick, no special marker needed
@@ -0,0 +1,173 @@
# Smart City Infrastructure
A multi-system integration diagram showing interconnected city infrastructure (power, water, transport) connected through a central IoT platform with a citizen dashboard on top. Demonstrates hub-spoke layout, diverse physical shapes, and UI mockups.
## Key Patterns Used
- **Hub-spoke layout**: Central IoT platform with radiating data connections to subsystems
- **Connection dots**: Visual indicators where data lines attach to the central hub
- **Dashboard/UI mockup**: Screen with mini-charts, gauges, and status indicators
- **Multi-system integration**: Three independent systems unified by central platform
- **Semantic line styles**: Different stroke styles for data (dashed), power, water, roads
- **Physical infrastructure shapes**: Solar panels, wind turbines, dams, pipes, roads, vehicles
## New Shape Techniques
### Solar Panels (angled polygons with grid lines)
```xml
<polygon class="solar-panel" points="0,25 35,8 38,12 3,29"/>
<line class="solar-frame" x1="12" y1="22" x2="24" y2="13"/>
<line x1="19" y1="29" x2="19" y2="40" stroke="#5F5E5A" stroke-width="2"/>
```
### Wind Turbine (tower + nacelle + blades)
```xml
<!-- Tapered tower -->
<polygon class="wind-tower" points="20,70 30,70 28,25 22,25"/>
<!-- Nacelle -->
<rect class="wind-hub" x="18" y="20" width="14" height="8" rx="2"/>
<!-- Hub -->
<circle class="wind-hub" cx="25" cy="18" r="5"/>
<!-- Blades (rotated ellipses) -->
<ellipse class="wind-blade" cx="25" cy="5" rx="3" ry="13"/>
<ellipse class="wind-blade" cx="14" cy="26" rx="3" ry="13" transform="rotate(-120, 25, 18)"/>
<ellipse class="wind-blade" cx="36" cy="26" rx="3" ry="13" transform="rotate(120, 25, 18)"/>
```
### Battery with Charge Level
```xml
<rect class="battery" x="0" y="0" width="45" height="65" rx="5"/>
<!-- Terminals -->
<rect x="10" y="-6" width="10" height="8" rx="2" fill="#27500A"/>
<rect x="25" y="-6" width="10" height="8" rx="2" fill="#27500A"/>
<!-- Charge level fill -->
<rect class="battery-level" x="5" y="12" width="35" height="48" rx="3"/>
<text x="22" y="42" text-anchor="middle" fill="#173404" style="font-size:10px">85%</text>
```
### Dam/Reservoir with Water Waves
```xml
<!-- Dam wall -->
<polygon class="reservoir-wall" points="0,60 10,0 70,0 80,60"/>
<!-- Water behind dam -->
<polygon class="water" points="12,10 68,10 68,55 75,55 75,58 5,58 5,55 12,55"/>
<!-- Wave effect -->
<path d="M 15 25 Q 25 22 35 25 Q 45 28 55 25" fill="none" stroke="#378ADD" stroke-width="1" opacity="0.5"/>
```
### Pipe Network with Joints and Valves
```xml
<path class="pipe" d="M 80 85 L 110 85"/>
<circle class="pipe-joint" cx="10" cy="30" r="8"/>
<circle class="valve" cx="190" cy="85" r="6"/>
<!-- Distribution branches -->
<path class="pipe-thin" d="M 18 30 L 50 30"/>
<path class="pipe-thin" d="M 10 22 L 10 5 L 50 5"/>
```
### Road Intersection with Lane Markings
```xml
<!-- Road surface -->
<line class="road" x1="0" y1="50" x2="170" y2="50"/>
<line class="road-mark" x1="10" y1="50" x2="160" y2="50"/>
<!-- Cross road -->
<line class="road" x1="85" y1="0" x2="85" y2="100"/>
<line class="road-mark" x1="85" y1="10" x2="85" y2="90"/>
<!-- Embedded sensors -->
<circle class="sensor" cx="40" cy="50" r="5"/>
```
### Traffic Light with Signal States
```xml
<rect class="traffic-light" x="0" y="0" width="14" height="32" rx="3"/>
<circle class="light-red" cx="7" cy="8" r="4"/>
<circle class="light-off" cx="7" cy="16" r="4"/>
<circle class="light-off" cx="7" cy="24" r="4"/>
```
### Bus with Windows and Wheels
```xml
<rect class="bus" x="0" y="0" width="55" height="28" rx="6"/>
<!-- Windows -->
<rect class="bus-window" x="5" y="5" width="12" height="12" rx="2"/>
<rect class="bus-window" x="20" y="5" width="12" height="12" rx="2"/>
<!-- Wheels with hubcaps -->
<circle cx="14" cy="30" r="6" fill="#2C2C2A"/>
<circle cx="14" cy="30" r="3" fill="#5F5E5A"/>
```
### Dashboard UI Mockup
```xml
<!-- Monitor frame -->
<rect class="dashboard" x="0" y="0" width="200" height="120" rx="8"/>
<!-- Screen -->
<rect class="screen" x="10" y="10" width="180" height="85" rx="4"/>
<!-- Mini bar chart -->
<rect class="screen-content" x="18" y="18" width="50" height="35" rx="2"/>
<rect class="screen-chart" x="22" y="38" width="8" height="12"/>
<rect class="screen-chart" x="33" y="32" width="8" height="18"/>
<!-- Gauge -->
<circle class="screen-bar" cx="100" cy="35" r="12"/>
<text x="100" y="39" text-anchor="middle" fill="#E8E6DE" style="font-size:8px">78%</text>
<!-- Status indicators -->
<circle cx="35" cy="74" r="6" fill="#97C459"/>
<circle cx="75" cy="74" r="6" fill="#97C459"/>
<circle cx="115" cy="74" r="6" fill="#EF9F27"/>
```
### Hexagonal IoT Hub with Connection Points
```xml
<!-- Outer hexagon -->
<polygon class="iot-hex" points="0,-45 39,-22 39,22 0,45 -39,22 -39,-22"/>
<!-- Inner hexagon -->
<polygon class="iot-inner" points="0,-20 17,-10 17,10 0,20 -17,10 -17,-10"/>
<!-- Connection dots on data lines -->
<circle cx="321" cy="248" r="4" fill="#7F77DD"/>
```
## CSS Classes for Infrastructure
```css
/* Power system */
.solar-panel { fill: #3C3489; stroke: #534AB7; stroke-width: 0.5; }
.solar-frame { fill: none; stroke: #EEEDFE; stroke-width: 0.5; }
.wind-tower { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.wind-blade { fill: #F1EFE8; stroke: #888780; stroke-width: 0.5; }
.battery { fill: #27500A; stroke: #3B6D11; stroke-width: 1.5; }
.battery-level { fill: #97C459; }
.power-line { stroke: #EF9F27; stroke-width: 2; fill: none; }
/* Water system */
.reservoir-wall { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.water { fill: #85B7EB; stroke: #378ADD; stroke-width: 0.5; }
.pipe { fill: none; stroke: #378ADD; stroke-width: 4; stroke-linecap: round; }
.pipe-joint { fill: #185FA5; stroke: #0C447C; stroke-width: 1; }
.valve { fill: #0C447C; stroke: #185FA5; stroke-width: 1; }
/* Transport */
.road { stroke: #888780; stroke-width: 8; fill: none; stroke-linecap: round; }
.road-mark { stroke: #F1EFE8; stroke-width: 1; fill: none; stroke-dasharray: 6 4; }
.traffic-light { fill: #444441; stroke: #2C2C2A; stroke-width: 0.5; }
.light-red { fill: #E24B4A; }
.light-green { fill: #97C459; }
.light-off { fill: #2C2C2A; }
.bus { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 1.5; }
/* Data/IoT */
.data-line { stroke: #7F77DD; stroke-width: 2; fill: none; stroke-dasharray: 4 3; }
.iot-hex { fill: #EEEDFE; stroke: #534AB7; stroke-width: 2; }
/* Dashboard */
.dashboard { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1.5; }
.screen { fill: #1a1a18; }
.screen-chart { fill: #5DCAA5; }
```
## Layout Notes
- **ViewBox**: 720×620 (wider for three-column system layout)
- **Hub position**: Central IoT at (360, 270) - geometric center
- **Data lines**: Use quadratic curves or L-shaped paths, add connection dots at hub attachment points
- **System spacing**: ~200px width per system section
- **Vertical layers**: Dashboard (top) → IoT Hub (middle) → Systems (bottom)
- **Component grouping**: Use `<g transform="translate(x,y)">` for each major component for easy positioning
@@ -0,0 +1,154 @@
# Smartphone Layer Anatomy
An exploded view diagram showing all internal layers of a smartphone from front glass to back, with alternating left/right labels to avoid overlap. Demonstrates layered product teardown visualization and component detail.
## Key Patterns Used
- **Exploded vertical stack**: Layers separated vertically to show internal structure
- **Alternating labels**: Left/right label placement prevents text overlap
- **Component detail**: Chips, coils, lenses rendered with realistic shapes
- **Thickness scale**: Measurement indicator on the side
- **Progressive depth**: Each layer slightly offset to create 3D stack effect
## New Shape Techniques
### Capacitive Touch Grid
```xml
<rect class="digitizer" x="0" y="0" width="140" height="90" rx="14"/>
<g transform="translate(8, 8)">
<!-- Horizontal lines -->
<line class="digitizer-grid" x1="0" y1="15" x2="124" y2="15"/>
<line class="digitizer-grid" x1="0" y1="37" x2="124" y2="37"/>
<!-- Vertical lines -->
<line class="digitizer-grid" x1="20" y1="0" x2="20" y2="74"/>
<line class="digitizer-grid" x1="50" y1="0" x2="50" y2="74"/>
</g>
<!-- Touch point indicator -->
<circle cx="70" cy="45" r="12" fill="none" stroke="#7F77DD" stroke-width="2" opacity="0.6"/>
<circle cx="70" cy="45" r="5" fill="#7F77DD" opacity="0.4"/>
```
### OLED RGB Subpixels
```xml
<rect class="oled-panel" x="0" y="0" width="140" height="90" rx="12"/>
<g transform="translate(10, 10)">
<!-- RGB pixel group -->
<rect class="oled-subpixel-r" x="0" y="0" width="2" height="6"/>
<rect class="oled-subpixel-g" x="3" y="0" width="2" height="6"/>
<rect class="oled-subpixel-b" x="6" y="0" width="2" height="6"/>
<!-- Repeat pattern -->
<rect class="oled-subpixel-r" x="11" y="0" width="2" height="6"/>
<rect class="oled-subpixel-g" x="14" y="0" width="2" height="6"/>
<rect class="oled-subpixel-b" x="17" y="0" width="2" height="6"/>
</g>
```
### Logic Board with Chips
```xml
<rect class="pcb" x="0" y="0" width="116" height="106" rx="3"/>
<!-- PCB traces -->
<path class="pcb-trace" d="M 8 50 L 30 50 L 30 35"/>
<!-- CPU chip -->
<rect class="chip-cpu" x="30" y="20" width="55" height="35" rx="3"/>
<text class="chip-label" x="57" y="35" text-anchor="middle">A17 Pro</text>
<!-- RAM chip -->
<rect class="chip-ram" x="30" y="62" width="35" height="18" rx="2"/>
<text class="chip-label" x="47" y="74" text-anchor="middle">8GB RAM</text>
<!-- Storage chip -->
<rect class="chip-storage" x="30" y="85" width="55" height="16" rx="2"/>
<text class="chip-label" x="57" y="96" text-anchor="middle">256GB NAND</text>
```
### Camera Lens Array
```xml
<!-- Main camera -->
<circle class="camera-lens" cx="20" cy="20" r="18"/>
<circle class="camera-lens-inner" cx="20" cy="20" r="13"/>
<circle class="camera-sensor" cx="20" cy="20" r="8"/>
<circle cx="20" cy="20" r="3" fill="#1a1a18"/>
<!-- Secondary camera (smaller) -->
<circle class="camera-lens" cx="15" cy="15" r="13"/>
<circle class="camera-lens-inner" cx="15" cy="15" r="9"/>
<circle class="camera-sensor" cx="15" cy="15" r="5"/>
```
### Wireless Charging Coil with Magnets
```xml
<!-- Concentric coil rings -->
<circle class="charging-coil-outer" cx="0" cy="0" r="30"/>
<circle class="charging-coil" cx="0" cy="0" r="23"/>
<circle class="charging-coil" cx="0" cy="0" r="16"/>
<circle class="charging-coil" cx="0" cy="0" r="9"/>
<!-- MagSafe magnet ring -->
<circle class="magnet" cx="0" cy="-35" r="3"/>
<circle class="magnet" cx="25" cy="-25" r="3"/>
<circle class="magnet" cx="35" cy="0" r="3"/>
<circle class="magnet" cx="25" cy="25" r="3"/>
<!-- ... continue around circle -->
```
### Battery Cell
```xml
<rect class="battery" x="0" y="0" width="140" height="90" rx="10"/>
<rect class="battery-cell" x="10" y="12" width="120" height="60" rx="6"/>
<text x="70" y="38" text-anchor="middle" fill="#27500A" style="font-size:9px">Li-Ion Polymer</text>
<text x="70" y="52" text-anchor="middle" fill="#27500A" style="font-size:12px; font-weight:bold">4422 mAh</text>
<rect class="battery-connector" x="55" y="75" width="30" height="10" rx="2"/>
```
## CSS Classes
```css
/* Glass */
.front-glass { fill: #E8E6DE; stroke: #888780; stroke-width: 1; opacity: 0.9; }
.back-glass { fill: #2C2C2A; stroke: #444441; stroke-width: 1; }
/* Touch digitizer */
.digitizer { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1; }
.digitizer-grid { stroke: #AFA9EC; stroke-width: 0.3; fill: none; }
/* OLED */
.oled-panel { fill: #1a1a18; stroke: #444441; stroke-width: 1; }
.oled-subpixel-r { fill: #E24B4A; }
.oled-subpixel-g { fill: #97C459; }
.oled-subpixel-b { fill: #378ADD; }
/* Midframe */
.midframe { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1.5; }
/* Logic board */
.pcb { fill: #0F6E56; stroke: #085041; stroke-width: 1; }
.pcb-trace { stroke: #5DCAA5; stroke-width: 0.3; fill: none; }
.chip-cpu { fill: #3C3489; stroke: #534AB7; stroke-width: 0.5; }
.chip-ram { fill: #185FA5; stroke: #378ADD; stroke-width: 0.5; }
.chip-storage { fill: #27500A; stroke: #3B6D11; stroke-width: 0.5; }
/* Battery */
.battery { fill: #EAF3DE; stroke: #3B6D11; stroke-width: 1.5; }
.battery-cell { fill: #97C459; stroke: #639922; stroke-width: 0.5; }
/* Camera */
.camera-lens { fill: #0C447C; stroke: #185FA5; stroke-width: 0.5; }
.camera-lens-inner { fill: #1a1a18; stroke: #378ADD; stroke-width: 0.3; }
.camera-sensor { fill: #3C3489; stroke: #534AB7; stroke-width: 0.3; }
/* Wireless charging */
.charging-coil { fill: none; stroke: #EF9F27; stroke-width: 1.5; }
.magnet { fill: #5F5E5A; stroke: #444441; stroke-width: 0.5; }
```
## Layout Notes
- **ViewBox**: 900×780 (tall for vertical stack)
- **Layer offset**: Each layer offset 10px right and down for depth effect
- **Label alternation**: Odd layers → RIGHT labels, Even layers → LEFT labels
- **Thickness scale**: Vertical measurement bar on left side
- **Front/Back markers**: Text labels at top and bottom
- **Chip labels**: Use small white text (6px) directly on chip shapes
@@ -0,0 +1,247 @@
# SN2 Reaction Mechanism
A chemistry diagram showing the bimolecular nucleophilic substitution (SN2) mechanism between hydroxide ion and methyl bromide. Demonstrates molecular structure rendering, electron movement arrows, transition state notation, and reaction energy profiles.
## Key Patterns Used
- **Molecular structures**: Ball-and-stick style atoms with bonds
- **Electron movement**: Curved arrows showing nucleophilic attack
- **Transition state**: Bracketed pentacoordinate intermediate with partial charges
- **Stereochemistry**: Wedge/dash bonds showing 3D configuration
- **Energy profile**: Potential energy vs reaction coordinate plot
- **Annotation boxes**: Key features and mechanistic notes
## Diagram Type
This is a **chemistry mechanism diagram** with:
- **Molecular rendering**: Atoms as colored circles with element symbols
- **Bond notation**: Solid, wedge, dash, and partial (dashed) bonds
- **Reaction arrows**: Curved for electron movement, straight for reaction progress
- **Energy landscape**: Quantitative energy profile below mechanism
## Molecular Structure Elements
### Atom Rendering
```xml
<!-- Carbon atom (dark) -->
<circle cx="0" cy="0" r="14" class="carbon"/>
<text class="chem" x="0" y="5" text-anchor="middle" fill="white" font-weight="500">C</text>
<!-- Oxygen atom (red) -->
<circle cx="0" cy="0" r="14" class="oxygen"/>
<text class="chem" x="0" y="5" text-anchor="middle" fill="white" font-weight="500">O</text>
<!-- Hydrogen atom (light with border) -->
<circle cx="38" cy="0" r="8" class="hydrogen"/>
<text class="chem-sm" x="38" y="4" text-anchor="middle">H</text>
<!-- Bromine atom (brown) -->
<circle cx="52" cy="0" r="16" class="bromine"/>
<text class="chem" x="52" y="5" text-anchor="middle" fill="white" font-weight="500">Br</text>
```
```css
.carbon { fill: #2C2C2A; }
.hydrogen { fill: #F1EFE8; stroke: #888780; stroke-width: 1; }
.oxygen { fill: #E24B4A; }
.bromine { fill: #993C1D; }
.nitrogen { fill: #378ADD; } /* for other reactions */
```
### Bond Types
```xml
<!-- Single bond (solid) -->
<line x1="14" y1="0" x2="38" y2="0" class="bond"/>
<!-- Wedge bond (coming toward viewer) -->
<polygon class="bond-wedge" points="0,-14 -6,-35 6,-35"/>
<!-- Dash bond (going away from viewer) -->
<line x1="-10" y1="10" x2="-28" y2="28" class="bond-dash"/>
<!-- Partial bond (forming/breaking) -->
<line x1="-40" y1="0" x2="-14" y2="0" class="bond-partial"/>
```
```css
.bond { stroke: var(--text-primary); stroke-width: 2.5; fill: none; stroke-linecap: round; }
.bond-thin { stroke: var(--text-primary); stroke-width: 1.5; fill: none; }
.bond-partial { stroke: var(--text-primary); stroke-width: 2; fill: none; stroke-dasharray: 4 3; }
.bond-wedge { fill: var(--text-primary); stroke: none; }
.bond-dash { stroke: var(--text-primary); stroke-width: 2; fill: none; stroke-dasharray: 2 2; }
```
### Lone Pairs and Charges
```xml
<!-- Lone pair electrons (dots) -->
<circle cx="-8" cy="-18" r="2" fill="var(--text-primary)"/>
<circle cx="0" cy="-18" r="2" fill="var(--text-primary)"/>
<!-- Formal negative charge -->
<text class="charge" x="12" y="-12" fill="#A32D2D" font-weight="bold">⊖</text>
<!-- Partial charges (delta notation) -->
<text class="partial" x="0" y="-18" text-anchor="middle" fill="#A32D2D">δ⁻</text>
<text class="partial" x="0" y="-22" text-anchor="middle" fill="#3B6D11">δ⁺</text>
```
```css
.charge { font-family: "Times New Roman", Georgia, serif; font-size: 12px; }
.partial { font-family: "Times New Roman", Georgia, serif; font-size: 11px; font-style: italic; }
```
### Curved Arrow (Electron Movement)
```xml
<defs>
<marker id="curved-arrow" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="6" markerHeight="6" orient="auto">
<path d="M0,0 L10,5 L0,10 L3,5 Z" class="arrow-fill"/>
</marker>
</defs>
<!-- Nucleophilic attack arrow -->
<path d="M -5,15 Q 30,60 70,25" class="arrow-curved" marker-end="url(#curved-arrow)"/>
```
```css
.arrow-curved { stroke: #534AB7; stroke-width: 2; fill: none; }
.arrow-fill { fill: #534AB7; }
```
### Transition State Brackets
```xml
<!-- Left bracket -->
<path d="M -75,-70 L -85,-70 L -85,75 L -75,75" class="ts-bracket"/>
<!-- Right bracket -->
<path d="M 95,-70 L 105,-70 L 105,75 L 95,75" class="ts-bracket"/>
<!-- Double dagger symbol -->
<text class="chem" x="115" y="-60" fill="var(--text-primary)">‡</text>
```
```css
.ts-bracket { stroke: var(--text-primary); stroke-width: 1.5; fill: none; }
```
## Energy Profile Diagram
### Axes
```xml
<!-- Y-axis (Energy) -->
<line x1="0" y1="280" x2="0" y2="0" class="axis" marker-end="url(#straight-arrow)"/>
<text class="t" x="-15" y="-10" text-anchor="middle" transform="rotate(-90 -15 140)">Potential Energy</text>
<!-- X-axis (Reaction Coordinate) -->
<line x1="0" y1="280" x2="600" y2="280" class="axis" marker-end="url(#straight-arrow)"/>
<text class="t" x="580" y="305" text-anchor="middle">Reaction Coordinate</text>
```
### Energy Curve
```xml
<!-- Filled area under curve -->
<path class="energy-fill" d="
M 40,200
Q 150,200 250,50
Q 350,200 500,220
L 500,280 L 40,280 Z
"/>
<!-- Curve line -->
<path class="energy-curve" d="
M 40,200
Q 100,200 150,150
Q 200,80 250,50
Q 300,80 350,150
Q 400,210 500,220
"/>
```
```css
.energy-curve { stroke: #534AB7; stroke-width: 2.5; fill: none; }
.energy-fill { fill: rgba(83, 74, 183, 0.1); }
```
### Energy Levels and Annotations
```xml
<!-- Reactants level -->
<line x1="20" y1="200" x2="80" y2="200" stroke="#3B6D11" stroke-width="2"/>
<text class="ts" x="50" y="218" text-anchor="middle">Reactants</text>
<!-- Transition state peak -->
<circle cx="250" cy="50" r="5" fill="#534AB7"/>
<line x1="250" y1="50" x2="250" y2="280" class="energy-level"/>
<text class="ts" x="250" y="30" text-anchor="middle" fill="#534AB7" font-weight="500">Transition State [‡]</text>
<!-- Products level (lower = exergonic) -->
<line x1="470" y1="220" x2="530" y2="220" stroke="#3B6D11" stroke-width="2"/>
<!-- Activation energy arrow -->
<line x1="100" y1="200" x2="100" y2="55" class="delta-arrow" marker-end="url(#delta-arrow)"/>
<text class="ts" x="85" y="125" text-anchor="end" fill="#3B6D11">E<tspan baseline-shift="sub" font-size="8">a</tspan></text>
```
```css
.energy-level { stroke: var(--text-secondary); stroke-width: 1; stroke-dasharray: 4 2; fill: none; }
.delta-arrow { stroke: #3B6D11; stroke-width: 1.5; fill: none; }
.delta-fill { fill: #3B6D11; }
```
## Chemistry Text Styles
```css
/* Chemistry notation (serif font for formulas) */
.chem { font-family: "Times New Roman", Georgia, serif; font-size: 16px; fill: var(--text-primary); }
.chem-sm { font-family: "Times New Roman", Georgia, serif; font-size: 12px; fill: var(--text-primary); }
.chem-lg { font-family: "Times New Roman", Georgia, serif; font-size: 18px; fill: var(--text-primary); }
```
## Subscript/Superscript in SVG
```xml
<!-- Subscript using tspan -->
<text class="ts">E<tspan baseline-shift="sub" font-size="8">a</tspan></text>
<!-- Superscript for charges -->
<text class="chem-sm">OH⁻</text> <!-- Using Unicode superscript minus -->
<text class="chem-sm">CH₃Br</text> <!-- Using Unicode subscript 3 -->
```
## Color Coding
| Element | Color | Hex |
|---------|-------|-----|
| Carbon | Dark gray | #2C2C2A |
| Hydrogen | Light cream | #F1EFE8 |
| Oxygen | Red | #E24B4A |
| Bromine | Brown | #993C1D |
| Nitrogen | Blue | #378ADD |
| Electron arrows | Purple | #534AB7 |
| Positive charge | Green | #3B6D11 |
| Negative charge | Red | #A32D2D |
## Layout Notes
- **ViewBox**: 800×680 (landscape for mechanism + energy profile)
- **Mechanism section**: y=60-300, showing reactants → TS → products
- **Energy profile**: y=320-630, with axes and curve
- **Atom sizes**: C/O/Br ~12-16px radius, H ~7-8px radius
- **Bond lengths**: ~25-40px between atom centers
- **Spacing**: ~140px between mechanism stages
## When to Use This Pattern
Use this diagram style for:
- Organic reaction mechanisms (SN1, SN2, E1, E2, additions, eliminations)
- Reaction energy profiles and kinetics
- Stereochemistry illustrations
- Enzyme mechanism diagrams
- Transition state theory visualization
- Any chemistry concept requiring molecular structures
@@ -0,0 +1,338 @@
# Modern Onshore Wind Turbine Structure
A physical/structural cross-section diagram showing all major components of a modern wind turbine from underground foundation to blade tips.
## Key Patterns Used
- **Underground section**: Soil layers, deep concrete foundation with rebar reinforcement grid, spread footing
- **Cross-section view**: Tower wall thickness shown, internal components visible
- **Tapered tower**: Path elements creating realistic tower silhouette that narrows toward top
- **Internal access**: Ladder with rungs, elevator shaft inside tower
- **Cable routing**: Power cables running from nacelle down through tower to transformer
- **Nacelle cutaway**: Gearbox, generator, brake, yaw system all visible inside housing
- **Rotor assembly**: Hub with pitch motors at blade roots, three composite blades with gradient fill
- **Ground level marker**: Clear separation between above/below ground
- **Component color coding**: Each system type has distinct color (blue=generator, gold=gearbox, red=brake, green=yaw, purple=pitch)
- **Legend bar**: Quick reference for color meanings
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 920" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
<!-- Blade gradient for 3D effect -->
<linearGradient id="bladeGrad" x1="0%" y1="0%" x2="100%" y2="0%">
<stop offset="0%" style="stop-color:#D3D1C7"/>
<stop offset="50%" style="stop-color:#F1EFE8"/>
<stop offset="100%" style="stop-color:#B4B2A9"/>
</linearGradient>
</defs>
<!-- ===== GROUND LEVEL LINE ===== -->
<line x1="40" y1="680" x2="640" y2="680" stroke="#3B6D11" stroke-width="2"/>
<text class="tl" x="45" y="675">Ground level</text>
<!-- ===== UNDERGROUND: FOUNDATION ===== -->
<!-- Soil layers -->
<rect x="120" y="680" width="300" height="180" class="soil"/>
<rect x="120" y="780" width="300" height="80" class="soil-dark"/>
<!-- Deep concrete foundation -->
<path d="M170 680 L170 820 L200 850 L340 850 L370 820 L370 680 Z" class="concrete"/>
<!-- Foundation base spread -->
<path d="M140 820 L170 820 L200 850 L340 850 L370 820 L400 820 L400 860 L140 860 Z" class="concrete-dark"/>
<!-- Rebar reinforcement -->
<g class="rebar">
<line x1="185" y1="700" x2="185" y2="840"/>
<line x1="210" y1="700" x2="210" y2="845"/>
<line x1="235" y1="700" x2="235" y2="848"/>
<line x1="260" y1="700" x2="260" y2="848"/>
<line x1="285" y1="700" x2="285" y2="848"/>
<line x1="310" y1="700" x2="310" y2="845"/>
<line x1="335" y1="700" x2="335" y2="840"/>
<!-- Horizontal rebar -->
<line x1="175" y1="720" x2="365" y2="720"/>
<line x1="175" y1="760" x2="365" y2="760"/>
<line x1="175" y1="800" x2="365" y2="800"/>
<line x1="155" y1="835" x2="385" y2="835"/>
</g>
<!-- Foundation labels -->
<line x1="410" y1="770" x2="480" y2="770" class="leader"/>
<text class="ts" x="485" y="766">Deep concrete foundation</text>
<text class="tl" x="485" y="778">Reinforced with steel rebar</text>
<text class="tl" x="485" y="790">15-25m deep typical</text>
<line x1="400" y1="850" x2="480" y2="870" class="leader"/>
<text class="ts" x="485" y="866">Foundation spread footing</text>
<text class="tl" x="485" y="878">Distributes load to soil</text>
<!-- ===== TOWER BASE ===== -->
<!-- Tower base flange -->
<ellipse cx="270" cy="680" rx="70" ry="12" class="concrete-dark"/>
<rect x="200" y="668" width="140" height="12" class="tower"/>
<!-- Transformer at base -->
<g transform="translate(470, 640)">
<rect x="0" y="0" width="50" height="40" rx="3" class="transformer"/>
<!-- Cooling fins -->
<rect x="52" y="5" width="4" height="30" class="transformer-fin"/>
<rect x="58" y="5" width="4" height="30" class="transformer-fin"/>
<rect x="64" y="5" width="4" height="30" class="transformer-fin"/>
<!-- Connection box -->
<rect x="10" y="-8" width="30" height="10" rx="2" class="transformer-fin"/>
</g>
<line x1="470" y1="660" x2="430" y2="640" class="leader"/>
<text class="ts" x="385" y="636" text-anchor="end">Transformer</text>
<text class="tl" x="385" y="648" text-anchor="end">Steps up voltage for grid</text>
<!-- ===== TUBULAR STEEL TOWER ===== -->
<!-- Tower outer shell (tapered) -->
<path d="M200 680 L220 200 L320 200 L340 680 Z" class="tower"/>
<!-- Tower inner surface (cutaway) -->
<path d="M215 680 L232 210 L308 210 L325 680 Z" class="tower-inner"/>
<!-- Tower section joints -->
<line x1="205" y1="550" x2="335" y2="550" class="tower-section"/>
<line x1="210" y1="420" x2="330" y2="420" class="tower-section"/>
<line x1="215" y1="300" x2="325" y2="300" class="tower-section"/>
<!-- Internal ladder (left side) -->
<g transform="translate(225, 220)">
<!-- Ladder rails -->
<line x1="0" y1="0" x2="8" y2="450" class="ladder"/>
<line x1="15" y1="0" x2="23" y2="450" class="ladder"/>
<!-- Rungs -->
<g class="ladder-rung">
<line x1="1" y1="20" x2="22" y2="21"/>
<line x1="1" y1="50" x2="22" y2="52"/>
<line x1="2" y1="80" x2="22" y2="83"/>
<line x1="2" y1="110" x2="23" y2="114"/>
<line x1="2" y1="140" x2="23" y2="145"/>
<line x1="3" y1="170" x2="23" y2="176"/>
<line x1="3" y1="200" x2="24" y2="207"/>
<line x1="3" y1="230" x2="24" y2="238"/>
<line x1="4" y1="260" x2="24" y2="269"/>
<line x1="4" y1="290" x2="25" y2="300"/>
<line x1="4" y1="320" x2="25" y2="331"/>
<line x1="5" y1="350" x2="25" y2="362"/>
<line x1="5" y1="380" x2="26" y2="393"/>
<line x1="6" y1="410" x2="26" y2="424"/>
<line x1="6" y1="440" x2="27" y2="455"/>
</g>
</g>
<!-- Elevator shaft (right side) -->
<rect x="280" y="230" width="25" height="430" rx="2" class="elevator"/>
<text class="tl" x="292" y="450" text-anchor="middle" transform="rotate(-90, 292, 450)" fill="#185FA5">ELEVATOR</text>
<!-- Electrical cables running down -->
<path d="M270 220 C270 300 268 400 268 500 C268 600 268 650 310 665 L470 665" class="cable"/>
<path d="M260 225 C258 350 256 500 256 600 C256 650 256 670 256 680" class="cable-thin"/>
<!-- Tower labels -->
<line x1="340" y1="350" x2="400" y2="320" class="leader"/>
<text class="ts" x="405" y="316">Tubular steel tower</text>
<text class="tl" x="405" y="328">80-120m height typical</text>
<text class="tl" x="405" y="340">Tapered for strength</text>
<line x1="248" y1="400" x2="130" y2="380" class="leader"/>
<text class="ts" x="125" y="376" text-anchor="end">Internal ladder</text>
<text class="tl" x="125" y="388" text-anchor="end">Service access</text>
<line x1="305" y1="500" x2="400" y2="520" class="leader"/>
<text class="ts" x="405" y="516">Service elevator</text>
<line x1="268" y1="580" x2="130" y2="600" class="leader"/>
<text class="ts" x="125" y="596" text-anchor="end">Power cables</text>
<text class="tl" x="125" y="608" text-anchor="end">To transformer</text>
<!-- ===== NACELLE ===== -->
<g transform="translate(270, 160)">
<!-- Nacelle base/bedplate -->
<rect x="-60" y="30" width="120" height="15" class="nacelle"/>
<!-- Yaw bearing -->
<ellipse cx="0" cy="42" rx="35" ry="6" class="bearing"/>
<!-- Yaw motors -->
<rect x="-55" y="32" width="12" height="18" rx="2" class="yaw"/>
<rect x="43" y="32" width="12" height="18" rx="2" class="yaw"/>
<!-- Nacelle housing -->
<path d="M-65 30 L-70 -10 L-65 -35 L70 -35 L85 -10 L85 30 Z" class="nacelle-cover"/>
<!-- Main shaft -->
<rect x="-90" y="-8" width="35" height="16" rx="2" fill="#888780" stroke="#5F5E5A" stroke-width="0.5"/>
<!-- Gearbox -->
<rect x="-55" y="-25" width="40" height="45" rx="3" class="gearbox"/>
<text class="tl" x="-35" y="5" text-anchor="middle" fill="#633806">GEAR</text>
<!-- Generator -->
<rect x="-10" y="-20" width="50" height="38" rx="4" class="generator"/>
<ellipse cx="15" cy="0" rx="15" ry="15" fill="none" stroke="#0C447C" stroke-width="1"/>
<text class="tl" x="15" y="4" text-anchor="middle" fill="#E6F1FB">GEN</text>
<!-- Brake disc -->
<rect x="45" y="-12" width="8" height="24" rx="1" class="brake"/>
<!-- Electrical cabinet -->
<rect x="58" y="-25" width="20" height="35" rx="2" fill="#5F5E5A" stroke="#444441" stroke-width="0.5"/>
<!-- Anemometer on top -->
<line x1="60" y1="-35" x2="60" y2="-50" stroke="#5F5E5A" stroke-width="1"/>
<ellipse cx="60" cy="-52" rx="8" ry="3" fill="#D3D1C7" stroke="#888780" stroke-width="0.5"/>
</g>
<!-- Nacelle labels -->
<line x1="215" y1="135" x2="130" y2="115" class="leader"/>
<text class="ts" x="125" y="111" text-anchor="end">Gearbox</text>
<text class="tl" x="125" y="123" text-anchor="end">Speed multiplier</text>
<line x1="285" y1="145" x2="400" y2="125" class="leader"/>
<text class="ts" x="405" y="121">Generator</text>
<text class="tl" x="405" y="133">Converts rotation to electricity</text>
<line x1="315" y1="155" x2="400" y2="165" class="leader"/>
<text class="ts" x="405" y="161">Brake system</text>
<line x1="215" y1="200" x2="130" y2="220" class="leader"/>
<text class="ts" x="125" y="216" text-anchor="end">Yaw motors</text>
<text class="tl" x="125" y="228" text-anchor="end">Rotate nacelle to face wind</text>
<line x1="330" y1="108" x2="400" y2="90" class="leader"/>
<text class="ts" x="405" y="86">Anemometer</text>
<text class="tl" x="405" y="98">Wind speed sensor</text>
<!-- ===== ROTOR HUB & BLADES ===== -->
<!-- Hub -->
<g transform="translate(180, 152)">
<!-- Hub body -->
<ellipse cx="0" cy="0" rx="25" ry="30" class="hub"/>
<!-- Hub nose cone -->
<path d="M-25 -20 Q-50 0 -25 20 Q-30 0 -25 -20" class="hub-cap"/>
<!-- Blade roots with pitch motors -->
<!-- Blade 1 (up) -->
<g transform="translate(-10, -25) rotate(-80)">
<ellipse cx="0" cy="0" rx="12" ry="8" class="blade-root"/>
<rect x="-8" y="-5" width="10" height="10" rx="2" class="pitch-motor"/>
</g>
<!-- Blade 2 (lower left) -->
<g transform="translate(-18, 18) rotate(40)">
<ellipse cx="0" cy="0" rx="12" ry="8" class="blade-root"/>
<rect x="-8" y="-5" width="10" height="10" rx="2" class="pitch-motor"/>
</g>
<!-- Blade 3 (lower right) -->
<g transform="translate(5, 22) rotate(160)">
<ellipse cx="0" cy="0" rx="12" ry="8" class="blade-root"/>
<rect x="-8" y="-5" width="10" height="10" rx="2" class="pitch-motor"/>
</g>
</g>
<!-- Blade 1 (pointing up-left) -->
<path d="M165 125 Q140 80 130 40 Q125 20 115 15 Q110 18 112 25 Q115 50 125 90 Q140 120 158 128 Z" class="blade" fill="url(#bladeGrad)"/>
<!-- Blade 2 (pointing down-left) -->
<path d="M158 175 Q120 200 80 230 Q60 245 55 255 Q60 258 68 252 Q95 235 130 210 Q155 190 163 178 Z" class="blade" fill="url(#bladeGrad)"/>
<!-- Blade 3 (pointing down-right, partially visible) -->
<path d="M188 175 Q195 200 205 230 Q210 250 215 255 Q220 252 218 245 Q212 220 202 195 Q192 175 186 172 Z" class="blade" fill="url(#bladeGrad)"/>
<!-- Blade labels -->
<line x1="115" y1="35" x2="60" y2="35" class="leader"/>
<text class="ts" x="55" y="31" text-anchor="end">Composite blade</text>
<text class="tl" x="55" y="43" text-anchor="end">Fiberglass/carbon fiber</text>
<text class="tl" x="55" y="55" text-anchor="end">40-80m length each</text>
<line x1="170" y1="130" x2="130" y2="155" class="leader"/>
<text class="ts" x="85" y="151" text-anchor="end">Pitch motor</text>
<text class="tl" x="85" y="163" text-anchor="end">Adjusts blade angle</text>
<line x1="180" y1="152" x2="130" y2="180" class="leader"/>
<text class="ts" x="85" y="183" text-anchor="end">Rotor hub</text>
<!-- ===== LEGEND ===== -->
<g transform="translate(40, 895)">
<rect x="0" y="-15" width="600" height="30" rx="4" fill="none" stroke="#D3D1C7" stroke-width="0.5"/>
<rect x="15" y="-5" width="12" height="12" rx="2" class="generator"/>
<text class="tl" x="32" y="5">Generator</text>
<rect x="95" y="-5" width="12" height="12" rx="2" class="gearbox"/>
<text class="tl" x="112" y="5">Gearbox</text>
<rect x="170" y="-5" width="12" height="12" rx="2" class="brake"/>
<text class="tl" x="187" y="5">Brake</text>
<rect x="230" y="-5" width="12" height="12" rx="2" class="yaw"/>
<text class="tl" x="247" y="5">Yaw system</text>
<rect x="320" y="-5" width="12" height="12" rx="2" class="pitch-motor"/>
<text class="tl" x="337" y="5">Pitch motor</text>
<line x1="415" y1="1" x2="435" y2="1" class="cable" style="stroke-width:2"/>
<text class="tl" x="440" y="5">Power cable</text>
<rect x="515" y="-5" width="12" height="12" rx="2" class="transformer"/>
<text class="tl" x="532" y="5">Transformer</text>
</g>
</svg>
```
## CSS Classes
```css
/* Foundation */
.concrete { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.concrete-dark { fill: #888780; stroke: #5F5E5A; stroke-width: 1; }
.rebar { stroke: #854F0B; stroke-width: 1.5; fill: none; }
.soil { fill: #8B7355; stroke: #5F5E5A; stroke-width: 0.5; }
.soil-dark { fill: #6B5344; }
/* Tower */
.tower { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1; }
.tower-inner { fill: #D3D1C7; stroke: #888780; stroke-width: 0.5; }
.tower-section { stroke: #888780; stroke-width: 0.5; stroke-dasharray: 2 4; }
.ladder { stroke: #5F5E5A; stroke-width: 1; fill: none; }
.ladder-rung { stroke: #888780; stroke-width: 0.8; }
.elevator { fill: #E6F1FB; stroke: #185FA5; stroke-width: 0.5; }
.cable { stroke: #E24B4A; stroke-width: 2; fill: none; }
.cable-thin { stroke: #E24B4A; stroke-width: 1.5; fill: none; }
/* Nacelle */
.nacelle { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1; }
.nacelle-cover { fill: #D3D1C7; stroke: #5F5E5A; stroke-width: 1; }
.gearbox { fill: #BA7517; stroke: #633806; stroke-width: 0.5; }
.generator { fill: #378ADD; stroke: #0C447C; stroke-width: 0.5; }
.brake { fill: #E24B4A; stroke: #791F1F; stroke-width: 0.5; }
.yaw { fill: #5DCAA5; stroke: #085041; stroke-width: 0.5; }
.bearing { fill: #444441; stroke: #2C2C2A; stroke-width: 0.5; }
/* Rotor */
.hub { fill: #D3D1C7; stroke: #5F5E5A; stroke-width: 1; }
.hub-cap { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1; }
.blade { fill: #F1EFE8; stroke: #888780; stroke-width: 1; }
.blade-root { fill: #D3D1C7; stroke: #5F5E5A; stroke-width: 0.5; }
.pitch-motor { fill: #7F77DD; stroke: #3C3489; stroke-width: 0.5; }
/* Transformer */
.transformer { fill: #27500A; stroke: #173404; stroke-width: 1; }
.transformer-fin { fill: #3B6D11; stroke: #27500A; stroke-width: 0.5; }
```
@@ -0,0 +1,43 @@
# Dashboard Patterns
Building blocks for UI/dashboard mockups inside a concept diagram — admin panels, monitoring dashboards, control interfaces, status displays.
## Pattern
A "screen" is a rounded dark rect inside a lighter "frame" rect, with chart/gauge/indicator elements nested on top.
```xml
<!-- Monitor frame -->
<rect class="dashboard" x="0" y="0" width="200" height="120" rx="8"/>
<!-- Screen -->
<rect class="screen" x="10" y="10" width="180" height="85" rx="4"/>
<!-- Mini bar chart -->
<rect class="screen-content" x="18" y="18" width="50" height="35" rx="2"/>
<rect class="screen-chart" x="22" y="38" width="8" height="12"/>
<rect class="screen-chart" x="33" y="32" width="8" height="18"/>
<!-- Gauge -->
<circle class="screen-bar" cx="100" cy="35" r="12"/>
<text x="100" y="39" text-anchor="middle" fill="#E8E6DE" style="font-size:8px">78%</text>
<!-- Status indicators -->
<circle cx="35" cy="74" r="6" fill="#97C459"/> <!-- green = ok -->
<circle cx="75" cy="74" r="6" fill="#EF9F27"/> <!-- amber = warning -->
<circle cx="115" cy="74" r="6" fill="#E24B4A"/> <!-- red = alert -->
```
## CSS
```css
.dashboard { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1.5; }
.screen { fill: #1a1a18; }
.screen-content { fill: #2C2C2A; }
.screen-chart { fill: #5DCAA5; }
.screen-bar { fill: #7F77DD; }
.screen-alert { fill: #E24B4A; }
```
## Tips
- Dashboard screens stay dark in both light and dark mode — they represent actual monitor glass.
- Keep on-screen text small (`font-size:8px` or `10px`) and high-contrast (near-white fill on dark).
- Use the status triad green/amber/red consistently — OK / warning / alert.
- A single dashboard usually sits on top of an infrastructure hub diagram as a unified view (see `examples/smart-city-infrastructure.md`).
@@ -0,0 +1,144 @@
# Infrastructure Patterns
Reusable shapes and line styles for infrastructure / systems-integration diagrams (smart cities, IoT networks, industrial systems, multi-domain architectures).
## Layout pattern: hub-spoke
- **Central hub**: Hexagon or circle representing the integration platform
- **Radiating connections**: Data lines from hub to each subsystem with connection dots
- **Subsystem sections**: Each system (power, water, transport) in its own region
- **Dashboard on top**: Optional UI mockup showing a unified view (see `dashboard-patterns.md`)
```xml
<!-- Central hub (hexagon) -->
<polygon class="iot-hex" points="0,-45 39,-22 39,22 0,45 -39,22 -39,-22"/>
<!-- Data lines with connection dots -->
<path class="data-line" d="M 321 248 L 200 248 L 120 380" stroke-dasharray="4 3"/>
<circle cx="321" cy="248" r="4" fill="#7F77DD"/>
```
## Semantic line styles
Use a dedicated CSS class per subsystem so every diagram reads the same way:
```css
.data-line { stroke: #7F77DD; stroke-width: 2; fill: none; stroke-dasharray: 4 3; }
.power-line { stroke: #EF9F27; stroke-width: 2; fill: none; }
.water-pipe { stroke: #378ADD; stroke-width: 4; stroke-linecap: round; fill: none; }
.road { stroke: #888780; stroke-width: 8; stroke-linecap: round; fill: none; }
```
## Power systems
**Solar panel (angled):**
```xml
<polygon class="solar-panel" points="0,25 35,8 38,12 3,29"/>
<line class="solar-frame" x1="12" y1="22" x2="24" y2="13"/>
```
**Wind turbine:**
```xml
<polygon class="wind-tower" points="20,70 30,70 28,25 22,25"/>
<circle class="wind-hub" cx="25" cy="18" r="5"/>
<ellipse class="wind-blade" cx="25" cy="5" rx="3" ry="13"/>
<ellipse class="wind-blade" cx="14" cy="26" rx="3" ry="13" transform="rotate(-120, 25, 18)"/>
<ellipse class="wind-blade" cx="36" cy="26" rx="3" ry="13" transform="rotate(120, 25, 18)"/>
```
**Battery with charge level:**
```xml
<rect class="battery" x="0" y="0" width="45" height="65" rx="5"/>
<rect x="10" y="-6" width="10" height="8" rx="2" fill="#27500A"/> <!-- terminal -->
<rect class="battery-level" x="5" y="12" width="35" height="48" rx="3"/> <!-- fill level -->
```
**Power pylon:**
```xml
<polygon class="pylon" points="30,0 35,0 40,60 25,60"/>
<line x1="15" y1="10" x2="45" y2="10" stroke="#5F5E5A" stroke-width="3"/>
<circle cx="18" cy="10" r="3" fill="#FAEEDA" stroke="#854F0B"/> <!-- insulator -->
```
## Water systems
**Reservoir/dam:**
```xml
<polygon class="reservoir-wall" points="0,60 10,0 70,0 80,60"/>
<polygon class="water" points="12,10 68,10 68,55 75,55 75,58 5,58 5,55 12,55"/>
<!-- Wave effect -->
<path d="M 15 25 Q 25 22 35 25 Q 45 28 55 25" fill="none" stroke="#378ADD" opacity="0.5"/>
```
**Treatment tank:**
```xml
<ellipse class="treatment-tank" cx="35" cy="45" rx="30" ry="18"/>
<rect class="treatment-tank" x="5" y="20" width="60" height="25"/>
<!-- Bubbles -->
<circle cx="20" cy="32" r="2" fill="#378ADD" opacity="0.6"/>
```
**Pipe with joint and valve:**
```xml
<path class="pipe" d="M 80 85 L 110 85"/>
<circle class="pipe-joint" cx="110" cy="85" r="8"/>
<circle class="valve" cx="95" cy="85" r="6"/>
```
## Transport systems
**Road with lane markings:**
```xml
<line class="road" x1="0" y1="50" x2="170" y2="50"/>
<line class="road-mark" x1="10" y1="50" x2="160" y2="50"/>
```
**Traffic light:**
```xml
<rect class="traffic-light" x="0" y="0" width="14" height="32" rx="3"/>
<circle class="light-red" cx="7" cy="8" r="4"/>
<circle class="light-off" cx="7" cy="16" r="4"/>
<circle class="light-green" cx="7" cy="24" r="4"/>
```
**Bus:**
```xml
<rect class="bus" x="0" y="0" width="55" height="28" rx="6"/>
<rect class="bus-window" x="5" y="5" width="12" height="12" rx="2"/>
<circle cx="14" cy="30" r="6" fill="#2C2C2A"/> <!-- wheel -->
<circle cx="14" cy="30" r="3" fill="#5F5E5A"/> <!-- hubcap -->
```
## Full CSS block (add to the host page or inline <style>)
```css
/* Power */
.solar-panel { fill: #3C3489; stroke: #534AB7; stroke-width: 0.5; }
.wind-tower { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.wind-blade { fill: #F1EFE8; stroke: #888780; stroke-width: 0.5; }
.battery { fill: #27500A; stroke: #3B6D11; stroke-width: 1.5; }
.battery-level { fill: #97C459; }
.power-line { stroke: #EF9F27; stroke-width: 2; fill: none; }
/* Water */
.reservoir-wall { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.water { fill: #85B7EB; stroke: #378ADD; stroke-width: 0.5; }
.pipe { fill: none; stroke: #378ADD; stroke-width: 4; stroke-linecap: round; }
.pipe-joint { fill: #185FA5; stroke: #0C447C; stroke-width: 1; }
.valve { fill: #0C447C; stroke: #185FA5; stroke-width: 1; }
/* Transport */
.road { stroke: #888780; stroke-width: 8; fill: none; stroke-linecap: round; }
.road-mark { stroke: #F1EFE8; stroke-width: 1; stroke-dasharray: 6 4; fill: none; }
.traffic-light { fill: #444441; stroke: #2C2C2A; stroke-width: 0.5; }
.light-red { fill: #E24B4A; }
.light-green { fill: #97C459; }
.light-off { fill: #2C2C2A; }
.bus { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 1.5; }
```
## Reference examples
- `examples/smart-city-infrastructure.md` — hub-spoke with multiple subsystems
- `examples/electricity-grid-flow.md` — voltage hierarchy, flow markers
- `examples/wind-turbine-structure.md` — cross-section with legend
@@ -0,0 +1,42 @@
# Physical Shape Cookbook
Guidance for drawing physical objects (vehicles, buildings, hardware, mechanical systems, anatomy) — when rectangles aren't enough.
## Shape selection
| Physical form | SVG element | Example use |
|---------------|-------------|-------------|
| Curved bodies | `<path>` with Q/C curves | Fuselage, tanks, pipes |
| Tapered/angular shapes | `<polygon>` | Wings, fins, wedges |
| Cylindrical/round | `<ellipse>`, `<circle>` | Engines, wheels, buttons |
| Linear structures | `<line>` | Struts, beams, connections |
| Internal sections | `<rect>` inside parent | Compartments, rooms |
| Dashed boundaries | `stroke-dasharray` | Hidden parts, fuel tanks |
## Layering approach
1. Draw outer structure first (fuselage, frame, hull)
2. Add internal sections on top (cabins, compartments)
3. Add detail elements (engines, wheels, controls)
4. Add leader lines with labels
## Semantic CSS classes (instead of c-* ramps)
For physical diagrams, define component-specific classes directly rather than applying `c-*` color classes. This makes each part self-documenting and lets you keep a restrained palette:
```css
.fuselage { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1; }
.wing { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
.engine { fill: #FAECE7; stroke: #993C1D; stroke-width: 1; }
```
Add these to a local `<style>` inside the SVG (or extend the host page's `<style>` block). The light-mode/dark-mode pattern still works — use the CSS variables from the template (`var(--bg-secondary)`, `var(--border)`, `var(--text-primary)`) if you want dark-mode awareness.
## Reference examples
Look at these example files for working physical-diagram patterns:
- `examples/commercial-aircraft-structure.md` — fuselage curves + tapered wings + ellipse engines
- `examples/wind-turbine-structure.md` — underground foundation, tubular tower, nacelle cutaway
- `examples/smartphone-layer-anatomy.md` — exploded-view stack with alternating labels
- `examples/apartment-floor-plan-conversion.md` — walls, doors, windows, proposed changes
@@ -0,0 +1,174 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Concept Diagram</title>
<style>
:root {
--text-primary: #1a1a18;
--text-secondary: #5f5e5a;
--text-tertiary: #88877f;
--bg-primary: #ffffff;
--bg-secondary: #f6f5f0;
--bg-tertiary: #eeedeb;
--border: rgba(0,0,0,0.15);
--border-hover: rgba(0,0,0,0.3);
}
@media (prefers-color-scheme: dark) {
:root {
--text-primary: #e8e6de;
--text-secondary: #b4b2a9;
--text-tertiary: #888780;
--bg-primary: #1a1a18;
--bg-secondary: #2c2c2a;
--bg-tertiary: #3d3d3a;
--border: rgba(255,255,255,0.15);
--border-hover: rgba(255,255,255,0.3);
}
}
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
font-family: system-ui, -apple-system, sans-serif;
background: var(--bg-tertiary);
display: flex;
justify-content: center;
align-items: flex-start;
min-height: 100vh;
padding: 40px 20px;
}
.card {
background: var(--bg-primary);
border-radius: 16px;
padding: 32px;
max-width: 780px;
width: 100%;
box-shadow: 0 1px 3px rgba(0,0,0,0.08);
}
h1 {
font-size: 18px;
font-weight: 500;
color: var(--text-primary);
margin-bottom: 8px;
}
.subtitle {
font-size: 13px;
color: var(--text-tertiary);
margin-bottom: 24px;
}
svg { width: 100%; height: auto; }
/* === SVG Design System Classes === */
/* Text classes */
.t { font-family: system-ui, -apple-system, sans-serif; font-size: 14px; fill: var(--text-primary); }
.ts { font-family: system-ui, -apple-system, sans-serif; font-size: 12px; fill: var(--text-secondary); }
.th { font-family: system-ui, -apple-system, sans-serif; font-size: 14px; fill: var(--text-primary); font-weight: 500; }
/* Neutral box */
.box { fill: var(--bg-secondary); stroke: var(--border); stroke-width: 0.5px; }
/* Arrow */
.arr { stroke: var(--text-secondary); stroke-width: 1.5px; fill: none; }
/* Leader line */
.leader { stroke: var(--text-tertiary); stroke-width: 0.5px; stroke-dasharray: 4 3; fill: none; }
/* Clickable node */
.node { cursor: pointer; transition: opacity 0.15s; }
.node:hover { opacity: 0.82; }
/* === Color Ramp Classes (light mode) === */
.c-purple > rect, .c-purple > circle, .c-purple > ellipse { fill: #EEEDFE; stroke: #534AB7; }
.c-purple > .th, .c-purple > text.th { fill: #3C3489; }
.c-purple > .ts, .c-purple > text.ts { fill: #534AB7; }
.c-purple > .t, .c-purple > text.t { fill: #3C3489; }
.c-teal > rect, .c-teal > circle, .c-teal > ellipse { fill: #E1F5EE; stroke: #0F6E56; }
.c-teal > .th, .c-teal > text.th { fill: #085041; }
.c-teal > .ts, .c-teal > text.ts { fill: #0F6E56; }
.c-teal > .t, .c-teal > text.t { fill: #085041; }
.c-coral > rect, .c-coral > circle, .c-coral > ellipse { fill: #FAECE7; stroke: #993C1D; }
.c-coral > .th, .c-coral > text.th { fill: #712B13; }
.c-coral > .ts, .c-coral > text.ts { fill: #993C1D; }
.c-coral > .t, .c-coral > text.t { fill: #712B13; }
.c-pink > rect, .c-pink > circle, .c-pink > ellipse { fill: #FBEAF0; stroke: #993556; }
.c-pink > .th, .c-pink > text.th { fill: #72243E; }
.c-pink > .ts, .c-pink > text.ts { fill: #993556; }
.c-pink > .t, .c-pink > text.t { fill: #72243E; }
.c-gray > rect, .c-gray > circle, .c-gray > ellipse { fill: #F1EFE8; stroke: #5F5E5A; }
.c-gray > .th, .c-gray > text.th { fill: #444441; }
.c-gray > .ts, .c-gray > text.ts { fill: #5F5E5A; }
.c-gray > .t, .c-gray > text.t { fill: #444441; }
.c-blue > rect, .c-blue > circle, .c-blue > ellipse { fill: #E6F1FB; stroke: #185FA5; }
.c-blue > .th, .c-blue > text.th { fill: #0C447C; }
.c-blue > .ts, .c-blue > text.ts { fill: #185FA5; }
.c-blue > .t, .c-blue > text.t { fill: #0C447C; }
.c-green > rect, .c-green > circle, .c-green > ellipse { fill: #EAF3DE; stroke: #3B6D11; }
.c-green > .th, .c-green > text.th { fill: #27500A; }
.c-green > .ts, .c-green > text.ts { fill: #3B6D11; }
.c-green > .t, .c-green > text.t { fill: #27500A; }
.c-amber > rect, .c-amber > circle, .c-amber > ellipse { fill: #FAEEDA; stroke: #854F0B; }
.c-amber > .th, .c-amber > text.th { fill: #633806; }
.c-amber > .ts, .c-amber > text.ts { fill: #854F0B; }
.c-amber > .t, .c-amber > text.t { fill: #633806; }
.c-red > rect, .c-red > circle, .c-red > ellipse { fill: #FCEBEB; stroke: #A32D2D; }
.c-red > .th, .c-red > text.th { fill: #791F1F; }
.c-red > .ts, .c-red > text.ts { fill: #A32D2D; }
.c-red > .t, .c-red > text.t { fill: #791F1F; }
/* === Dark mode overrides === */
@media (prefers-color-scheme: dark) {
.c-purple > rect, .c-purple > circle, .c-purple > ellipse { fill: #3C3489; stroke: #AFA9EC; }
.c-purple > .th, .c-purple > text.th { fill: #CECBF6; }
.c-purple > .ts, .c-purple > text.ts { fill: #AFA9EC; }
.c-teal > rect, .c-teal > circle, .c-teal > ellipse { fill: #085041; stroke: #5DCAA5; }
.c-teal > .th, .c-teal > text.th { fill: #9FE1CB; }
.c-teal > .ts, .c-teal > text.ts { fill: #5DCAA5; }
.c-coral > rect, .c-coral > circle, .c-coral > ellipse { fill: #712B13; stroke: #F0997B; }
.c-coral > .th, .c-coral > text.th { fill: #F5C4B3; }
.c-coral > .ts, .c-coral > text.ts { fill: #F0997B; }
.c-pink > rect, .c-pink > circle, .c-pink > ellipse { fill: #72243E; stroke: #ED93B1; }
.c-pink > .th, .c-pink > text.th { fill: #F4C0D1; }
.c-pink > .ts, .c-pink > text.ts { fill: #ED93B1; }
.c-gray > rect, .c-gray > circle, .c-gray > ellipse { fill: #444441; stroke: #B4B2A9; }
.c-gray > .th, .c-gray > text.th { fill: #D3D1C7; }
.c-gray > .ts, .c-gray > text.ts { fill: #B4B2A9; }
.c-blue > rect, .c-blue > circle, .c-blue > ellipse { fill: #0C447C; stroke: #85B7EB; }
.c-blue > .th, .c-blue > text.th { fill: #B5D4F4; }
.c-blue > .ts, .c-blue > text.ts { fill: #85B7EB; }
.c-green > rect, .c-green > circle, .c-green > ellipse { fill: #27500A; stroke: #97C459; }
.c-green > .th, .c-green > text.th { fill: #C0DD97; }
.c-green > .ts, .c-green > text.ts { fill: #97C459; }
.c-amber > rect, .c-amber > circle, .c-amber > ellipse { fill: #633806; stroke: #EF9F27; }
.c-amber > .th, .c-amber > text.th { fill: #FAC775; }
.c-amber > .ts, .c-amber > text.ts { fill: #EF9F27; }
.c-red > rect, .c-red > circle, .c-red > ellipse { fill: #791F1F; stroke: #F09595; }
.c-red > .th, .c-red > text.th { fill: #F7C1C1; }
.c-red > .ts, .c-red > text.ts { fill: #F09595; }
}
</style>
</head>
<body>
<div class="card">
<h1><!-- DIAGRAM TITLE HERE --></h1>
<p class="subtitle"><!-- OPTIONAL SUBTITLE HERE --></p>
<!-- PASTE SVG HERE -->
</div>
</body>
</html>

Some files were not shown because too many files have changed in this diff Show More