Compare commits

..

129 Commits

Author SHA1 Message Date
teknium1 9e17ddcead feat(cli): add hermes send to pipe script output to any messaging platform
Introduces a thin CLI wrapper around the existing send_message_tool so
shell scripts, cron scripts, CI hooks, and monitoring daemons can reuse
the gateway's already-configured platform credentials without
reimplementing each platform's REST client.

## What

  hermes send --to telegram "deploy finished"
  echo "RAM 92%" | hermes send --to telegram:-1001234567890
  hermes send --to discord:#ops --file report.md
  hermes send --to slack:#eng --subject "[CI]" --file build.log
  hermes send --list                  # all targets
  hermes send --list telegram         # filter by platform

Supports all platforms the send_message tool already does (Telegram,
Discord, Slack, Signal, SMS, WhatsApp, Matrix, Feishu, DingTalk, WeCom,
Weixin, Email, etc.), including threaded targets and #channel-name
resolution via the channel directory.

## How

hermes_cli/send_cmd.py delegates to tools.send_message_tool.send_message_tool,
which means there is zero new platform-specific code. The subcommand just:

1. Bridges ~/.hermes/.env and top-level ~/.hermes/config.yaml scalars into
   os.environ (same bootstrap the gateway does at startup) — required so
   TELEGRAM_HOME_CHANNEL and friends are visible to load_gateway_config().
2. Resolves the message body from positional arg, --file, or piped stdin.
3. Calls the shared tool and translates its JSON result to exit codes:
   0 success, 1 delivery failure, 2 usage error.

No running gateway is required for bot-token platforms (Telegram, Discord,
Slack, Signal, SMS, WhatsApp) — the tool hits each platform's REST API
directly. Plugin platforms that rely on a live adapter connection still
need the gateway running; the error message is forwarded verbatim.

## Docs

- New guide: website/docs/guides/pipe-script-output.md covering real-world
  patterns (memory watchdogs, CI hooks, cron pipes, long-running task
  completion pings) and the security/gateway notes.
- Cross-links added from automate-with-cron.md ("no LLM? use hermes send")
  and developer-guide/gateway-internals.md (delivery-path section).

## Tests

tests/hermes_cli/test_send_cmd.py (20 tests, all green):

- Happy paths: positional message, stdin, --file, --file -, --subject,
  --json, --quiet.
- Error paths: missing --to, missing body, file not found, tool returns
  error payload (exit 1), tool skipped-send result (exit 0).
- --list: human output, --json output, platform filter, unknown platform.
- Env loader: bridges config.yaml scalars into env, does not override
  existing env vars, gracefully handles missing files.
- Registrar contract: register_send_subparser() returns a working parser.

Smoke-tested end-to-end against a live Telegram bot before commit.
2026-05-04 02:32:49 -07:00
Teknium cac4f2c0e6 test(kanban): update worker-prompt header assertion to match #19427
PR #19427 dropped the 'You are a Kanban worker' identity line from
KANBAN_GUIDANCE so SOUL.md stays authoritative for profile identity.
This test assertion was stale against that change; update it to the
new protocol-only header.
2026-05-04 02:00:42 -07:00
pdonizete deb59eab72 fix: allow kanban tools for orchestrator profiles with kanban toolset
The _check_kanban_mode() gating function only checked for
HERMES_KANBAN_TASK env var, which is only set by the dispatcher
when spawning workers. This prevented orchestrator profiles (like
techlead) from using kanban_create, kanban_link, etc. even when
they had 'kanban' explicitly in their toolsets config.

Now uses load_config() from hermes_cli.config (which has mtime-based
caching) to check if 'kanban' is in the profile's toolsets list.
This enables orchestrators to route work via Kanban while workers
continue using the dispatcher env var.

Fixes #18968
2026-05-04 02:00:42 -07:00
nftpoetrist 9faaa292b4 fix(delegate): inherit parent fallback_chain in _build_child_agent
_build_child_agent constructed child AIAgents without passing
fallback_model, leaving _fallback_chain=[] for every subagent.
When a subagent hit a rate-limit or credential exhaustion the
runtime fallback check (run_agent.py:7486 / 12267) found an empty
chain and failed immediately — even though the parent agent was
configured with fallback_providers and would have recovered.

The cron scheduler already propagates fallback_model correctly
(scheduler.py:1038). Fix closes the parity gap by reading the
parent's _fallback_chain (the normalised list form accepted by
AIAgent's fallback_model parameter) and threading it through.

Empty chains coerce to None so AIAgent initialises _fallback_chain=[]
as usual rather than iterating an empty list.
2026-05-04 01:48:56 -07:00
molvikar cb33c73418 fix(run_agent): gate iteration-limit provider routing to OpenRouter 2026-05-04 01:45:59 -07:00
Asunfly 8a364df2c8 fix: inherit reasoning config in API server runs 2026-05-04 01:44:16 -07:00
SHL0MS aede94e757 fix: back up config.yaml before hermes setup modifies it
Create a timestamped backup (~/.hermes/config.yaml.bak.YYYYMMDD_HHMMSS)
before the setup wizard runs any configuration sections. After setup
completes, show the backup path and a restore command.

This protects user-customized values (compression thresholds, provider
routing, PII redaction, auxiliary model configs) from being silently
overwritten by setup defaults.

Addresses #3522
2026-05-04 01:43:17 -07:00
memosr 2c7d7a9b2f fix(security): bind Meet node server to localhost and restrict token file to owner read 2026-05-04 01:42:59 -07:00
yuehei cdde0c8411 fix(feishu): enable MEDIA attachment delivery in send_message tool
The _send_feishu() function already supports media_files (images, video,
audio, documents) via the adapter's send_image_file/send_video/send_voice
/send_document methods, but _send_to_platform() never routed Feishu into
the early media-handling branch — media attachments were silently dropped
with a "not supported" warning.

Add a Feishu-specific media branch (matching the existing Yuanbao/Signal
pattern) so that MEDIA:<path> tags in send_message calls are correctly
delivered as native Feishu attachments. Also update the two error/warning
message strings to include feishu in the supported platform list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-04 01:42:40 -07:00
WanderWang 45fd45103d fix: _chromium_installed() now checks AGENT_BROWSER_EXECUTABLE_PATH and system Chrome
Before this fix, _chromium_installed() only searched Playwright-style
chromium-* / chromium_headless_shell-* directories, which meant users
with system Chrome or AGENT_BROWSER_EXECUTABLE_PATH configured still
had all browser_* tools gated.

Now checks three sources in priority order:
1. AGENT_BROWSER_EXECUTABLE_PATH env var (if set and points to a real binary)
2. System Chrome/Chromium via shutil.which() (google-chrome, chromium-browser, chrome)
3. Playwright browser cache (existing logic, kept as fallback)

Closes #19294
2026-05-04 01:42:23 -07:00
Yanzhong Su c653f5dc3f Clarify session_search auxiliary model docs 2026-05-04 01:42:07 -07:00
ai-ag2026 8bdec80882 fix(agent): surface preflight compression status
Preflight compression can run synchronously before the first model call when a loaded session exceeds the active context threshold. Gateway users saw no visible progress while the compression LLM call was in flight, which can look like a dropped message during long compactions.\n\nEmit the existing lifecycle status through _emit_status before starting preflight compression so CLI, gateway, and WebUI status callbacks all get immediate feedback.\n\nAdds a regression assertion for the preflight path.
2026-05-04 01:41:51 -07:00
qiqufang d8be50d772 fix(web): add missing icons for config page category sidebar
Add icon mappings for 9 categories that fell back to FileQuestion:
- bedrock (Cloud), curator (Sparkles), kanban (LayoutDashboard)
- model_catalog (BookOpen), openrouter (Route), sessions (History)
- tool_loop_guardrails (Shield), tool_output (FileOutput), updates (RefreshCw)
2026-05-04 01:41:27 -07:00
Teknium 06031229e8 fix(tests): tolerate ps ancestor-walk in find_gateway_pids fallback test (#19590)
Follow-up to #19586 (@cixuuz salvage): _get_ancestor_pids walks ps -o ppid=
up the process tree, which the pre-existing mock in
test_find_gateway_pids_falls_back_to_pid_file_when_process_scan_fails didn't
expect. Return empty stdout so the ancestor loop terminates cleanly and the
original fallback assertion still passes.
2026-05-04 01:40:39 -07:00
liuhao1024 9c93fc5775 fix(tui): call process.exit(0) after Ink exit to trigger terminal cleanup
Ink's exit() calls unmount() which resets terminal modes (kitty keyboard,
mouse, etc.) but does NOT call process.exit().  The Node process stays
alive because stdin is still open (Ink listens on it), so the
process.on('exit') handler in entry.tsx — which sends the final
resetTerminalModes() — never fires.

This left kitty keyboard protocol and other terminal modes enabled in the
parent shell after /quit, Ctrl+C, or Ctrl+D, breaking arrow keys and
other input in subsequent programs.

Add explicit process.exit(0) after exit() in die() so the process
actually terminates and the exit handler runs.

Fixes #19194
2026-05-04 01:39:39 -07:00
Hermes Agent 74c997d985 fix(gateway): move quick-command dispatch before built-in handlers
Quick commands of type "alias" that target built-in slash commands
(e.g. /h -> /model) were processed too late in _handle_message — after
the if-canonical=="model" checks. This meant alias expansion never
reached the target handler and fell through to the LLM as raw text.

Two fixes:
1. Move the quick_commands block before built-in dispatch so alias
   targets (like /model) hit the correct handler after expansion.
2. Extract bare command name from target_command via .split()[0] to
   feed _resolve_cmd() correctly (was using the full arg-string).
2026-05-04 01:39:23 -07:00
holynn c857592558 fix(cli): allow custom:* provider slugs in model validation
Two related fixes for custom_providers model switching:

1. validate_requested_model() now recognizes custom:<name> slugs
   (e.g. custom:volcengine) as custom endpoints, not generic providers.
   Previously only the bare 'custom' slug matched the relaxed validation
   branch, causing model validation to fail with 'not found in provider
   listing' for all named custom providers.

2. switch_model() now consults the custom_providers list when deciding
   whether to override a validation rejection. If the requested model
   matches the entry's 'model' field or any key in its 'models' dict,
   the switch is accepted even when the remote /v1/models endpoint does
   not list it.

Both changes are covered by existing tests (86 passed).
2026-05-04 01:39:06 -07:00
Byrn Tong e8cdcf5328 fix: exclude ancestor PIDs from gateway process scan (#13242)
_scan_gateway_pids() uses ps-based pattern matching to find running
gateways. When invoked from the CLI (e.g. `hermes gateway status`),
the calling process itself matches gateway patterns, causing false
positives — the CLI is mistakenly counted as a running gateway.

Add _get_ancestor_pids() that walks the process tree from the current
PID up to init (PID 1). Merge this set into exclude_pids at the top
of _scan_gateway_pids() so the entire ancestor chain is filtered out.

This complements the existing os.getpid() exclusion in
_append_unique_pid() by also covering parent/grandparent processes
(e.g. when hermes is invoked via a wrapper script or shell).

Closes #13242
2026-05-04 01:38:41 -07:00
Aleksandr Pasevin 8a4fe80f8d fix(signal): skip reactions for unauthorized senders
The on_processing_start hook fired a reaction emoji (👀) on every
inbound Signal message before run.py's _is_user_authorized check.
This meant contacts not in SIGNAL_ALLOWED_USERS would see the bot
react to their messages even though Hermes silently dropped them —
leaking the presence of the bot and causing confusing UX.

Two changes to gateway/platforms/signal.py:

1. Read SIGNAL_ALLOWED_USERS into self.dm_allow_from in __init__
   (mirrors the group_allow_from pattern already in place).

2. Add _reactions_enabled(event) — two-gate check:
   - SIGNAL_REACTIONS=false/0/no disables reactions globally
   - If SIGNAL_ALLOWED_USERS is set, only react to senders in
     the allowlist (skips unauthorized contacts)

Both on_processing_start and on_processing_complete now call this
guard before sending any reaction.

Telegram already has an equivalent _reactions_enabled() guard
(controlled by TELEGRAM_REACTIONS). This brings Signal to parity.
2026-05-04 01:38:21 -07:00
nftpoetrist e89376d66f fix(setup): add missing SLACK_HOME_CHANNEL prompt to _setup_slack()
_setup_slack() was the only platform setup function that did not prompt
for a home channel. All four sibling setups (_setup_telegram,
_setup_discord, _setup_mattermost, _setup_bluebubbles) close with an
identical home-channel block, and setup_gateway() already checks for
SLACK_HOME_CHANNEL presence at the end of the wizard — but the value
was never collected, leaving cron delivery and cross-platform
notifications silently broken for Slack after a fresh hermes setup run.

Add the standard home-channel prompt at the end of _setup_slack(),
symmetric with the Discord implementation. Add two unit tests that
verify the prompt is saved when provided and skipped when left blank.
2026-05-04 01:37:18 -07:00
Byrn Tong 81ce945450 fix(gateway): show other profiles in gateway status to prevent confusion
When multiple gateway profiles are running (e.g. default and wx1),
`hermes gateway status` can be misleading — stopping one profile's
gateway and checking status may still show the other profile's process
without indicating which profile it belongs to.

Add `_print_other_profiles_gateway_status()` which displays running
gateways from other profiles at the bottom of the status output:

    Other profiles:
      ✓ wx1              — PID 166893

This uses the existing `find_profile_gateway_processes()` and
`get_active_profile_name()` — no new dependencies.

Closes #19113
Related: #4402, #4587
2026-05-04 01:37:02 -07:00
wanazhar df88375f0d fix: treat ctrl-c as curses cancel 2026-05-04 01:36:44 -07:00
leavr ccb5d87076 test: cover max-iterations summary message sanitization 2026-05-04 01:36:27 -07:00
tmdgusya a1cb811cb8 fix(cli): avoid voice TTS restart race 2026-05-04 01:36:07 -07:00
Teknium 314fe9f827 chore(release): add AUTHOR_MAP entries for upcoming salvage batch
Pre-adds author-email mappings for the 21 Tier 1b salvage PRs so
their cherry-picked commits land with mapped GitHub logins in the
release notes.
2026-05-04 01:34:32 -07:00
ethan 645b99aadd test(cron): cover null next_run_at recovery and non-dict origin tolerance
Adds four regression tests guarding the bugfix in the previous commit:
- TestGetDueJobs::test_broken_cron_without_next_run_is_recovered exercises
  cron schedules whose next_run_at was lost; expects compute_next_run to
  repopulate it within get_due_jobs() rather than silently skipping the job.
- TestGetDueJobs::test_broken_interval_without_next_run_is_recovered does
  the same for interval schedules.
- TestResolveOrigin::test_string_origin_is_tolerated and
  test_non_dict_origin_is_tolerated confirm _resolve_origin() returns None
  for legacy/hand-edited origins (string, list, int) instead of raising.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-04 01:32:58 -07:00
ethan 78b635ee3c fix(cron): recover null next_run_at jobs and tolerate non-dict origin
Fixes #18722

get_due_jobs() now recomputes next_run_at via compute_next_run() for
cron/interval jobs that arrived with null next_run_at (e.g. via direct
jobs.json edits) instead of silently skipping them. _resolve_origin()
guards with isinstance(origin, dict), and _deliver_result() now routes
through _resolve_origin() so string/non-dict origins no longer crash
the ticker.

References: references #18735 (open competing fix from automated bulk PR touching 79 files); this PR is a focused single-issue contribution and adds the missing interval-recovery test variant

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-04 01:32:58 -07:00
Teknium 91ea3ae4b2 test(skills): add bytes-vs-str equivalence and on-disk hash parity tests
Follow-up on #9925 cherry-pick adding two additional tests:
- bytes content hashes identically to its str-decoded form
- mixed bytes+str bundle hash equals the on-disk content_hash from
  skills_guard (the production invariant used to detect drift)

Also map dodofun@126.com and 1615063567@qq.com in AUTHOR_MAP so the
CI contributor check passes for the cherry-picked commit.

Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
Co-authored-by: zhao0112 <1615063567@qq.com>
2026-05-04 01:28:12 -07:00
dh 3072e5543b skills-hub: hash binary skill bundle files correctly 2026-05-04 01:28:12 -07:00
Teknium c90f25dd1f chore(release): map daixin1204@gmail.com to @SimbaKingjoe 2026-05-04 01:21:23 -07:00
daixin1204 744079ffe6 fix(curator): prevent false-positive consolidation from substring matching
_classify_removed_skills used naive 'in' substring matching to detect
whether a removed skill's name appeared in skill_manage arguments.
Short/common skill names (api, git, test, foo, etc.) matched
incorrectly when they appeared as substrings of longer words in file
paths (references/api-design.md) or content (latest, testing).

Replace with field-aware matching:
- file_path: needle must match a complete filename stem or directory
  name, with -/_ normalised for variant tolerance
- content fields: word-boundary regex (\b) prevents embedding in
  longer words

Also add 3 regression tests covering the false-positive scenarios.
2026-05-04 01:21:23 -07:00
Clooooode c0300575c1 fix(kanban): use get_default_hermes_root() in list_profiles_on_disk
Path.home() / ".hermes" / "profiles" breaks custom-root deployments
(e.g. HERMES_HOME=/opt/data). Switch to get_default_hermes_root() so
profile discovery is consistent with kanban_db_path() and
workspaces_root() fixed in #18985.

Fixes #19017.
Related to #18442, #18985.
2026-05-04 01:21:14 -07:00
Clooooode 1964b0565b test(kanban): add failing test for list_profiles_on_disk with custom HERMES_HOME
list_profiles_on_disk() hardcodes Path.home() / ".hermes" / "profiles",
ignoring HERMES_HOME when set to a custom root (e.g. /opt/data).

Add test_list_profiles_on_disk_custom_root to cover this case.

Related to #18442, #18985.
2026-05-04 01:21:14 -07:00
Siddharth Balyan 8163d37192 fix(skill): reference built-in video_analyze/vision_analyze tools in kanban-video-orchestrator (#19562)
The tool-matrix.md had a vague 'Gemini multimodal / Claude vision' entry
in the external tools table that didn't point to the actual built-in
Hermes tools. Now that video_analyze exists (merged in #19301), update
the skill to reference it properly:

- Add 'Built-in Hermes tools for media review' section with proper
  toolset names, enablement instructions, and capability details
- Add video + vision toolsets to cinematographer, editor, and reviewer
  profile configs
- Update role-archetypes.md to reference tools by name
- Update API key table to explain video_analyze routing
2026-05-04 12:54:50 +05:30
Siddharth Balyan a11aed1acc fix(cli): local backend CLI always uses launch directory, stops .env sync of TERMINAL_CWD (#19334)
The old CWD heuristic was fooled by:
1. TERMINAL_CWD persisted to .env by `hermes config set terminal.cwd`
2. Inherited TERMINAL_CWD from parent hermes processes
3. Only resolved when config had a placeholder value (not explicit paths)

Fix:
- load_cli_config() unconditionally uses os.getcwd() for local backend
- TERMINAL_CWD always force-exported in CLI mode (overrides stale values)
- Gateway sets _HERMES_GATEWAY=1 marker so lazy cli.py imports don't clobber
- Remove terminal.cwd from config-set .env sync map (prevents re-poisoning)
- Clarify setup wizard label as 'Gateway working directory'

Closes #19214
2026-05-04 11:36:19 +05:30
Ben Barclay 434d70d8bc Merge pull request #19540 from NousResearch/single_container_for_all
feat(docker): launch dashboard as side-process via HERMES_DASHBOARD=1
2026-05-04 15:38:19 +10:00
Ben 5671059f62 feat(docker): launch dashboard as side-process via HERMES_DASHBOARD=1
Adds an optional dashboard side-process to the container entrypoint,
toggled by `HERMES_DASHBOARD=1` (also accepts `true` / `yes`).  When set,
the entrypoint backgrounds `hermes dashboard` before `exec`-ing the main
command so the user's chosen foreground process (gateway, chat, `sleep
infinity`, …) remains PID-of-interest for the container runtime.
  docker run -d \
    -v ~/.hermes:/opt/data \
    -p 8642:8642 -p 9119:9119 \
    -e HERMES_DASHBOARD=1 \
    nousresearch/hermes-agent gateway run
Defaults chosen for the container case:
 - Host: 0.0.0.0 (reachable through published port; can override to
   127.0.0.1 via HERMES_DASHBOARD_HOST for sidecar/reverse-proxy setups)
 - Port: 9119 (matches `hermes dashboard`)
 - Auto-adds `--insecure` when binding to non-localhost, matching the
   dashboard's own safety gate for exposing API keys
 - HERMES_DASHBOARD_TUI is read by `hermes dashboard` directly — no
   entrypoint plumbing needed
Dashboard output is prefixed with `[dashboard]` via `stdbuf`+`sed -u` so
it's easy to separate from gateway logs in `docker logs`.  No supervision:
if the dashboard crashes it stays down until the container restarts
(documented in the `:::note` panel).
Other changes bundled in:
 - Deprecate GATEWAY_HEALTH_URL / GATEWAY_HEALTH_TIMEOUT env vars in
   hermes_cli/web_server.py with a DEPRECATED block comment and a
   `.. deprecated::` note on _probe_gateway_health.  The feature still
   works for this release; it'll be removed alongside the move to a
   first-class dashboard config key.
 - Rewrite the "Running the dashboard" doc section around the new
   single-container pattern.  Drops the previously-documented
   dashboard-as-its-own-container setup — that pattern relied on the
   deprecated env vars for cross-container gateway-liveness detection,
   and without them the dashboard would permanently report the gateway
   as "not running".
 - Collapse the two-service Compose example (gateway + dashboard
   container) into a single service with HERMES_DASHBOARD=1.  Removes
   the now-unnecessary bridge network and `depends_on`.
 - Drop the ":::warning" caveat about "Running a dashboard container
   alongside the gateway is safe" — that case no longer exists.
2026-05-04 15:37:27 +10:00
Ben Barclay 95f395027f Merge pull request #19520 from NousResearch/fix_docker_tui
fix(docker/tui): tolerate npm's peer-flag drop in lockfile comparison
2026-05-04 14:29:43 +10:00
Ben 2f2998bb1b fix(tui): tolerate npm's peer-flag drop in lockfile comparison
`_tui_need_npm_install()` compares the canonical `package-lock.json` against
the hidden `node_modules/.package-lock.json` to decide whether `npm install`
needs to re-run. npm 9 drops the `"peer": true` field from the hidden lock
on dev-deps that are *also* declared as peers (the canonical lock preserves
the dual annotation). That made the check flag 16 packages (`@babel/core`,
`@types/node`, `@types/react`, `@typescript-eslint/*`, `react`, `vite`,
`tsx`, `typescript`, …) as mismatched on every launch, triggering a runtime
`npm install`.
Inside the Docker image, that runtime install then fails with EACCES because
`/opt/hermes/ui-tui/node_modules/` is root-owned from build time, so
`docker run … hermes-agent --tui` prints:
    Installing TUI dependencies…
    npm install failed.
…and exits 1, with no preview. The empty preview is a second bug: the
launcher captured only stderr, but npm 9 writes EACCES to stdout, which
was DEVNULL'd.
Fixes:
 - Add `"peer"` to `_NPM_LOCK_RUNTIME_KEYS` so the comparison ignores the
   non-deterministic field, alongside the existing `"ideallyInert"`.
 - Capture stdout as well as stderr in the install subprocess so future
   failures surface a useful preview instead of a bare "failed." line.
Regression tests:
 - `test_no_install_when_only_peer_annotation_differs` — the exact scenario
 - `test_install_when_version_differs_even_with_peer_drop` — guards against
   the peer-drop tolerance masking a real version skew
On-host impact: the same false-positive was firing on every `hermes --tui`
invocation from a normal checkout, silently running a no-op `npm install`
each time (it converged because the host's `node_modules/` is writable).
Startup time on the TUI should drop noticeably.
2026-05-04 14:13:38 +10:00
Chris Danis 363cc93674 fix(cron): bump skill usage when cron jobs load skills
Cron jobs that reference skills via their skills: config never bumped
the usage counters in .usage.json, so the curator could auto-archive
skills actively used by cron jobs based on stale timestamps.

Now _build_job_prompt() calls bump_use(skill_name) for each
successfully loaded skill so the curator sees them as active.
2026-05-03 17:06:48 -07:00
nftpoetrist 808fee151d fix(auxiliary): propagate explicit_api_key to _try_anthropic()
_try_anthropic() lacked the explicit_api_key parameter added to
_try_openrouter() in #18768. When resolve_provider_client() is called
with provider="anthropic" and an explicit key (e.g. from a fallback_model
entry with api_key set), the key was silently ignored — _try_anthropic()
always fell back to resolve_anthropic_token(), so the fallback returned
None,None for users without a default Anthropic credential configured.

Fix: add explicit_api_key: str = None to _try_anthropic() and use
explicit_api_key or <pool/env fallback> in both the pool-present and
no-pool paths. Pass explicit_api_key=explicit_api_key at the call site
in resolve_provider_client(). Symmetric with the _try_openrouter() fix.
No behavior change when explicit_api_key is None.
2026-05-03 17:00:55 -07:00
molvikar 74636f9c4a fix(gateway): clear queued reload-skills notes on new/resume/branch 2026-05-03 17:00:31 -07:00
Kenny Wang 222767e5e8 fix: sanitize Telegram help command mentions 2026-05-03 17:00:09 -07:00
konsisumer 6fda92aa7f fix(gateway): bridge top-level require_mention to Telegram config
Users commonly place `require_mention: true` at the top level of
config.yaml alongside `group_sessions_per_user`, expecting it to gate
Telegram group messages. The key was silently ignored because the
config loader only checked `yaml_cfg["telegram"]["require_mention"]`.

When `require_mention` is found at the top level and no telegram-specific
value is set, the fix now:
- adds it to platforms_data["telegram"]["extra"] so _telegram_require_mention()
  picks it up via the primary config.extra path
- sets TELEGRAM_REQUIRE_MENTION env var for the secondary fallback path

A telegram-specific value (telegram.require_mention) still takes
precedence over the top-level shorthand.

Also corrects telegram.md: bare /cmd without @botname is rejected when
require_mention is enabled; only /cmd@botname (bot-menu form) passes.

Fixes #3979
2026-05-03 16:59:46 -07:00
clawbot 1bd975c0ba fix(gateway): suppress duplicate voice transcripts
Deduplicate exact and near-exact Discord voice STT transcripts per guild/user over a short window to avoid duplicate delayed agent replies.

Adds regression tests for exact and near-duplicate voice transcript suppression.
2026-05-03 16:59:21 -07:00
Teknium b58db237e4 fix(kanban): drop worker identity claim from KANBAN_GUIDANCE (#19427)
KANBAN_GUIDANCE layer 3 of the system prompt started with 'You are a
Kanban worker', overriding the profile's SOUL.md identity at layer 1.
Profiles with strict role boundaries (e.g. a reviewer profile that
never writes code) still executed implementation tasks because the
kanban identity claim diluted SOUL's.

Drop the identity line. Layer 3 now describes the task-execution
protocol only; SOUL.md remains the sole identity slot.

Fixes #19351
2026-05-03 16:59:00 -07:00
LeonSGP43 6713274a42 fix(file): strip leaked terminal fences from reads 2026-05-03 16:58:50 -07:00
Alan Chen 2d7543c61f fix(windows): enforce UTF-8 stdout/stderr to prevent UnicodeEncodeError crash
On Windows, services and terminals default to cp1252 encoding. The CLI
uses box-drawing characters (┌│├└─) in banners, doctor output, and
status displays. When print() tries to encode these under cp1252, an
unhandled UnicodeEncodeError crashes the gateway on startup.

This fix adds early UTF-8 enforcement in hermes_cli/__init__.py:
- Sets PYTHONUTF8=1 and PYTHONIOENCODING=utf-8
- Re-opens stdout/stderr with UTF-8 encoding if not already UTF-8

Runs at import time so it protects all CLI subcommands. No effect on
Unix (gated on sys.platform == "win32"). Backwards-compatible: on
systems already using UTF-8, the function is a no-op.

Fixes #10956
2026-05-03 16:58:25 -07:00
Teknium 2ababfe6ed chore(release): map 0xKingBack noreply email 2026-05-03 16:55:16 -07:00
0xKingBack 3c42024539 fix(curator): pass auxiliary curator api_key/base_url into runtime resolution
Curator review fork now forwards per-slot credentials from auxiliary.curator
and legacy curator.auxiliary to resolve_runtime_provider, matching the
canonical aux task schema. Add regression tests for binding and main fallback.
2026-05-03 16:55:16 -07:00
Kiala 3792b77bd1 fix(send_message): support QQBot C2C and group chats
The _send_qqbot function was hardcoded to use the guild channel
endpoint (/channels/{id}/messages), which fails for C2C private
chats and QQ groups with 'channel does not exist' (code 11263).

This change tries the appropriate endpoints in order:
1. /channels/{id}/messages     (guild channels)
2. /v2/users/{id}/messages     (C2C private chats)
3. /v2/groups/{id}/messages    (QQ groups)

Fixes active sending to QQBot C2C and group recipients.
2026-05-03 16:54:39 -07:00
MrBob 86e64c1d3b fix(gateway): hide required-arg commands from Telegram menu 2026-05-03 15:29:06 -07:00
sprmn24 408dd8aa28 fix(compressor): skip non-string tool content in dedup pass to prevent AttributeError 2026-05-03 15:28:30 -07:00
sprmn24 5bd937533c fix(vision): guard user_prompt type in video_analyze_tool before debug_call_data construction 2026-05-03 15:28:04 -07:00
sprmn24 6c4aca7adc fix(vision): guard user_prompt type before debug_call_data construction 2026-05-03 15:27:40 -07:00
Zyproth a5cae16496 fix(api_server): fall back to default port on malformed API_SERVER_PORT 2026-05-03 15:27:03 -07:00
Amit Gaur 65bebb9b80 fix(cli): follow 307 redirects in MiniMax OAuth httpx clients
The MiniMax OAuth API endpoints have moved from api.minimax.io to
account.minimax.io and the old paths now respond with HTTP 307.
httpx defaults to follow_redirects=False (unlike requests), so the
device-code and token-refresh flows fail with "Temporary Redirect".

Adds follow_redirects=True to the two httpx.Client instances in
hermes_cli/auth.py used by the MiniMax OAuth flow. This is forward-
compatible -- if endpoints move again, the redirect chain is
followed automatically.

Repro before patch:
  curl -i -X POST https://api.minimax.io/oauth/code  # -> 307
  curl -i -X POST https://api.minimax.io/oauth/token # -> 307

Verified end-to-end against a real MiniMax Plus account on macOS;
the existing tests/test_minimax_oauth.py suite (15 tests) still
passes.
2026-05-03 15:26:33 -07:00
Zyproth dfdd7b6e6f fix(codex-transport): preserve request override headers for xai responses 2026-05-03 15:25:45 -07:00
LeonSGP43 4a2f822137 fix(mcp): reconnect on terminated sessions 2026-05-03 15:23:33 -07:00
teknium1 2658494e81 fix(kanban): add per-path env overrides + dispatcher env injection
Layers defense-in-depth on top of the shared-root anchoring (base commit).

Changes in hermes_cli/kanban_db.py:
- kanban_db_path() now honours HERMES_KANBAN_DB first, then falls through
  to kanban_home()/kanban.db.
- workspaces_root() now honours HERMES_KANBAN_WORKSPACES_ROOT first, then
  falls through to kanban_home()/kanban/workspaces.
- All three overrides (HERMES_KANBAN_HOME, HERMES_KANBAN_DB,
  HERMES_KANBAN_WORKSPACES_ROOT) now call .expanduser() for consistency.
- _default_spawn() injects HERMES_KANBAN_DB and
  HERMES_KANBAN_WORKSPACES_ROOT into the worker subprocess env. Even
  when the worker's get_default_hermes_root() resolution somehow
  disagrees with the dispatcher's (symlinks, unusual Docker layouts),
  the two processes still open the same SQLite file.

Module docstring updated to describe all three overrides and the
dispatcher env-injection contract.

Tests (tests/hermes_cli/test_kanban_db.py, TestSharedBoardPaths):
- test_hermes_kanban_db_pin_beats_kanban_home
- test_hermes_kanban_workspaces_root_pin_beats_kanban_home
- test_empty_per_path_overrides_fall_through
- test_dispatcher_spawn_injects_kanban_db_and_workspaces_root
  (monkeypatches subprocess.Popen, asserts both env vars reach the
  child even after HERMES_HOME is rewritten by `hermes -p <profile>`.)

Docs: website/docs/reference/environment-variables.md gets entries
for the three kanban env vars.

This fusion is built on the cleanest of the seven competing PRs that
targeted issue #18442:

* Base commit (from PR #19350 by @GodsBoy): add `kanban_home()` helper
  anchored at `get_default_hermes_root()`, reroute all 5 kanban path
  sites through it (including the 3 sibling log-dir sites that the
  other six PRs missed), 8-test regression class.
* Dispatcher env-var injection approach drawn from PRs #18300
  (@quocanh261997) and #19100 (@cg2aigc).
* Per-path env overrides drawn from PR #19100 (@cg2aigc).
* get_default_hermes_root() resolution direction first proposed in
  PR #18503 (@beibi9966) and PR #18985 (@Gosuj).

Closes the duplicate/competing PRs: #18300, #18503, #18670, #18985,
#19037, #19056, #19100. Fixes #18442 and #19348.

Co-authored-by: quocanh261997 <17986614+quocanh261997@users.noreply.github.com>
Co-authored-by: cg2aigc <232694053+cg2aigc@users.noreply.github.com>
Co-authored-by: beibi9966 <beibei1988@proton.me>
Co-authored-by: Gosuj <123411271+Gosuj@users.noreply.github.com>
Co-authored-by: LeonSGP43 <154585401+LeonSGP43@users.noreply.github.com>
2026-05-03 15:13:39 -07:00
GodsBoy f5bd77b3e1 fix(kanban): anchor board, workspaces, and worker logs at the shared Hermes root
The Kanban board is documented as shared across all Hermes profiles, but
`kanban_db_path()` and `workspaces_root()` resolved through `get_hermes_home()`,
which returns the active profile's HERMES_HOME. When the dispatcher spawned a
worker with `hermes -p <profile> --skills kanban-worker chat -q "work kanban
task <id>"`, the worker rewrote HERMES_HOME to the profile subdirectory before
kanban_db.py imported, opening a profile-local `kanban.db` that did not contain
the dispatcher's task. `kanban_show` and `kanban_complete` failed; the
dispatcher's row stayed `running` and was retried/crashed. The same defect
applied to `_default_spawn`'s log directory and `worker_log_path`, so
`hermes kanban tail` did not see the worker's output.

Add `kanban_home()` in `hermes_cli/kanban_db.py` that resolves through
`HERMES_KANBAN_HOME` (explicit override) then `get_default_hermes_root()`,
which already understands the `<root>/profiles/<name>` and Docker / custom
HERMES_HOME shapes. Reroute `kanban_db_path`, `workspaces_root`, the
`_default_spawn` log directory, `gc_worker_logs`, and `worker_log_path`
through it. Profile-specific config, `.env`, memory, and sessions stay
isolated as before; only the kanban surface is shared.

Add a `TestSharedBoardPaths` regression class to `tests/hermes_cli/test_kanban_db.py`
covering: default install, profile-worker convergence, Docker custom HERMES_HOME,
Docker profile layout, explicit `HERMES_KANBAN_HOME` override, and a real
SQLite round-trip across dispatcher and worker HERMES_HOME perspectives.
The dispatcher/worker convergence tests fail on origin/main and pass after
the fix.

Update the `kanban.md` user-guide page and the misleading docstrings in
`kanban_db.py` to describe the shared-root behavior.

Fixes #19348
2026-05-03 15:13:39 -07:00
Siddharth Balyan 167b5648ea Revert "fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)" (#19329)
This reverts commit 9eaddfafa3.
2026-05-04 00:43:58 +05:30
Siddharth Balyan 9eaddfafa3 fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)
CLI/TUI sessions on the local backend now unconditionally use
os.getcwd() as the working directory. The terminal.cwd config value is
only consumed by gateway/cron/delegation modes (where there's no shell
to cd from).

Previously, 'hermes setup' would write an absolute path (e.g. $HOME)
into terminal.cwd which then pinned the CLI to that directory regardless
of where the user launched hermes from. This was a silent foot-gun —
the user's 'cd' was being ignored.

Changes:

1. cli.py: Restructured CWD resolution — if TERMINAL_CWD is not already
   set by the gateway, and the backend is local, always use os.getcwd().
   Config terminal.cwd is irrelevant for interactive CLI/TUI sessions.

2. setup.py: Moved the cwd prompt from setup_terminal_backend() to
   setup_gateway(). It now only appears when configuring messaging
   platforms and is labeled 'Gateway working directory'.

3. Tests: Rewrote test_cwd_env_respect.py to validate the new behavior:
   explicit config paths are ignored for CLI, gateway pre-set values are
   preserved, non-local backends keep their config paths.

4. Docs: Updated configuration.md, profiles.md, and
   environment-variables.md to clarify that terminal.cwd only affects
   gateway/cron mode on local backend.

Closes #19214
2026-05-04 00:14:36 +05:30
GodsBoy b8ae8cc801 fix(debug): redact log content at upload time in hermes debug share
Apply agent.redact.redact_sensitive_text with force=True to log content
captured by _capture_log_snapshot before it reaches upload_to_pastebin.
On-disk logs are untouched. Compatible with the off-by-default local
redaction policy from #16794: this is upload-time-only and applies
regardless of security.redact_secrets because the public paste service
is the leak surface. A visible banner is prepended to each uploaded log
paste so reviewers know redaction was applied. --no-redact preserves
deliberate unredacted sharing for maintainer-coordinated cases.

The bug-report, setup-help, and feature-request issue templates direct
users to run hermes debug share and paste the resulting public URLs.
With redaction off by default per #16794, those uploads have been
carrying credentials onto paste.rs and dpaste.com.

force=True is non-negotiable: without it, redact_sensitive_text
short-circuits at agent/redact.py:322 when the env var is unset, so the
fix would silently be a no-op for its target audience. A regression
test pins this down.

Fixes #19316
2026-05-03 11:42:20 -07:00
Siddharth Balyan c9a3f36f56 feat: add video_analyze tool for native video understanding (#19301)
* feat: add video_analyze tool for native video understanding

Adds a video_analyze tool that sends video files to multimodal LLMs
(e.g. Gemini) for analysis via the OpenRouter-compatible video_url
content type. Mirrors vision_analyze in structure, error handling,
and registration pattern.

Key design:
- Base64 encodes entire video (no frame extraction, no ffmpeg dep)
- Uses 'video_url' content block type (OpenRouter standard)
- Supports mp4, webm, mov, avi, mkv, mpeg formats
- 50 MB hard cap, 20 MB warning threshold
- 180s minimum timeout (videos take longer than images)
- AUXILIARY_VIDEO_MODEL env override, falls back to AUXILIARY_VISION_MODEL
- Same SSRF protection, retry logic, and cleanup as vision_analyze

Default disabled: registered in 'video' toolset (not in _HERMES_CORE_TOOLS).
Users opt in via: hermes tools enable video, or enabled_toolsets=['video'].

* feat(video): add models.dev capability pre-check + CONFIGURABLE_TOOLSETS entry

- Pre-checks model video capability via models.dev modalities.input
  before expensive base64 encoding. Fails early with helpful message
  suggesting video-capable alternatives (gemini, mimo-v2.5-pro).
- Passes optimistically if model unknown or lookup fails.
- Adds ModelInfo.supports_video_input() helper.
- Adds 'video' to CONFIGURABLE_TOOLSETS and _DEFAULT_OFF_TOOLSETS
  so 'hermes tools enable video' works from CLI.
- 8 new tests for the capability check (37 total).

* refactor(video): remove models.dev capability pre-check

Removes _check_video_model_capability and ModelInfo.supports_video_input.
The vision_analyze tool doesn't pre-check image capability either — both
tools rely on the same pattern: send request, handle API errors gracefully
with categorized user-facing messages. The pre-check was inconsistent
(only worked for some providers/models) so drop it for parity.

* cleanup: compress comments, fix fragile timeout coupling

- Replace _VISION_DOWNLOAD_TIMEOUT * 2 with hardcoded 60s (no silent
  breakage if vision timeout changes independently)
- Strip verbose comments and redundant log lines throughout
- No behavioral changes
2026-05-04 00:04:36 +05:30
SHL0MS 0dd8e3f8d8 rename: video-orchestrator → kanban-video-orchestrator
The kanban prefix makes the skill discoverable alongside `kanban-orchestrator`
and `kanban-worker`, and signals up front that this skill drives the kanban
plugin rather than being a generic video tool.

Updated:
- directory rename
- SKILL.md frontmatter `name:` and H1
- setup.sh.tmpl header
2026-05-03 10:26:54 -07:00
SHL0MS 511add7249 feat(skill): add video-orchestrator optional creative skill
Meta-pipeline that wraps any video request — narrative film, product /
marketing, music video, explainer, ASCII, generative, comic, 3D,
real-time/installation — in a Hermes Kanban pipeline. Performs adaptive
discovery, designs an appropriate team for the requested style, generates
the setup script that creates Hermes profiles + initial kanban task, and
helps monitor execution.

Routes scenes to whichever existing Hermes skill fits each beat
(`ascii-video`, `manim-video`, `p5js`, `comfyui`, `touchdesigner-mcp`,
`blender-mcp`, `pixel-art`, `baoyu-comic`, `claude-design`, `excalidraw`,
`songsee`, `heartmula`, …) plus external APIs for TTS, image-gen, and
image-to-video. Kanban orchestration uses the `kanban-orchestrator` and
`kanban-worker` skills.

The single-project workspace layout, profile-config patching pattern,
SOUL.md-per-profile model, and `--workspace dir:<path>` discipline are
adapted from alt-glitch's original kanban-video-pipeline at
https://github.com/NousResearch/kanban-video-pipeline. This skill
generalizes those patterns across video styles and replaces the original
string-replacement config patcher with a PyYAML-based one that touches
only `toolsets` and `skills.always_load` (preserving security-sensitive
fields like `approvals.mode`).

Includes:
- SKILL.md — workflow + critical rules
- references/ — intake, role archetypes, tool matrix, kanban setup,
  monitoring, six worked examples
- assets/ — brief / setup.sh / soul.md templates
- scripts/ — bootstrap_pipeline.py (plan.json -> setup.sh) and
  monitor.py (poll + issue detection)

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-05-03 10:26:54 -07:00
brooklyn! e97a9993b9 Merge pull request #19307 from NousResearch/bb/fix-terminal-resize-jumble
fix(tui): clear Apple Terminal resize artifacts
2026-05-03 10:17:15 -07:00
Brooklyn Nicholson 279b656adc fix(tui): clear Apple Terminal resize artifacts
Use a deeper alt-screen clear for Apple Terminal resize repaints so host reflow artifacts do not survive the recovery frame.
2026-05-03 12:11:24 -05:00
Bartok9 e527240b27 fix(tools): write_file handler now rejects missing 'content'/'path' args instead of silently writing zero-byte files (#19096)
Under context pressure, frontier models sometimes emit tool calls with
required fields dropped. Previously _handle_write_file() used
args.get('content', '') which substituted an empty string for the missing
key, returned success with bytes_written=0, and created a zero-byte file
on disk. The model had no way to detect the failure.

Changes:
- Reject calls where 'path' is absent or not a non-empty string
- Reject calls where 'content' key is entirely absent (key-presence check,
  not truthiness) — distinguishing a legitimately empty file from a dropped arg
- Reject calls where 'content' is a non-string type
- All error messages include guidance to re-emit the tool call or switch
  to execute_code with hermes_tools.write_file() for large payloads
- Explicit empty string content (file truncation) continues to work

Regression tests added for all four cases: missing path, missing content,
explicit-empty content, and wrong content type.

Fixes #19096
2026-05-03 08:52:41 -07:00
Tranquil-Flow 6b4fb9f878 fix(cron): treat non-dict origin as missing instead of crashing tick
``_resolve_origin`` called ``origin.get('platform')`` on whatever
``job.get('origin')`` returned. The leading ``if not origin: return None``
short-circuited the falsy cases (None, empty dict, "") but a non-empty
string passed that guard and then crashed with
``AttributeError: 'str' object has no attribute 'get'`` on every fire
attempt. Observed in the wild after a migration script tagged jobs with
free-form provenance strings (e.g.
``"combined-digest-replaces-x-and-y-20260503"``).

``mark_job_run`` did record ``last_status: error,
last_error: "'str' object has no attribute 'get'"`` once, but the next
tick re-loaded the same poisoned origin and crashed identically. The
job stayed enabled, fired every tick, and accumulated cascading errors
in the log until ``origin`` was patched manually.

Replace the falsy guard with ``isinstance(origin, dict)``. Non-dict
origins (string, int, list, tuple, float — anything that survived a
hand-edit, JSON-script write, or migration) are now treated the same
as a missing origin: the job continues with ``deliver`` falling back
through its normal home-channel path instead of crashing the scheduler
loop.

Test parametrises the non-dict shapes that can appear in jobs.json
through external writers and asserts ``_resolve_origin`` returns None
for each.

Note: this fix scope is the non-dict-``origin`` crash only. The
``next_run_at: null`` recurring-job recovery (the second sub-bug in
#18722) is independently addressed by the in-flight #18825, which
extends the never-silently-disable defense from #16265 to
``get_due_jobs()`` — that approach is well-aligned with the existing
recovery pattern and ships fine without a competing change here.

Fixes #18722 (non-dict origin crash; recurring-job recovery covered by #18825)
2026-05-03 08:51:50 -07:00
JasonOA888 69dd0f7cf1 fix(approval): extend sensitive write target to cover shell RC and credential files
Terminal commands can write to shell RC files (~/.bashrc, ~/.zshrc,
~/.profile) and credential files (~/.netrc, ~/.pgpass, ~/.npmrc,
~/.pypirc) via redirection or tee without triggering approval, even
though write_file already blocks these paths in file_safety.py.

This creates an inconsistency: write_file protects these paths but
terminal shell redirections bypass the same protection. An agent
prompted via indirect injection could install persistent backdoors
(e.g. PATH manipulation, alias overrides) or write credential entries
without user approval.

Extend _SENSITIVE_WRITE_TARGET with two new regex groups matching the
same paths that file_safety.py's WRITE_DENIED_PATHS already covers:
  _SHELL_RC_FILES  — ~/.bashrc, ~/.zshrc, ~/.profile, ~/.bash_profile,
                     ~/.zprofile
  _CREDENTIAL_FILES — ~/.netrc, ~/.pgpass, ~/.npmrc, ~/.pypirc

All 130 existing tests pass.
2026-05-03 08:49:13 -07:00
teknium1 3c59566cc5 chore(release): map leprincep35700 email for PR #18440 salvage 2026-05-03 08:47:49 -07:00
leprincep35700 b59bb4e351 fix(gateway): preserve home-channel thread targets across restart notifications 2026-05-03 08:47:49 -07:00
Teknium d87fd9f039 fix(goals): make /goal work in TUI and fix gateway verdict delivery (#19209)
/goal was silently broken outside the classic CLI.

TUI: /goal was routed through the HermesCLI slash-worker subprocess,
which set the goal row in SessionDB but then called
_pending_input.put(state.goal) — the subprocess has no reader for that
queue, so the kickoff message was discarded. No post-turn judge was
wired into prompt.submit either, so even a manual kickoff would not
continue the goal loop. Intercept /goal in command.dispatch instead,
drive GoalManager directly, and return {type: send, notice, message}
so the TUI client renders the Goal-set notice and fires the kickoff.
Run the judge in _run_prompt_submit after message.complete, surface
the verdict via status.update {kind: goal}, and chain the continuation
turn after the running guard is released.

Gateway: _post_turn_goal_continuation was gated on
hasattr(adapter, 'send_message'), but adapters only expose send().
That branch was dead on every platform — users never saw
'✓ Goal achieved', 'Continuing toward goal', or budget-exhausted
messages. Replace the dead call with adapter.send(chat_id, content,
metadata) and drop a broken reference to self._loop.

Tests:
- tests/tui_gateway/test_goal_command.py — full /goal dispatch matrix
  (set / status / pause / resume / clear / stop / done / whitespace)
  plus regressions for slash.exec → 4018 and 'goal' staying in
  _PENDING_INPUT_COMMANDS.
- tests/gateway/test_goal_verdict_send.py — locks in the adapter.send
  path for done / continue / budget-exhausted and verifies the hook
  no-ops when no goal is set or the adapter lacks send().
2026-05-03 05:49:12 -07:00
Teknium 55647a5813 fix(whatsapp): pin protobufjs >=7.5.5 via npm overrides to clear 3 critical vulns (#19204)
The whatsapp-bridge pulls @whiskeysockets/baileys at a pinned git
commit whose transitive dep tree ships protobufjs <7.5.5, triggering
GHSA-xq3m-2v4x-88gg (critical, arbitrary code execution). npm audit
reported 3 cascading criticals: protobufjs, @whiskeysockets/libsignal-node
(pulls protobufjs), and baileys itself (effect rollup).

Fix: add npm overrides block pinning protobufjs to ^7.5.5. Deduplicates
to a single 7.5.6 copy at node_modules/protobufjs that both libsignal-node
and any other consumers resolve through normal module resolution.

Why not bump baileys: npm-published baileys@6.17.16 is deprecated by the
maintainers (wrong version), 7.0.0-rc.* still pulls the same vulnerable
libsignal-node, and upstream Baileys HEAD adds a 4th vuln (music-metadata).
The override is the minimal, behavior-preserving fix.

Validation:
- npm audit: 3 critical -> 0 vulnerabilities
- node -e "import('@whiskeysockets/baileys')" -> all 5 named exports
  (makeWASocket, useMultiFileAuthState, DisconnectReason,
  fetchLatestBaileysVersion, downloadMediaMessage) resolve
- node bridge.js loads all modules and reaches Express bind
  (exits only on EADDRINUSE because the live gateway owns :3000)
- Single deduped protobufjs@7.5.6 in the tree
2026-05-03 05:22:30 -07:00
kshitijk4poor 6f2dab248a fix: update tests for resume_pending semantics + add AUTHOR_MAP entries
Tests updated to reflect suspend_recently_active now setting
resume_pending=True (preserves session) instead of suspended=True
(wipes session history).

AUTHOR_MAP entries: millerc79 (#19033), shellybotmoyer (#18915)
2026-05-03 03:54:03 -07:00
charliekerfoot 1148c46241 fix(gateway): correct ws scheme conversion for https urls 2026-05-03 03:54:03 -07:00
kshitijk4poor 7a22c639dc chore: add shellybotmoyer to AUTHOR_MAP 2026-05-03 03:54:03 -07:00
Hermes Agent 934103476f fix(gateway): send /new response before cancel_session_processing to avoid race (#18912)
When /new is issued while an agent is actively processing, the confirmation response was never sent to the user because cancel_session_processing() was called before _send_with_retry(). Task cancellation side effects could silently drop the response.

Fix: reorder to send the response BEFORE cancelling the old task. Add logging at the send point (matching the pattern at line 2800 in _process_message_background) so future failures are visible.

Closes: #18912
2026-05-03 03:54:03 -07:00
kshitijk4poor bf3239472f chore: add millerc79 to AUTHOR_MAP 2026-05-03 03:54:03 -07:00
millerc79 f1e0292517 fix(gateway): resume sessions after crash/restart instead of blanket suspend
suspend_recently_active() was unconditionally setting suspended=True on
startup, causing get_or_create_session() to wipe conversation history on
every restart. Change to set resume_pending=True instead, so sessions
auto-resume while still allowing stuck-loop escalation after 3 failures.
2026-05-03 03:54:03 -07:00
kshitijk4poor 0a97ce6bff chore: add nftpoetrist to AUTHOR_MAP 2026-05-03 03:47:49 -07:00
nftpoetrist 6c1322b997 fix(slack): close previous handler in connect() to prevent zombie Socket Mode connections
SlackAdapter.connect() overwrote self._handler, self._app, and
self._socket_mode_task without closing the prior AsyncSocketModeHandler
first. If connect() was called a second time on the same adapter (e.g.
during a gateway restart or in-process reconnect attempt), the old Socket
Mode websocket stayed alive. Both the old and new connections received
every Slack event and dispatched it twice — producing double responses
with different wording, the same bug that affected DiscordAdapter (#18187,
fixed in #18758).

Fix: add a close-before-reassign guard at the start of the connection
setup path, mirroring the guard DiscordAdapter.connect() already has.
When self._handler is None (fresh adapter, first connect()) the block is
a harmless no-op. Scoped to the handler/app fields only — no behavior
change for any path that does not call connect() twice.

Fixes #18980
2026-05-03 03:47:49 -07:00
kshitijk4poor c14bf441a3 chore: add 0xyg3n noreply email to AUTHOR_MAP 2026-05-03 03:44:55 -07:00
0xyg3n 19ba9e43b6 fix(gateway/discord): require allowlist auth on slash commands
Slash commands (_run_simple_slash, _handle_thread_create_slash) bypassed
every DISCORD_ALLOWED_* gate enforced by on_message. Any guild member
could invoke /background (RCE via terminal), /restart, /model, /skill,
etc. CVSS 9.8 Critical.

- _evaluate_slash_authorization mirrors on_message gates (user, role,
  channel, ignored channel) with fail-closed semantics
- _check_slash_authorization sends ephemeral reject + logs + admin alert
- Auth gate runs before defer() so rejections are ephemeral
- /skill autocomplete returns [] for unauthorized users (no catalog leak)
- Component views (ExecApproval, SlashConfirm, UpdatePrompt, ModelPicker)
  now honor role allowlists via shared _component_check_auth helper
- Optional DISCORD_HIDE_SLASH_COMMANDS defense-in-depth
- Cross-platform admin alert (Telegram/Slack fallback) on unauthorized attempts

Based on PR #18125 by @0xyg3n.
2026-05-03 03:44:55 -07:00
kshitijk4poor 5d5b8912be test: add tests for cmd_key preservation through name clamping
- TestClampCommandNamesTriples: unit tests for 3-tuple support in
  _clamp_command_names (short names, long names, collisions, multiple
  entries, backward compat with 2-tuples)
- TestDiscordSkillCmdKeyDispatch: integration test through the full
  discord_skill_commands pipeline verifying long skill names retain
  their original cmd_key after clamping
- Add contributor CharlieKerfoot to AUTHOR_MAP
2026-05-03 03:25:45 -07:00
charliekerfoot c4c0e5abc2 fix: After _clamp_command_names truncates skill names to fit the 32-cha… 2026-05-03 03:25:45 -07:00
kshitij 457c7b76cd feat(openrouter): add response caching support (#19132)
Enable OpenRouter's response caching feature (beta) via X-OpenRouter-Cache
headers. When enabled, identical API requests return cached responses for
free (zero billing), reducing both latency and cost.

Configuration via config.yaml:
  openrouter:
    response_cache: true       # default: on
    response_cache_ttl: 300    # 1-86400 seconds

Changes:
- Add openrouter config section to DEFAULT_CONFIG (response_cache + TTL)
- Add build_or_headers() in auxiliary_client.py that builds attribution
  headers plus optional cache headers based on config
- Replace inline _OR_HEADERS dicts with build_or_headers() at all 5 sites:
  run_agent.py __init__, _apply_client_headers_for_base_url(), and
  auxiliary_client.py _try_openrouter() + _to_async_client()
- Add _check_openrouter_cache_status() method to AIAgent that reads
  X-OpenRouter-Cache-Status from streaming response headers and logs
  HIT/MISS status
- Document in cli-config.yaml.example
- Add 28 tests (22 unit + 6 integration)

Ref: https://openrouter.ai/docs/guides/features/response-caching
2026-05-03 01:54:24 -07:00
Teknium 9b5b88b5e0 chore: add MottledShadow to AUTHOR_MAP 2026-05-03 01:51:33 -07:00
MottledShadow a22465e07a fix(weixin): send_weixin_direct cross-loop session check
When send_message tool is called from inside a running gateway, the
_run_async bridge spawns a worker thread with a separate event loop.
send_weixin_direct then reuses the live adapter's aiohttp session
which was created on the gateway's main loop.  aiohttp's TimerContext
checks asyncio.current_task(loop=session._loop) and sees None because
we're executing on the worker thread's loop → raises 'Timeout context
manager should be used inside a task'.

Fix: skip the live-adapter shortcut when the session belongs to a
different event loop, falling through to the fresh-session path.
2026-05-03 01:51:33 -07:00
Henkey 9987f3d824 fix(acp): compact Zed tool replay rendering 2026-05-03 01:44:23 -07:00
Henkey 19854c7cd2 Schedule ACP history replay and fence file output 2026-05-03 01:44:23 -07:00
Henkey eb612f5574 fix(acp): keep web extract rendering compact 2026-05-03 01:44:23 -07:00
Henkey b294d1d022 fix(acp): keep read-file starts compact 2026-05-03 01:44:23 -07:00
Henkey 72c8037a24 fix(acp): polish common tool rendering 2026-05-03 01:44:23 -07:00
Henkey ef9a08a872 fix(acp): polish Zed context and tool rendering 2026-05-03 01:44:23 -07:00
Henkey e26f9b2070 fix(acp): route Zed thoughts to reasoning callbacks 2026-05-03 01:44:23 -07:00
helix4u 4f37669170 fix(tools): reconfigure enabled unconfigured toolsets 2026-05-03 00:33:02 -07:00
helix4u d409a4409c fix(model): avoid bedrock credential probe in provider picker 2026-05-03 00:32:55 -07:00
Siddharth Balyan 5d3be898a8 docs(tts): mention xAI custom voice support (#18776)
Point users to xAI's custom voices feature — clone your voice in the
console, paste the voice_id into tts.xai.voice_id. No code changes
needed; the existing TTS pipeline already handles arbitrary voice IDs.

- config.py: link to xAI custom voices docs in voice_id comment
- setup.py: prompt accepts custom voice IDs during xAI TTS setup
- tts.md: short section linking to xAI console and docs
2026-05-02 16:08:01 +05:30
liuhao1024 af98122793 fix(auxiliary): propagate explicit_api_key to _try_openrouter()
When resolve_provider_client() passes explicit_api_key for OpenRouter auxiliary
tasks, _try_openrouter() now accepts and honors this parameter instead of
silently ignoring it and falling back to OPENROUTER_API_KEY env var.

Root cause: _try_openrouter() had no explicit_api_key parameter, so even
when callers wanted to pass a runtime credential pool key, it could not be used.

Fix:
- Add explicit_api_key: str = None parameter to _try_openrouter()
- Prioritize explicit_api_key over pool key and env var
- Update resolve_provider_client() call site to pass explicit_api_key

Regression coverage:
- Test that explicit_api_key is passed to OpenAI client when provided
- Test that fallback to OPENROUTER_API_KEY still works when explicit_api_key is None

Closes #18338
2026-05-02 02:27:49 -07:00
teknium1 73bcd83dba chore(release): map beibi9966 email for AUTHOR_MAP
Follow-up for PR #18502 salvage.
2026-05-02 02:23:37 -07:00
teknium1 762eb79f1e fix(gateway): tighten httpx keepalive and close whatsapp typing-response leak (#18451)
Two mitigations for the CLOSE_WAIT accumulation reported against QQ Bot
+ Feishu on macOS behind Cloudflare Warp.

1. Shared httpx.Limits helper (gateway/platforms/_http_client_limits.py).
   Every long-lived platform adapter now constructs httpx.AsyncClient
   with max_keepalive_connections=10 and keepalive_expiry=2.0, vs httpx's
   default of unbounded keepalive pool and 5.0s expiry. On macOS/Warp the
   default 5s window let idle keepalive sockets sit in CLOSE_WAIT long
   enough for seven persistent adapters (QQ Bot, WeCom, DingTalk, Signal,
   BlueBubbles, WeCom-callback, plus the transient Feishu helper) to
   compound to the 256-fd ulimit. Tunable via
   HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY and
   HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE env vars.

2. whatsapp.send_typing aiohttp leak. The call was
   'await self._http_session.post(...)' with no 'async with' and no
   variable capture — the ClientResponse went out of scope unclosed,
   holding its TCP socket in CLOSE_WAIT until GC. Fixed by wrapping in
   'async with'. This was the only bare-await aiohttp leak in the
   gateway/tools/plugins tree per audit; all other aiohttp sites use
   the context-manager pattern correctly.

The underlying reporter also saw Feishu SDK (lark-oapi) connections in
CLOSE_WAIT — those are inside the SDK and out of our direct control, but
tightening httpx keepalive across adapters reduces the aggregate pool
pressure regardless of which individual adapter leaks.
2026-05-02 02:23:37 -07:00
beibi9966 38dd057e91 fix(feishu): finalize remote document downloads inside httpx.AsyncClient context (#18502)
Snapshot Content-Type and body while the client context is still
active so pooled connections fully release on exit. Previously the
read happened after `async with httpx.AsyncClient(...)` returned —
which works today only because httpx eagerly buffers non-streaming
responses; a future refactor to `.stream()` would silently read-
after-close.

Part of the #18451 connection-hygiene audit. Salvage of #18502.
2026-05-02 02:23:37 -07:00
Teknium e444d8f29c fix(gateway): config.yaml wins over .env for agent/display/timezone settings (#18764)
Regression from the silent config→env bridge. The bridge at module import
time is correct for max_turns (unconditional overwrite), but every other
agent.*, display.*, timezone, and security bridge key was guarded by
'if X not in os.environ' — so a stale .env entry from an old 'hermes setup'
run would shadow the user's current config.yaml indefinitely.

Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60
in .env from an old setup, and the gateway silently capped at 60
iterations per turn. Gateway logs confirmed api_calls never exceeded 60.

Three changes:

1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*,
   display.*, timezone, and security.* bridge keys. config.yaml is now
   authoritative for these settings — same semantics already in place
   for max_turns, terminal.*, and auxiliary.*. Also surface the bridge
   failure (previously 'except Exception: pass') to stderr so operators
   see bridge errors instead of silently falling back to .env.

2. gateway/run.py: INFO-log the resolved max_iterations at gateway
   start so operators can verify the config→env bridge did the right
   thing instead of chasing a phantom budget ceiling.

3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in
   the setup wizard. config.yaml is the single source of truth. Also
   clean up any stale .env entry left behind by pre-fix setups.

Regression tests in tests/gateway/test_config_env_bridge_authority.py
guard each config→env key against the 'stale .env shadows config' bug.
2026-05-02 02:14:35 -07:00
luyao618 13f344c5ce fix(agent): try fallback providers at init when primary credential pool is exhausted (#17929)
When a provider's credential pool has a single entry in 429-cooldown,
resolve_provider_client returns None and AIAgent.__init__ raises a
misleading RuntimeError suggesting the API key is missing — even when
valid fallback_providers are configured.

This patch makes __init__ iterate the fallback chain before raising,
mirroring the existing in-flight fallback logic in the request loop.
If a fallback resolves, the agent initializes against it and sets
_fallback_activated=True so _restore_primary_runtime can pick the
primary back up after cooldown.

Closes #17929
2026-05-02 02:09:46 -07:00
Teknium 1dce908930 fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) (#18761)
* fix(gateway): config.yaml wins over .env for agent/display/timezone settings

Regression from the silent config→env bridge. The bridge at module import
time is correct for max_turns (unconditional overwrite), but every other
agent.*, display.*, timezone, and security bridge key was guarded by
'if X not in os.environ' — so a stale .env entry from an old 'hermes setup'
run would shadow the user's current config.yaml indefinitely.

Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60
in .env from an old setup, and the gateway silently capped at 60
iterations per turn. Gateway logs confirmed api_calls never exceeded 60.

Three changes:

1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*,
   display.*, timezone, and security.* bridge keys. config.yaml is now
   authoritative for these settings — same semantics already in place
   for max_turns, terminal.*, and auxiliary.*. Also surface the bridge
   failure (previously 'except Exception: pass') to stderr so operators
   see bridge errors instead of silently falling back to .env.

2. gateway/run.py: INFO-log the resolved max_iterations at gateway
   start so operators can verify the config→env bridge did the right
   thing instead of chasing a phantom budget ceiling.

3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in
   the setup wizard. config.yaml is the single source of truth. Also
   clean up any stale .env entry left behind by pre-fix setups.

Regression tests in tests/gateway/test_config_env_bridge_authority.py
guard each config→env key against the 'stale .env shadows config' bug.

* fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log)

Three issues observed in production gateway.log during a rapid restart
chain on 2026-05-02, all fixed here.

1. _send_restart_notification logged unconditional success
   adapter.send() catches provider errors (e.g. Telegram 'Chat not found')
   and returns SendResult(success=False); it never raises. The caller
   ignored the return value and always logged 'Sent restart notification
   to <chat>' at INFO, producing a misleading success line directly
   below the 'Failed to send Telegram message' traceback on every boot.
   Now inspects result.success and logs WARNING with the error otherwise.

2. WhatsApp bridge SIGTERM on shutdown classified as fatal error
   _check_managed_bridge_exit() saw the bridge's returncode -15 (our own
   SIGTERM from disconnect()) and fired the full fatal-error path,
   producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus
   'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every
   planned shutdown, immediately before the normal '✓ whatsapp
   disconnected'. Adds a _shutting_down flag that disconnect() sets
   before the terminate, and _check_managed_bridge_exit() returns None
   for returncode in {0, -2, -15} while shutting down. OOM-kill (137)
   and other non-signal exits still hit the fatal path.

3. restart_drain_timeout default 60s → 180s
   On 2026-05-02 01:43:27 a user /restart fired while three agents were
   mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget
   expired and all three were force-interrupted. 180s covers realistic
   in-flight agent turns; users on very-long-reasoning models can still
   raise it further via agent.restart_drain_timeout in config.yaml.
   Existing explicit user values are preserved by deep-merge.

Tests
- tests/gateway/test_restart_notification.py: two new tests assert INFO
  is only logged on SendResult(success=True) and WARNING with the error
  string is logged on SendResult(success=False).
- tests/gateway/test_whatsapp_connect.py: parametrized test for
  returncode in {0, -2, -15} proves shutdown-time exits are suppressed;
  separate test proves returncode 137 (SIGKILL/OOM) still surfaces as
  fatal even when _shutting_down is set.
- _check_managed_bridge_exit() reads _shutting_down via getattr-with-
  default so existing _make_adapter() test helpers that bypass __init__
  (pitfall #17 in AGENTS.md) keep working unmodified.
2026-05-02 02:08:06 -07:00
teknium1 50f9f389ec chore(release): map ambition0802 email for AUTHOR_MAP
Follow-up for PR #17939 salvage.
2026-05-02 02:07:14 -07:00
ambition0802 7696ddc59e fix(cli): robust paste file expansion and process_loop error handling (#17666)
Two narrow fixes for long pasted messages silently disappearing:

1. _expand_paste_references: replace path.exists() + read_text() with
   try/except (OSError, IOError). Closes the TOCTOU window where a paste
   file deleted between check and read raised FileNotFoundError, bubbled
   up through process_loop's outer except, and silently dropped the
   user's input. Failures now return the placeholder text and log a
   warning.

2. process_loop outer except: logger.warning() instead of print().
   prompt_toolkit's TUI swallows stdout, so 'Error: …' was invisible
   to the user. Logged errors are discoverable via hermes logs.

Dropped the larger interrupt_queue→pending_input drain that was part of
the original PR — that's a separate class of input-drop (in-progress
interrupt handling) unrelated to the paste-file TOCTOU reported in the
issue, and worth its own review.

Salvage of #17939.
2026-05-02 02:07:14 -07:00
Teknium 5eac6084bc fix(discord): warn on 32-char clamp collisions in the /skill collector (#18759)
Discord's per-command name limit is 32 chars. When two skill slugs
share the same first 32 chars (or a skill slug clamps onto a reserved
gateway command name), only the first seen wins — the second is
dropped from the /skill autocomplete. The old behavior incremented a
``hidden`` counter silently, so skill authors had no way to discover
the drop short of noticing their skill was missing from the picker.

Not an actively-biting bug today (no collisions on the default catalog
as of 2026-05), but a landmine the moment someone ships a skill with a
long name. The earlier series in #18745 / #18753 / #18754 dropped the
other silent data-loss paths in the Discord /skill collector; this one
lights up the last remaining one.

Fix: promote ``_names_used`` from a set to a dict keyed by the clamped
name, mapping to the source cmd_key (or a ``"<reserved>"`` sentinel
for names inherited via ``reserved_names``). On collision, log a
WARNING naming both sides — the winner, the loser, the clamped name,
and what to rename.

Two phrasings:

* skill-vs-skill — "both clamp to X on Discord's 32-char command-name
  limit; only the winner appears in /skill. Rename one skill's
  frontmatter ``name:`` to differ in its first 32 chars."
* skill-vs-reserved — "collides with a reserved gateway command name;
  the skill will not appear in /skill. Rename the skill's frontmatter
  ``name:``."

Tests: three cases in
``tests/hermes_cli/test_discord_skill_clamp_warning.py`` —
skill-vs-skill collision (warning names both cmd_keys + clamped prefix),
skill-vs-reserved collision (warning uses the distinct phrasing), and a
no-collision negative (zero warnings emitted).
2026-05-02 02:05:01 -07:00
teknium1 e363ced3c3 test(discord): regression coverage for zombie-websocket guard in connect()
Covers PR #18224 fix for issue #18187 — when DiscordAdapter.connect() is
called a second time without an intervening disconnect(), the previous
commands.Bot must be closed before a new one is created. Otherwise both
websockets stay connected to Discord's gateway and both fire on_message,
producing double responses with different wording.
2026-05-02 02:04:14 -07:00
luyao618 292d2fb42f fix(discord): close old client before reconnect to prevent zombie websockets (#18187)
When DiscordAdapter.connect() is called during reconnect, it creates a new
commands.Bot client without closing the previous one. The old client's
websocket remains connected to Discord's gateway, causing both to fire
on_message for every incoming event — resulting in double responses.

Fix: before creating a new Bot instance, check if a previous client exists
and close it. This ensures only one websocket connection is active at any
time.

Closes #18187
2026-05-02 02:04:14 -07:00
teknium1 0a6865b328 test(credential_pool): regression coverage for .env vs os.environ precedence
Covers PR #18256 fix for issue #18254 — when OPENROUTER_API_KEY is set in
BOTH os.environ (stale from parent shell) and ~/.hermes/.env (fresh),
_seed_from_env must prefer the .env value. Also guards the fallback case
where .env omits the key entirely (Docker/K8s/systemd deployments that
only inject via runtime env).
2026-05-02 02:00:32 -07:00
teknium1 9c626ef8ea chore(release): map franksong2702 email for AUTHOR_MAP
Follow-up for PR #18256 salvage.
2026-05-02 02:00:32 -07:00
Frank Song 2ef1ad280b fix: prefer ~/.hermes/.env over os.environ when seeding credential pool
When _seed_from_env() reads API keys to populate the credential pool, it
should treat ~/.hermes/.env as the authoritative source — not os.environ.
Stale env vars inherited from parent shell processes (Codex CLI, test
scripts, etc.) can shadow deliberate changes to the .env file, causing
auth.json to cache an outdated key that leads to silent 401 errors.

This is especially visible with OpenRouter: if a parent process exported
OPENROUTER_API_KEY=test-key-fresh and the user later updates .env with a
valid key, restarting Hermes still picks up the stale os.environ value,
writes it back to auth.json, and all API calls fail with 401.

Fixes #18254
2026-05-02 02:00:32 -07:00
Teknium 10297fa23c fix(discord): /reload-skills now refreshes the /skill autocomplete live (#18754)
`_register_skill_group` captured the skill catalog in closure variables
(`entries` and `skill_lookup`) so the single `tree.add_command` call at
startup owned the only live copy. The closure is never re-entered after
startup, so `/reload-skills` — which rescans the on-disk skills dir and
refreshes the in-process `_skill_commands` registry — had no way to
propagate results into the `/skill` autocomplete on Discord. New skills
stayed invisible in the dropdown, and deleted skills returned
"Unknown skill" when the stale autocomplete entry was clicked.

The fix is purely a dataflow change: promote `entries` and `skill_lookup`
to instance attributes (`_skill_entries`, `_skill_lookup`), split the
collector-driven rebuild into a helper (`_refresh_skill_catalog_state`),
and add a public `refresh_skill_group()` method that re-runs the helper
and is safe to call at any point after the initial registration.

The gateway's `_handle_reload_skills_command` then iterates
`self.adapters` and calls `refresh_skill_group()` on any adapter that
exposes it (currently only Discord). Both sync and async implementations
are supported; adapters that don't override the method (Telegram's
BotCommand menu, Slack subcommand map, etc.) are silently skipped — the
in-process `reload_skills()` call covers them.

No `tree.sync()` is required because Discord fetches autocomplete
options dynamically on every keystroke — mutating the instance state the
callbacks already read from is sufficient. That sidesteps the per-app
command-bucket rate limit (~5 writes / 20 s) that made the previous
bulk-sync-on-reload approach unusable (#16713 context).

Tests: tests/gateway/test_reload_skills_discord_resync.py — five cases
covering (1) refresh replaces entries, (2) entries stay sorted after
refresh, (3) collector exception leaves cached state intact, (4)
`_refresh_skill_catalog_state` populates the instance attrs, (5)
orchestrator calls `refresh_skill_group()` on sync + async adapters and
skips adapters that don't expose it.
2026-05-02 02:00:11 -07:00
Teknium 6ec74aec07 fix(gateway): match disabled/optional skills by frontmatter slug, not dir name (#18753)
_check_unavailable_skill is meant to turn a typed "/foo" command that
doesn't resolve into a specific hint — "disabled, enable with hermes
skills config" or "available but not installed, install with hermes
skills install …" — instead of the generic "unknown command" reply.

It was doing the match with `skill_md.parent.name.lower().replace("_", "-")`,
comparing that to the typed command. For every skill whose directory name
drifted from its declared frontmatter `name:`, that comparison failed and
the user got the unhelpful generic path. On a standard install today 19
skills have this drift, e.g.:

  dir: mlops/stable-diffusion
  frontmatter: name: Stable Diffusion Image Generation
  registered slug (what the user types): /stable-diffusion-image-generation

  dir: mlops/qdrant
  frontmatter: name: Qdrant Vector Search
  registered slug: /qdrant-vector-search

  dir: mlops/flash-attention
  frontmatter: name: Optimizing Attention Flash
  registered slug: /optimizing-attention-flash

In every case, _check_unavailable_skill would fall through because
"stable-diffusion" != "stable-diffusion-image-generation", even with the
skill sitting right there on disk.

Fix: extract a small `_skill_slug_from_frontmatter` helper that reads the
SKILL.md frontmatter and normalizes exactly like scan_skill_commands
(lower, spaces/underscores → hyphens, strip non-[a-z0-9-], collapse
runs of hyphens, strip edges). Use it in both the
disabled-skills branch and the optional-skills branch. The disabled-set
membership check now uses the declared frontmatter name (which is what
`hermes skills config` writes into skills.disabled / platform_disabled),
not the slug.

Tests: five cases in tests/gateway/test_unavailable_skill_hint.py —
the drift case for the disabled branch, unknown-command negative,
matched-but-not-disabled negative, non-alnum stripping, and the drift
case for the optional-skills branch. All five fail against main and
pass with the fix.
2026-05-02 02:00:09 -07:00
Teknium 8825e9044c fix(discord): complete #18741 for /skill autocomplete and drop legacy 25x25 caps (#18745)
``discord_skill_commands_by_category`` was lagging the flat
``discord_skill_commands`` collector on two counts. Both were actively
dropping skills from Discord's ``/skill`` autocomplete dropdown.

1. External-dir skills were filtered out. #18741 widened the flat
   collector to accept ``SKILLS_DIR + skills.external_dirs`` but left
   this sibling collector — the one ``_register_skill_group`` actually
   uses on Discord — still matching ``SKILLS_DIR`` only. External
   skills were visible in ``hermes skills list`` and the agent's
   ``/skill-name`` dispatch but silently absent from Discord's
   ``/skill`` picker. Widen the accepted roots to match, and derive
   categories from whichever root the skill lives under so
   ``<ext>/mlops/foo/SKILL.md`` still lands in the ``mlops`` group.

2. 25-group × 25-subcommand caps were still applied. PR #11580
   refactored ``/skill`` to a flat autocomplete (whose options Discord
   fetches dynamically — no per-command payload concern) and its
   docstring promises "no hidden skills." The collector kept the old
   nested-layout caps anyway, silently dropping anything past the 25th
   alphabetical category. On installs with 29 category dirs today (real
   example: tail categories ``social-media``, ``software-development``,
   ``yuanbao`` going missing) this was biting immediately. Remove the
   caps; ``hidden`` now reports only 32-char name-clamp collisions
   against reserved names.

Tests: guard both behaviors. ``test_no_legacy_25x25_cap`` builds 30
categories × 30 skills each and asserts all 900 are returned.
``test_external_dirs_skills_included`` monkeypatches
``get_external_skills_dirs`` and asserts an external-dir skill makes
it into the result grouped under its own top-level directory.
2026-05-02 02:00:06 -07:00
Jacob Lizarraga 2470434d60 fix(telegram): probe polling liveness after reconnect to detect wedged Updater
After a transient Telegram 502, _handle_polling_network_error's
stop()+start_polling() cycle can leave PTB's Updater with `running=True`
but a wedged consumer task that never makes progress. No error_callback
fires in that state, so the reconnect ladder never advances past attempt
1, the MAX_NETWORK_RETRIES fatal-error path is never reached, and the
gateway sits silent indefinitely.

Schedule a heartbeat probe (60s after a successful reconnect) that
verifies Updater.running is still True and bot.get_me() responds within
a tight asyncio.wait_for timeout. Either failure feeds back into the
reconnect ladder so the existing escalation path fires.

No PTB-internal coupling, no Application rebuild — minimal additive
defense inside the existing reconnect abstraction.

Tests cover healthy / Updater non-running / probe timeout / probe
network error / already-fatal cases, plus an integration check that the
probe is actually scheduled after a successful start_polling().

Closes the silent-wedge case observed in the wild after a transient
Telegram 502; existing reconnect tests updated to mock bot.get_me() now
that the success path schedules a heartbeat probe.
2026-05-02 01:55:04 -07:00
liuhao1024 9bf260472b fix(tools): deduplicate tool names at API boundary for Vertex/Azure/Bedrock
Providers like Google Vertex, Azure, and Amazon Bedrock reject API
requests with duplicate tool names (HTTP 400: 'Tool names must be
unique').  The upstream injection paths in run_agent.py already dedup
after PR #17335, but two API-boundary functions pass tools through
without checking:

- agent/auxiliary_client.py: _build_call_kwargs() (all non-Anthropic
  providers in chat_completions mode)
- agent/anthropic_adapter.py: convert_tools_to_anthropic() (Anthropic
  Messages API path)

Add defensive dedup guards at both sites.  Duplicates are dropped with
a warning log, converting a hard 400 failure into a recoverable
condition.  This is intentionally conservative — the root-cause dedup
in run_agent.py is the primary defense; these guards add resilience
against future injection-path regressions.

Includes 8 new tests covering unique passthrough, duplicate removal,
empty/None edge cases.

Closes #18478
2026-05-02 01:51:51 -07:00
Teknium 699b3679bc fix(constants): warn once when get_hermes_home() falls back under an active profile (#18746)
When HERMES_HOME is unset but ~/.hermes/active_profile names a non-default
profile, any data this process writes lands in the default profile — not the
one the operator expects. Before this change the fallback was silent, so
cross-profile contamination (#18594) was invisible until a user noticed
their memory/state ended up in the wrong place.

Now we emit a one-shot warning to stderr the first time this happens in
a process. No raise — there are 30+ module-level callers of get_hermes_home()
and raising from any of them would brick import. Behavior is otherwise
unchanged; subprocess spawners (systemd template, kanban dispatcher, docker
entrypoint) already propagate HERMES_HOME correctly.

Bypasses logging.getLogger() because this runs before logging is configured
in a significant fraction of callers (module import time).

Refs #18594. Credit to @liuhao1024 for surfacing the silent-fallback case
in PR #18600; we kept the diagnostic signal without the import-time raise.
2026-05-02 01:49:55 -07:00
teknium1 98c98821ff chore(release): map CoreyNoDream email for AUTHOR_MAP
Follow-up for PR #18721 salvage.
2026-05-02 01:40:31 -07:00
CoreyNoDream c5e3a6fb5b fix(cli): decode .env as UTF-8 to avoid GBK crash on Windows
Path.read_text() uses the system locale by default. On Windows CN/JP/KR
locales (GBK/CP932/CP949), reading a UTF-8 .env raises UnicodeDecodeError
as soon as it contains any non-ASCII byte (e.g. an em dash).

Pin encoding="utf-8" on every .env read in hermes_cli to match how the
rest of the codebase (load_dotenv at doctor.py:26) already decodes it.

Adds a regression test that monkeypatches Path.read_text to simulate a
GBK locale and asserts 'hermes doctor' no longer raises.

Refs #18637
2026-05-02 01:40:31 -07:00
Teknium e2cea6eeba fix(gateway): include external_dirs skills in Telegram/Discord slash commands (#18741)
Skills configured through `skills.external_dirs` in config.yaml were
visible via `hermes skills list`, `get_skill_commands()`, and the
agent's `/skill-name` dispatch, but silently excluded from the
Telegram and Discord slash-command menus. The filter in
`_collect_gateway_skill_entries` only accepted skills whose
`skill_md_path` started with `SKILLS_DIR`, so anything under an
external directory fell through.

Widen the accepted-prefix set to include all configured external
dirs alongside the local skills dir. Every prefix is now
slash-terminated so `/my-skills` cannot also admit
`/my-skills-extra`. Also guard against empty `skill_md_path`
values so they can't accidentally match.

Fixes #8110

Salvages #8790 by luyao618.

Co-authored-by: Yao <34041715+luyao618@users.noreply.github.com>
2026-05-02 01:36:57 -07:00
Teknium c73594fe41 fix(skills): rescan skill_commands cache when platform scope changes (#18739)
The process-global `_skill_commands` dict in agent/skill_commands.py
was seeded by whichever platform scanned first, and
`get_skill_commands()` only rescanned when the cache was empty. In a
long-lived gateway process serving multiple platforms (Telegram +
Discord + Slack), the first platform's
`skills.platform_disabled` view was silently inherited by the
others — so a skill disabled for Telegram would also disappear from
Discord's slash menu, and vice versa.

Track the platform scope the cache was populated for
(`_skill_commands_platform`) and rescan in `get_skill_commands()`
when the currently-active platform no longer matches. Platform
resolution uses the same precedence as `_is_skill_disabled`:
`HERMES_PLATFORM` env var then `HERMES_SESSION_PLATFORM` from the
gateway session context.

Fixes #14536

Salvages #14570 by LeonSGP43.

Co-authored-by: LeonSGP <leon@sgp43.com>
2026-05-02 01:36:53 -07:00
Teknium 97acd66b4c fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671) (#18731)
* fix(curator): authoritative absorbed_into declarations on skill delete

Closes #18671. The classification pipeline that feeds cron-ref rewriting
used to infer consolidation vs pruning from two brittle signals: the
curator model's post-hoc YAML summary block, and a substring heuristic
scanning other tool calls for the removed skill's name. Both miss in
real consolidations — the model forgets the YAML under reasoning
pressure, and the heuristic misses when the umbrella's patch content
describes the absorbed behavior abstractly instead of naming the old
slug. When both miss, the skill falls through to 'no-evidence fallback'
pruned, and #18253's cron rewriter drops the cron ref entirely instead
of mapping it to the umbrella. Same observable symptom as pre-#18253:
'Skill(s) not found and skipped' at the next cron run.

The fix makes the model declare intent at the moment of deletion.
skill_manage(action='delete') now accepts absorbed_into:
  - absorbed_into='<umbrella>'  -> consolidated, target must exist on disk
  - absorbed_into=''            -> explicit prune, no forwarding target
  - missing                     -> legacy path, falls through to heuristic/YAML

The curator reconciler reads these declarations off llm_meta.tool_calls
BEFORE either the YAML block or the substring heuristic. Declaration
wins. Fallback logic stays intact for backward compat with any caller
(human or older curator conversation) that doesn't populate the arg.

Changes
- tools/skill_manager_tool.py: add absorbed_into param to skill_manage
  + _delete_skill. Validate target exists when non-empty. Reject
  absorbed_into=<self>. Wire through dispatcher + registry + schema.
- agent/curator.py: new _extract_absorbed_into_declarations() walks
  tool calls for skill_manage(delete) with the arg. _reconcile_classification
  accepts absorbed_declarations= and treats them as authoritative. Curator
  prompt updated to require the arg on every delete.
- Tests: 7 new skill_manager tests covering the tool contract (valid
  target, empty string, nonexistent target, self-reference, whitespace,
  backward compat, dispatcher plumbing). 11 new curator tests covering
  the extractor + authoritative reconciler path + mixed-legacy-and-
  declared runs.

Validation
- 307/307 targeted tests pass (curator + cron + skill_manager suites).
- E2E #18671 repro: 3 narrow skills, 1 umbrella, cron job referencing
  all 3. Model emits NO YAML block. Heuristic misses (patch prose
  doesn't name old slugs). Delete calls carry absorbed_into. Result:
  both PR skills correctly classified 'consolidated' + cron rewritten
  ['pr-review-format', 'pr-review-checklist', 'stale-junk'] ->
  ['hermes-agent-dev']; stale-junk pruned via absorbed_into=''.
- E2E backward-compat: delete without absorbed_into, model emits YAML
  -> routed via existing 'model' source, cron still rewritten correctly.

* feat(curator): capture + restore cron skill links across snapshot/rollback

Before this, rolling back a curator run restored the skills tree but cron
jobs still pointed at the umbrella skills the curator had rewritten them
to. The user would see their old narrow skills back on disk but their
cron jobs still configured with the merged umbrella — not actually 'back
to how it was'.

Snapshot side: snapshot_skills() now captures ~/.hermes/cron/jobs.json
alongside the skills tarball, as cron-jobs.json. The manifest gets a new
'cron_jobs' block with {backed_up, jobs_count} so rollback (and the CLI
confirm dialog) can surface what's in the snapshot. If jobs.json is
missing/unreadable/malformed, snapshot proceeds without cron data — the
skills backup is the core guarantee; cron is additive.

Rollback side: after the skills extract succeeds, the new
_restore_cron_skill_links() reconciles the backed-up jobs into the live
jobs.json SURGICALLY. Only 'skills' and 'skill' fields are restored, and
only on jobs matched by id. Everything else about a cron job — schedule,
last_run_at, next_run_at, enabled, prompt, workdir, hooks — is live
state the user or scheduler has modified since the snapshot; overwriting
it would regress unrelated activity.

Reconciliation rules:
- Job in backup AND live, skills differ  → skills restored.
- Job in backup AND live, skills match   → no-op.
- Job in backup, NOT in live             → skipped (user deleted it
                                              after snapshot; their choice
                                              is later than the snapshot).
- Job in live, NOT in backup             → untouched (user created it
                                              after snapshot).
- Snapshot missing cron-jobs.json at all → rollback still succeeds,
                                              reports 'not captured'
                                              (older pre-feature snapshots
                                              keep working).

Writes go through cron.jobs.save_jobs under the same _jobs_file_lock the
scheduler uses, so rollback doesn't race tick().

Also:
- hermes_cli/curator.py: rollback confirm dialog now shows
  'cron jobs: N (will be restored for skill-link fields only)' when the
  snapshot has cron data, or 'not in snapshot (<reason>)' otherwise.
- rollback()'s message string includes a 'cron links: ...' clause
  summarizing the reconciliation outcome.

Tests
- 9 new cases: snapshot-with-cron, snapshot-without-cron, malformed-json
  captured-as-raw, full rollback-restores-skills-and-cron, rollback
  touches only skill fields, rollback skips user-deleted jobs, rollback
  leaves user-created jobs untouched, rollback still works with
  pre-feature snapshot that has no cron-jobs.json, standalone unit test
  on _restore_cron_skill_links exercising the full report shape.

Validation
- 484/484 targeted tests pass (curator + cron + skill_manager suites).
- E2E: real snapshot_skills, real cron rewrite, real rollback. Before:
  ['pr-review-format', 'pr-review-checklist', 'pr-triage-salvage'].
  After curator: ['hermes-agent-dev']. After rollback: ['pr-review-format',
  'pr-review-checklist', 'pr-triage-salvage']. Non-skill fields (id,
  name, prompt) preserved across the round trip.
2026-05-02 01:29:57 -07:00
Siddharth Balyan f98b5d00a4 fix: gateway systemd unit now retries indefinitely with backoff (#18639)
The old defaults (StartLimitIntervalSec=600, StartLimitBurst=5,
RestartSec=30) meant any network outage over ~5 minutes would
permanently kill the gateway until manual intervention.

Changes:
- StartLimitIntervalSec=0 (never give up)
- Restart=always (not just on-failure)
- RestartSec=60 with RestartMaxDelaySec=300, RestartSteps=5
  (exponential backoff: 60 → 120 → 180 → 240 → 300s cap)
- After=network-online.target + Wants= (both units now wait for
  actual connectivity, not just network.target)

Power outage → internet down → internet back = auto-recovery.
2026-05-02 08:51:30 +05:30
Siddharth Balyan 585d6778da fix: allow WebSocket connections from non-loopback IPs in --insecure mode (#18633)
When the dashboard is bound to 0.0.0.0 with --insecure (e.g. behind
Tailscale Serve), WebSocket endpoints (/api/pty, /api/ws, /api/pub,
/api/events) rejected connections from non-loopback client IPs with
code 4403 — causing 'events feed disconnected' in the UI.

Extract the repeated loopback check into _ws_client_is_allowed() which
respects the public bind flag. Session token auth still guards all
endpoints regardless of bind mode.
2026-05-02 08:17:45 +05:30
174 changed files with 16207 additions and 1378 deletions
+245 -27
View File
@@ -4,6 +4,7 @@ from __future__ import annotations
import asyncio
import contextvars
import json
import logging
import os
from collections import defaultdict, deque
@@ -47,6 +48,7 @@ from acp.schema import (
TextContentBlock,
UnstructuredCommandInput,
Usage,
UsageUpdate,
UserMessageChunk,
)
@@ -65,6 +67,7 @@ from acp_adapter.events import (
)
from acp_adapter.permissions import make_approval_callback
from acp_adapter.session import SessionManager, SessionState, _expand_acp_enabled_toolsets
from acp_adapter.tools import build_tool_complete, build_tool_start
logger = logging.getLogger(__name__)
@@ -315,6 +318,66 @@ class HermesACPAgent(acp.Agent):
return target_provider, new_model
@staticmethod
def _build_usage_update(state: SessionState) -> UsageUpdate | None:
"""Build ACP native context-usage data for clients like Zed.
Zed's circular context indicator is driven by ACP ``usage_update``
session updates: ``size`` is the model context window and ``used`` is
the current request pressure. Hermes estimates ``used`` from the same
buckets it sends to providers: system prompt, conversation history, and
tool schemas.
"""
agent = state.agent
compressor = getattr(agent, "context_compressor", None)
size = int(getattr(compressor, "context_length", 0) or 0)
if size <= 0:
return None
try:
from agent.model_metadata import estimate_request_tokens_rough
used = estimate_request_tokens_rough(
state.history,
system_prompt=getattr(agent, "_cached_system_prompt", "") or "",
tools=getattr(agent, "tools", None) or None,
)
except Exception:
logger.debug("Could not estimate ACP native context usage", exc_info=True)
used = int(getattr(compressor, "last_prompt_tokens", 0) or 0)
return UsageUpdate(
session_update="usage_update",
size=max(size, 0),
used=max(used, 0),
)
async def _send_usage_update(self, state: SessionState) -> None:
"""Send ACP native context usage to the connected client."""
if not self._conn:
return
update = self._build_usage_update(state)
if update is None:
return
try:
await self._conn.session_update(
session_id=state.session_id,
update=update,
)
except Exception:
logger.warning(
"Failed to send ACP usage update for session %s",
state.session_id,
exc_info=True,
)
def _schedule_usage_update(self, state: SessionState) -> None:
"""Schedule native context indicator refresh after ACP responses."""
if not self._conn:
return
loop = asyncio.get_running_loop()
loop.call_soon(asyncio.create_task, self._send_usage_update(state))
async def _register_session_mcp_servers(
self,
state: SessionState,
@@ -485,37 +548,99 @@ class HermesACPAgent(acp.Agent):
)
return None
@staticmethod
def _history_tool_call_name_args(tool_call: dict[str, Any]) -> tuple[str, dict[str, Any]]:
"""Extract function name/arguments from an OpenAI-style tool_call."""
function = tool_call.get("function") if isinstance(tool_call.get("function"), dict) else {}
name = str(function.get("name") or tool_call.get("name") or "unknown_tool")
raw_args = function.get("arguments") or tool_call.get("arguments") or tool_call.get("args") or {}
if isinstance(raw_args, str):
try:
parsed = json.loads(raw_args)
except Exception:
parsed = {"raw": raw_args}
raw_args = parsed
if not isinstance(raw_args, dict):
raw_args = {}
return name, raw_args
@staticmethod
def _history_tool_call_id(tool_call: dict[str, Any]) -> str:
"""Return the stable provider tool call id for ACP history replay."""
return str(
tool_call.get("id")
or tool_call.get("call_id")
or tool_call.get("tool_call_id")
or ""
).strip()
async def _replay_session_history(self, state: SessionState) -> None:
"""Send persisted user/assistant history to clients during session/load.
Zed's ACP history UI calls ``session/load`` after the user picks an item
from the Agents sidebar. The agent must then replay the full conversation
as ``user_message_chunk`` / ``agent_message_chunk`` notifications; merely
restoring server-side state makes Hermes remember context, but leaves the
editor looking like a clean thread.
as user/assistant chunks plus reconstructed tool-call start/completion
notifications; merely restoring server-side state makes Hermes remember
context, but leaves the editor looking like a clean thread.
"""
if not self._conn or not state.history:
return
for message in state.history:
role = str(message.get("role") or "")
if role not in {"user", "assistant"}:
continue
text = self._history_message_text(message)
if not text:
continue
update = self._history_message_update(role=role, text=text)
if update is None:
continue
active_tool_calls: dict[str, tuple[str, dict[str, Any]]] = {}
async def _send(update: Any) -> bool:
try:
await self._conn.session_update(session_id=state.session_id, update=update)
return True
except Exception:
logger.warning(
"Failed to replay ACP history for session %s",
state.session_id,
exc_info=True,
)
return
return False
for message in state.history:
role = str(message.get("role") or "")
if role in {"user", "assistant"}:
text = self._history_message_text(message)
if text:
update = self._history_message_update(role=role, text=text)
if update is not None and not await _send(update):
return
if role == "assistant" and isinstance(message.get("tool_calls"), list):
for tool_call in message["tool_calls"]:
if not isinstance(tool_call, dict):
continue
tool_call_id = self._history_tool_call_id(tool_call)
if not tool_call_id:
continue
tool_name, args = self._history_tool_call_name_args(tool_call)
active_tool_calls[tool_call_id] = (tool_name, args)
if not await _send(build_tool_start(tool_call_id, tool_name, args)):
return
continue
if role == "tool":
tool_call_id = str(message.get("tool_call_id") or "").strip()
tool_name = str(message.get("tool_name") or "").strip()
function_args: dict[str, Any] | None = None
if tool_call_id in active_tool_calls:
tool_name, function_args = active_tool_calls.pop(tool_call_id)
if not tool_call_id or not tool_name:
continue
result = message.get("content")
if not await _send(
build_tool_complete(
tool_call_id,
tool_name,
result=result if isinstance(result, str) else None,
function_args=function_args,
)
):
return
async def new_session(
self,
@@ -527,11 +652,24 @@ class HermesACPAgent(acp.Agent):
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("New session %s (cwd=%s)", state.session_id, cwd)
self._schedule_available_commands_update(state.session_id)
self._schedule_usage_update(state)
return NewSessionResponse(
session_id=state.session_id,
models=self._build_model_state(state),
)
def _schedule_history_replay(self, state: SessionState) -> None:
"""Replay persisted history after session/load or session/resume returns.
Zed only attaches streamed transcript/tool updates once the load/resume
response has completed. Sending replay notifications while the request is
still in-flight can make the server look correct in logs while the editor
drops or fails to attach the tool-call history.
"""
loop = asyncio.get_running_loop()
replay_coro = self._replay_session_history(state)
loop.call_soon(asyncio.create_task, replay_coro)
async def load_session(
self,
cwd: str,
@@ -545,8 +683,9 @@ class HermesACPAgent(acp.Agent):
return None
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Loaded session %s", session_id)
await self._replay_session_history(state)
self._schedule_history_replay(state)
self._schedule_available_commands_update(session_id)
self._schedule_usage_update(state)
return LoadSessionResponse(models=self._build_model_state(state))
async def resume_session(
@@ -562,8 +701,9 @@ class HermesACPAgent(acp.Agent):
state = self.session_manager.create_session(cwd=cwd)
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Resumed session %s", state.session_id)
await self._replay_session_history(state)
self._schedule_history_replay(state)
self._schedule_available_commands_update(state.session_id)
self._schedule_usage_update(state)
return ResumeSessionResponse(models=self._build_model_state(state))
async def cancel(self, session_id: str, **kwargs: Any) -> None:
@@ -712,6 +852,7 @@ class HermesACPAgent(acp.Agent):
if self._conn:
update = acp.update_agent_message_text(response_text)
await self._conn.session_update(session_id, update)
await self._send_usage_update(state)
return PromptResponse(stop_reason="end_turn")
# If Zed sends another regular prompt while the same ACP session is
@@ -744,24 +885,37 @@ class HermesACPAgent(acp.Agent):
tool_call_meta: dict[str, dict[str, Any]] = {}
previous_approval_cb = None
streamed_message = False
if conn:
tool_progress_cb = make_tool_progress_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
thinking_cb = make_thinking_cb(conn, session_id, loop)
reasoning_cb = make_thinking_cb(conn, session_id, loop)
step_cb = make_step_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
message_cb = make_message_cb(conn, session_id, loop)
def stream_delta_cb(text: str) -> None:
nonlocal streamed_message
if text:
streamed_message = True
message_cb(text)
approval_cb = make_approval_callback(conn.request_permission, loop, session_id)
else:
tool_progress_cb = None
thinking_cb = None
reasoning_cb = None
step_cb = None
message_cb = None
stream_delta_cb = None
approval_cb = None
agent = state.agent
agent.tool_progress_callback = tool_progress_cb
agent.thinking_callback = thinking_cb
# ACP thought panes should not receive Hermes' local kawaii waiting/status
# updates. Route provider/model reasoning deltas instead; if the provider
# emits no reasoning, Zed should not get a fake "thinking" accordion.
agent.thinking_callback = None
agent.reasoning_callback = reasoning_cb
agent.step_callback = step_cb
agent.message_callback = message_cb
agent.stream_delta_callback = stream_delta_cb
# Approval callback is per-thread (thread-local, GHSA-qg5c-hvr5-hjgr).
# Set it INSIDE _run_agent so the TLS write happens in the executor
@@ -867,7 +1021,7 @@ class HermesACPAgent(acp.Agent):
)
except Exception:
logger.debug("Failed to auto-title ACP session %s", session_id, exc_info=True)
if final_response and conn:
if final_response and conn and not streamed_message:
update = acp.update_agent_message_text(final_response)
await conn.session_update(session_id, update)
@@ -903,6 +1057,8 @@ class HermesACPAgent(acp.Agent):
cached_read_tokens=result.get("cache_read_tokens"),
)
await self._send_usage_update(state)
stop_reason = "cancelled" if state.cancel_event and state.cancel_event.is_set() else "end_turn"
return PromptResponse(stop_reason=stop_reason, usage=usage)
@@ -1035,22 +1191,84 @@ class HermesACPAgent(acp.Agent):
return f"Could not list tools: {e}"
def _cmd_context(self, args: str, state: SessionState) -> str:
"""Show ACP session context pressure and compression guidance."""
n_messages = len(state.history)
if n_messages == 0:
return "Conversation is empty (no messages yet)."
# Count by role
# Count by role.
roles: dict[str, int] = {}
for msg in state.history:
role = msg.get("role", "unknown")
roles[role] = roles.get(role, 0) + 1
agent = state.agent
model = state.model or getattr(agent, "model", "")
provider = getattr(agent, "provider", None) or "auto"
compressor = getattr(agent, "context_compressor", None)
context_length = int(getattr(compressor, "context_length", 0) or 0)
threshold_tokens = int(getattr(compressor, "threshold_tokens", 0) or 0)
try:
from agent.model_metadata import estimate_request_tokens_rough
system_prompt = getattr(agent, "_cached_system_prompt", "") or ""
tools = getattr(agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
state.history,
system_prompt=system_prompt,
tools=tools,
)
except Exception:
logger.debug("Could not estimate ACP context usage", exc_info=True)
approx_tokens = 0
if threshold_tokens <= 0 and context_length > 0:
threshold_tokens = int(context_length * 0.80)
lines = [
f"Conversation: {n_messages} messages",
f"Conversation: {n_messages} messages"
if n_messages
else "Conversation is empty (no messages yet).",
f" user: {roles.get('user', 0)}, assistant: {roles.get('assistant', 0)}, "
f"tool: {roles.get('tool', 0)}, system: {roles.get('system', 0)}",
]
model = state.model or getattr(state.agent, "model", "")
if model:
lines.append(f"Model: {model}")
lines.append(f"Provider: {provider}")
if approx_tokens > 0:
if context_length > 0:
usage_pct = (approx_tokens / context_length) * 100
lines.append(
f"Context usage: ~{approx_tokens:,} / {context_length:,} tokens ({usage_pct:.1f}%)"
)
else:
lines.append(f"Context usage: ~{approx_tokens:,} tokens")
if threshold_tokens > 0:
if approx_tokens > 0:
threshold_pct = (threshold_tokens / context_length) * 100 if context_length > 0 else 0
remaining = max(threshold_tokens - approx_tokens, 0)
if approx_tokens >= threshold_tokens:
lines.append(
f"Compression: due now (threshold ~{threshold_tokens:,}"
+ (f", {threshold_pct:.0f}%" if threshold_pct else "")
+ "). Run /compact."
)
else:
lines.append(
f"Compression: ~{remaining:,} tokens until threshold "
f"(~{threshold_tokens:,}"
+ (f", {threshold_pct:.0f}%" if threshold_pct else "")
+ ")."
)
else:
lines.append(f"Compression threshold: ~{threshold_tokens:,} tokens")
if getattr(agent, "compression_enabled", True) is False:
lines.append("Compression is disabled for this agent.")
else:
lines.append("Tip: run /compact to compress manually before the threshold.")
return "\n".join(lines)
def _cmd_reset(self, args: str, state: SessionState) -> str:
+822 -21
View File
@@ -28,6 +28,11 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
"terminal": "execute",
"process": "execute",
"execute_code": "execute",
# Session/meta tools
"todo": "other",
"skill_view": "read",
"skills_list": "read",
"skill_manage": "edit",
# Web / fetch
"web_search": "fetch",
"web_extract": "fetch",
@@ -51,6 +56,28 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
}
_POLISHED_TOOLS = {
# Core operator loop
"todo", "memory", "session_search", "delegate_task",
# Files / execution
"read_file", "write_file", "patch", "search_files", "terminal", "process", "execute_code",
# Skills / web / browser / media
"skill_view", "skills_list", "skill_manage", "web_search", "web_extract",
"browser_navigate", "browser_click", "browser_type", "browser_press", "browser_scroll",
"browser_back", "browser_snapshot", "browser_console", "browser_get_images", "browser_vision",
"vision_analyze", "image_generate", "text_to_speech",
# Schedulers / platform integrations
"cronjob", "send_message", "clarify", "discord", "discord_admin",
"ha_list_entities", "ha_get_state", "ha_list_services", "ha_call_service",
"feishu_doc_read", "feishu_drive_list_comments", "feishu_drive_list_comment_replies",
"feishu_drive_reply_comment", "feishu_drive_add_comment",
"kanban_create", "kanban_show", "kanban_comment", "kanban_complete",
"kanban_block", "kanban_link", "kanban_heartbeat",
"yb_query_group_info", "yb_query_group_members", "yb_search_sticker",
"yb_send_dm", "yb_send_sticker", "mixture_of_agents",
}
def get_tool_kind(tool_name: str) -> ToolKind:
"""Return the ACP ToolKind for a hermes tool, defaulting to 'other'."""
return TOOL_KIND_MAP.get(tool_name, "other")
@@ -85,18 +112,645 @@ def build_tool_title(tool_name: str, args: Dict[str, Any]) -> str:
if urls:
return f"extract: {urls[0]}" + (f" (+{len(urls)-1})" if len(urls) > 1 else "")
return "web extract"
if tool_name == "process":
action = str(args.get("action") or "").strip() or "manage"
sid = str(args.get("session_id") or "").strip()
return f"process {action}: {sid}" if sid else f"process {action}"
if tool_name == "delegate_task":
tasks = args.get("tasks")
if isinstance(tasks, list) and tasks:
return f"delegate batch ({len(tasks)} tasks)"
goal = args.get("goal", "")
if goal and len(goal) > 60:
goal = goal[:57] + "..."
return f"delegate: {goal}" if goal else "delegate task"
if tool_name == "session_search":
query = str(args.get("query") or "").strip()
return f"session search: {query}" if query else "recent sessions"
if tool_name == "memory":
action = str(args.get("action") or "manage").strip() or "manage"
target = str(args.get("target") or "memory").strip() or "memory"
return f"memory {action}: {target}"
if tool_name == "execute_code":
return "execute code"
code = str(args.get("code") or "").strip()
first_line = next((line.strip() for line in code.splitlines() if line.strip()), "")
if first_line:
if len(first_line) > 70:
first_line = first_line[:67] + "..."
return f"python: {first_line}"
return "python code"
if tool_name == "todo":
items = args.get("todos")
if isinstance(items, list):
return f"todo ({len(items)} item{'s' if len(items) != 1 else ''})"
return "todo"
if tool_name == "skill_view":
name = str(args.get("name") or "?").strip() or "?"
file_path = str(args.get("file_path") or "").strip()
suffix = f"/{file_path}" if file_path else ""
return f"skill view ({name}{suffix})"
if tool_name == "skills_list":
category = str(args.get("category") or "").strip()
return f"skills list ({category})" if category else "skills list"
if tool_name == "skill_manage":
action = str(args.get("action") or "manage").strip() or "manage"
name = str(args.get("name") or "?").strip() or "?"
file_path = str(args.get("file_path") or "").strip()
target = f"{name}/{file_path}" if file_path else name
if len(target) > 64:
target = target[:61] + "..."
return f"skill {action}: {target}"
if tool_name == "browser_navigate":
return f"navigate: {args.get('url', '?')}"
if tool_name == "browser_snapshot":
return "browser snapshot"
if tool_name == "browser_vision":
return f"browser vision: {str(args.get('question', '?'))[:50]}"
if tool_name == "browser_get_images":
return "browser images"
if tool_name == "vision_analyze":
return f"analyze image: {args.get('question', '?')[:50]}"
return f"analyze image: {str(args.get('question', '?'))[:50]}"
if tool_name == "image_generate":
prompt = str(args.get("prompt") or args.get("description") or "").strip()
return f"generate image: {prompt[:50]}" if prompt else "generate image"
if tool_name == "cronjob":
action = str(args.get("action") or "manage").strip() or "manage"
job_id = str(args.get("job_id") or args.get("id") or "").strip()
return f"cron {action}: {job_id}" if job_id else f"cron {action}"
return tool_name
def _text(content: str) -> Any:
return acp.tool_content(acp.text_block(content))
def _json_loads_maybe(value: Optional[str]) -> Any:
if not isinstance(value, str):
return value
try:
return json.loads(value)
except Exception:
pass
# Some Hermes tools append a human hint after a JSON payload, e.g.
# ``{...}\n\n[Hint: Results truncated...]``. Keep the structured rendering path
# by decoding the first JSON value instead of falling back to raw text.
try:
decoded, _ = json.JSONDecoder().raw_decode(value.lstrip())
return decoded
except Exception:
return None
def _truncate_text(text: str, limit: int = 5000) -> str:
if len(text) <= limit:
return text
return text[: max(0, limit - 100)] + f"\n... ({len(text)} chars total, truncated)"
def _fenced_text(text: str, language: str = "") -> str:
"""Return a Markdown fence that cannot be broken by backticks in text."""
longest = max((len(run) for run in text.split("`")[1::2]), default=0)
fence = "`" * max(3, longest + 1)
return f"{fence}{language}\n{text}\n{fence}"
def _format_todo_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict) or not isinstance(data.get("todos"), list):
return None
summary = data.get("summary") if isinstance(data.get("summary"), dict) else {}
icon = {
"completed": "",
"in_progress": "🔄",
"pending": "",
"cancelled": "",
}
lines = ["**Todo list**", ""]
for item in data["todos"]:
if not isinstance(item, dict):
continue
status = str(item.get("status") or "pending")
content = str(item.get("content") or item.get("id") or "").strip()
if content:
lines.append(f"- {icon.get(status, '')} {content}")
if summary:
cancelled = summary.get("cancelled", 0)
lines.extend([
"",
"**Progress:** "
f"{summary.get('completed', 0)} completed, "
f"{summary.get('in_progress', 0)} in progress, "
f"{summary.get('pending', 0)} pending"
+ (f", {cancelled} cancelled" if cancelled else ""),
])
return "\n".join(lines)
def _format_read_file_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("error") and not data.get("content"):
return f"Read failed: {data.get('error')}"
content = data.get("content")
if not isinstance(content, str):
return None
path = str((args or {}).get("path") or data.get("path") or "file").strip()
offset = (args or {}).get("offset")
limit = (args or {}).get("limit")
range_bits = []
if offset:
range_bits.append(f"from line {offset}")
if limit:
range_bits.append(f"limit {limit}")
suffix = f" ({', '.join(range_bits)})" if range_bits else ""
header = f"Read {path}{suffix}"
if data.get("total_lines") is not None:
header += f"{data.get('total_lines')} total lines"
# Hermes read_file output is line-numbered with `|`. If we send it as raw
# Markdown, Zed can interpret pipes as tables and collapse the layout.
# Fence the payload so file lines stay readable and literal.
return _truncate_text(f"{header}\n\n{_fenced_text(content)}")
def _format_search_files_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
matches = data.get("matches")
if not isinstance(matches, list):
return None
total = data.get("total_count", len(matches))
shown = min(len(matches), 12)
truncated = bool(data.get("truncated")) or len(matches) > shown
lines = [
"Search results",
f"Found {total} match{'es' if total != 1 else ''}; showing {shown}.",
"",
]
for match in matches[:shown]:
if not isinstance(match, dict):
lines.append(f"- {match}")
continue
path = str(match.get("path") or match.get("file") or match.get("filename") or "?")
line = match.get("line") or match.get("line_number")
content = str(match.get("content") or match.get("text") or "").strip()
loc = f"{path}:{line}" if line else path
lines.append(f"- {loc}")
if content:
snippet = _truncate_text(" ".join(content.split()), 300)
lines.append(f" {snippet}")
if truncated:
lines.extend([
"",
"Results truncated. Narrow the search, add file_glob, or use offset to page.",
])
return _truncate_text("\n".join(lines), limit=7000)
def _format_execute_code_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
output = str(data.get("output") or "")
error = str(data.get("error") or "")
exit_code = data.get("exit_code")
parts = [f"Exit code: {exit_code}" if exit_code is not None else "Execution complete"]
if output:
parts.extend(["", "Output:", output])
if error:
parts.extend(["", "Error:", error])
return _truncate_text("\n".join(parts))
def _extract_markdown_headings(content: str, limit: int = 8) -> list[str]:
headings: list[str] = []
for line in content.splitlines():
stripped = line.strip()
if stripped.startswith("#"):
heading = stripped.lstrip("#").strip()
if heading:
headings.append(heading)
if len(headings) >= limit:
break
return headings
def _format_skill_view_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False:
return f"Skill view failed: {data.get('error', 'unknown error')}"
name = str(data.get("name") or "skill")
file_path = str(data.get("file") or data.get("path") or "SKILL.md")
description = str(data.get("description") or "").strip()
content = str(data.get("content") or "")
linked = data.get("linked_files") if isinstance(data.get("linked_files"), dict) else None
lines = ["**Skill loaded**", "", f"- **Name:** `{name}`", f"- **File:** `{file_path}`"]
if description:
lines.append(f"- **Description:** {description}")
if content:
lines.append(f"- **Content:** {len(content):,} chars loaded into agent context")
if linked:
linked_count = sum(len(v) for v in linked.values() if isinstance(v, list))
lines.append(f"- **Linked files:** {linked_count}")
headings = _extract_markdown_headings(content)
if headings:
lines.extend(["", "**Sections**"])
lines.extend(f"- {heading}" for heading in headings)
lines.extend([
"",
"_Full skill content is available to the agent but hidden here to keep ACP readable._",
])
return "\n".join(lines)
def _format_skill_manage_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
action = str((args or {}).get("action") or "manage").strip() or "manage"
name = str((args or {}).get("name") or data.get("name") or "skill").strip() or "skill"
file_path = str((args or {}).get("file_path") or data.get("file_path") or "SKILL.md").strip() or "SKILL.md"
success = data.get("success")
status = "✅ Skill updated" if success is not False else "✗ Skill update failed"
lines = [f"**{status}**", "", f"- **Action:** `{action}`", f"- **Skill:** `{name}`"]
if action not in {"delete"}:
lines.append(f"- **File:** `{file_path}`")
message = str(data.get("message") or data.get("error") or "").strip()
if message:
lines.append(f"- **Result:** {message}")
replacements = data.get("replacements") or data.get("replacement_count")
if replacements is not None:
lines.append(f"- **Replacements:** {replacements}")
path = str(data.get("path") or "").strip()
if path:
lines.append(f"- **Path:** `{path}`")
return "\n".join(lines)
def _format_web_search_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
web = data.get("data", {}).get("web") if isinstance(data.get("data"), dict) else data.get("web")
if not isinstance(web, list):
return None
lines = [f"Web results: {len(web)}"]
for item in web[:10]:
if not isinstance(item, dict):
continue
title = str(item.get("title") or item.get("url") or "result").strip()
url = str(item.get("url") or "").strip()
desc = str(item.get("description") or "").strip()
lines.append(f"{title}" + (f"{url}" if url else ""))
if desc:
lines.append(f" {desc}")
return _truncate_text("\n".join(lines))
def _format_web_extract_result(result: Optional[str]) -> Optional[str]:
"""Return only web_extract errors for ACP; success stays compact via title."""
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False and data.get("error"):
return f"Web extract failed: {data.get('error')}"
results = data.get("results")
if not isinstance(results, list):
return None
failures: list[str] = []
for item in results[:10]:
if not isinstance(item, dict):
continue
error = str(item.get("error") or "").strip()
if not error or error in {"None", "null"}:
continue
url = str(item.get("url") or "").strip()
title = str(item.get("title") or url or "Untitled").strip()
failures.append(
f"- {title}" + (f"{url}" if url and url != title else "") + f"\n Error: {_truncate_text(error, limit=500)}"
)
if not failures:
return None
lines = [f"Web extract failed for {len(failures)} URL{'s' if len(failures) != 1 else ''}"]
lines.extend(failures)
return "\n".join(lines)
def _format_process_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False and data.get("error"):
return f"Process error: {data.get('error')}"
action = str((args or {}).get("action") or "process").strip() or "process"
if isinstance(data.get("processes"), list):
processes = data["processes"]
lines = [f"Processes: {len(processes)}"]
for proc in processes[:20]:
if not isinstance(proc, dict):
lines.append(f"- {proc}")
continue
sid = str(proc.get("session_id") or proc.get("id") or "?")
status = str(proc.get("status") or ("exited" if proc.get("exited") else "running"))
cmd = str(proc.get("command") or "").strip()
pid = proc.get("pid")
code = proc.get("exit_code")
bits = [status]
if pid is not None:
bits.append(f"pid {pid}")
if code is not None:
bits.append(f"exit {code}")
lines.append(f"- `{sid}` — {', '.join(bits)}" + (f"{cmd[:120]}" if cmd else ""))
if len(processes) > 20:
lines.append(f"... {len(processes) - 20} more process(es)")
return "\n".join(lines)
status = str(data.get("status") or data.get("state") or action).strip()
sid = str(data.get("session_id") or (args or {}).get("session_id") or "").strip()
lines = [f"Process {action}: {status}" + (f" (`{sid}`)" if sid else "")]
for key, label in (("command", "Command"), ("pid", "PID"), ("exit_code", "Exit code"), ("returncode", "Exit code"), ("lines", "Lines")):
if data.get(key) is not None:
lines.append(f"- **{label}:** {data.get(key)}")
output = data.get("output") or data.get("new_output") or data.get("log") or data.get("stdout")
error = data.get("error") or data.get("stderr")
if output:
lines.extend(["", "Output:", _truncate_text(str(output), limit=5000)])
if error:
lines.extend(["", "Error:", _truncate_text(str(error), limit=2000)])
msg = data.get("message")
if msg and not output and not error:
lines.append(str(msg))
return _truncate_text("\n".join(lines), limit=7000)
def _format_delegate_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("error") and not isinstance(data.get("results"), list):
return f"Delegation failed: {data.get('error')}"
results = data.get("results")
if not isinstance(results, list):
return None
total = data.get("total_duration_seconds")
lines = [f"Delegation results: {len(results)} task{'s' if len(results) != 1 else ''}" + (f" in {total}s" if total is not None else "")]
icon = {"completed": "", "failed": "", "error": "", "timeout": "", "interrupted": ""}
for item in results:
if not isinstance(item, dict):
lines.append(f"- {item}")
continue
idx = item.get("task_index")
status = str(item.get("status") or "unknown")
model = item.get("model")
dur = item.get("duration_seconds")
role = item.get("_child_role")
header = f"{icon.get(status, '')} Task {idx + 1 if isinstance(idx, int) else '?'}: {status}"
bits = []
if model:
bits.append(str(model))
if role:
bits.append(f"role={role}")
if dur is not None:
bits.append(f"{dur}s")
if bits:
header += " (" + ", ".join(bits) + ")"
lines.extend(["", header])
summary = str(item.get("summary") or "").strip()
error = str(item.get("error") or "").strip()
if summary:
lines.append(_truncate_text(summary, limit=1200))
if error:
lines.append("Error: " + _truncate_text(error, limit=800))
trace = item.get("tool_trace")
if isinstance(trace, list) and trace:
names = [str(t.get("tool") or "?") for t in trace if isinstance(t, dict)]
if names:
lines.append("Tools: " + ", ".join(names[:12]) + (f" (+{len(names)-12})" if len(names) > 12 else ""))
return _truncate_text("\n".join(lines), limit=8000)
def _format_session_search_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False:
return f"Session search failed: {data.get('error', 'unknown error')}"
results = data.get("results")
if not isinstance(results, list):
return None
mode = data.get("mode") or "search"
query = data.get("query")
lines = ["Recent sessions" if mode == "recent" else f"Session search results" + (f" for `{query}`" if query else "")]
if not results:
lines.append(str(data.get("message") or "No matching sessions found."))
return "\n".join(lines)
for item in results:
if not isinstance(item, dict):
continue
sid = str(item.get("session_id") or "?")
title = str(item.get("title") or item.get("when") or "Untitled session").strip()
when = str(item.get("last_active") or item.get("started_at") or item.get("when") or "").strip()
count = item.get("message_count")
source = str(item.get("source") or "").strip()
meta = ", ".join(str(x) for x in [when, source, f"{count} msgs" if count is not None else ""] if x)
lines.append(f"- **{title}** (`{sid}`)" + (f"{meta}" if meta else ""))
summary = str(item.get("summary") or item.get("preview") or "").strip()
if summary:
lines.append(" " + _truncate_text(" ".join(summary.split()), limit=500))
return _truncate_text("\n".join(lines), limit=7000)
def _format_memory_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
action = str((args or {}).get("action") or "memory").strip() or "memory"
target = str(data.get("target") or (args or {}).get("target") or "memory")
if data.get("success") is False:
lines = [f"✗ Memory {action} failed ({target})", str(data.get("error") or "unknown error")]
matches = data.get("matches")
if isinstance(matches, list) and matches:
lines.append("Matches:")
lines.extend(f"- {_truncate_text(str(m), 160)}" for m in matches[:5])
return "\n".join(lines)
lines = [f"✅ Memory {action} saved ({target})"]
if data.get("message"):
lines.append(str(data.get("message")))
if data.get("entry_count") is not None:
lines.append(f"Entries: {data.get('entry_count')}")
if data.get("usage"):
lines.append(f"Usage: {data.get('usage')}")
# Avoid dumping all memory entries into ACP UI; show only the explicit new value preview.
preview = str((args or {}).get("content") or (args or {}).get("old_text") or "").strip()
if preview:
lines.append("Preview: " + _truncate_text(preview, limit=300))
return "\n".join(lines)
def _format_edit_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
path = str((args or {}).get("path") or "file").strip()
if isinstance(data, dict):
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed for {path}: {data.get('error', 'unknown error')}"
message = str(data.get("message") or "").strip()
replacements = data.get("replacements") or data.get("replacement_count")
lines = [f"{tool_name} completed" + (f" for `{path}`" if path else "")]
if message:
lines.append(message)
if replacements is not None:
lines.append(f"Replacements: {replacements}")
if data.get("files_modified"):
files = data.get("files_modified")
if isinstance(files, list):
lines.append("Files: " + ", ".join(f"`{f}`" for f in files[:8]))
return "\n".join(lines)
if isinstance(result, str) and result.strip():
return _truncate_text(result, limit=3000)
return f"{tool_name} completed" + (f" for `{path}`" if path else "")
def _format_browser_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
if tool_name == "browser_get_images":
images = data.get("images") or data.get("data")
if isinstance(images, list):
lines = [f"Images found: {len(images)}"]
for img in images[:12]:
if isinstance(img, dict):
alt = str(img.get("alt") or "").strip()
url = str(img.get("url") or img.get("src") or "").strip()
lines.append(f"- {alt or 'image'}" + (f"{url}" if url else ""))
return _truncate_text("\n".join(lines), limit=5000)
title = str(data.get("title") or data.get("url") or data.get("status") or tool_name)
text = str(data.get("text") or data.get("content") or data.get("snapshot") or data.get("analysis") or data.get("message") or "").strip()
lines = [title]
if data.get("url") and data.get("url") != title:
lines.append(str(data.get("url")))
if text:
lines.extend(["", _truncate_text(text, limit=5000)])
return _truncate_text("\n".join(lines), limit=7000)
def _format_media_or_cron_result(tool_name: str, result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
lines = [f"{tool_name} completed"]
for key in ("file_path", "path", "url", "image_url", "job_id", "id", "status", "message", "next_run"):
if data.get(key):
lines.append(f"- **{key}:** {data.get(key)}")
return "\n".join(lines)
def _format_generic_structured_result(tool_name: str, result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, (dict, list)):
return result if isinstance(result, str) and result.strip() else None
if isinstance(data, list):
lines = [f"{tool_name}: {len(data)} item{'s' if len(data) != 1 else ''}"]
for item in data[:12]:
lines.append(f"- {_truncate_text(str(item), limit=240)}")
return _truncate_text("\n".join(lines), limit=5000)
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
lines = [f"{tool_name} completed" if data.get("success") is True else f"{tool_name} result"]
priority_keys = (
"message", "status", "id", "task_id", "issue_id", "title", "name", "entity_id",
"state", "service", "url", "path", "file_path", "count", "total", "next_run",
)
seen = set()
for key in priority_keys:
value = data.get(key)
if value in (None, "", [], {}):
continue
seen.add(key)
lines.append(f"- **{key}:** {_truncate_text(str(value), limit=500)}")
for key, value in data.items():
if key in seen or key in {"success", "raw", "content", "entries"}:
continue
if value in (None, "", [], {}):
continue
if isinstance(value, (dict, list)):
preview = json.dumps(value, ensure_ascii=False, default=str)
else:
preview = str(value)
lines.append(f"- **{key}:** {_truncate_text(preview, limit=500)}")
if len(lines) >= 14:
break
content = data.get("content")
if isinstance(content, str) and content.strip():
lines.extend(["", _truncate_text(content.strip(), limit=1500)])
return _truncate_text("\n".join(lines), limit=7000)
def _build_polished_completion_content(
tool_name: str,
result: Optional[str],
function_args: Optional[Dict[str, Any]],
) -> Optional[List[Any]]:
formatter = {
"todo": lambda: _format_todo_result(result),
"read_file": lambda: _format_read_file_result(result, function_args),
"write_file": lambda: _format_edit_result(tool_name, result, function_args),
"patch": lambda: _format_edit_result(tool_name, result, function_args),
"search_files": lambda: _format_search_files_result(result),
"execute_code": lambda: _format_execute_code_result(result),
"process": lambda: _format_process_result(result, function_args),
"delegate_task": lambda: _format_delegate_result(result),
"session_search": lambda: _format_session_search_result(result),
"memory": lambda: _format_memory_result(result, function_args),
"skill_view": lambda: _format_skill_view_result(result),
"skill_manage": lambda: _format_skill_manage_result(result, function_args),
"web_search": lambda: _format_web_search_result(result),
"web_extract": lambda: _format_web_extract_result(result),
"browser_navigate": lambda: _format_browser_result(tool_name, result, function_args),
"browser_snapshot": lambda: _format_browser_result(tool_name, result, function_args),
"browser_vision": lambda: _format_browser_result(tool_name, result, function_args),
"browser_get_images": lambda: _format_browser_result(tool_name, result, function_args),
"vision_analyze": lambda: _format_media_or_cron_result(tool_name, result),
"image_generate": lambda: _format_media_or_cron_result(tool_name, result),
"cronjob": lambda: _format_media_or_cron_result(tool_name, result),
}.get(tool_name)
if formatter is None and tool_name in _POLISHED_TOOLS:
formatter = lambda: _format_generic_structured_result(tool_name, result)
if formatter is None:
return None
text = formatter()
if not text:
return None
return [_text(text)]
def _build_patch_mode_content(patch_text: str) -> List[Any]:
"""Parse V4A patch mode input into ACP diff blocks when possible."""
if not patch_text:
@@ -258,7 +912,11 @@ def _build_tool_complete_content(
except Exception:
pass
return [acp.tool_content(acp.text_block(display_result))]
polished_content = _build_polished_completion_content(tool_name, result, function_args)
if polished_content:
return polished_content
return [_text(display_result)]
# ---------------------------------------------------------------------------
@@ -288,7 +946,6 @@ def build_tool_start(
content = _build_patch_mode_content(patch_text)
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "write_file":
@@ -297,32 +954,172 @@ def build_tool_start(
content = [acp.tool_diff_content(path=path, new_text=file_content)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "terminal":
command = arguments.get("command", "")
content = [acp.tool_content(acp.text_block(f"$ {command}"))]
content = [_text(f"$ {command}")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "read_file":
path = arguments.get("path", "")
content = [acp.tool_content(acp.text_block(f"Reading {path}"))]
# The title and location already identify the file. Sending a synthetic
# "Reading ..." content block makes Zed render an unhelpful Output
# section before the real file contents arrive on completion.
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
tool_call_id, title, kind=kind, content=None, locations=locations,
)
if tool_name == "search_files":
pattern = arguments.get("pattern", "")
target = arguments.get("target", "content")
content = [acp.tool_content(acp.text_block(f"Searching for '{pattern}' ({target})"))]
search_path = arguments.get("path")
where = f" in {search_path}" if search_path else ""
content = [_text(f"Searching for '{pattern}' ({target}){where}")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "todo":
items = arguments.get("todos")
if isinstance(items, list):
preview_lines = ["Updating todo list", ""]
for item in items[:8]:
if isinstance(item, dict):
preview_lines.append(f"- {item.get('status', 'pending')}: {item.get('content', item.get('id', ''))}")
if len(items) > 8:
preview_lines.append(f"... {len(items) - 8} more")
content = [_text("\n".join(preview_lines))]
else:
content = [_text("Reading todo list")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "skill_view":
name = str(arguments.get("name") or "?").strip() or "?"
file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
content = [_text(f"Loading skill '{name}' ({file_path})")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "skill_manage":
action = str(arguments.get("action") or "manage").strip() or "manage"
name = str(arguments.get("name") or "?").strip() or "?"
file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
path = f"skills/{name}/{file_path}" if file_path else f"skills/{name}"
if action == "patch":
old = str(arguments.get("old_string") or "")
new = str(arguments.get("new_string") or "")
content = [acp.tool_diff_content(path=path, old_text=old or None, new_text=new)]
elif action in {"edit", "create"}:
content = [
acp.tool_diff_content(
path=path,
new_text=str(arguments.get("content") or ""),
)
]
elif action == "write_file":
target = str(arguments.get("file_path") or "file")
content = [
acp.tool_diff_content(
path=f"skills/{name}/{target}",
new_text=str(arguments.get("file_content") or ""),
)
]
elif action in {"delete", "remove_file"}:
target = str(arguments.get("file_path") or file_path or name)
content = [_text(f"Removing {target} from skill '{name}'")]
else:
content = [_text(f"Running skill_manage action '{action}' on skill '{name}' ({file_path})")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "execute_code":
code = str(arguments.get("code") or "").strip()
preview = code[:1200] + (f"\n... ({len(code)} chars total, truncated)" if len(code) > 1200 else "")
content = [_text(f"Running Python helper script:\n\n```python\n{preview}\n```" if preview else "Running Python helper script")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "web_search":
query = str(arguments.get("query") or "").strip()
content = [_text(f"Searching the web for: {query}" if query else "Searching the web")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "web_extract":
# The title identifies the URL(s). Avoid a duplicate content block so
# Zed renders this like read_file: compact start, concise completion.
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=None, locations=locations,
)
if tool_name == "process":
action = str(arguments.get("action") or "").strip() or "manage"
sid = str(arguments.get("session_id") or "").strip()
data_preview = str(arguments.get("data") or "").strip()
text = f"Process action: {action}" + (f"\nSession: {sid}" if sid else "")
if data_preview:
text += "\nInput: " + _truncate_text(data_preview, limit=500)
content = [_text(text)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "delegate_task":
tasks = arguments.get("tasks")
if isinstance(tasks, list) and tasks:
lines = [f"Delegating {len(tasks)} tasks", ""]
for i, task in enumerate(tasks[:8], 1):
if isinstance(task, dict):
goal = str(task.get("goal") or "").strip()
role = str(task.get("role") or "").strip()
lines.append(f"{i}. " + _truncate_text(goal, limit=160) + (f" ({role})" if role else ""))
if len(tasks) > 8:
lines.append(f"... {len(tasks) - 8} more")
content = [_text("\n".join(lines))]
else:
goal = str(arguments.get("goal") or "").strip()
content = [_text("Delegating task" + (f":\n{_truncate_text(goal, limit=800)}" if goal else ""))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "session_search":
query = str(arguments.get("query") or "").strip()
content = [_text(f"Searching past sessions for: {query}" if query else "Loading recent sessions")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "memory":
action = str(arguments.get("action") or "manage").strip() or "manage"
target = str(arguments.get("target") or "memory").strip() or "memory"
preview = str(arguments.get("content") or arguments.get("old_text") or "").strip()
text = f"Memory {action} ({target})"
if preview:
text += "\nPreview: " + _truncate_text(preview, limit=500)
content = [_text(text)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name in _POLISHED_TOOLS:
try:
args_text = json.dumps(arguments, indent=2, default=str)
except (TypeError, ValueError):
args_text = str(arguments)
content = [_text(_truncate_text(args_text, limit=1200))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
# Generic fallback
@@ -334,7 +1131,7 @@ def build_tool_start(
content = [acp.tool_content(acp.text_block(args_text))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
raw_input=None if tool_name in _POLISHED_TOOLS else arguments,
)
@@ -347,18 +1144,22 @@ def build_tool_complete(
) -> ToolCallProgress:
"""Create a ToolCallUpdate (progress) event for a completed tool call."""
kind = get_tool_kind(tool_name)
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
if tool_name == "web_extract":
error_text = _format_web_extract_result(result)
content = [_text(error_text)] if error_text else None
else:
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
return acp.update_tool_call(
tool_call_id,
kind=kind,
status="completed",
content=content,
raw_output=result,
raw_output=None if tool_name in _POLISHED_TOOLS else result,
)
+15 -1
View File
@@ -1241,10 +1241,24 @@ def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
if not tools:
return []
result = []
seen_names: set = set()
for t in tools:
fn = t.get("function", {})
name = fn.get("name", "")
# Defensive dedup: Anthropic rejects requests with duplicate tool
# names. Upstream injection paths already dedup, but this guard
# converts a hard API failure into a warning. See: #18478
if name and name in seen_names:
logger.warning(
"convert_tools_to_anthropic: duplicate tool name '%s' "
"— dropping second occurrence",
name,
)
continue
if name:
seen_names.add(name)
result.append({
"name": fn.get("name", ""),
"name": name,
"description": fn.get("description", ""),
"input_schema": _normalize_tool_input_schema(
fn.get("parameters", {"type": "object", "properties": {}})
+89 -15
View File
@@ -259,13 +259,68 @@ _PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
"kimi-coding-cn",
})
# OpenRouter app attribution headers
_OR_HEADERS = {
# OpenRouter app attribution headers (base — always sent)
_OR_HEADERS_BASE = {
"HTTP-Referer": "https://hermes-agent.nousresearch.com",
"X-OpenRouter-Title": "Hermes Agent",
"X-OpenRouter-Categories": "productivity,cli-agent",
}
# Truthy values for boolean env-var parsing.
_TRUTHY_ENV_VALUES = frozenset({"1", "true", "yes", "on"})
def build_or_headers(or_config: dict | None = None) -> dict:
"""Build OpenRouter headers, optionally including response-cache headers.
Precedence for response cache: env var > config.yaml > default (enabled).
Environment variables:
``HERMES_OPENROUTER_CACHE`` — truthy (``1``/``true``/``yes``/``on``)
enables caching; ``0``/``false``/``no``/``off`` disables.
Overrides ``openrouter.response_cache`` in config.yaml.
``HERMES_OPENROUTER_CACHE_TTL`` — integer seconds (1-86400).
Overrides ``openrouter.response_cache_ttl`` in config.yaml.
*or_config* is the ``openrouter`` section from config.yaml. When *None*,
falls back to reading config from disk via ``load_config()``.
"""
headers = dict(_OR_HEADERS_BASE)
# Resolve config from disk if not provided.
if or_config is None:
try:
from hermes_cli.config import load_config
or_config = load_config().get("openrouter", {})
except Exception:
or_config = {}
# Determine cache enabled: env var overrides config.
env_cache = os.environ.get("HERMES_OPENROUTER_CACHE", "").strip().lower()
if env_cache:
cache_enabled = env_cache in _TRUTHY_ENV_VALUES
else:
cache_enabled = or_config.get("response_cache", False)
if not cache_enabled:
return headers
headers["X-OpenRouter-Cache"] = "true"
# Determine TTL: env var overrides config.
env_ttl = os.environ.get("HERMES_OPENROUTER_CACHE_TTL", "").strip()
if env_ttl:
if env_ttl.isdigit():
ttl = int(env_ttl)
if 1 <= ttl <= 86400:
headers["X-OpenRouter-Cache-TTL"] = str(ttl)
else:
ttl = or_config.get("response_cache_ttl", 300)
if isinstance(ttl, (int, float)) and 1 <= ttl <= 86400:
headers["X-OpenRouter-Cache-TTL"] = str(int(ttl))
return headers
# Vercel AI Gateway app attribution headers. HTTP-Referer maps to
# referrerUrl and X-Title maps to appName in the gateway's analytics.
from hermes_cli import __version__ as _HERMES_VERSION
@@ -1149,23 +1204,23 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
def _try_openrouter(explicit_api_key: str = None) -> Tuple[Optional[OpenAI], Optional[str]]:
pool_present, entry = _select_pool_entry("openrouter")
if pool_present:
or_key = _pool_runtime_api_key(entry)
or_key = explicit_api_key or _pool_runtime_api_key(entry)
if not or_key:
return None, None
base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
logger.debug("Auxiliary client: OpenRouter via pool")
return OpenAI(api_key=or_key, base_url=base_url,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
default_headers=build_or_headers()), _OPENROUTER_MODEL
or_key = os.getenv("OPENROUTER_API_KEY")
or_key = explicit_api_key or os.getenv("OPENROUTER_API_KEY")
if not or_key:
return None, None
logger.debug("Auxiliary client: OpenRouter")
return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
default_headers=build_or_headers()), _OPENROUTER_MODEL
def _describe_openrouter_unavailable() -> str:
@@ -1474,7 +1529,7 @@ def _build_codex_client(model: str) -> Tuple[Optional[Any], Optional[str]]:
return CodexAuxiliaryClient(real_client, model), model
def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
def _try_anthropic(explicit_api_key: str = None) -> Tuple[Optional[Any], Optional[str]]:
try:
from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
except ImportError:
@@ -1484,10 +1539,10 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
if pool_present:
if entry is None:
return None, None
token = _pool_runtime_api_key(entry)
token = explicit_api_key or _pool_runtime_api_key(entry)
else:
entry = None
token = resolve_anthropic_token()
token = explicit_api_key or resolve_anthropic_token()
if not token:
return None, None
@@ -1911,7 +1966,7 @@ def _to_async_client(sync_client, model: str, is_vision: bool = False):
}
sync_base_url = str(sync_client.base_url)
if base_url_host_matches(sync_base_url, "openrouter.ai"):
async_kwargs["default_headers"] = dict(_OR_HEADERS)
async_kwargs["default_headers"] = build_or_headers()
elif base_url_host_matches(sync_base_url, "api.githubcopilot.com"):
from hermes_cli.copilot_auth import copilot_request_headers
@@ -2053,9 +2108,9 @@ def resolve_provider_client(
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
# ── OpenRouter ───────────────────────────────────────────────────
# ── OpenRouter ───────────────────────────────────────────
if provider == "openrouter":
client, default = _try_openrouter()
client, default = _try_openrouter(explicit_api_key=explicit_api_key)
if client is None:
logger.warning(
"resolve_provider_client: openrouter requested but %s",
@@ -2281,7 +2336,7 @@ def resolve_provider_client(
if pconfig.auth_type == "api_key":
if provider == "anthropic":
client, default_model = _try_anthropic()
client, default_model = _try_anthropic(explicit_api_key=explicit_api_key)
if client is None:
logger.warning("resolve_provider_client: anthropic requested but no Anthropic credentials found")
return None, None
@@ -3237,7 +3292,26 @@ def _build_call_kwargs(
kwargs["max_tokens"] = max_tokens
if tools:
kwargs["tools"] = tools
# Defensive dedup: providers like Google Vertex, Azure, and Bedrock
# reject requests with duplicate tool names (HTTP 400). The upstream
# injection paths (run_agent.py) already dedup, but this guard
# converts a hard API failure into a warning if an upstream regression
# reintroduces duplicates. See: #18478
_seen: set = set()
_deduped: list = []
for _t in tools:
_tname = (_t.get("function") or {}).get("name", "")
if _tname and _tname in _seen:
logger.warning(
"_build_call_kwargs: duplicate tool name '%s' removed "
"(provider=%s model=%s)",
_tname, provider, model,
)
continue
if _tname:
_seen.add(_tname)
_deduped.append(_t)
kwargs["tools"] = _deduped
# Provider-specific extra_body
merged_extra = dict(extra_body or {})
+2
View File
@@ -569,6 +569,8 @@ class ContextCompressor(ContextEngine):
# Skip multimodal content (list of content blocks)
if isinstance(content, list):
continue
if not isinstance(content, str):
continue
if len(content) < 200:
continue
h = hashlib.md5(content.encode("utf-8", errors="replace")).hexdigest()[:12]
+17 -6
View File
@@ -3,6 +3,7 @@
from __future__ import annotations
import logging
import os
import random
import threading
import time
@@ -13,7 +14,7 @@ from datetime import datetime
from typing import Any, Dict, List, Optional, Set, Tuple
from hermes_constants import OPENROUTER_BASE_URL
from hermes_cli.config import get_env_value
from hermes_cli.config import get_env_value, load_env
import hermes_cli.auth as auth_mod
from hermes_cli.auth import (
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
@@ -1380,6 +1381,16 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool, Set[str]]:
changed = False
active_sources: Set[str] = set()
# Prefer ~/.hermes/.env over os.environ — the user's config file is the
# authoritative source for Hermes credentials. Stale env vars from parent
# processes (Codex CLI, test scripts, etc.) should not override deliberate
# changes to the .env file.
def _get_env_prefer_dotenv(key: str) -> str:
env_file = load_env()
val = env_file.get(key) or os.environ.get(key) or ""
return val.strip()
# Honour user suppression — `hermes auth remove <provider> <N>` for an
# env-seeded credential marks the env:<VAR> source as suppressed so it
# won't be re-seeded from the user's shell environment or ~/.hermes/.env.
@@ -1391,8 +1402,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
def _is_source_suppressed(_p, _s): # type: ignore[misc]
return False
if provider == "openrouter":
# Check both os.environ and ~/.hermes/.env file
token = (get_env_value("OPENROUTER_API_KEY") or "").strip()
# Prefer ~/.hermes/.env over os.environ
token = _get_env_prefer_dotenv("OPENROUTER_API_KEY")
if token:
source = "env:OPENROUTER_API_KEY"
if _is_source_suppressed(provider, source):
@@ -1418,7 +1429,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
env_url = ""
if pconfig.base_url_env_var:
env_url = (get_env_value(pconfig.base_url_env_var) or "").strip().rstrip("/")
env_url = _get_env_prefer_dotenv(pconfig.base_url_env_var).rstrip("/")
env_vars = list(pconfig.api_key_env_vars)
if provider == "anthropic":
@@ -1429,8 +1440,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
]
for env_var in env_vars:
# Check both os.environ and ~/.hermes/.env file
token = (get_env_value(env_var) or "").strip()
# Prefer ~/.hermes/.env over os.environ
token = _get_env_prefer_dotenv(env_var)
if not token:
continue
source = f"env:{env_var}"
+218 -36
View File
@@ -24,11 +24,12 @@ from __future__ import annotations
import json
import logging
import os
import re
import tempfile
import threading
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any, Callable, Dict, List, Optional, Set
from typing import Any, Callable, Dict, List, NamedTuple, Optional, Set
from hermes_constants import get_hermes_home
from tools import skill_usage
@@ -36,6 +37,22 @@ from tools import skill_usage
logger = logging.getLogger(__name__)
def _strip_aux_credential(value: Any) -> Optional[str]:
if value is None:
return None
text = str(value).strip()
return text or None
class _ReviewRuntimeBinding(NamedTuple):
"""Provider/model for the curator review fork plus optional per-slot overrides."""
provider: str
model: str
explicit_api_key: Optional[str]
explicit_base_url: Optional[str]
DEFAULT_INTERVAL_HOURS = 24 * 7 # 7 days
DEFAULT_MIN_IDLE_HOURS = 2
DEFAULT_STALE_AFTER_DAYS = 30
@@ -387,6 +404,11 @@ CURATOR_REVIEW_PROMPT = (
" - skill_manage action=write_file — add a references/, templates/, "
"or scripts/ file under an existing skill (the skill must already "
"exist)\n"
" - skill_manage action=delete — archive a skill. MUST pass "
"`absorbed_into=<umbrella>` when you've merged its content into another "
"skill, or `absorbed_into=\"\"` when you're truly pruning with no "
"forwarding target. This drives cron-job skill-reference migration — "
"guessing from your YAML summary after the fact is fragile.\n"
" - terminal — mv a sibling into the archive "
"OR move its content into a support subfile\n\n"
"'keep' is a legitimate decision ONLY when the skill is already a "
@@ -448,6 +470,24 @@ def _reports_root() -> Path:
return root
def _needle_in_path_component(needle: str, path: str) -> bool:
"""Check if *needle* is a complete filename stem or directory name in *path*.
Unlike simple substring matching, this avoids false positives where short
skill names are embedded in longer filenames (e.g. "api" matching
"references/api-design.md"). Hyphens and underscores are normalised so
"open-webui-setup" matches "open_webui_setup.md".
"""
norm_needle = needle.replace("-", "_")
for part in path.replace("\\", "/").split("/"):
if not part:
continue
stem = part.rsplit(".", 1)[0] if "." in part else part
if stem.replace("-", "_") == norm_needle:
return True
return False
def _classify_removed_skills(
removed: List[str],
added: List[str],
@@ -526,15 +566,29 @@ def _classify_removed_skills(
continue
# Look for the removed skill's name in file_path / content / raw.
haystacks: List[str] = []
# Matching strategy differs by field type:
# file_path — needle must be a complete path component
# (filename stem or directory name), so "api" does NOT
# falsely match "references/api-design.md".
# content fields — word-boundary regex so "test" does NOT
# falsely match "latest" or "testing".
haystacks: List[tuple[str, str]] = []
for key in ("file_path", "file_content", "content", "new_string", "_raw"):
v = args.get(key)
if isinstance(v, str):
haystacks.append(v)
haystacks.append((key, v))
hit = False
for hay in haystacks:
for key, hay in haystacks:
for needle in needles:
if needle and needle in hay:
if not needle:
continue
if key == "file_path":
matched = _needle_in_path_component(needle, hay)
else:
matched = bool(
re.search(rf'\b{re.escape(needle)}\b', hay)
)
if matched:
hit = True
evidence = (
f"skill_manage action={args.get('action', '?')} "
@@ -637,15 +691,76 @@ def _parse_structured_summary(
return out
def _extract_absorbed_into_declarations(
tool_calls: List[Dict[str, Any]],
) -> Dict[str, Dict[str, Any]]:
"""Walk this run's tool calls and extract model-declared absorption targets.
The curator prompt requires every ``skill_manage(action='delete')`` call
to pass ``absorbed_into=<umbrella>`` when consolidating, or
``absorbed_into=""`` when truly pruning. This is the single authoritative
signal for classification — the model's own declaration at the moment of
deletion, which beats both post-hoc YAML summary parsing and substring
heuristics on other tool calls.
Returns ``{skill_name: {"into": "<umbrella>" | "", "declared": True}}``.
Entries with ``into == ""`` are explicit prunings.
Skills without a ``skill_manage(delete)`` call, or with one that omitted
``absorbed_into``, are not in the returned dict — caller falls back to
the existing heuristic/YAML logic for those (backward compat with older
curator runs and any callers that don't populate the arg).
"""
out: Dict[str, Dict[str, Any]] = {}
for tc in tool_calls or []:
if not isinstance(tc, dict):
continue
if tc.get("name") != "skill_manage":
continue
raw = tc.get("arguments") or ""
args: Dict[str, Any] = {}
if isinstance(raw, dict):
args = raw
elif isinstance(raw, str):
try:
args = json.loads(raw)
except Exception:
continue
if not isinstance(args, dict):
continue
if args.get("action") != "delete":
continue
name = args.get("name")
if not isinstance(name, str) or not name.strip():
continue
# absorbed_into must be present (even empty string is meaningful);
# missing key means the model didn't declare intent.
if "absorbed_into" not in args:
continue
target = args.get("absorbed_into")
if target is None:
continue
if not isinstance(target, str):
continue
out[name.strip()] = {"into": target.strip(), "declared": True}
return out
def _reconcile_classification(
removed: List[str],
heuristic: Dict[str, List[Dict[str, Any]]],
model_block: Dict[str, List[Dict[str, str]]],
destinations: Set[str],
absorbed_declarations: Optional[Dict[str, Dict[str, Any]]] = None,
) -> Dict[str, List[Dict[str, Any]]]:
"""Merge heuristic (tool-call evidence) with the model's structured block.
Rules:
Rules (evaluated in order; first match wins):
- **Model-declared `absorbed_into` at delete time is authoritative.** Any
entry in ``absorbed_declarations`` beats every other signal. This is
the model telling us directly, at the moment of deletion, what it did.
``into != ""`` and target exists → consolidated. ``into == ""`` →
pruned. ``into != ""`` but target doesn't exist → hallucination; fall
through to the usual signals.
- Model-declared consolidation wins when its ``into`` target exists
in ``destinations`` (survived or newly-created). This gives the
model authority over intent + rationale.
@@ -666,6 +781,8 @@ def _reconcile_classification(
model_cons = {e["from"]: e for e in model_block.get("consolidations", [])}
model_pruned = {e["name"]: e for e in model_block.get("prunings", [])}
declared = absorbed_declarations or {}
consolidated: List[Dict[str, Any]] = []
pruned: List[Dict[str, Any]] = []
@@ -673,6 +790,36 @@ def _reconcile_classification(
mc = model_cons.get(name)
mp = model_pruned.get(name)
hc = heur_cons.get(name)
dec = declared.get(name)
# Authoritative: model declared `absorbed_into` at the delete call.
if dec is not None:
into_claim = dec.get("into", "")
if into_claim and into_claim in destinations:
entry: Dict[str, Any] = {
"name": name,
"into": into_claim,
"source": "absorbed_into (model-declared at delete)",
"reason": (mc.get("reason") or "") if mc else "",
}
if hc and hc.get("evidence"):
entry["evidence"] = hc["evidence"]
consolidated.append(entry)
continue
if into_claim == "":
# Explicit prune declaration
pruned.append({
"name": name,
"source": "absorbed_into=\"\" (model-declared prune)",
"reason": (mp.get("reason") or "") if mp else "",
})
continue
# into_claim is non-empty but target doesn't exist: the model
# named a nonexistent umbrella at delete time. The tool already
# rejects this at the skill_manage layer, so we shouldn't see it
# in practice — but if it slips through (e.g. the umbrella was
# deleted LATER in the same run), fall through to the usual
# signals rather than trusting a broken reference.
# Model says consolidated — trust it if the destination is real.
if mc and mc.get("into") in destinations:
@@ -808,11 +955,20 @@ def _write_run_report(
)
model_block = _parse_structured_summary(llm_meta.get("final", "") or "")
destinations = set(after_names) | set(added or [])
# Authoritative signal: extract per-delete `absorbed_into` declarations
# from this run's tool calls. These beat both the YAML summary block and
# the substring heuristic — the model is telling us directly, at the
# moment of deletion, whether each archived skill was consolidated
# (into=<umbrella>) or pruned (into="").
absorbed_declarations = _extract_absorbed_into_declarations(
llm_meta.get("tool_calls", []) or []
)
classification = _reconcile_classification(
removed=removed,
heuristic=heuristic,
model_block=model_block,
destinations=destinations,
absorbed_declarations=absorbed_declarations,
)
consolidated = classification["consolidated"]
pruned = classification["pruned"]
@@ -1291,6 +1447,52 @@ def run_curator_review(
}
def _resolve_review_runtime(cfg: Dict[str, Any]) -> _ReviewRuntimeBinding:
"""Resolve provider/model and per-slot credentials for the curator review fork.
Same precedence as `_resolve_review_model()`. Non-empty ``api_key`` /
``base_url`` from the active slot are returned as explicit overrides so
``resolve_runtime_provider`` does not silently reuse the main chat
credential chain for a routed auxiliary model.
"""
_main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
_main_provider = _main.get("provider") or "auto"
_main_model = _main.get("default") or _main.get("model") or ""
# 1. Canonical aux task slot
_aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
_cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
_task_provider = (_cur_task.get("provider") or "").strip() or None
_task_model = (_cur_task.get("model") or "").strip() or None
if _task_provider and _task_provider != "auto" and _task_model:
return _ReviewRuntimeBinding(
_task_provider,
_task_model,
_strip_aux_credential(_cur_task.get("api_key")),
_strip_aux_credential(_cur_task.get("base_url")),
)
# 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
_cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
_legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
_legacy_provider = _legacy.get("provider") or None
_legacy_model = _legacy.get("model") or None
if _legacy_provider and _legacy_model:
logger.info(
"curator: using deprecated curator.auxiliary.{provider,model} "
"config — please migrate to auxiliary.curator.{provider,model}"
)
return _ReviewRuntimeBinding(
str(_legacy_provider),
str(_legacy_model),
_strip_aux_credential(_legacy.get("api_key")),
_strip_aux_credential(_legacy.get("base_url")),
)
# 3. Fall through to the main chat model
return _ReviewRuntimeBinding(_main_provider, _main_model, None, None)
def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
"""Pick (provider, model) for the curator review fork.
@@ -1306,32 +1508,8 @@ def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
2. Legacy ``curator.auxiliary.{provider,model}`` when both are set
3. Main ``model.{provider,default/model}`` pair
"""
_main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
_main_provider = _main.get("provider") or "auto"
_main_model = _main.get("default") or _main.get("model") or ""
# 1. Canonical aux task slot
_aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
_cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
_task_provider = (_cur_task.get("provider") or "").strip() or None
_task_model = (_cur_task.get("model") or "").strip() or None
if _task_provider and _task_provider != "auto" and _task_model:
return _task_provider, _task_model
# 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
_cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
_legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
_legacy_provider = _legacy.get("provider") or None
_legacy_model = _legacy.get("model") or None
if _legacy_provider and _legacy_model:
logger.info(
"curator: using deprecated curator.auxiliary.{provider,model} "
"config — please migrate to auxiliary.curator.{provider,model}"
)
return _legacy_provider, _legacy_model
# 3. Fall through to the main chat model
return _main_provider, _main_model
b = _resolve_review_runtime(cfg)
return b.provider, b.model
def _run_llm_review(prompt: str) -> Dict[str, Any]:
@@ -1370,10 +1548,10 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
# arguments hits an auto-resolution path that fails for OAuth-only
# providers and for pool-backed credentials.
#
# `_resolve_review_model()` honors `auxiliary.curator.{provider,model}`
# `_resolve_review_runtime()` honors `auxiliary.curator.{provider,model,...}`
# (canonical aux-task slot, wired through `hermes model` → auxiliary
# picker and the dashboard Models tab), with a legacy fallback to
# `curator.auxiliary.{provider,model}`. See docs/user-guide/features/curator.md.
# `curator.auxiliary.{provider,model,...}`. See docs/user-guide/features/curator.md.
_api_key = None
_base_url = None
_api_mode = None
@@ -1383,9 +1561,13 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
from hermes_cli.config import load_config
from hermes_cli.runtime_provider import resolve_runtime_provider
_cfg = load_config()
_provider, _model_name = _resolve_review_model(_cfg)
_binding = _resolve_review_runtime(_cfg)
_provider, _model_name = _binding.provider, _binding.model
_rp = resolve_runtime_provider(
requested=_provider, target_model=_model_name
requested=_provider,
target_model=_model_name,
explicit_api_key=_binding.explicit_api_key,
explicit_base_url=_binding.explicit_base_url,
)
_api_key = _rp.get("api_key")
_base_url = _rp.get("base_url")
+257 -4
View File
@@ -21,6 +21,18 @@ It DOES include:
pointer otherwise the curator would immediately re-fire on the next
tick)
- ``.bundled_manifest`` (so protection markers stay consistent)
Alongside the skills tarball, each snapshot also captures a copy of
``~/.hermes/cron/jobs.json`` as ``cron-jobs.json`` when it exists. Cron
jobs reference skills by name in their ``skills``/``skill`` fields; the
curator's consolidation pass rewrites those in place via
``cron.jobs.rewrite_skill_refs()``. Without capturing the pre-run state,
rolling back the skills tree would leave cron jobs pointing at the
umbrella skills even though the narrow skills they were originally
configured with have been restored. We store the whole jobs.json for
fidelity but rollback only touches the ``skills``/``skill`` fields the
rest (schedule, next_run_at, enabled, prompt, etc.) is live state and
we leave it alone.
"""
from __future__ import annotations
@@ -63,6 +75,60 @@ def _skills_dir() -> Path:
return get_hermes_home() / "skills"
def _cron_jobs_file() -> Path:
"""Source path for the live cron jobs store (``~/.hermes/cron/jobs.json``)."""
return get_hermes_home() / "cron" / "jobs.json"
CRON_JOBS_FILENAME = "cron-jobs.json"
def _backup_cron_jobs_into(dest: Path) -> Dict[str, Any]:
"""Copy the live cron jobs.json into ``dest`` as ``cron-jobs.json``.
Returns a small dict describing what was captured so the caller can
fold it into the manifest. Never raises if the cron file is missing
or unreadable, the return dict has ``backed_up=False`` and the reason,
and the snapshot proceeds without cron data (the snapshot is still
useful for rolling back skills).
"""
src = _cron_jobs_file()
info: Dict[str, Any] = {"backed_up": False, "jobs_count": 0}
if not src.exists():
info["reason"] = "no cron/jobs.json present"
return info
try:
raw = src.read_text(encoding="utf-8")
except OSError as e:
logger.debug("Failed to read cron/jobs.json for backup: %s", e)
info["reason"] = f"read error: {e}"
return info
# Count jobs as a nice diagnostic — but don't fail the snapshot if the
# file is unparseable; just store the raw text and let rollback deal
# with it (or not, if it's corrupted). jobs.json wraps the list as
# `{"jobs": [...], "updated_at": ...}` — we count via that shape, and
# fall back to bare-list shape just in case the format ever changes.
try:
parsed = json.loads(raw)
if isinstance(parsed, dict):
inner = parsed.get("jobs")
if isinstance(inner, list):
info["jobs_count"] = len(inner)
elif isinstance(parsed, list):
info["jobs_count"] = len(parsed)
except (json.JSONDecodeError, TypeError):
info["jobs_count"] = 0
info["parse_warning"] = "jobs.json was not valid JSON at snapshot time"
try:
(dest / CRON_JOBS_FILENAME).write_text(raw, encoding="utf-8")
except OSError as e:
logger.debug("Failed to write cron backup file: %s", e)
info["reason"] = f"write error: {e}"
return info
info["backed_up"] = True
return info
def _utc_id(now: Optional[datetime] = None) -> str:
"""UTC ISO-ish filesystem-safe timestamp: ``2026-05-01T13-05-42Z``."""
if now is None:
@@ -116,7 +182,8 @@ def _count_skill_files(base: Path) -> int:
def _write_manifest(dest: Path, reason: str, archive_path: Path,
skills_counted: int) -> None:
skills_counted: int,
cron_info: Optional[Dict[str, Any]] = None) -> None:
manifest = {
"id": dest.name,
"reason": reason,
@@ -125,6 +192,15 @@ def _write_manifest(dest: Path, reason: str, archive_path: Path,
"archive_bytes": archive_path.stat().st_size,
"skill_files": skills_counted,
}
if cron_info is not None:
manifest["cron_jobs"] = {
"backed_up": bool(cron_info.get("backed_up", False)),
"jobs_count": int(cron_info.get("jobs_count", 0)),
}
if not cron_info.get("backed_up"):
manifest["cron_jobs"]["reason"] = cron_info.get("reason", "not captured")
if cron_info.get("parse_warning"):
manifest["cron_jobs"]["parse_warning"] = cron_info["parse_warning"]
(dest / "manifest.json").write_text(
json.dumps(manifest, indent=2, sort_keys=True), encoding="utf-8"
)
@@ -181,7 +257,14 @@ def snapshot_skills(reason: str = "manual") -> Optional[Path]:
# arcname: store paths relative to skills/ so extraction
# drops cleanly back into the skills dir.
tf.add(str(entry), arcname=entry.name, recursive=True)
_write_manifest(dest, reason, archive, _count_skill_files(skills))
# Capture cron/jobs.json alongside the tarball. Never fails the
# snapshot — the skills side is the core guarantee; cron is
# additive. We still record in the manifest whether it was
# captured so rollback can surface "no cron data in this snapshot".
cron_info = _backup_cron_jobs_into(dest)
_write_manifest(dest, reason, archive,
_count_skill_files(skills),
cron_info=cron_info)
except (OSError, tarfile.TarError) as e:
logger.debug("Curator snapshot failed: %s", e, exc_info=True)
# Clean up partial snapshot
@@ -298,6 +381,149 @@ def _resolve_backup(backup_id: Optional[str]) -> Optional[Path]:
return candidates[0] if candidates else None
def _restore_cron_skill_links(snapshot_dir: Path) -> Dict[str, Any]:
"""Reconcile backed-up cron skill links into the live ``cron/jobs.json``.
We do NOT overwrite the whole cron file. Only the ``skills`` and
``skill`` fields are restored, and only on jobs that still exist in the
current file (matched by ``id``). Everything else about the job
schedule, next_run_at, last_run_at, enabled, prompt, workdir, hooks
is live state that the user/scheduler has modified since the snapshot;
overwriting it would regress unrelated cron activity.
Rules:
- Jobs present in backup AND live, with differing skills skills restored.
- Jobs present in backup AND live, with matching skills no-op.
- Jobs present in backup but gone from live (user deleted the job
after the snapshot) skipped, noted in the return report.
- Jobs present in live but not in backup (user created a new cron
job after the snapshot) left untouched.
Never raises; failures are captured in the return dict. Writes through
``cron.jobs`` to pick up the same lock + atomic-write path that tick()
uses, so we don't race the scheduler.
"""
report: Dict[str, Any] = {
"attempted": False,
"restored": [],
"skipped_missing": [],
"unchanged": 0,
"error": None,
}
backup_file = snapshot_dir / CRON_JOBS_FILENAME
if not backup_file.exists():
report["error"] = f"snapshot has no {CRON_JOBS_FILENAME}"
return report
try:
backup_text = backup_file.read_text(encoding="utf-8")
backup_parsed = json.loads(backup_text)
except (OSError, json.JSONDecodeError) as e:
report["error"] = f"failed to load backed-up jobs: {e}"
return report
# jobs.json on disk is `{"jobs": [...], "updated_at": ...}`; accept both
# that shape and a bare list for forward compat.
if isinstance(backup_parsed, dict):
backup_jobs = backup_parsed.get("jobs")
elif isinstance(backup_parsed, list):
backup_jobs = backup_parsed
else:
backup_jobs = None
if not isinstance(backup_jobs, list):
report["error"] = "backed-up cron-jobs.json has no jobs list"
return report
# Build a lookup of the backed-up skill state keyed by job id.
# We only need the two skill-ish fields (legacy single and modern list).
backup_by_id: Dict[str, Dict[str, Any]] = {}
for job in backup_jobs:
if not isinstance(job, dict):
continue
jid = job.get("id")
if not isinstance(jid, str) or not jid:
continue
backup_by_id[jid] = {
"skills": job.get("skills"),
"skill": job.get("skill"),
"name": job.get("name") or jid,
}
if not backup_by_id:
report["attempted"] = True # we tried but there was nothing to do
return report
# Load and rewrite the live jobs under the scheduler's lock.
try:
from cron.jobs import load_jobs, save_jobs, _jobs_file_lock
except ImportError as e:
report["error"] = f"cron module unavailable: {e}"
return report
report["attempted"] = True
try:
with _jobs_file_lock:
live_jobs = load_jobs()
changed = False
live_ids = set()
for live in live_jobs:
if not isinstance(live, dict):
continue
jid = live.get("id")
if not isinstance(jid, str) or not jid:
continue
live_ids.add(jid)
backup = backup_by_id.get(jid)
if backup is None:
continue # live job didn't exist at snapshot time
cur_skills = live.get("skills")
cur_skill = live.get("skill")
bkp_skills = backup.get("skills")
bkp_skill = backup.get("skill")
if cur_skills == bkp_skills and cur_skill == bkp_skill:
report["unchanged"] += 1
continue
# Restore. Preserve absence (don't force the key to appear
# if the backup didn't have it either).
if bkp_skills is None:
live.pop("skills", None)
else:
live["skills"] = bkp_skills
if bkp_skill is None:
live.pop("skill", None)
else:
live["skill"] = bkp_skill
report["restored"].append({
"job_id": jid,
"job_name": backup.get("name") or jid,
"from": {"skills": cur_skills, "skill": cur_skill},
"to": {"skills": bkp_skills, "skill": bkp_skill},
})
changed = True
# Jobs in backup but not in live = user deleted them after snapshot
for jid, backup in backup_by_id.items():
if jid not in live_ids:
report["skipped_missing"].append({
"job_id": jid,
"job_name": backup.get("name") or jid,
})
if changed:
save_jobs(live_jobs)
except Exception as e: # noqa: BLE001 — rollback must not die mid-restore
logger.debug("Cron skill-link restore failed: %s", e, exc_info=True)
report["error"] = f"restore failed mid-flight: {e}"
return report
def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]]:
"""Restore ``~/.hermes/skills/`` from a snapshot.
@@ -408,8 +634,35 @@ def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]
except OSError:
pass
logger.info("Curator rollback: restored from %s", target.name)
return (True, f"restored from snapshot {target.name}", target)
# Reconcile cron skill-links. Surgical: only the skills/skill fields
# on jobs matched by id. Everything else in jobs.json is live state
# (schedule, next_run_at, enabled, prompt, etc.) and we leave it
# alone. Failures here don't fail the overall rollback — the skills
# tree is already restored, which is the main guarantee.
cron_report = _restore_cron_skill_links(target)
summary_bits = [f"restored from snapshot {target.name}"]
if cron_report.get("attempted"):
restored_n = len(cron_report.get("restored") or [])
skipped_n = len(cron_report.get("skipped_missing") or [])
if cron_report.get("error"):
summary_bits.append(f"cron links: error — {cron_report['error']}")
elif restored_n == 0 and skipped_n == 0 and cron_report.get("unchanged", 0) == 0:
# Attempted but nothing matched — empty snapshot or no overlapping ids.
pass
else:
parts = []
if restored_n:
parts.append(f"{restored_n} job(s) had skill links restored")
if skipped_n:
parts.append(f"{skipped_n} backed-up job(s) no longer exist (skipped)")
if cron_report.get("unchanged"):
parts.append(f"{cron_report['unchanged']} already matched")
summary_bits.append("cron links: " + ", ".join(parts))
logger.info("Curator rollback: restored from %s (cron_report=%s)",
target.name, cron_report)
return (True, "; ".join(summary_bits), target)
# ---------------------------------------------------------------------------
+2 -2
View File
@@ -183,8 +183,8 @@ SKILLS_GUIDANCE = (
)
KANBAN_GUIDANCE = (
"# You are a Kanban worker\n"
"You were spawned by the Hermes Kanban dispatcher to execute ONE task from "
"# Kanban task execution protocol\n"
"You have been assigned ONE task from "
"the shared board at `~/.hermes/kanban.db`. Your task id is in "
"`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
"The `kanban_*` tools in your schema are your primary coordination surface — "
+38 -3
View File
@@ -6,6 +6,7 @@ can invoke skills via /skill-name commands.
import json
import logging
import os
import re
from pathlib import Path
from typing import Any, Dict, Optional
@@ -20,10 +21,35 @@ from agent.skill_preprocessing import (
logger = logging.getLogger(__name__)
_skill_commands: Dict[str, Dict[str, Any]] = {}
_skill_commands_platform: Optional[str] = None
# Patterns for sanitizing skill names into clean hyphen-separated slugs.
_SKILL_INVALID_CHARS = re.compile(r"[^a-z0-9-]")
_SKILL_MULTI_HYPHEN = re.compile(r"-{2,}")
def _resolve_skill_commands_platform() -> Optional[str]:
"""Return the current platform scope used for disabled-skill filtering.
Used to detect when the active platform has shifted so
:func:`get_skill_commands` can drop a stale cache that was populated
for a different platform's ``skills.platform_disabled`` view (#14536).
Resolves from (in order) ``HERMES_PLATFORM`` env var and
``HERMES_SESSION_PLATFORM`` from the gateway session context. Returns
``None`` when no platform scope is active (e.g. classic CLI, RL
rollouts, standalone scripts).
"""
try:
from gateway.session_context import get_session_env
resolved_platform = (
os.getenv("HERMES_PLATFORM")
or get_session_env("HERMES_SESSION_PLATFORM")
)
except Exception:
resolved_platform = os.getenv("HERMES_PLATFORM")
return resolved_platform or None
def _load_skill_payload(skill_identifier: str, task_id: str | None = None) -> tuple[dict[str, Any], Path | None, str] | None:
"""Load a skill by name/path and return (loaded_payload, skill_dir, display_name)."""
raw_identifier = (skill_identifier or "").strip()
@@ -218,7 +244,8 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
Returns:
Dict mapping "/skill-name" to {name, description, skill_md_path, skill_dir}.
"""
global _skill_commands
global _skill_commands, _skill_commands_platform
_skill_commands_platform = _resolve_skill_commands_platform()
_skill_commands = {}
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
@@ -278,8 +305,16 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
def get_skill_commands() -> Dict[str, Dict[str, Any]]:
"""Return the current skill commands mapping (scan first if empty)."""
if not _skill_commands:
"""Return the current skill commands mapping (scan first if empty).
Rescans when the active platform scope changes (e.g. a gateway
process serving Telegram and Discord concurrently) so each platform
sees its own ``skills.platform_disabled`` view (#14536).
"""
if (
not _skill_commands
or _skill_commands_platform != _resolve_skill_commands_platform()
):
scan_skill_commands()
return _skill_commands
+12 -1
View File
@@ -143,7 +143,18 @@ class ResponsesApiTransport(ProviderTransport):
kwargs["max_output_tokens"] = max_tokens
if is_xai_responses and session_id:
kwargs["extra_headers"] = {"x-grok-conv-id": session_id}
existing_extra_headers = kwargs.get("extra_headers")
merged_extra_headers: Dict[str, str] = {}
if isinstance(existing_extra_headers, dict):
merged_extra_headers.update(
{
str(key): str(value)
for key, value in existing_extra_headers.items()
if key and value is not None
}
)
merged_extra_headers["x-grok-conv-id"] = session_id
kwargs["extra_headers"] = merged_extra_headers
return kwargs
+12
View File
@@ -121,6 +121,18 @@ model:
# # Data policy: "allow" (default) or "deny" to exclude providers that may store data
# # data_collection: "deny"
# =============================================================================
# OpenRouter Response Caching (only applies when using OpenRouter)
# =============================================================================
# Cache identical API responses at the OpenRouter edge for free instant replays.
# When enabled, identical requests (same model, messages, parameters) return
# cached responses with zero billing. Separate from Anthropic prompt caching.
# See: https://openrouter.ai/docs/guides/features/response-caching
#
# openrouter:
# response_cache: true # Enable response caching (default: true)
# response_cache_ttl: 300 # Cache TTL in seconds, 1-86400 (default: 300)
# =============================================================================
# Git Worktree Isolation
# =============================================================================
+43 -73
View File
@@ -459,32 +459,19 @@ def load_cli_config() -> Dict[str, Any]:
if "backend" in terminal_config:
terminal_config["env_type"] = terminal_config["backend"]
# Handle special cwd values: "." or "auto" means use current working directory.
# Only resolve to the host's CWD for the local backend where the host
# filesystem is directly accessible. For ALL remote/container backends
# (ssh, docker, modal, singularity), the host path doesn't exist on the
# target -- remove the key so terminal_tool.py uses its per-backend default.
#
# GUARD: If TERMINAL_CWD is already set to a real absolute path (by the
# gateway's config bridge earlier in the process), don't clobber it.
# This prevents a lazy import of cli.py during gateway runtime from
# rewriting TERMINAL_CWD to the service's working directory.
# See issue #10817.
# CWD resolution for CLI/TUI. The gateway has its own config bridge in
# gateway/run.py but may lazily import cli.py (triggering this code).
# Local backend: always os.getcwd(). Use `cd /dir && hermes` to control it.
# Non-local with placeholder: pop so terminal_tool uses its per-backend default.
# Non-local with explicit path: keep as-is.
_CWD_PLACEHOLDERS = (".", "auto", "cwd")
if terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
_existing_cwd = os.environ.get("TERMINAL_CWD", "")
if _existing_cwd and _existing_cwd not in _CWD_PLACEHOLDERS and os.path.isabs(_existing_cwd):
# Gateway (or earlier startup) already resolved a real path — keep it
terminal_config["cwd"] = _existing_cwd
defaults["terminal"]["cwd"] = _existing_cwd
else:
effective_backend = terminal_config.get("env_type", "local")
if effective_backend == "local":
terminal_config["cwd"] = os.getcwd()
defaults["terminal"]["cwd"] = terminal_config["cwd"]
else:
# Remove so TERMINAL_CWD stays unset → tool picks backend default
terminal_config.pop("cwd", None)
effective_backend = terminal_config.get("env_type", "local")
if effective_backend == "local":
terminal_config["cwd"] = os.getcwd()
defaults["terminal"]["cwd"] = terminal_config["cwd"]
elif terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
terminal_config.pop("cwd", None)
env_mappings = {
"env_type": "TERMINAL_ENV",
@@ -517,13 +504,18 @@ def load_cli_config() -> Dict[str, Any]:
"sudo_password": "SUDO_PASSWORD",
}
# Apply config values to env vars so terminal_tool picks them up.
# If the config file explicitly has a [terminal] section, those values are
# authoritative and override any .env settings. When using defaults only
# (no config file or no terminal section), don't overwrite env vars that
# were already set by .env -- the user's .env is the fallback source.
# Bridge config env vars for terminal_tool. TERMINAL_CWD is force-exported
# UNLESS we're inside a gateway process (detected by _HERMES_GATEWAY marker)
# where it was already set correctly by gateway/run.py's config bridge.
_is_gateway = os.environ.get("_HERMES_GATEWAY") == "1"
for config_key, env_var in env_mappings.items():
if config_key in terminal_config:
if env_var == "TERMINAL_CWD":
if _is_gateway:
continue
# CLI: always export (overrides stale .env or inherited values)
os.environ[env_var] = str(terminal_config[config_key])
continue
if _file_has_terminal_config or env_var not in os.environ:
val = terminal_config[config_key]
if isinstance(val, list):
@@ -2928,7 +2920,14 @@ class HermesCLI:
def _expand_ref(match):
path = Path(match.group(1))
return path.read_text(encoding="utf-8") if path.exists() else match.group(0)
# Use try/except instead of path.exists() to avoid TOCTOU race:
# the paste file may be deleted between check and read, causing
# the input to be silently dropped (#17666).
try:
return path.read_text(encoding="utf-8")
except (OSError, IOError):
logger.warning("Paste file gone or unreadable, returning placeholder: %s", path)
return match.group(0)
return paste_ref_re.sub(_expand_ref, text)
@@ -4912,40 +4911,6 @@ class HermesCLI:
flush_tool_summary()
print()
def _handle_recap_command(self) -> None:
"""Show a compact recap of recent activity in this session.
Inspired by Claude Code's ``/recap`` (v2.1.114, April 2026) — useful
when running multiple sessions simultaneously and returning to one
after a while. Purely local; no LLM call, no token cost, no cache
invalidation.
"""
try:
from hermes_cli.session_recap import build_recap
except Exception as exc: # pragma: no cover - defensive
print(f" (recap unavailable: {exc})")
return
title = None
try:
if self._session_db and self.session_id:
row = self._session_db.get_session(self.session_id)
if row:
title = row.get("title") or None
except Exception:
title = None
text = build_recap(
self.conversation_history or [],
session_title=title,
session_id=self.session_id,
platform="cli",
)
print()
for line in text.splitlines():
print(line)
print()
def _notify_session_boundary(self, event_type: str) -> None:
"""Fire a session-boundary plugin hook (on_session_finalize or on_session_reset).
@@ -6398,8 +6363,6 @@ class HermesCLI:
pass
elif canonical == "history":
self.show_history()
elif canonical == "recap":
self._handle_recap_command()
elif canonical == "title":
parts = cmd_original.split(maxsplit=1)
if len(parts) > 1:
@@ -8412,6 +8375,17 @@ class HermesCLI:
_cprint(f"{_DIM}Voice auto-restart failed: {e}{_RST}")
threading.Thread(target=_restart_recording, daemon=True).start()
def _voice_speak_response_async(self, text: str) -> None:
"""Schedule TTS and mark it pending before continuous recording can restart."""
if not self._voice_tts or not text:
return
self._voice_tts_done.clear()
threading.Thread(
target=self._voice_speak_response,
args=(text,),
daemon=True,
).start()
def _voice_speak_response(self, text: str):
"""Speak the agent's response aloud using TTS (runs in background thread)."""
if not self._voice_tts:
@@ -9572,11 +9546,7 @@ class HermesCLI:
# Speak response aloud if voice TTS is enabled
# Skip batch TTS when streaming TTS already handled it
if self._voice_tts and response and not use_streaming_tts:
threading.Thread(
target=self._voice_speak_response,
args=(response,),
daemon=True,
).start()
self._voice_speak_response_async(response)
# Re-queue the interrupt message (and any that arrived while we were
@@ -11620,7 +11590,7 @@ class HermesCLI:
pass # Non-fatal — don't break the main loop
except Exception as e:
print(f"Error: {e}")
logger.warning("process_loop unhandled error (msg may be lost): %s", e)
# Start processing thread
process_thread = threading.Thread(target=process_loop, daemon=True)
+19 -2
View File
@@ -797,19 +797,36 @@ def get_due_jobs() -> List[Dict[str, Any]]:
next_run = job.get("next_run_at")
if not next_run:
schedule = job.get("schedule", {})
kind = schedule.get("kind")
# One-shot jobs use a small grace window via the dedicated helper.
recovered_next = _recoverable_oneshot_run_at(
job.get("schedule", {}),
schedule,
now,
last_run_at=job.get("last_run_at"),
)
recovery_kind = "one-shot" if recovered_next else None
# Recurring jobs reach here only when something — typically a
# direct jobs.json edit that bypassed add_job() — left
# next_run_at unset. Without this branch, such jobs are
# silently skipped forever; recompute next_run_at from the
# schedule so they pick up at their next scheduled tick.
if not recovered_next and kind in ("cron", "interval"):
recovered_next = compute_next_run(schedule, now.isoformat())
if recovered_next:
recovery_kind = kind
if not recovered_next:
continue
job["next_run_at"] = recovered_next
next_run = recovered_next
logger.info(
"Job '%s' had no next_run_at; recovering one-shot run at %s",
"Job '%s' had no next_run_at; recovering %s run at %s",
job.get("name", job["id"]),
recovery_kind,
recovered_next,
)
for rj in raw_jobs:
+35 -5
View File
@@ -123,9 +123,19 @@ _LOCK_FILE = _LOCK_DIR / ".tick.lock"
def _resolve_origin(job: dict) -> Optional[dict]:
"""Extract origin info from a job, preserving any extra routing metadata."""
"""Extract origin info from a job, preserving any extra routing metadata.
Treats non-dict origins (free-form provenance strings, ints, lists from
migration scripts or hand-edited jobs.json) as missing instead of
crashing with ``AttributeError`` on ``origin.get(...)``. Without this
guard, a job tagged with e.g. ``"combined-digest-replaces-x-and-y"``
crashed every fire attempt with
``'str' object has no attribute 'get'`` ``mark_job_run`` recorded the
failure, but the next tick re-loaded the same poisoned origin and
crashed identically until the field was patched manually (#18722).
"""
origin = job.get("origin")
if not origin:
if not isinstance(origin, dict):
return None
platform = origin.get("platform")
chat_id = origin.get("chat_id")
@@ -147,6 +157,19 @@ def _get_home_target_chat_id(platform_name: str) -> str:
return value
def _get_home_target_thread_id(platform_name: str) -> Optional[str]:
"""Return the optional thread/topic ID for a platform home target."""
env_var = _HOME_TARGET_ENV_VARS.get(platform_name.lower())
if not env_var:
return None
value = os.getenv(f"{env_var}_THREAD_ID", "").strip()
if not value:
legacy = _LEGACY_HOME_TARGET_ENV_VARS.get(env_var)
if legacy:
value = os.getenv(f"{legacy}_THREAD_ID", "").strip()
return value or None
def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[dict]:
"""Resolve one concrete auto-delivery target for a cron job."""
@@ -175,7 +198,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
return {
"platform": platform_name,
"chat_id": chat_id,
"thread_id": None,
"thread_id": _get_home_target_thread_id(platform_name),
}
return None
@@ -229,7 +252,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
return {
"platform": platform_name,
"chat_id": chat_id,
"thread_id": None,
"thread_id": _get_home_target_thread_id(platform_name),
}
@@ -394,7 +417,7 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
thread_id = target.get("thread_id")
# Diagnostic: log thread_id for topic-aware delivery debugging
origin = job.get("origin") or {}
origin = _resolve_origin(job) or {}
origin_thread = origin.get("thread_id")
if origin_thread and not thread_id:
logger.warning(
@@ -759,6 +782,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
return prompt
from tools.skills_tool import skill_view
from tools.skill_usage import bump_use
parts = []
skipped: list[str] = []
@@ -770,6 +794,12 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
skipped.append(skill_name)
continue
# Bump usage so the curator sees this skill as actively used.
try:
bump_use(skill_name)
except Exception:
logger.debug("Cron job: failed to bump skill usage for '%s'", skill_name, exc_info=True)
content = str(loaded.get("content") or "").strip()
if parts:
parts.append("")
+35
View File
@@ -86,6 +86,41 @@ if [ -d "$INSTALL_DIR/skills" ]; then
python3 "$INSTALL_DIR/tools/skills_sync.py"
fi
# Optionally start `hermes dashboard` as a side-process.
#
# Toggled by HERMES_DASHBOARD=1 (also accepts "true"/"yes", case-insensitive).
# Host/port/TUI can be overridden via:
# HERMES_DASHBOARD_HOST (default 0.0.0.0 — exposed outside the container)
# HERMES_DASHBOARD_PORT (default 9119, matches `hermes dashboard` default)
# HERMES_DASHBOARD_TUI (already honored by `hermes dashboard` itself)
#
# The dashboard is a long-lived server. We background it *before* the final
# `exec hermes "$@"` so the user's chosen foreground command (chat, gateway,
# sleep infinity, …) remains PID-of-interest for the container runtime. When
# the container stops the whole process tree is torn down, so no explicit
# cleanup is needed.
case "${HERMES_DASHBOARD:-}" in
1|true|TRUE|True|yes|YES|Yes)
dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"
dash_port="${HERMES_DASHBOARD_PORT:-9119}"
dash_args=(--host "$dash_host" --port "$dash_port" --no-open)
# Binding to anything other than localhost requires --insecure — the
# dashboard refuses otherwise because it exposes API keys. Inside a
# container this is the expected deployment (host reaches it via
# published port), so opt in automatically.
if [ "$dash_host" != "127.0.0.1" ] && [ "$dash_host" != "localhost" ]; then
dash_args+=(--insecure)
fi
echo "Starting hermes dashboard on ${dash_host}:${dash_port} (background)"
# Prefix dashboard output so it's distinguishable from the main
# process in `docker logs`. stdbuf keeps the pipe line-buffered.
(
stdbuf -oL -eL hermes dashboard "${dash_args[@]}" 2>&1 \
| sed -u 's/^/[dashboard] /'
) &
;;
esac
# Final exec: two supported invocation patterns.
#
# docker run <image> -> exec `hermes` with no args (legacy default)
+45 -4
View File
@@ -186,18 +186,24 @@ class HomeChannel:
Default destination for a platform.
When a cron job specifies deliver="telegram" without a specific chat ID,
messages are sent to this home channel.
messages are sent to this home channel. Thread-aware platforms may also
store a thread/topic ID so the bare platform target routes to the exact
conversation where /sethome was run.
"""
platform: Platform
chat_id: str
name: str # Human-readable name for display
thread_id: Optional[str] = None
def to_dict(self) -> Dict[str, Any]:
return {
result = {
"platform": self.platform.value,
"chat_id": self.chat_id,
"name": self.name,
}
if self.thread_id:
result["thread_id"] = self.thread_id
return result
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "HomeChannel":
@@ -205,6 +211,7 @@ class HomeChannel:
platform=Platform(data["platform"]),
chat_id=str(data["chat_id"]),
name=data.get("name", "Home"),
thread_id=str(data["thread_id"]) if data.get("thread_id") else None,
)
@@ -839,11 +846,25 @@ def load_gateway_config() -> GatewayConfig:
if yaml_key in allow_mentions_cfg and not os.getenv(env_key):
os.environ[env_key] = str(allow_mentions_cfg[yaml_key]).lower()
# Bridge top-level require_mention to Telegram when the telegram: section
# does not already provide one. Users often write "require_mention: true"
# at the top level alongside group_sessions_per_user, expecting it to work
# the same way (#3979).
_tl_require_mention = yaml_cfg.get("require_mention")
if _tl_require_mention is not None:
_tg_section = yaml_cfg.get("telegram") or {}
if "require_mention" not in _tg_section:
_tg_plat = platforms_data.setdefault(Platform.TELEGRAM.value, {})
_tg_extra = _tg_plat.setdefault("extra", {})
_tg_extra.setdefault("require_mention", _tl_require_mention)
# Telegram settings → env vars (env vars take precedence)
telegram_cfg = yaml_cfg.get("telegram", {})
if isinstance(telegram_cfg, dict):
if "require_mention" in telegram_cfg and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(telegram_cfg["require_mention"]).lower()
# Prefer telegram.require_mention; fall back to the top-level shorthand.
_effective_rm = telegram_cfg.get("require_mention", yaml_cfg.get("require_mention"))
if _effective_rm is not None and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(_effective_rm).lower()
if "mention_patterns" in telegram_cfg and not os.getenv("TELEGRAM_MENTION_PATTERNS"):
os.environ["TELEGRAM_MENTION_PATTERNS"] = json.dumps(telegram_cfg["mention_patterns"])
frc = telegram_cfg.get("free_response_chats")
@@ -1071,6 +1092,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.TELEGRAM,
chat_id=telegram_home,
name=os.getenv("TELEGRAM_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("TELEGRAM_HOME_CHANNEL_THREAD_ID") or None,
)
# Discord
@@ -1087,6 +1109,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.DISCORD,
chat_id=discord_home,
name=os.getenv("DISCORD_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("DISCORD_HOME_CHANNEL_THREAD_ID") or None,
)
# Reply threading mode for Discord (off/first/all)
@@ -1108,6 +1131,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WHATSAPP,
chat_id=whatsapp_home,
name=os.getenv("WHATSAPP_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WHATSAPP_HOME_CHANNEL_THREAD_ID") or None,
)
# Slack
@@ -1135,6 +1159,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SLACK,
chat_id=slack_home,
name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
thread_id=os.getenv("SLACK_HOME_CHANNEL_THREAD_ID") or None,
)
# Signal
@@ -1155,6 +1180,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SIGNAL,
chat_id=signal_home,
name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("SIGNAL_HOME_CHANNEL_THREAD_ID") or None,
)
# Mattermost
@@ -1174,6 +1200,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.MATTERMOST,
chat_id=mattermost_home,
name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("MATTERMOST_HOME_CHANNEL_THREAD_ID") or None,
)
# Matrix
@@ -1205,6 +1232,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.MATRIX,
chat_id=matrix_home,
name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
thread_id=os.getenv("MATRIX_HOME_ROOM_THREAD_ID") or None,
)
# Home Assistant
@@ -1238,6 +1266,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.EMAIL,
chat_id=email_home,
name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
thread_id=os.getenv("EMAIL_HOME_ADDRESS_THREAD_ID") or None,
)
# SMS (Twilio)
@@ -1253,6 +1282,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SMS,
chat_id=sms_home,
name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("SMS_HOME_CHANNEL_THREAD_ID") or None,
)
# API Server
@@ -1315,6 +1345,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.DINGTALK,
chat_id=dingtalk_home,
name=os.getenv("DINGTALK_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("DINGTALK_HOME_CHANNEL_THREAD_ID") or None,
)
# Feishu / Lark
@@ -1342,6 +1373,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.FEISHU,
chat_id=feishu_home,
name=os.getenv("FEISHU_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("FEISHU_HOME_CHANNEL_THREAD_ID") or None,
)
# WeCom (Enterprise WeChat)
@@ -1364,6 +1396,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WECOM,
chat_id=wecom_home,
name=os.getenv("WECOM_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WECOM_HOME_CHANNEL_THREAD_ID") or None,
)
# WeCom callback mode (self-built apps)
@@ -1422,6 +1455,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WEIXIN,
chat_id=weixin_home,
name=os.getenv("WEIXIN_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WEIXIN_HOME_CHANNEL_THREAD_ID") or None,
)
# BlueBubbles (iMessage)
@@ -1445,6 +1479,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.BLUEBUBBLES,
chat_id=bluebubbles_home,
name=os.getenv("BLUEBUBBLES_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("BLUEBUBBLES_HOME_CHANNEL_THREAD_ID") or None,
)
# QQ (Official Bot API v2)
@@ -1482,6 +1517,11 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.QQBOT,
chat_id=qq_home,
name=os.getenv("QQBOT_HOME_CHANNEL_NAME") or os.getenv(qq_home_name_env, "Home"),
thread_id=(
os.getenv("QQBOT_HOME_CHANNEL_THREAD_ID")
or os.getenv("QQ_HOME_CHANNEL_THREAD_ID")
or None
),
)
# Yuanbao — YUANBAO_APP_ID preferred
@@ -1512,6 +1552,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.YUANBAO,
chat_id=yuanbao_home,
name=os.getenv("YUANBAO_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("YUANBAO_HOME_CHANNEL_THREAD_ID") or None,
)
yuanbao_dm_policy = os.getenv("YUANBAO_DM_POLICY")
if yuanbao_dm_policy:
+84
View File
@@ -0,0 +1,84 @@
"""Shared HTTP client factory for long-lived platform adapters.
Gateway messaging platforms (QQ Bot, Feishu, WeCom, DingTalk, Signal,
BlueBubbles, WeCom-callback) keep a persistent ``httpx.AsyncClient``
alive for the adapter's lifetime. That amortises TLS/connection setup
across many API calls, but it also means the process's file-descriptor
pressure is sensitive to how aggressively the pool recycles idle keep-
alive connections.
httpx's default ``keepalive_expiry`` is 5 seconds. On macOS behind
Cloudflare Warp (and other transparent proxies), peer-initiated FIN can
sit in ``CLOSE_WAIT`` longer than that before the local socket actually
drains which, multiplied across 7 long-lived adapters plus the LLM
client and MCP clients, walks straight into the default 256 fd limit.
See #18451.
``platform_httpx_limits()`` returns a tighter ``httpx.Limits`` the
adapter factories use instead of the httpx default. The values chosen:
* ``max_keepalive_connections=10`` plenty for any single adapter;
platform APIs rarely parallelise beyond this.
* ``keepalive_expiry=2.0`` close idle sockets aggressively so a
proxy's lingering CLOSE_WAIT window can't starve the process.
Override via ``HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY`` /
``HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE`` env vars when tuning under load.
"""
from __future__ import annotations
import os
try:
import httpx
except ImportError: # pragma: no cover — optional dep
httpx = None # type: ignore[assignment]
_DEFAULT_KEEPALIVE_EXPIRY_S = 2.0
_DEFAULT_MAX_KEEPALIVE = 10
def platform_httpx_limits() -> "httpx.Limits | None":
"""Return ``httpx.Limits`` tuned for persistent platform-adapter clients.
Returns ``None`` when httpx isn't importable, so callers can fall
back to httpx's built-in default without a hard dependency on this
helper being reachable.
"""
if httpx is None:
return None
def _env_float(name: str, default: float) -> float:
raw = os.environ.get(name, "").strip()
if not raw:
return default
try:
val = float(raw)
except (TypeError, ValueError):
return default
return val if val > 0 else default
def _env_int(name: str, default: int) -> int:
raw = os.environ.get(name, "").strip()
if not raw:
return default
try:
val = int(raw)
except (TypeError, ValueError):
return default
return val if val > 0 else default
keepalive_expiry = _env_float(
"HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY", _DEFAULT_KEEPALIVE_EXPIRY_S
)
max_keepalive = _env_int(
"HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE", _DEFAULT_MAX_KEEPALIVE
)
return httpx.Limits(
max_keepalive_connections=max_keepalive,
# Leave max_connections at httpx default (100) — plenty of headroom.
keepalive_expiry=keepalive_expiry,
)
+15 -3
View File
@@ -62,6 +62,14 @@ MAX_NORMALIZED_TEXT_LENGTH = 65_536 # 64 KB cap for normalized content parts
MAX_CONTENT_LIST_SIZE = 1_000 # Max items when content is an array
def _coerce_port(value: Any, default: int = DEFAULT_PORT) -> int:
"""Parse a listen port without letting malformed env/config values crash startup."""
try:
return int(value)
except (TypeError, ValueError):
return default
def _normalize_chat_content(
content: Any, *, _max_depth: int = 10, _depth: int = 0,
) -> str:
@@ -573,7 +581,10 @@ class APIServerAdapter(BasePlatformAdapter):
super().__init__(config, Platform.API_SERVER)
extra = config.extra or {}
self._host: str = extra.get("host", os.getenv("API_SERVER_HOST", DEFAULT_HOST))
self._port: int = int(extra.get("port", os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))))
raw_port = extra.get("port")
if raw_port is None:
raw_port = os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))
self._port: int = _coerce_port(raw_port, DEFAULT_PORT)
self._api_key: str = extra.get("key", os.getenv("API_SERVER_KEY", ""))
self._cors_origins: tuple[str, ...] = self._parse_cors_origins(
extra.get("cors_origins", os.getenv("API_SERVER_CORS_ORIGINS", "")),
@@ -727,10 +738,11 @@ class APIServerAdapter(BasePlatformAdapter):
gateway platforms), falling back to the hermes-api-server default.
"""
from run_agent import AIAgent
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config, GatewayRunner
from hermes_cli.tools_config import _get_platform_tools
runtime_kwargs = _resolve_runtime_agent_kwargs()
reasoning_config = GatewayRunner._load_reasoning_config()
model = _resolve_gateway_model()
user_config = _load_gateway_config()
@@ -740,7 +752,6 @@ class APIServerAdapter(BasePlatformAdapter):
# Load fallback provider chain so the API server platform has the
# same fallback behaviour as Telegram/Discord/Slack (fixes #4954).
from gateway.run import GatewayRunner
fallback_model = GatewayRunner._load_fallback_model()
agent = AIAgent(
@@ -759,6 +770,7 @@ class APIServerAdapter(BasePlatformAdapter):
tool_complete_callback=tool_complete_callback,
session_db=self._ensure_session_db(),
fallback_model=fallback_model,
reasoning_config=reasoning_config,
)
return agent
+19 -7
View File
@@ -2489,15 +2489,20 @@ class BasePlatformAdapter(ABC):
try:
response = await self._message_handler(event)
# Old adapter task (if any) is cancelled AFTER the runner has
# fully handled the command — keeps ordering deterministic.
await self.cancel_session_processing(
session_key,
release_guard=False,
discard_pending=False,
)
_text, _eph_ttl = self._unwrap_ephemeral(response)
# Send the response BEFORE cancelling the old task so the send
# cannot be affected by task-cancellation side effects (race
# condition fix — issue #18912). Previously the send happened
# after cancel_session_processing, which could silently drop the
# "/new" confirmation when an agent was actively running.
if _text:
logger.info(
"[%s] Sending command '/%s' response (%d chars) to %s",
self.name,
cmd,
len(_text),
event.source.chat_id,
)
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
@@ -2510,6 +2515,13 @@ class BasePlatformAdapter(ABC):
message_id=_r.message_id,
ttl_seconds=_eph_ttl,
)
# Old adapter task (if any) is cancelled AFTER the response has
# been sent — keeps ordering deterministic and avoids the race.
await self.cancel_session_processing(
session_key,
release_guard=False,
discard_pending=False,
)
except Exception:
# On failure, restore the original guard if one still exists so
# we don't leave the session in a half-reset state.
+3 -1
View File
@@ -162,7 +162,9 @@ class BlueBubblesAdapter(BasePlatformAdapter):
return False
from aiohttp import web
self.client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
try:
await self._api_get("/api/v1/ping")
info = await self._api_get("/api/v1/server/info")
+5 -1
View File
@@ -228,7 +228,11 @@ class DingTalkAdapter(BasePlatformAdapter):
return False
try:
self._http_client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0, limits=platform_httpx_limits(),
)
credential = dingtalk_stream.Credential(
self._client_id, self._client_secret
+502 -44
View File
@@ -497,6 +497,7 @@ class DiscordAdapter(BasePlatformAdapter):
self._ready_event = asyncio.Event()
self._allowed_user_ids: set = set() # For button approval authorization
self._allowed_role_ids: set = set() # For DISCORD_ALLOWED_ROLES filtering
self.gateway_runner = None # Set by gateway/run.py for cross-platform delivery
# Voice channel state (per-guild)
self._voice_clients: Dict[int, Any] = {} # guild_id -> VoiceClient
self._voice_locks: Dict[int, asyncio.Lock] = {} # guild_id -> serialize join/leave
@@ -613,6 +614,21 @@ class DiscordAdapter(BasePlatformAdapter):
# so LLM output or echoed user content can't ping the whole
# server; override per DISCORD_ALLOW_MENTION_* env vars or the
# discord.allow_mentions.* block in config.yaml.
# Close any existing client to prevent zombie websocket connections
# on reconnect (see #18187). Without this, the old client remains
# connected to Discord gateway and both fire on_message, causing
# double responses.
if self._client is not None:
try:
if not self._client.is_closed():
await self._client.close()
except Exception:
logger.debug("[%s] Failed to close previous Discord client", self.name)
finally:
self._client = None
self._ready_event.clear()
self._client = commands.Bot(
command_prefix="!", # Not really used, we handle raw messages
intents=intents,
@@ -1914,6 +1930,225 @@ class DiscordAdapter(BasePlatformAdapter):
return True
return False
# ── Slash command authorization ─────────────────────────────────────
# Slash commands (``_run_simple_slash`` and ``_handle_thread_create_slash``)
# are a separate Discord interaction surface from regular messages and
# historically ran with NO authorization check — bypassing every gate
# ``on_message`` enforces (DISCORD_ALLOWED_USERS, DISCORD_ALLOWED_ROLES,
# DISCORD_ALLOWED_CHANNELS, DISCORD_IGNORED_CHANNELS). Any guild member
# could invoke ``/background``, ``/restart``, ``/sethome``, etc. as the
# operator. ``_check_slash_authorization`` mirrors the on_message gates
# one-for-one so the slash surface honors the same trust boundary.
#
# By design, this is a no-op for deployments with no allowlist env vars
# set — ``_is_allowed_user`` returns True and the channel checks early-out
# — preserving the existing "single-tenant, all guild members trusted"
# default. Deployments that DO set any DISCORD_ALLOWED_* var get slash
# parity with on_message.
def _evaluate_slash_authorization(
self, interaction: "discord.Interaction",
) -> Tuple[bool, Optional[str]]:
"""Evaluate slash authorization without producing any response.
Returns ``(allowed, reason)``. ``reason`` is populated only when
``allowed`` is False. This is the shared core used by both the
responding wrapper (``_check_slash_authorization``) and side-effect-
free callers like the ``/skill`` autocomplete callback, which must
return an empty list for unauthorized users instead of leaking an
ephemeral rejection per-keystroke.
Fail-closed semantics for malformed payloads: when an allowlist is
configured but the interaction is missing the data needed to
evaluate it (no channel id with channel policy active, no user
with user/role policy active), the gate REJECTS rather than
falling through. Without these guards a guild interaction that
happens to deserialize without a channel id would silently bypass
``DISCORD_ALLOWED_CHANNELS`` and a payload missing ``user`` would
raise ``AttributeError`` in the user check below, surfacing as
an opaque interaction failure rather than a clean rejection.
"""
chan_obj = getattr(interaction, "channel", None)
in_dm = isinstance(chan_obj, discord.DMChannel) if chan_obj is not None else False
# ── Channel scope (mirrors on_message lines 3374-3388) ──
# DMs aren't channel-gated — DMs follow on_message's DM lockdown
# path which has its own user-allowlist enforcement.
if not in_dm:
chan_id_raw = getattr(interaction, "channel_id", None) or getattr(
chan_obj, "id", None,
)
channel_ids: set = set()
if chan_id_raw is not None:
channel_ids.add(str(chan_id_raw))
# Mirror on_message: also test the parent channel for threads
# so per-channel allow/deny lists work consistently.
if isinstance(chan_obj, discord.Thread):
parent_id = self._get_parent_channel_id(chan_obj)
if parent_id:
channel_ids.add(str(parent_id))
allowed_raw = os.getenv("DISCORD_ALLOWED_CHANNELS", "")
if allowed_raw:
allowed = {c.strip() for c in allowed_raw.split(",") if c.strip()}
if "*" not in allowed:
if not channel_ids:
# Channel policy is configured but the interaction
# has no resolvable channel id. Fail closed.
return (
False,
"channel id missing with DISCORD_ALLOWED_CHANNELS configured",
)
if not (channel_ids & allowed):
return (False, "channel not in DISCORD_ALLOWED_CHANNELS")
# Ignored beats allowed: even when a thread's parent channel
# is on the allowlist, an explicit DISCORD_IGNORED_CHANNELS
# entry on the thread or its parent rejects the interaction.
ignored_raw = os.getenv("DISCORD_IGNORED_CHANNELS", "")
if ignored_raw and channel_ids:
ignored = {c.strip() for c in ignored_raw.split(",") if c.strip()}
if "*" in ignored or (channel_ids & ignored):
return (False, "channel in DISCORD_IGNORED_CHANNELS")
# ── User / role allowlist (mirrors on_message line 681) ──
user = getattr(interaction, "user", None)
allowed_users = getattr(self, "_allowed_user_ids", set()) or set()
allowed_roles = getattr(self, "_allowed_role_ids", set()) or set()
if user is None or getattr(user, "id", None) is None:
# No identifiable user. With any user/role allowlist
# configured, fail closed rather than raise AttributeError
# on ``interaction.user.id`` below. With no allowlist this
# is the existing "no allowlist = everyone" backwards-compat.
if allowed_users or allowed_roles:
return (False, "missing interaction.user with allowlist configured")
return (True, None)
user_id = str(user.id)
if not self._is_allowed_user(user_id, author=user):
return (
False,
"user not in DISCORD_ALLOWED_USERS / DISCORD_ALLOWED_ROLES",
)
return (True, None)
async def _check_slash_authorization(
self, interaction: "discord.Interaction", command_text: str,
) -> bool:
"""Mirror on_message's user/role/channel gates onto a slash invocation.
Returns True to proceed. Returns False *after* sending an ephemeral
rejection, logging a warning, and scheduling a cross-platform admin
alert the caller must stop on False (the interaction has already
been responded to).
"""
allowed, reason = self._evaluate_slash_authorization(interaction)
if allowed:
return True
return await self._reject_slash(
interaction, command_text, reason=reason or "unauthorized",
)
async def _reject_slash(
self, interaction: "discord.Interaction", command_text: str, *, reason: str,
) -> bool:
"""Send ephemeral reject + log warning + schedule admin alert. Returns False.
Tolerates a missing ``interaction.user`` -- the fail-closed branch
in ``_evaluate_slash_authorization`` deliberately routes here for
malformed payloads (no user) when an allowlist is configured, and
``str(interaction.user.id)`` would raise AttributeError before the
ephemeral rejection could be sent.
"""
user = getattr(interaction, "user", None)
if user is not None:
user_id = str(getattr(user, "id", "?"))
user_name = getattr(user, "name", "?")
else:
user_id = "?"
user_name = "?"
chan_id = getattr(interaction, "channel_id", None) or getattr(
getattr(interaction, "channel", None), "id", None,
)
guild_id = getattr(interaction, "guild_id", None)
logger.warning(
"[Discord] Unauthorized slash attempt: user=%s id=%s channel=%s "
"guild=%s cmd=%r reason=%r",
user_name, user_id, chan_id, guild_id, command_text, reason,
)
try:
await interaction.response.send_message(
"You're not authorized to use this command.",
ephemeral=True,
)
except Exception as e:
# Interaction may already be responded to (e.g. caller deferred
# before the auth check, or Discord retried). Best-effort only.
logger.debug("[Discord] Could not send unauthorized ephemeral: %s", e)
# Fire-and-forget: don't block the interaction handler on Telegram I/O.
try:
asyncio.create_task(self._notify_unauthorized_slash(
user_name, user_id, chan_id, guild_id, command_text, reason,
))
except Exception as e:
logger.debug("[Discord] Could not schedule admin notify task: %s", e)
return False
async def _notify_unauthorized_slash(
self, user_name: str, user_id: str, chan_id, guild_id,
command_text: str, reason: str,
) -> None:
"""Best-effort cross-platform alert to the gateway operator.
Tries TELEGRAM first (most operators set TELEGRAM_HOME_CHANNEL),
then SLACK. Silently no-ops if no other platform is configured
with a home channel.
A soft send failure -- adapter.send() returning a result with
``success=False`` rather than raising -- continues the fallback
chain. Treating a SendResult(success=False) as delivered would
mean a Telegram outage that the adapter politely surfaces (e.g.
rate-limit, auth failure) silently swallows the alert without
attempting Slack. Hard exceptions still take the same path via
the except branch below.
"""
runner = getattr(self, "gateway_runner", None)
if not runner:
return
for target in (Platform.TELEGRAM, Platform.SLACK):
try:
adapter = runner.adapters.get(target)
if not adapter:
continue
home = runner.config.get_home_channel(target)
if not home or not getattr(home, "chat_id", None):
continue
msg = (
"⚠️ Unauthorized Discord slash attempt\n"
f"User: {user_name} ({user_id})\n"
f"Channel: {chan_id} (guild {guild_id})\n"
f"Command: {command_text}\n"
f"Reason: {reason}"
)
result = await adapter.send(str(home.chat_id), msg)
# Only return on confirmed delivery. SendResult(success=False)
# -> continue to the next platform.
if getattr(result, "success", None) is False:
logger.debug(
"[Discord] Admin notify via %s returned success=False"
" (error=%r); falling through",
target, getattr(result, "error", None),
)
continue
return
except Exception as e:
logger.debug("[Discord] Admin notify via %s failed: %s", target, e)
async def send_image_file(
self,
chat_id: str,
@@ -2301,6 +2536,11 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception:
pass # logging must never block command dispatch
# Auth gate — must run before defer() so an ephemeral rejection can
# be delivered on the still-unresponded interaction.
if not await self._check_slash_authorization(interaction, command_text):
return
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, command_text)
await self.handle_message(event)
@@ -2445,7 +2685,8 @@ class DiscordAdapter(BasePlatformAdapter):
message: str = "",
auto_archive_duration: int = 1440,
):
await interaction.response.defer(ephemeral=True)
# defer() is performed inside the handler *after* the auth gate
# so a rejected invoker can receive an ephemeral rejection.
await self._handle_thread_create_slash(interaction, name, message, auto_archive_duration)
@tree.command(name="queue", description="Queue a prompt for the next turn (doesn't interrupt)")
@@ -2566,6 +2807,54 @@ class DiscordAdapter(BasePlatformAdapter):
# supporting up to 25 categories × 25 skills = 625 skills.
self._register_skill_group(tree)
# Optional defense-in-depth: hide every slash command from non-admin
# guild members in Discord's slash picker. Server-side authorization
# (``_check_slash_authorization``) is the actual gate; this is purely
# UX so users don't see commands they can't invoke. Off by default
# to preserve the slash UX for deployments that intentionally allow
# everyone in the guild.
if os.getenv("DISCORD_HIDE_SLASH_COMMANDS", "false").strip().lower() in (
"true", "1", "yes", "on",
):
self._apply_owner_only_visibility(tree)
def _apply_owner_only_visibility(self, tree) -> None:
"""Set default_member_permissions=0 on every registered slash command.
Discord interprets ``Permissions(0)`` as "requires no permissions",
which paradoxically means the command is hidden from every guild
member except those with the Administrator permission. Server admins
can re-grant per user/role via Server Settings Integrations
<bot> Permissions.
Authoritative gate is ``_check_slash_authorization`` on every
invocation, which catches stale clients, role grants made by
mistake, and direct API calls bypassing Discord's UI hide.
"""
try:
no_perms = discord.Permissions(0)
except Exception as e:
logger.warning(
"[Discord] _apply_owner_only_visibility: cannot build Permissions(0): %s",
e,
)
return
applied = 0
for cmd in tree.get_commands():
try:
cmd.default_permissions = no_perms
applied += 1
except Exception as e:
logger.debug(
"[Discord] Could not set default_permissions on %r: %s",
getattr(cmd, "name", "?"), e,
)
logger.info(
"[Discord] Hid %d slash command(s) from non-admin guild members "
"(opt-in defense in depth via DISCORD_HIDE_SLASH_COMMANDS).",
applied,
)
def _register_skill_group(self, tree) -> None:
"""Register a single ``/skill`` command with autocomplete on the name.
@@ -2584,40 +2873,32 @@ class DiscordAdapter(BasePlatformAdapter):
hidden skills. The slash picker also becomes more discoverable
Discord live-filters by the user's typed prefix against both the
skill name and its description.
The entries list and lookup dict are stored on ``self`` rather
than captured in closure variables so :meth:`refresh_skill_group`
can repopulate them when the user runs ``/reload-skills`` without
needing to touch the Discord slash-command tree or trigger a
``tree.sync()`` call.
"""
try:
from hermes_cli.commands import discord_skill_commands_by_category
existing_names = set()
try:
existing_names = {cmd.name for cmd in tree.get_commands()}
except Exception:
pass
# Reuse the existing collector for consistent filtering
# (per-platform disabled, hub-excluded, name clamping), then
# flatten — the category grouping was only useful for the
# nested layout.
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=existing_names,
)
entries: list[tuple[str, str, str]] = list(uncategorized)
for cat_skills in categories.values():
entries.extend(cat_skills)
# Populate the instance-level entries/lookup so the
# autocomplete + handler callbacks below always read the
# freshest state. refresh_skill_group() re-runs the same
# collector and mutates these two attributes in place.
self._skill_entries: list[tuple[str, str, str]] = []
self._skill_lookup: dict[str, tuple[str, str]] = {}
self._skill_group_reserved_names: set[str] = set(existing_names)
self._refresh_skill_catalog_state()
if not entries:
if not self._skill_entries:
return
# Stable alphabetical order so the autocomplete suggestion
# list is predictable across restarts.
entries.sort(key=lambda t: t[0])
# name -> (description, cmd_key) — used by both the autocomplete
# callback and the handler for O(1) dispatch.
skill_lookup: dict[str, tuple[str, str]] = {
n: (d, k) for n, d, k in entries
}
async def _autocomplete_name(
interaction: "discord.Interaction", current: str,
) -> list:
@@ -2627,10 +2908,29 @@ class DiscordAdapter(BasePlatformAdapter):
"/skill pdf" surfaces skills whose description mentions
PDFs even if the name doesn't. Discord caps this list at
25 entries per query.
Authorization: a quiet pre-check evaluates the slash
allowlists and returns ``[]`` for unauthorized users so
the installed skill catalog is not leaked to anyone who
can see the command in the picker. Returning a generic
empty list here is intentional sending a per-keystroke
ephemeral rejection would produce a barrage of error
popups during typing.
Reads ``self._skill_entries`` so a ``/reload-skills`` run
since process start shows up on the very next keystroke.
"""
try:
allowed, _reason = self._evaluate_slash_authorization(interaction)
except Exception:
# Defensive: never raise from autocomplete. Fail
# closed by returning an empty suggestion list.
return []
if not allowed:
return []
q = (current or "").strip().lower()
choices: list = []
for name, desc, _key in entries:
for name, desc, _key in self._skill_entries:
if not q or q in name.lower() or (desc and q in desc.lower()):
if desc:
label = f"{name}{desc}"
@@ -2654,7 +2954,13 @@ class DiscordAdapter(BasePlatformAdapter):
async def _skill_handler(
interaction: "discord.Interaction", name: str, args: str = "",
):
entry = skill_lookup.get(name)
# Authorize BEFORE any skill lookup so that known and
# unknown skill names produce identical rejections for
# unauthorized users (no probing the installed catalog
# via "Unknown skill: <name>" responses).
if not await self._check_slash_authorization(interaction, "/skill"):
return
entry = self._skill_lookup.get(name)
if not entry:
await interaction.response.send_message(
f"Unknown skill: `{name}`. Start typing for "
@@ -2676,16 +2982,74 @@ class DiscordAdapter(BasePlatformAdapter):
logger.info(
"[%s] Registered /skill command with %d skill(s) via autocomplete",
self.name, len(entries),
self.name, len(self._skill_entries),
)
if hidden:
if self._skill_group_hidden_count:
logger.info(
"[%s] %d skill(s) filtered out of /skill (name clamp / reserved)",
self.name, hidden,
self.name, self._skill_group_hidden_count,
)
except Exception as exc:
logger.warning("[%s] Failed to register /skill command: %s", self.name, exc)
def _refresh_skill_catalog_state(self) -> None:
"""Re-scan disk for skills and repopulate ``self._skill_entries``.
Called once from :meth:`_register_skill_group` at startup and
again from :meth:`refresh_skill_group` whenever the user runs
``/reload-skills``. No Discord API calls are made autocomplete
and the handler both read from these instance attributes
directly, so an in-place mutation is sufficient.
"""
from hermes_cli.commands import discord_skill_commands_by_category
reserved = getattr(self, "_skill_group_reserved_names", set())
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=set(reserved),
)
entries: list[tuple[str, str, str]] = list(uncategorized)
for cat_skills in categories.values():
entries.extend(cat_skills)
# Stable alphabetical order so the autocomplete suggestion
# list is predictable across restarts.
entries.sort(key=lambda t: t[0])
self._skill_entries = entries
self._skill_lookup = {n: (d, k) for n, d, k in entries}
self._skill_group_hidden_count = hidden
def refresh_skill_group(self) -> tuple[int, int]:
"""Rescan skills and update the live ``/skill`` autocomplete state.
Invoked by :meth:`gateway.run.GatewayOrchestrator._handle_reload_skills_command`
after :func:`agent.skill_commands.reload_skills` has refreshed
the in-process skill-command registry. Without this call, the
``/skill`` autocomplete dropdown keeps showing the list captured
at process start new skills stay invisible and deleted skills
return an "Unknown skill" error when clicked.
Because autocomplete options are fetched dynamically by Discord,
we only need to mutate the entries/lookup attributes read by the
callbacks no ``tree.sync()`` is required.
Returns ``(new_count, hidden_count)``.
"""
try:
self._refresh_skill_catalog_state()
except Exception as exc:
logger.warning(
"[%s] Failed to refresh /skill autocomplete after reload: %s",
self.name, exc,
)
return (len(getattr(self, "_skill_entries", [])), 0)
logger.info(
"[%s] Refreshed /skill autocomplete: %d skill(s) available (%d filtered)",
self.name,
len(self._skill_entries),
self._skill_group_hidden_count,
)
return (len(self._skill_entries), self._skill_group_hidden_count)
def _build_slash_event(self, interaction: discord.Interaction, text: str) -> MessageEvent:
"""Build a MessageEvent from a Discord slash command interaction."""
is_dm = isinstance(interaction.channel, discord.DMChannel)
@@ -2743,6 +3107,9 @@ class DiscordAdapter(BasePlatformAdapter):
auto_archive_duration: int = 1440,
) -> None:
"""Create a Discord thread from a slash command and start a session in it."""
if not await self._check_slash_authorization(interaction, "/thread"):
return
await interaction.response.defer(ephemeral=True)
result = await self._create_thread(
interaction,
name=name,
@@ -3037,6 +3404,7 @@ class DiscordAdapter(BasePlatformAdapter):
view = ExecApprovalView(
session_key=session_key,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3075,6 +3443,7 @@ class DiscordAdapter(BasePlatformAdapter):
session_key=session_key,
confirm_id=confirm_id,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3109,6 +3478,7 @@ class DiscordAdapter(BasePlatformAdapter):
view = UpdatePromptView(
session_key=session_key,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
return SendResult(success=True, message_id=str(msg.id))
@@ -3166,6 +3536,7 @@ class DiscordAdapter(BasePlatformAdapter):
session_key=session_key,
on_model_selected=on_model_selected,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3721,6 +4092,72 @@ class DiscordAdapter(BasePlatformAdapter):
# Discord UI Components (outside the adapter class)
# ---------------------------------------------------------------------------
def _component_check_auth(
interaction,
allowed_user_ids: Optional[set],
allowed_role_ids: Optional[set],
) -> bool:
"""Shared user-or-role OR semantics for component view button clicks.
Mirrors ``DiscordAdapter._is_allowed_user`` / the slash and on_message
gates so every Discord interaction surface honors the same trust
boundary. Component views (ExecApprovalView, SlashConfirmView,
UpdatePromptView, ModelPickerView) used to receive only
``allowed_user_ids``: in role-only deployments
(DISCORD_ALLOWED_ROLES set, DISCORD_ALLOWED_USERS empty) the user
set was empty and the legacy "no allowlist = allow everyone" branch
let any guild member click the buttons -- approving exec commands,
cancelling slash confirmations, switching the model.
Behavior:
- both allowlists empty -> allow (preserves existing no-allowlist
deployments, no regression)
- user is in user allowlist -> allow
- role allowlist set + user has a role in it -> allow
- role allowlist set + interaction.user has no resolvable
``roles`` attribute (e.g. DM context with a role policy active)
-> reject (fail closed)
- otherwise -> reject
"""
user_set = allowed_user_ids or set()
role_set = allowed_role_ids or set()
has_users = bool(user_set)
has_roles = bool(role_set)
if not has_users and not has_roles:
return True
user = getattr(interaction, "user", None)
if user is None:
return False
if has_users:
try:
uid = str(user.id)
except AttributeError:
uid = ""
if uid and uid in user_set:
return True
if has_roles:
roles_attr = getattr(user, "roles", None)
if roles_attr is None:
# Role policy is configured but the interaction doesn't
# carry role data (DM-context Member, raw User payload).
# Fail closed: a user without a resolvable role list cannot
# satisfy a role allowlist.
return False
try:
user_role_ids = {getattr(r, "id", None) for r in roles_attr}
except TypeError:
return False
if user_role_ids & role_set:
return True
return False
if DISCORD_AVAILABLE:
class ExecApprovalView(discord.ui.View):
@@ -3733,17 +4170,23 @@ if DISCORD_AVAILABLE:
Only users in the allowed list can click. Times out after 5 minutes.
"""
def __init__(self, session_key: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300) # 5-minute timeout
self.session_key = session_key
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
"""Verify the user clicking is authorized."""
if not self.allowed_user_ids:
return True # No allowlist = anyone can approve
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _resolve(
self, interaction: discord.Interaction, choice: str,
@@ -3835,17 +4278,24 @@ if DISCORD_AVAILABLE:
5 minutes (matches the gateway primitive's timeout).
"""
def __init__(self, session_key: str, confirm_id: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
confirm_id: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300)
self.session_key = session_key
self.confirm_id = confirm_id
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _resolve(
self, interaction: discord.Interaction, choice: str,
@@ -3923,16 +4373,22 @@ if DISCORD_AVAILABLE:
5-minute timeout on its side).
"""
def __init__(self, session_key: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300)
self.session_key = session_key
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _respond(
self, interaction: discord.Interaction, answer: str,
@@ -4009,6 +4465,7 @@ if DISCORD_AVAILABLE:
session_key: str,
on_model_selected,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=120)
self.providers = providers
@@ -4017,15 +4474,16 @@ if DISCORD_AVAILABLE:
self.session_key = session_key
self.on_model_selected = on_model_selected
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
self._selected_provider: str = ""
self._build_provider_select()
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
def _build_provider_select(self):
"""Build the provider dropdown menu."""
+7 -2
View File
@@ -2922,13 +2922,18 @@ class FeishuAdapter(BasePlatformAdapter):
},
)
response.raise_for_status()
# Snapshot Content-Type and body while the client context is
# still active so pooled connections fully release on exit.
# See #18451.
content_type_hdr = str(response.headers.get("Content-Type", ""))
body = response.content
filename = self._derive_remote_filename(
file_url,
content_type=str(response.headers.get("Content-Type", "")),
content_type=content_type_hdr,
default_name=preferred_name,
default_ext=default_ext,
)
cached_path = cache_document_from_bytes(response.content, filename)
cached_path = cache_document_from_bytes(body, filename)
return cached_path, filename
@staticmethod
+1 -1
View File
@@ -139,7 +139,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
async def _ws_connect(self) -> bool:
"""Establish WebSocket connection and authenticate."""
ws_url = self._hass_url.replace("http://", "ws://").replace("https://", "wss://")
ws_url = self._hass_url.replace("https://", "wss://").replace("http://", "ws://")
ws_url = f"{ws_url}/api/websocket"
self._session = aiohttp.ClientSession(
+4
View File
@@ -243,10 +243,14 @@ class QQAdapter(BasePlatformAdapter):
return False
try:
# Tighter keepalive pool so idle CLOSE_WAIT sockets drain
# faster behind proxies like Cloudflare Warp (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0,
follow_redirects=True,
event_hooks={"response": [_ssrf_redirect_guard]},
limits=platform_httpx_limits(),
)
# 1. Get access token
+34 -1
View File
@@ -192,6 +192,15 @@ class SignalAdapter(BasePlatformAdapter):
group_allowed_str = os.getenv("SIGNAL_GROUP_ALLOWED_USERS", "")
self.group_allow_from = set(_parse_comma_list(group_allowed_str))
# DM allowlist — mirrors SIGNAL_ALLOWED_USERS checked by run.py.
# Stored here so the reaction hooks can skip unauthorized senders
# (reactions fire before run.py's auth gate, so without this check
# every inbound DM from any contact gets a 👀 reaction).
# "*" means all users allowed (open mode); empty means no restriction
# recorded at adapter level (run.py still enforces auth separately).
dm_allowed_str = os.getenv("SIGNAL_ALLOWED_USERS", "*")
self.dm_allow_from = set(_parse_comma_list(dm_allowed_str))
# HTTP client
self.client: Optional[httpx.AsyncClient] = None
@@ -248,7 +257,9 @@ class SignalAdapter(BasePlatformAdapter):
except Exception as e:
logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)
self.client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
try:
# Health check — verify signal-cli daemon is reachable
try:
@@ -1428,8 +1439,28 @@ class SignalAdapter(BasePlatformAdapter):
return None
return (author, ts)
def _reactions_enabled(self, event: "MessageEvent" = None) -> bool:
"""Check if message reactions are enabled for this event.
Two gates:
1. SIGNAL_REACTIONS env var set to false/0/no to disable globally.
2. DM allowlist if SIGNAL_ALLOWED_USERS is set, only react to
messages from senders in that list. This prevents unauthorized
contacts from seeing the 👀 reaction (which fires before run.py's
auth gate and would otherwise reveal that a bot is listening).
"""
if os.getenv("SIGNAL_REACTIONS", "true").lower() in ("false", "0", "no"):
return False
if event is not None:
sender = getattr(getattr(event, "source", None), "user_id", None)
if sender and "*" not in self.dm_allow_from and sender not in self.dm_allow_from:
return False
return True
async def on_processing_start(self, event: MessageEvent) -> None:
"""React with 👀 when processing begins."""
if not self._reactions_enabled(event):
return
target = self._extract_reaction_target(event)
if target:
await self.send_reaction(event.source.chat_id, "👀", *target)
@@ -1440,6 +1471,8 @@ class SignalAdapter(BasePlatformAdapter):
On CANCELLED we leave the 👀 in place no terminal outcome means
the reaction should keep reflecting "in progress" (matches Telegram).
"""
if not self._reactions_enabled(event):
return
if outcome == ProcessingOutcome.CANCELLED:
return
target = self._extract_reaction_target(event)
+15
View File
@@ -528,6 +528,21 @@ class SlackAdapter(BasePlatformAdapter):
return False
lock_acquired = True
# Close any previous handler before creating a new one so that
# calling connect() a second time (e.g. during a gateway restart or
# in-process reconnect attempt) does not leave a zombie Socket Mode
# connection alive. Both the old and new connections would otherwise
# receive every Slack event and dispatch it twice, producing double
# responses — the same bug that affected DiscordAdapter (#18187).
if self._handler is not None:
try:
await self._handler.close_async()
except Exception:
logger.debug("[%s] Failed to close previous Slack handler", self.name)
finally:
self._handler = None
self._app = None
# First token is the primary — used for AsyncApp / Socket Mode
primary_token = bot_tokens[0]
self._app = AsyncApp(token=primary_token)
+55
View File
@@ -512,6 +512,17 @@ class TelegramAdapter(BasePlatformAdapter):
self.name, attempt,
)
self._polling_network_error_count = 0
# start_polling() returning is necessary but not sufficient:
# PTB's Updater can be left in a state where `running` is True
# but the underlying long-poll task is wedged on a stale httpx
# connection and never makes progress. No error_callback fires
# in that state, so the reconnect ladder won't advance on its
# own. Schedule a deferred probe to detect the wedge and
# re-enter the ladder if needed.
if not self.has_fatal_error:
probe = asyncio.ensure_future(self._verify_polling_after_reconnect())
self._background_tasks.add(probe)
probe.add_done_callback(self._background_tasks.discard)
except Exception as retry_err:
logger.warning("[%s] Telegram polling reconnect failed: %s", self.name, retry_err)
# start_polling failed — polling is dead and no further error
@@ -523,6 +534,50 @@ class TelegramAdapter(BasePlatformAdapter):
self._background_tasks.add(task)
task.add_done_callback(self._background_tasks.discard)
async def _verify_polling_after_reconnect(self) -> None:
"""Heartbeat probe scheduled after a successful reconnect.
PTB's Updater can survive a botched stop()+start_polling() cycle
with `running=True` but a wedged consumer task. No error callback
fires, so the reconnect ladder doesn't advance on its own. This
probe detects the wedge by:
1. Sleeping HEARTBEAT_PROBE_DELAY so a healthy long-poll has time
to complete at least one cycle.
2. Verifying `Updater.running` is still True.
3. Probing the bot endpoint with a tight asyncio timeout. A
wedged httpx pool fails this probe; a healthy one returns
well under the timeout.
On any failure, re-enter the reconnect ladder so the existing
MAX_NETWORK_RETRIES path can ultimately escalate to fatal-error.
"""
HEARTBEAT_PROBE_DELAY = 60
PROBE_TIMEOUT = 10
await asyncio.sleep(HEARTBEAT_PROBE_DELAY)
if self.has_fatal_error:
return
if not (self._app and self._app.updater and self._app.updater.running):
logger.warning(
"[%s] Updater not running %ds after reconnect — treating as wedged",
self.name, HEARTBEAT_PROBE_DELAY,
)
await self._handle_polling_network_error(
RuntimeError("Updater not running after reconnect heartbeat")
)
return
try:
await asyncio.wait_for(self._app.bot.get_me(), PROBE_TIMEOUT)
except Exception as probe_err:
logger.warning(
"[%s] Polling heartbeat probe failed %ds after reconnect: %s",
self.name, HEARTBEAT_PROBE_DELAY, probe_err,
)
await self._handle_polling_network_error(probe_err)
async def _handle_polling_conflict(self, error: Exception) -> None:
if self.has_fatal_error and self.fatal_error_code == "telegram_polling_conflict":
return
+5 -1
View File
@@ -206,7 +206,11 @@ class WeComAdapter(BasePlatformAdapter):
return False
try:
self._http_client = httpx.AsyncClient(timeout=30.0, follow_redirects=True)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0, follow_redirects=True, limits=platform_httpx_limits(),
)
await self._open_connection()
self._mark_connected()
self._listen_task = asyncio.create_task(self._listen_loop())
+3 -1
View File
@@ -119,7 +119,9 @@ class WecomCallbackAdapter(BasePlatformAdapter):
pass
try:
self._http_client = httpx.AsyncClient(timeout=20.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(timeout=20.0, limits=platform_httpx_limits())
self._app = web.Application()
self._app.router.add_get("/health", self._handle_health)
self._app.router.add_get(self._path, self._handle_verify)
+3 -1
View File
@@ -2030,7 +2030,9 @@ async def send_weixin_direct(
live_adapter = _LIVE_ADAPTERS.get(resolved_token)
send_session = getattr(live_adapter, '_send_session', None)
if live_adapter is not None and send_session is not None and not send_session.closed:
if (live_adapter is not None and send_session is not None
and not send_session.closed
and send_session._loop is asyncio.get_running_loop()):
last_result: Optional[SendResult] = None
cleaned = live_adapter.format_message(message)
if cleaned:
+32 -2
View File
@@ -185,6 +185,13 @@ class WhatsAppAdapter(BasePlatformAdapter):
self._bridge_log: Optional[Path] = None
self._poll_task: Optional[asyncio.Task] = None
self._http_session: Optional["aiohttp.ClientSession"] = None
# Set to True by disconnect() before we SIGTERM our child bridge so
# _check_managed_bridge_exit() can distinguish an intentional
# shutdown-time exit (returncode -15 / -2 / 0) from a real crash.
# Without this, every graceful gateway shutdown/restart would log
# "Fatal whatsapp adapter error" plus dispatch a fatal-error
# notification before the normal "✓ whatsapp disconnected" fires.
self._shutting_down: bool = False
def _whatsapp_require_mention(self) -> bool:
configured = self.config.extra.get("require_mention")
@@ -555,6 +562,21 @@ class WhatsAppAdapter(BasePlatformAdapter):
if returncode is None:
return None
# Planned shutdown: disconnect() sets _shutting_down before it sends
# SIGTERM to the bridge, so a returncode of -15 (SIGTERM), -2 (SIGINT),
# or 0 (clean exit) at that point is expected, not a crash. Treat it
# as informational and skip the fatal-error path.
# getattr-with-default keeps tests that construct the adapter via
# ``WhatsAppAdapter.__new__`` (bypassing __init__) working without
# every _make_adapter() helper having to seed the attribute.
if getattr(self, "_shutting_down", False) and returncode in (0, -2, -15):
logger.info(
"[%s] Bridge exited during shutdown (code %d).",
self.name,
returncode,
)
return None
message = f"WhatsApp bridge process exited unexpectedly (code {returncode})."
if not self.has_fatal_error:
logger.error("[%s] %s", self.name, message)
@@ -565,6 +587,10 @@ class WhatsAppAdapter(BasePlatformAdapter):
async def disconnect(self) -> None:
"""Stop the WhatsApp bridge and clean up any orphaned processes."""
# Flip the shutdown flag BEFORE signalling the child so the exit-check
# path (which runs from other tasks like send() and the poll loop)
# doesn't race us and report the intentional termination as fatal.
self._shutting_down = True
if self._bridge_process:
try:
try:
@@ -876,11 +902,15 @@ class WhatsAppAdapter(BasePlatformAdapter):
try:
import aiohttp
await self._http_session.post(
# Must wrap in `async with` — a bare `await session.post(...)`
# leaves the response object alive until GC, holding its TCP
# socket in CLOSE_WAIT. See #18451.
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/typing",
json={"chatId": chat_id},
timeout=aiohttp.ClientTimeout(total=5)
)
):
pass
except Exception:
pass # Ignore typing indicator failures
+461 -87
View File
@@ -15,6 +15,7 @@ Usage:
import asyncio
import dataclasses
import inspect
import json
import logging
import os
@@ -48,6 +49,29 @@ from hermes_cli.config import cfg_get
_AGENT_CACHE_MAX_SIZE = 128
_AGENT_CACHE_IDLE_TTL_SECS = 3600.0 # evict agents idle for >1h
_PLATFORM_CONNECT_TIMEOUT_SECS_DEFAULT = 30.0
_TELEGRAM_COMMAND_MENTION_RE = re.compile(r"(?<![\w:/])/([A-Za-z0-9][A-Za-z0-9_-]*)")
def _telegramize_command_mentions(text: str, platform: Any) -> str:
"""Rewrite slash-command mentions to Telegram-valid command names.
Telegram Bot API command names allow only lowercase letters, digits, and
underscores. Keep other platform renderings unchanged, but normalize
Telegram help text so command mentions remain clickable/valid there.
"""
platform_value = getattr(platform, "value", platform)
if platform_value != "telegram":
return text
from hermes_cli.commands import _sanitize_telegram_name
def _replace(match: re.Match[str]) -> str:
sanitized = _sanitize_telegram_name(match.group(1))
return f"/{sanitized}" if sanitized else match.group(0)
return _TELEGRAM_COMMAND_MENTION_RE.sub(_replace, text)
# Only auto-continue interrupted gateway turns while the interruption is fresh.
# Stale tool-tail/resume markers can otherwise revive an unrelated old task
# after a gateway restart when the user's next message starts new work.
@@ -282,6 +306,20 @@ def _home_target_env_var(platform_name: str) -> str:
)
def _home_thread_env_var(platform_name: str) -> str:
"""Return the optional thread/topic env var for a platform home target."""
return f"{_home_target_env_var(platform_name)}_THREAD_ID"
def _restart_notification_pending() -> bool:
"""Return True when a /restart completion marker is waiting to be delivered."""
return (_hermes_home / ".restart_notify.json").exists()
# Mark this process as a gateway so cli.py's module-level load_cli_config()
# knows not to clobber TERMINAL_CWD if lazily imported.
os.environ["_HERMES_GATEWAY"] = "1"
_ensure_ssl_certs()
# Add parent directory to path
@@ -406,37 +444,37 @@ if _config_path.exists():
os.environ[_env_map["base_url"]] = _base_url
if _api_key:
os.environ[_env_map["api_key"]] = _api_key
# config.yaml is the documented, authoritative source for these
# settings — it unconditionally wins over .env values. Previously
# the guards below read `if X not in os.environ` and let stale
# .env entries (e.g. HERMES_MAX_ITERATIONS=60 written by an old
# `hermes setup` run) silently shadow the user's current config.
# See PR #18413 / the 60-vs-500 max_turns incident.
_agent_cfg = _cfg.get("agent", {})
if _agent_cfg and isinstance(_agent_cfg, dict):
if "max_turns" in _agent_cfg:
os.environ["HERMES_MAX_ITERATIONS"] = str(_agent_cfg["max_turns"])
# Bridge agent.gateway_timeout → HERMES_AGENT_TIMEOUT env var.
# Env var from .env takes precedence (already in os.environ).
if "gateway_timeout" in _agent_cfg and "HERMES_AGENT_TIMEOUT" not in os.environ:
if "gateway_timeout" in _agent_cfg:
os.environ["HERMES_AGENT_TIMEOUT"] = str(_agent_cfg["gateway_timeout"])
if "gateway_timeout_warning" in _agent_cfg and "HERMES_AGENT_TIMEOUT_WARNING" not in os.environ:
if "gateway_timeout_warning" in _agent_cfg:
os.environ["HERMES_AGENT_TIMEOUT_WARNING"] = str(_agent_cfg["gateway_timeout_warning"])
if "gateway_notify_interval" in _agent_cfg and "HERMES_AGENT_NOTIFY_INTERVAL" not in os.environ:
if "gateway_notify_interval" in _agent_cfg:
os.environ["HERMES_AGENT_NOTIFY_INTERVAL"] = str(_agent_cfg["gateway_notify_interval"])
if "restart_drain_timeout" in _agent_cfg and "HERMES_RESTART_DRAIN_TIMEOUT" not in os.environ:
if "restart_drain_timeout" in _agent_cfg:
os.environ["HERMES_RESTART_DRAIN_TIMEOUT"] = str(_agent_cfg["restart_drain_timeout"])
if (
"gateway_auto_continue_freshness" in _agent_cfg
and "HERMES_AUTO_CONTINUE_FRESHNESS" not in os.environ
):
if "gateway_auto_continue_freshness" in _agent_cfg:
os.environ["HERMES_AUTO_CONTINUE_FRESHNESS"] = str(
_agent_cfg["gateway_auto_continue_freshness"]
)
_display_cfg = _cfg.get("display", {})
if _display_cfg and isinstance(_display_cfg, dict):
if "busy_input_mode" in _display_cfg and "HERMES_GATEWAY_BUSY_INPUT_MODE" not in os.environ:
if "busy_input_mode" in _display_cfg:
os.environ["HERMES_GATEWAY_BUSY_INPUT_MODE"] = str(_display_cfg["busy_input_mode"])
if "busy_ack_enabled" in _display_cfg and "HERMES_GATEWAY_BUSY_ACK_ENABLED" not in os.environ:
if "busy_ack_enabled" in _display_cfg:
os.environ["HERMES_GATEWAY_BUSY_ACK_ENABLED"] = str(_display_cfg["busy_ack_enabled"])
# Timezone: bridge config.yaml → HERMES_TIMEZONE env var.
# HERMES_TIMEZONE from .env takes precedence (already in os.environ).
_tz_cfg = _cfg.get("timezone", "")
if _tz_cfg and isinstance(_tz_cfg, str) and "HERMES_TIMEZONE" not in os.environ:
if _tz_cfg and isinstance(_tz_cfg, str):
os.environ["HERMES_TIMEZONE"] = _tz_cfg.strip()
# Security settings
_security_cfg = _cfg.get("security", {})
@@ -444,8 +482,24 @@ if _config_path.exists():
_redact = _security_cfg.get("redact_secrets")
if _redact is not None:
os.environ["HERMES_REDACT_SECRETS"] = str(_redact).lower()
except Exception:
pass # Non-fatal; gateway can still run with .env values
except Exception as _bridge_err:
# Previously this was silent (`except Exception: pass`), which
# hid partial bridge failures and let .env defaults shadow
# config.yaml values — users observed max_turns=500 in config
# but a 60-iteration cap in practice. Surface the failure to
# stderr so operators see it even though `logger` is not yet
# initialized at module-import time (logger is defined further
# down this module).
print(
f" Warning: config.yaml → env bridge failed: "
f"{type(_bridge_err).__name__}: {_bridge_err}",
file=sys.stderr,
)
print(
" Gateway will fall back to .env values, which may not match "
"your current config.yaml. Run `hermes doctor` to investigate.",
file=sys.stderr,
)
# Apply IPv4 preference if configured (before any HTTP clients are created).
try:
@@ -490,6 +544,8 @@ from gateway.config import (
Platform,
_BUILTIN_PLATFORM_VALUES,
GatewayConfig,
HomeChannel,
PlatformConfig,
load_gateway_config,
)
from gateway.session import (
@@ -673,11 +729,69 @@ def _is_control_interrupt_message(message: Optional[str]) -> bool:
return normalized in _CONTROL_INTERRUPT_MESSAGES
def _skill_slug_from_frontmatter(skill_md: Path) -> tuple[str | None, str | None]:
"""Derive the /command slug and declared frontmatter name from a SKILL.md.
Matches the exact normalization used by
:func:`agent.skill_commands.scan_skill_commands` so the slug here is the
same string a user types after the leading ``/`` (e.g. a skill with
frontmatter ``name: Stable Diffusion Image Generation`` resolves to
``stable-diffusion-image-generation`` NOT the parent directory name,
which is commonly shorter/different, e.g. ``stable-diffusion``).
Using the directory name silently broke :func:`_check_unavailable_skill`
for every skill whose directory name drifted from its frontmatter name
(19 such skills on a standard install as of 2026-05), causing a generic
"unknown command" response where a "disabled — enable with …" or
"not installed — install with …" hint was expected.
Returns ``(slug, declared_name)`` or ``(None, None)`` when the file
can't be read or lacks a ``name:`` in its frontmatter.
"""
try:
content = skill_md.read_text(encoding="utf-8", errors="replace")
except Exception:
return None, None
if not content.startswith("---"):
return None, None
end = content.find("\n---", 3)
if end < 0:
return None, None
declared_name: str | None = None
for line in content[3:end].splitlines():
line = line.strip()
if line.startswith("name:"):
raw = line.split(":", 1)[1].strip()
# Strip YAML quote wrappers if present
if len(raw) >= 2 and raw[0] == raw[-1] and raw[0] in ('"', "'"):
raw = raw[1:-1]
declared_name = raw.strip()
break
if not declared_name:
return None, None
slug = declared_name.lower().replace(" ", "-").replace("_", "-")
# Mirror _SKILL_INVALID_CHARS and _SKILL_MULTI_HYPHEN from skill_commands
import re as _re
slug = _re.sub(r"[^a-z0-9-]", "", slug)
slug = _re.sub(r"-{2,}", "-", slug).strip("-")
if not slug:
return None, declared_name
return slug, declared_name
def _check_unavailable_skill(command_name: str) -> str | None:
"""Check if a command matches a known-but-inactive skill.
Returns a helpful message if the skill exists but is disabled or only
available as an optional install. Returns None if no match found.
The slug for each on-disk skill is derived from its frontmatter ``name:``
(via :func:`_skill_slug_from_frontmatter`), NOT from its containing
directory name because the two can differ (e.g. directory
``stable-diffusion`` + frontmatter ``Stable Diffusion Image Generation``
yields slug ``stable-diffusion-image-generation``). Matching on
directory name would miss that slug entirely and fall through to the
generic "unknown command" path.
"""
# Normalize: command uses hyphens, skill names may use hyphens or underscores
normalized = command_name.lower().replace("_", "-")
@@ -693,8 +807,12 @@ def _check_unavailable_skill(command_name: str) -> str | None:
for skill_md in skills_dir.rglob("SKILL.md"):
if any(part in ('.git', '.github', '.hub', '.archive') for part in skill_md.parts):
continue
name = skill_md.parent.name.lower().replace("_", "-")
if name == normalized and name in disabled:
slug, declared_name = _skill_slug_from_frontmatter(skill_md)
if not slug or not declared_name:
continue
# disabled is keyed by the declared frontmatter name (what
# skills.disabled / skills.platform_disabled store).
if slug == normalized and declared_name in disabled:
return (
f"The **{command_name}** skill is installed but disabled.\n"
f"Enable it with: `hermes skills config`"
@@ -706,8 +824,10 @@ def _check_unavailable_skill(command_name: str) -> str | None:
optional_dir = get_optional_skills_dir(repo_root / "optional-skills")
if optional_dir.exists():
for skill_md in optional_dir.rglob("SKILL.md"):
name = skill_md.parent.name.lower().replace("_", "-")
if name == normalized:
slug, _declared = _skill_slug_from_frontmatter(skill_md)
if not slug:
continue
if slug == normalized:
# Build install path: official/<category>/<name>
rel = skill_md.parent.relative_to(optional_dir)
parts = list(rel.parts)
@@ -1068,6 +1188,10 @@ class GatewayRunner:
# Per-chat voice reply mode: "off" | "voice_only" | "all"
self._voice_mode: Dict[str, str] = self._load_voice_modes()
# Recent voice transcripts per (guild,user) for duplicate suppression.
# Protects against the same utterance being emitted twice by the voice
# capture / STT pipeline, which otherwise produces a second delayed reply.
self._recent_voice_transcripts: Dict[tuple[int, int], List[tuple[float, str]]] = {}
# Track background tasks to prevent garbage collection mid-execution
self._background_tasks: set = set()
@@ -2176,15 +2300,13 @@ class GatewayRunner:
logger.debug("Failed interrupting agent during shutdown: %s", e)
async def _notify_active_sessions_of_shutdown(self) -> None:
"""Send a notification to every chat with an active agent.
"""Send shutdown/restart notifications to active chats and home channels.
Called at the very start of stop() adapters are still connected so
messages can be delivered. Best-effort: individual send failures are
messages can be delivered. Best-effort: individual send failures are
logged and swallowed so they never block the shutdown sequence.
"""
active = self._snapshot_running_agents()
if not active:
return
action = "restarting" if self._restart_requested else "shutting down"
hint = (
@@ -2195,7 +2317,7 @@ class GatewayRunner:
)
msg = f"⚠️ Gateway {action}{hint}"
notified: set = set()
notified: set[tuple[str, str, Optional[str]]] = set()
for session_key in active:
source = None
try:
@@ -2212,7 +2334,7 @@ class GatewayRunner:
if source is not None:
platform_str = source.platform.value
chat_id = source.chat_id
chat_id = str(source.chat_id)
thread_id = source.thread_id
else:
# Fall back to parsing the session key when no persisted
@@ -2224,9 +2346,10 @@ class GatewayRunner:
chat_id = _parsed["chat_id"]
thread_id = _parsed.get("thread_id")
# Deduplicate: one notification per chat, even if multiple
# sessions (different users/threads) share the same chat.
dedup_key = (platform_str, chat_id)
# Deduplicate only identical delivery targets. Thread/topic-aware
# platforms can share a parent chat while still routing to distinct
# destinations via metadata.
dedup_key = (platform_str, chat_id, str(thread_id) if thread_id else None)
if dedup_key in notified:
continue
@@ -2240,10 +2363,19 @@ class GatewayRunner:
# correct forum topic / thread.
metadata = {"thread_id": thread_id} if thread_id else None
await adapter.send(chat_id, msg, metadata=metadata)
result = await adapter.send(chat_id, msg, metadata=metadata)
if result is not None and getattr(result, "success", True) is False:
logger.debug(
"Failed to send shutdown notification to %s:%s: %s",
platform_str,
chat_id,
getattr(result, "error", "send returned success=False"),
)
continue
notified.add(dedup_key)
logger.info(
"Sent shutdown notification to %s:%s",
"Sent shutdown notification to active chat %s:%s",
platform_str, chat_id,
)
except Exception as e:
@@ -2252,6 +2384,44 @@ class GatewayRunner:
platform_str, chat_id, e,
)
for platform, adapter in self.adapters.items():
home = self.config.get_home_channel(platform)
if not home or not home.chat_id:
continue
dedup_key = (platform.value, str(home.chat_id), str(home.thread_id) if home.thread_id else None)
if dedup_key in notified:
continue
try:
metadata = {"thread_id": home.thread_id} if home.thread_id else None
if metadata:
result = await adapter.send(str(home.chat_id), msg, metadata=metadata)
else:
result = await adapter.send(str(home.chat_id), msg)
if result is not None and getattr(result, "success", True) is False:
logger.debug(
"Failed to send shutdown notification to home channel %s:%s: %s",
platform.value,
home.chat_id,
getattr(result, "error", "send returned success=False"),
)
continue
notified.add(dedup_key)
logger.info(
"Sent shutdown notification to home channel %s:%s",
platform.value,
home.chat_id,
)
except Exception as e:
logger.debug(
"Failed to send shutdown notification to home channel %s:%s: %s",
platform.value,
home.chat_id,
e,
)
def _finalize_shutdown_agents(self, active_agents: Dict[str, Any]) -> None:
for agent in active_agents.values():
try:
@@ -2519,6 +2689,18 @@ class GatewayRunner:
"""
logger.info("Starting Hermes Gateway...")
logger.info("Session storage: %s", self.config.sessions_dir)
# Log the resolved max_iterations budget so operators can verify the
# config.yaml → env bridge did the right thing at a glance (instead
# of silently running at a stale .env value for weeks).
try:
_effective_max_iter = int(os.getenv("HERMES_MAX_ITERATIONS", "90"))
logger.info(
"Agent budget: max_iterations=%d (agent.max_turns from config.yaml, "
"or HERMES_MAX_ITERATIONS from .env, or default 90)",
_effective_max_iter,
)
except Exception:
pass
try:
from hermes_cli.profiles import get_active_profile_name
_profile = get_active_profile_name()
@@ -2662,7 +2844,7 @@ class GatewayRunner:
try:
suspended = self.session_store.suspend_recently_active()
if suspended:
logger.info("Suspended %d in-flight session(s) from previous run", suspended)
logger.info("Marked %d in-flight session(s) as resumable from previous run", suspended)
except Exception as e:
logger.warning("Session suspension on startup failed: %s", e)
@@ -2860,8 +3042,28 @@ class GatewayRunner:
):
self._schedule_update_notification_watch()
# Give freshly connected platform adapters a brief moment to settle
# before sending restart/startup lifecycle messages. In practice this
# helps Discord thread deliveries right after reconnect.
if connected_count > 0:
await asyncio.sleep(1.0)
# Notify the chat that initiated /restart that the gateway is back.
await self._send_restart_notification()
restart_notification_pending = _restart_notification_pending()
delivered_restart_target = await self._send_restart_notification()
# Broadcast a lightweight "gateway is back" message to configured
# home channels only when this startup is resuming from /restart. If a
# /restart requester already received a direct completion notice in the
# same chat, skip the generic broadcast there to avoid duplicates while
# still allowing a home-channel fallback when the direct send fails.
if restart_notification_pending or delivered_restart_target is not None:
skip_home_targets = (
{delivered_restart_target} if delivered_restart_target else None
)
await self._send_home_channel_startup_notifications(
skip_targets=skip_home_targets,
)
# Drain any recovered process watchers (from crash recovery checkpoint)
try:
@@ -3889,7 +4091,9 @@ class GatewayRunner:
if not check_discord_requirements():
logger.warning("Discord: discord.py not installed")
return None
return DiscordAdapter(config)
adapter = DiscordAdapter(config)
adapter.gateway_runner = self # For cross-platform admin alerts on unauthorized slash
return adapter
elif platform == Platform.WHATSAPP:
from gateway.platforms.whatsapp import WhatsAppAdapter, check_whatsapp_requirements
@@ -4630,9 +4834,6 @@ class GatewayRunner:
if event.get_command() == "status":
return await self._handle_status_command(event)
if event.get_command() == "recap":
return await self._handle_recap_command(event)
# Resolve the command once for all early-intercept checks below.
from hermes_cli.commands import (
ACTIVE_SESSION_BYPASS_COMMANDS as _DEDICATED_HANDLERS,
@@ -4937,6 +5138,28 @@ class GatewayRunner:
_cmd_def = _resolve_cmd(command) if command else None
canonical = _cmd_def.name if _cmd_def else command
# Expand alias quick commands before built-in dispatch so targets like
# /model openai/gpt-5.5 --provider openrouter reach the /model handler.
# Preserve built-in precedence; aliases only need early handling when
# the typed command is not already known.
if command and _cmd_def is None:
if isinstance(self.config, dict):
quick_commands = self.config.get("quick_commands", {}) or {}
else:
quick_commands = getattr(self.config, "quick_commands", {}) or {}
if isinstance(quick_commands, dict) and command in quick_commands:
qcmd = quick_commands[command]
if qcmd.get("type") == "alias":
target = qcmd.get("target", "").strip()
if target:
target = target if target.startswith("/") else f"/{target}"
target_command = target.lstrip("/")
user_args = event.get_command_args().strip()
event.text = f"{target} {user_args}".strip()
command = target_command.split()[0] if target_command else target_command
_cmd_def = _resolve_cmd(command) if command else None
canonical = _cmd_def.name if _cmd_def else command
# Fire the ``command:<canonical>`` hook for any recognized slash
# command — built-in OR plugin-registered. Handlers can return a
# dict with ``{"decision": "deny" | "handled" | "rewrite", ...}``
@@ -5007,9 +5230,6 @@ class GatewayRunner:
if canonical == "status":
return await self._handle_status_command(event)
if canonical == "recap":
return await self._handle_recap_command(event)
if canonical == "agents":
return await self._handle_agents_command(event)
@@ -5153,7 +5373,7 @@ class GatewayRunner:
target_command = target.lstrip("/")
user_args = event.get_command_args().strip()
event.text = f"{target} {user_args}".strip()
command = target_command
command = target_command.split()[0] if target_command else target_command
# Fall through to normal command dispatch below
else:
return f"Quick command '/{command}' has no target defined."
@@ -6852,34 +7072,6 @@ class GatewayRunner:
return "\n".join(lines)
async def _handle_recap_command(self, event: MessageEvent) -> str:
"""Handle /recap command — compact summary of recent session activity.
Inspired by Claude Code's ``/recap`` (v2.1.114, April 2026). Purely
local: reads the transcript from SessionStore and builds a text
summary with no LLM call. Safe for prompt-cache integrity.
"""
from hermes_cli.session_recap import build_recap
source = event.source
session_entry = self.session_store.get_or_create_session(source)
history = self.session_store.load_transcript(session_entry.session_id) or []
title = None
try:
if self._session_db:
title = self._session_db.get_session_title(session_entry.session_id)
except Exception:
title = None
platform_name = source.platform.value if source and source.platform else None
return build_recap(
history,
session_title=title,
session_id=session_entry.session_id,
platform=platform_name,
)
async def _handle_agents_command(self, event: MessageEvent) -> str:
"""Handle /agents command - list active agents and running tasks."""
from tools.process_registry import format_uptime_short, process_registry
@@ -7159,7 +7351,10 @@ class GatewayRunner:
lines.append(f"\n... and {len(sorted_cmds) - 10} more. Use `/commands` for the full paginated list.")
except Exception:
pass
return "\n".join(lines)
return _telegramize_command_mentions(
"\n".join(lines),
getattr(getattr(event, "source", None), "platform", None),
)
async def _handle_commands_command(self, event: MessageEvent) -> str:
"""Handle /commands [page] - paginated list of all commands and skills."""
@@ -7212,7 +7407,10 @@ class GatewayRunner:
lines.extend(["", " | ".join(nav_parts)])
if page != requested_page:
lines.append(f"_(Requested page {requested_page} was out of range, showing page {page}.)_")
return "\n".join(lines)
return _telegramize_command_mentions(
"\n".join(lines),
getattr(getattr(event, "source", None), "platform", None),
)
async def _handle_model_command(self, event: MessageEvent) -> Optional[str]:
"""Handle /model command — switch model for this session.
@@ -7826,24 +8024,33 @@ class GatewayRunner:
msg = decision.get("message") or ""
# Send the status line back to the user so they see the judge's
# verdict. Fire-and-forget via the adapter.
# verdict. Fire-and-forget via the adapter's ``send()`` method —
# adapters expose ``send(chat_id, content, reply_to, metadata)``,
# not a ``send_message(source, msg)`` wrapper, so an earlier
# ``hasattr(adapter, "send_message")`` gate here was dead code and
# users never saw ``✓ Goal achieved`` / ``⏸ budget exhausted``
# verdicts.
if msg and source is not None:
try:
adapter = self.adapters.get(source.platform)
if adapter and hasattr(adapter, "send_message"):
if adapter is not None and hasattr(adapter, "send"):
import asyncio as _asyncio
coro = adapter.send_message(source, msg)
thread_meta = (
{"thread_id": source.thread_id} if source.thread_id else None
)
coro = adapter.send(
chat_id=source.chat_id,
content=msg,
metadata=thread_meta,
)
if _asyncio.iscoroutine(coro):
try:
loop = _asyncio.get_event_loop()
if loop.is_running():
loop.create_task(coro)
else:
loop.run_until_complete(coro)
loop = _asyncio.get_running_loop()
loop.create_task(coro)
except RuntimeError:
# No event loop in this thread — schedule on the main one.
# No running loop in this thread — best effort.
try:
_asyncio.run_coroutine_threadsafe(coro, self._loop)
_asyncio.run(coro)
except Exception:
pass
except Exception as exc:
@@ -7906,14 +8113,33 @@ class GatewayRunner:
chat_name = source.chat_name or chat_id
env_key = _home_target_env_var(platform_name)
thread_env_key = _home_thread_env_var(platform_name)
thread_id = source.thread_id
# Save to .env so it persists across restarts
try:
from hermes_cli.config import save_env_value
save_env_value(env_key, str(chat_id))
# Keep thread/topic routing explicit and clear stale values when
# /sethome is run from the parent chat instead of a thread.
save_env_value(thread_env_key, str(thread_id or ""))
except Exception as e:
return f"Failed to save home channel: {e}"
# Keep the running gateway config in sync too. The pre-restart
# notification path reads self.config before the process reloads env.
if source.platform:
platform_config = self.config.platforms.setdefault(
source.platform,
PlatformConfig(enabled=True),
)
platform_config.home_channel = HomeChannel(
platform=source.platform,
chat_id=str(chat_id),
name=chat_name,
thread_id=str(thread_id) if thread_id else None,
)
return (
f"✅ Home channel set to **{chat_name}** (ID: {chat_id}).\n"
f"Cron jobs and cross-platform messages will be delivered here."
@@ -8094,6 +8320,47 @@ class GatewayRunner:
adapter = self.adapters.get(Platform.DISCORD)
self._set_adapter_auto_tts_disabled(adapter, chat_id, disabled=True)
def _is_duplicate_voice_transcript(self, guild_id: int, user_id: int, transcript: str) -> bool:
"""Suppress repeated STT outputs for the same recent utterance.
Voice capture can occasionally emit the same utterance twice a few
seconds apart, which creates a second queued agent run and overlapping
spoken replies. Dedup exact and near-exact repeats per guild/user over a
short window while allowing genuinely new turns through.
"""
from difflib import SequenceMatcher
normalized = re.sub(r"\s+", " ", transcript).strip().lower()
normalized = re.sub(r"[^\w\s]", "", normalized)
if not normalized:
return False
now = time.monotonic()
window_seconds = 12.0
key = (guild_id, user_id)
recent_store = getattr(self, "_recent_voice_transcripts", None)
if not isinstance(recent_store, dict):
recent_store = {}
self._recent_voice_transcripts = recent_store
recent = [
(ts, txt)
for ts, txt in recent_store.get(key, [])
if now - ts <= window_seconds
]
for _, prior in recent:
if prior == normalized:
recent_store[key] = recent
return True
if len(prior) >= 16 and len(normalized) >= 16:
if SequenceMatcher(None, prior, normalized).ratio() >= 0.95:
recent_store[key] = recent
return True
recent.append((now, normalized))
recent_store[key] = recent[-5:]
return False
async def _handle_voice_channel_input(
self, guild_id: int, user_id: int, transcript: str
):
@@ -8131,6 +8398,15 @@ class GatewayRunner:
logger.debug("Unauthorized voice input from user %d, ignoring", user_id)
return
if self._is_duplicate_voice_transcript(guild_id, user_id, transcript):
logger.info(
"Suppressing duplicate voice transcript for guild=%s user=%s: %s",
guild_id,
user_id,
transcript[:100],
)
return
# Show transcript in text channel (after auth, with mention sanitization)
try:
channel = adapter._client.get_channel(text_ch_id)
@@ -9657,6 +9933,28 @@ class GatewayRunner:
removed = result.get("removed", []) # [{"name", "description"}, ...]
total = result.get("total", 0)
# Let each connected adapter refresh any platform-side state
# that cached the skill list at startup. Today that's the
# Discord /skill autocomplete (registered once per connect);
# without this call, new skills stay invisible in the
# dropdown and deleted skills error out when clicked. Other
# adapters that don't override refresh_skill_group (Telegram's
# BotCommand menu, Slack subcommand map, etc.) are silently
# skipped — the in-process reload above is enough for them.
for adapter in list(self.adapters.values()):
refresh = getattr(adapter, "refresh_skill_group", None)
if not callable(refresh):
continue
try:
maybe = refresh()
if inspect.isawaitable(maybe):
await maybe
except Exception as exc:
logger.warning(
"Adapter %s refresh_skill_group raised: %s",
getattr(adapter, "name", adapter), exc,
)
lines = ["🔄 **Skills Reloaded**\n"]
if not added and not removed:
lines.append("No new skills detected.")
@@ -10375,11 +10673,11 @@ class GatewayRunner:
return True
async def _send_restart_notification(self) -> None:
async def _send_restart_notification(self) -> Optional[tuple[str, str, Optional[str]]]:
"""Notify the chat that initiated /restart that the gateway is back."""
notify_path = _hermes_home / ".restart_notify.json"
if not notify_path.exists():
return
return None
try:
data = json.loads(notify_path.read_text())
@@ -10388,7 +10686,7 @@ class GatewayRunner:
thread_id = data.get("thread_id")
if not platform_str or not chat_id:
return
return None
platform = Platform(platform_str)
adapter = self.adapters.get(platform)
@@ -10397,24 +10695,94 @@ class GatewayRunner:
"Restart notification skipped: %s adapter not connected",
platform_str,
)
return
return None
metadata = {"thread_id": thread_id} if thread_id else None
await adapter.send(
chat_id,
result = await adapter.send(
str(chat_id),
"♻ Gateway restarted successfully. Your session continues.",
metadata=metadata,
)
# adapter.send() catches provider errors (e.g. "Chat not found")
# and returns SendResult(success=False) rather than raising, so
# we must inspect the result before claiming success — otherwise
# the log line is misleading and hides real delivery failures.
if result is not None and getattr(result, "success", True) is False:
logger.warning(
"Restart notification to %s:%s was not delivered: %s",
platform_str,
chat_id,
getattr(result, "error", "send returned success=False"),
)
return None
logger.info(
"Sent restart notification to %s:%s",
platform_str,
chat_id,
)
return str(platform_str), str(chat_id), str(thread_id) if thread_id else None
except Exception as e:
logger.warning("Restart notification failed: %s", e)
return None
finally:
notify_path.unlink(missing_ok=True)
async def _send_home_channel_startup_notifications(
self,
*,
skip_targets: Optional[set[tuple[str, str, Optional[str]]]] = None,
) -> set[tuple[str, str, Optional[str]]]:
"""Notify configured home channels that the gateway is back online.
The notification is best-effort and sent once per connected platform
home channel. ``skip_targets`` lets startup avoid duplicate messages
when a more specific restart notification is queued for the same chat.
"""
delivered: set[tuple[str, str, Optional[str]]] = set()
skipped = skip_targets or set()
message = "♻️ Gateway online — Hermes is back and ready."
for platform, adapter in self.adapters.items():
home = self.config.get_home_channel(platform)
if not home or not home.chat_id:
continue
target = (platform.value, str(home.chat_id), str(home.thread_id) if home.thread_id else None)
if target in skipped or target in delivered:
continue
try:
metadata = {"thread_id": home.thread_id} if home.thread_id else None
if metadata:
result = await adapter.send(str(home.chat_id), message, metadata=metadata)
else:
result = await adapter.send(str(home.chat_id), message)
if result is not None and getattr(result, "success", True) is False:
logger.warning(
"Home-channel startup notification failed for %s:%s: %s",
platform.value,
home.chat_id,
getattr(result, "error", "send returned success=False"),
)
continue
delivered.add(target)
logger.info(
"Sent home-channel startup notification to %s:%s",
platform.value,
home.chat_id,
)
except Exception as exc:
logger.warning(
"Home-channel startup notification failed for %s:%s: %s",
platform.value,
home.chat_id,
exc,
)
return delivered
def _set_session_env(self, context: SessionContext) -> list:
"""Set session context variables for the current async task.
@@ -11052,6 +11420,12 @@ class GatewayRunner:
if not session_key:
return
pending_skills_reload_notes = getattr(
self, "_pending_skills_reload_notes", None
)
if isinstance(pending_skills_reload_notes, dict):
pending_skills_reload_notes.pop(session_key, None)
pending_approvals = getattr(self, "_pending_approvals", None)
if isinstance(pending_approvals, dict):
pending_approvals.pop(session_key, None)
+17 -12
View File
@@ -1086,19 +1086,22 @@ class SessionStore:
return len(removed_keys)
def suspend_recently_active(self, max_age_seconds: int = 120) -> int:
"""Mark recently-active sessions as suspended.
"""Mark recently-active sessions as resumable after an unexpected exit.
Called on gateway startup to prevent sessions that were likely
in-flight when the gateway last exited from being blindly resumed
(#7536). Only suspends sessions updated within *max_age_seconds*
to avoid resetting long-idle sessions that are harmless to resume.
Returns the number of sessions that were suspended.
Called on gateway startup after a crash or fast restart to preserve
in-flight sessions instead of destroying their conversation history
(#7536). Only marks sessions updated within *max_age_seconds* to
avoid touching long-idle sessions. Sets ``resume_pending=True`` so
the next incoming message on the same session_key auto-resumes from
the existing transcript.
Entries flagged ``resume_pending=True`` are skipped those were
marked intentionally by the drain-timeout path as recoverable.
Terminal escalation for genuinely stuck ``resume_pending`` sessions
is handled by the existing ``.restart_failure_counts`` stuck-loop
counter, which runs after this method on startup.
Entries already flagged ``resume_pending=True`` are skipped. Entries
explicitly ``suspended=True`` (from /stop or stuck-loop escalation)
are also skipped. Terminal escalation for genuinely stuck sessions
is still handled by the existing ``.restart_failure_counts`` counter
(threshold 3), which runs after this method and sets ``suspended=True``.
Returns the number of sessions marked resumable.
"""
from datetime import timedelta
@@ -1110,7 +1113,9 @@ class SessionStore:
if entry.resume_pending:
continue
if not entry.suspended and entry.updated_at >= cutoff:
entry.suspended = True
entry.resume_pending = True
entry.resume_reason = "restart_interrupted"
entry.last_resume_marked_at = _now()
count += 1
if count:
self._save()
+33 -1
View File
@@ -5,11 +5,43 @@ Provides subcommands for:
- hermes chat - Interactive chat (same as ./hermes)
- hermes gateway - Run gateway in foreground
- hermes gateway start - Start gateway service
- hermes gateway stop - Stop gateway service
- hermes gateway stop - Stop gateway service
- hermes setup - Interactive setup wizard
- hermes status - Show status of all components
- hermes cron - Manage cron jobs
"""
import os
import sys
__version__ = "0.12.0"
__release_date__ = "2026.4.30"
def _ensure_utf8():
"""Force UTF-8 stdout/stderr on Windows to prevent UnicodeEncodeError.
Windows services and terminals default to cp1252, which cannot encode
box-drawing characters used in CLI output. This causes unhandled
UnicodeEncodeError crashes on gateway startup.
"""
if sys.platform != "win32":
return
os.environ.setdefault("PYTHONUTF8", "1")
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
for stream_name in ("stdout", "stderr"):
stream = getattr(sys, stream_name, None)
if stream is None:
continue
try:
if getattr(stream, "encoding", "").lower().replace("-", "") != "utf8":
new_stream = open(
stream.fileno(), "w", encoding="utf-8",
buffering=1, closefd=False,
)
setattr(sys, stream_name, new_stream)
except (AttributeError, OSError):
pass
_ensure_utf8()
+4 -2
View File
@@ -4283,7 +4283,8 @@ def _minimax_oauth_login(
print(f"Portal: {portal_base_url}")
with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
headers={"Accept": "application/json"}) as client:
headers={"Accept": "application/json"},
follow_redirects=True) as client:
code_data = _minimax_request_user_code(
client, portal_base_url=portal_base_url,
client_id=pconfig.client_id,
@@ -4360,7 +4361,8 @@ def _refresh_minimax_oauth_state(
return state
portal_base_url = state["portal_base_url"]
with httpx.Client(timeout=httpx.Timeout(timeout_seconds)) as client:
with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
follow_redirects=True) as client:
response = client.post(
f"{portal_base_url}/oauth/token",
data={
+142 -61
View File
@@ -10,6 +10,7 @@ To add an alias: set ``aliases=("short",)`` on the existing ``CommandDef``.
from __future__ import annotations
import logging
import os
import re
import shutil
@@ -21,6 +22,8 @@ from typing import Any
from utils import is_truthy_value
logger = logging.getLogger(__name__)
# prompt_toolkit is an optional CLI dependency — only needed for
# SlashCommandCompleter and SlashCommandAutoSuggest. Gateway and test
# environments that lack it must still be able to import this module
@@ -68,7 +71,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
cli_only=True),
CommandDef("history", "Show conversation history", "Session",
cli_only=True),
CommandDef("recap", "Summarize recent activity in this session", "Session"),
CommandDef("save", "Save the current conversation", "Session",
cli_only=True),
CommandDef("retry", "Retry the last message (resend to agent)", "Session"),
@@ -320,7 +322,6 @@ ACTIVE_SESSION_BYPASS_COMMANDS: frozenset[str] = frozenset(
"new",
"profile",
"queue",
"recap",
"restart",
"status",
"steer",
@@ -398,6 +399,11 @@ def _is_gateway_available(cmd: CommandDef, config_overrides: set[str] | None = N
return False
def _requires_argument(args_hint: str) -> bool:
"""Return True when selecting a command without text would be incomplete."""
return args_hint.strip().startswith("<")
def gateway_help_lines() -> list[str]:
"""Generate gateway help text lines from the registry."""
overrides = _resolve_config_gates()
@@ -454,7 +460,9 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
Telegram command names cannot contain hyphens, so they are replaced with
underscores. Aliases are skipped -- Telegram shows one menu entry per
canonical command.
canonical command. Commands that require arguments are skipped because
selecting a Telegram BotCommand sends only ``/command`` and would execute
an incomplete command.
Plugin-registered slash commands are included so plugins get native
autocomplete in Telegram without touching core code.
@@ -464,10 +472,14 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
for cmd in COMMAND_REGISTRY:
if not _is_gateway_available(cmd, overrides):
continue
if _requires_argument(cmd.args_hint):
continue
tg_name = _sanitize_telegram_name(cmd.name)
if tg_name:
result.append((tg_name, cmd.description))
for name, description, _args_hint in _iter_plugin_command_entries():
for name, description, args_hint in _iter_plugin_command_entries():
if _requires_argument(args_hint):
continue
tg_name = _sanitize_telegram_name(name)
if tg_name:
result.append((tg_name, description))
@@ -501,9 +513,9 @@ def _sanitize_telegram_name(raw: str) -> str:
def _clamp_command_names(
entries: list[tuple[str, str]],
entries: list[tuple[str, ...]],
reserved: set[str],
) -> list[tuple[str, str]]:
) -> list[tuple[str, ...]]:
"""Enforce 32-char command name limit with collision avoidance.
Both Telegram and Discord cap slash command names at 32 characters.
@@ -511,10 +523,15 @@ def _clamp_command_names(
(against *reserved* names or earlier entries in the same batch), the name is
shortened to 31 chars and a digit ``0``-``9`` is appended to differentiate.
If all 10 digit slots are taken the entry is silently dropped.
Accepts tuples of any length >= 2. Extra elements beyond ``(name, desc)``
(e.g. ``cmd_key``) are passed through unchanged, so callers can attach
metadata that survives the rename.
"""
used: set[str] = set(reserved)
result: list[tuple[str, str]] = []
for name, desc in entries:
result: list[tuple] = []
for entry in entries:
name, desc, *extra = entry
if len(name) > _CMD_NAME_LIMIT:
candidate = name[:_CMD_NAME_LIMIT]
if candidate in used:
@@ -530,7 +547,7 @@ def _clamp_command_names(
if name in used:
continue
used.add(name)
result.append((name, desc))
result.append((name, desc, *extra))
return result
@@ -613,13 +630,26 @@ def _collect_gateway_skill_entries(
try:
from agent.skill_commands import get_skill_commands
from tools.skills_tool import SKILLS_DIR
from agent.skill_utils import get_external_skills_dirs
_skills_dir = str(SKILLS_DIR.resolve())
_hub_dir = str((SKILLS_DIR / ".hub").resolve())
_hub_dir = str((SKILLS_DIR / ".hub").resolve()).rstrip("/") + "/"
# Build set of allowed directory prefixes: local skills dir + any
# user-configured ``skills.external_dirs``. Ensure each prefix ends
# with ``/`` so ``/my-skills`` does not also match ``/my-skills-extra``.
# Without this widening, external skills are visible in
# ``hermes skills list`` and the agent's ``/skill-name`` dispatch but
# silently excluded from gateway slash menus (#8110).
_allowed_prefixes = [_skills_dir.rstrip("/") + "/"]
_allowed_prefixes.extend(
str(d).rstrip("/") + "/" for d in get_external_skills_dirs()
)
skill_cmds = get_skill_commands()
for cmd_key in sorted(skill_cmds):
info = skill_cmds[cmd_key]
skill_path = info.get("skill_md_path", "")
if not skill_path.startswith(_skills_dir):
if not skill_path:
continue
if not any(skill_path.startswith(prefix) for prefix in _allowed_prefixes):
continue
if skill_path.startswith(_hub_dir):
continue
@@ -637,17 +667,15 @@ def _collect_gateway_skill_entries(
except Exception:
pass
# Clamp names; _clamp_command_names works on (name, desc) pairs so we
# need to zip/unzip.
skill_pairs = [(n, d) for n, d, _ in skill_triples]
key_by_pair = {(n, d): k for n, d, k in skill_triples}
skill_pairs = _clamp_command_names(skill_pairs, reserved_names)
# Clamp names; cmd_key is passed through as extra payload so it survives
# any clamp-induced renames.
skill_triples = _clamp_command_names(skill_triples, reserved_names)
# Skills fill remaining slots — only tier that gets trimmed
remaining = max(0, max_slots - len(all_entries))
hidden_count = max(0, len(skill_pairs) - remaining)
for n, d in skill_pairs[:remaining]:
all_entries.append((n, d, key_by_pair.get((n, d), "")))
hidden_count = max(0, len(skill_triples) - remaining)
for n, d, k in skill_triples[:remaining]:
all_entries.append((n, d, k))
return all_entries[:max_slots], hidden_count
@@ -723,24 +751,40 @@ def discord_skill_commands(
def discord_skill_commands_by_category(
reserved_names: set[str],
) -> tuple[dict[str, list[tuple[str, str, str]]], list[tuple[str, str, str]], int]:
"""Return skill entries organized by category for Discord ``/skill`` subcommand groups.
"""Return skill entries organized by category for Discord ``/skill`` autocomplete.
Skills whose directory is nested at least 2 levels under ``SKILLS_DIR``
Skills whose directory is nested at least 2 levels under a scan root
(e.g. ``creative/ascii-art/SKILL.md``) are grouped by their top-level
category. Root-level skills (e.g. ``dogfood/SKILL.md``) are returned as
*uncategorized* the caller should register them as direct subcommands
of the ``/skill`` group.
*uncategorized*.
The same filtering as :func:`discord_skill_commands` is applied: hub
skills excluded, per-platform disabled excluded, names clamped.
Scan roots include the local ``SKILLS_DIR`` **and** any configured
``skills.external_dirs`` matching the widened filter applied to the
flat ``discord_skill_commands()`` collector in #18741. Without this
parity, external-dir skills are visible via ``hermes skills list`` and
the agent's ``/skill-name`` dispatch but silently absent from Discord's
``/skill`` autocomplete.
Filtering mirrors :func:`discord_skill_commands`: hub skills excluded,
per-platform disabled excluded, names clamped to 32 chars, descriptions
clamped to 100 chars.
The legacy 25-group × 25-subcommand caps (from the old nested
``/skill <cat> <name>`` layout) are **not** applied the live caller
(``_register_skill_group`` in ``gateway/platforms/discord.py``, refactored
in PR #11580) flattens these results and feeds them into a single
autocomplete callback, which scales to thousands of entries without any
per-command payload concerns. ``hidden_count`` is retained in the return
tuple for backward compatibility and still reports skills dropped for
other reasons (32-char clamp collision vs a reserved name).
Returns:
``(categories, uncategorized, hidden_count)``
- *categories*: ``{category_name: [(name, description, cmd_key), ...]}``
- *uncategorized*: ``[(name, description, cmd_key), ...]``
- *hidden_count*: skills dropped due to Discord group limits
(25 subcommand groups, 25 subcommands per group)
- *hidden_count*: skills dropped due to name clamp collisions
against already-registered command names.
"""
from pathlib import Path as _P
@@ -754,14 +798,33 @@ def discord_skill_commands_by_category(
# Collect raw skill data --------------------------------------------------
categories: dict[str, list[tuple[str, str, str]]] = {}
uncategorized: list[tuple[str, str, str]] = []
_names_used: set[str] = set(reserved_names)
# Map clamped-32-char-name → what it came from, so we can emit an
# actionable warning on collision. Reserved (gateway-builtin) command
# names are marked with a sentinel so the warning distinguishes
# "skill collided with a reserved command" from "two skills collided
# on the 32-char clamp" — the latter is the rename-worthy case.
_names_used: dict[str, str] = {n: "<reserved>" for n in reserved_names}
hidden = 0
try:
from agent.skill_commands import get_skill_commands
from agent.skill_utils import get_external_skills_dirs
from tools.skills_tool import SKILLS_DIR
_skills_dir = SKILLS_DIR.resolve()
_hub_dir = (SKILLS_DIR / ".hub").resolve()
# Build list of (resolved_root, is_local) tuples. Each external dir
# becomes its own scan root for category derivation — a skill at
# ``<external>/mlops/foo/SKILL.md`` is still categorized as "mlops".
_scan_roots: list[_P] = [_skills_dir]
try:
for ext in get_external_skills_dirs():
try:
_scan_roots.append(_P(ext).resolve())
except Exception:
continue
except Exception:
pass
skill_cmds = get_skill_commands()
for cmd_key in sorted(skill_cmds):
@@ -770,33 +833,72 @@ def discord_skill_commands_by_category(
if not skill_path:
continue
sp = _P(skill_path).resolve()
# Skip skills outside SKILLS_DIR or from the hub
if not str(sp).startswith(str(_skills_dir)):
continue
# Hub skills are loaded via the skill hub, not surfaced as
# slash commands.
if str(sp).startswith(str(_hub_dir)):
continue
# Accept skill if it lives under any scan root; record the
# matching root so we can derive the category correctly.
matched_root: _P | None = None
for root in _scan_roots:
try:
sp.relative_to(root)
except ValueError:
continue
matched_root = root
break
if matched_root is None:
continue
skill_name = info.get("name", "")
if skill_name in _platform_disabled:
continue
raw_name = cmd_key.lstrip("/")
# Clamp to 32 chars (Discord limit)
# Clamp to 32 chars (Discord per-command name limit)
discord_name = raw_name[:32]
if discord_name in _names_used:
# Two skills whose first 32 chars are identical. One wins
# (the first one seen, which is alphabetical because the
# caller iterates ``sorted(skill_cmds)``); the other is
# dropped from Discord's /skill autocomplete.
#
# Silently counting this as ``hidden`` (the old behavior)
# meant skill authors had no way to discover the drop —
# their skill just didn't appear in the picker. Emit a
# WARNING naming both sides so the author can rename the
# losing skill's frontmatter name to something with a
# distinct 32-char prefix.
prior = _names_used[discord_name]
if prior == "<reserved>":
logger.warning(
"Discord /skill: %r (from %r) collides on its 32-char "
"clamp with a reserved gateway command name %r — the "
"skill will not appear in the /skill autocomplete. "
"Rename the skill's frontmatter ``name:`` to differ "
"in its first 32 chars.",
discord_name, cmd_key, discord_name,
)
else:
logger.warning(
"Discord /skill: %r and %r both clamp to %r on "
"Discord's 32-char command-name limit — only %r "
"will appear in the /skill autocomplete. Rename "
"one skill's frontmatter ``name:`` to differ in "
"its first 32 chars.",
prior, cmd_key, discord_name, prior,
)
hidden += 1
continue
_names_used.add(discord_name)
_names_used[discord_name] = cmd_key
desc = info.get("description", "")
if len(desc) > 100:
desc = desc[:97] + "..."
# Determine category from the relative path within SKILLS_DIR.
# e.g. creative/ascii-art/SKILL.md → parts = ("creative", "ascii-art")
try:
rel = sp.parent.relative_to(_skills_dir)
except ValueError:
continue
# Determine category from the relative path within the matched
# scan root. e.g. creative/ascii-art/SKILL.md → ("creative", ...)
rel = sp.parent.relative_to(matched_root)
parts = rel.parts
if len(parts) >= 2:
cat = parts[0]
@@ -806,28 +908,7 @@ def discord_skill_commands_by_category(
except Exception:
pass
# Enforce Discord limits: 25 subcommand groups, 25 subcommands each ------
_MAX_GROUPS = 25
_MAX_PER_GROUP = 25
trimmed_categories: dict[str, list[tuple[str, str, str]]] = {}
group_count = 0
for cat in sorted(categories):
if group_count >= _MAX_GROUPS:
hidden += len(categories[cat])
continue
entries = categories[cat][:_MAX_PER_GROUP]
hidden += max(0, len(categories[cat]) - _MAX_PER_GROUP)
trimmed_categories[cat] = entries
group_count += 1
# Uncategorized skills also count against the 25 top-level limit
remaining_slots = _MAX_GROUPS - group_count
if len(uncategorized) > remaining_slots:
hidden += len(uncategorized) - remaining_slots
uncategorized = uncategorized[:remaining_slots]
return trimmed_categories, uncategorized, hidden
return categories, uncategorized, hidden
# ---------------------------------------------------------------------------
+22 -3
View File
@@ -400,7 +400,12 @@ DEFAULT_CONFIG = {
# The gateway stops accepting new work, waits for running agents
# to finish, then interrupts any remaining runs after the timeout.
# 0 = no drain, interrupt immediately.
"restart_drain_timeout": 60,
#
# 180s is calibrated for realistic in-flight agent turns: a typical
# coding conversation mid-reasoning runs 60150s per call, so a 60s
# budget routinely interrupted legitimate work on /restart. Raise
# further in config.yaml if you run very-long-reasoning models.
"restart_drain_timeout": 180,
# Max app-level retry attempts for API errors (connection drops,
# provider timeouts, 5xx, etc.) before the agent surfaces the
# failure. The OpenAI SDK already does its own low-level retries
@@ -639,6 +644,18 @@ DEFAULT_CONFIG = {
"cache_ttl": "5m",
},
# OpenRouter-specific settings.
# response_cache: enable OpenRouter response caching (X-OpenRouter-Cache header).
# When enabled, identical requests return cached responses for free (zero billing).
# This is separate from Anthropic prompt caching and works alongside it.
# See: https://openrouter.ai/docs/guides/features/response-caching
# response_cache_ttl: how long cached responses remain valid, in seconds (1-86400).
# Default 300 (5 minutes). Only used when response_cache is enabled.
"openrouter": {
"response_cache": True,
"response_cache_ttl": 300,
},
# AWS Bedrock provider configuration.
# Only used when model.provider is "bedrock".
"bedrock": {
@@ -825,7 +842,7 @@ DEFAULT_CONFIG = {
# Voices: alloy, echo, fable, onyx, nova, shimmer
},
"xai": {
"voice_id": "eve",
"voice_id": "eve", # or custom voice ID — see https://docs.x.ai/developers/model-capabilities/audio/custom-voices
"language": "en",
"sample_rate": 24000,
"bit_rate": 128000,
@@ -4658,7 +4675,9 @@ def set_config_value(key: str, value: str):
"terminal.vercel_runtime": "TERMINAL_VERCEL_RUNTIME",
"terminal.docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
"terminal.docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
"terminal.cwd": "TERMINAL_CWD",
# terminal.cwd intentionally excluded — CLI resolves at runtime,
# gateway bridges it in gateway/run.py. Persisting to .env causes
# stale values to poison child processes.
"terminal.timeout": "TERMINAL_TIMEOUT",
"terminal.sandbox_dir": "TERMINAL_SANDBOX_DIR",
"terminal.persistent_shell": "TERMINAL_PERSISTENT_SHELL",
+13 -1
View File
@@ -302,9 +302,21 @@ def _cmd_rollback(args) -> int:
print(f" reason: {manifest.get('reason', '?')}")
print(f" created_at: {manifest.get('created_at', '?')}")
print(f" skill files: {manifest.get('skill_files', '?')}")
cron = manifest.get("cron_jobs") or {}
if isinstance(cron, dict):
if cron.get("backed_up"):
print(
f" cron jobs: {cron.get('jobs_count', 0)} "
f"(will be restored for skill-link fields only)"
)
else:
reason = cron.get("reason", "not captured")
print(f" cron jobs: not in snapshot ({reason})")
print(
"\nThis will replace the current ~/.hermes/skills/ tree (a safety "
"snapshot of the current state is taken first so this is undoable)."
"snapshot of the current state is taken first so this is undoable). "
"Cron jobs that still exist will have their skills/skill fields "
"restored from the snapshot; all other cron fields are left alone."
)
if not getattr(args, "yes", False):
+6
View File
@@ -156,6 +156,8 @@ def curses_checklist(
flush_stdin()
return result_holder[0] if result_holder[0] is not None else cancel_returns
except KeyboardInterrupt:
return cancel_returns
except Exception:
return _numbered_fallback(title, items, selected, cancel_returns, status_fn)
@@ -278,6 +280,8 @@ def curses_radiolist(
flush_stdin()
return result_holder[0] if result_holder[0] is not None else cancel_returns
except KeyboardInterrupt:
return cancel_returns
except Exception:
return _radio_numbered_fallback(title, items, selected, cancel_returns)
@@ -401,6 +405,8 @@ def curses_single_select(
return None
return result_holder[0]
except KeyboardInterrupt:
return None
except Exception:
all_items = list(items) + [cancel_label]
cancel_idx = len(items)
+82 -7
View File
@@ -1,12 +1,19 @@
"""``hermes debug`` debug tools for Hermes Agent.
"""``hermes debug`` debug tools for Hermes Agent.
Currently supports:
hermes debug share Upload debug report (system info + logs) to a
paste service and print a shareable URL.
By default, log content is run through
``agent.redact.redact_sensitive_text`` with
``force=True`` before upload so credentials in
``~/.hermes/logs/*.log`` are not leaked into
the public paste service. Pass ``--no-redact``
to disable.
"""
import io
import json
import logging
import sys
import time
import urllib.error
@@ -19,6 +26,16 @@ from typing import Optional
from hermes_constants import get_hermes_home
from utils import atomic_replace
logger = logging.getLogger(__name__)
# Banner prepended to upload-bound log content when redaction is enabled.
# Visible in the public paste so reviewers know the content was sanitized.
# Kept short; the trailing newline guarantees the banner sits on its own line.
_REDACTION_BANNER = (
"[hermes debug share: log content redacted at upload time. "
"run with --no-redact to disable]\n"
)
# ---------------------------------------------------------------------------
# Paste services — try paste.rs first, dpaste.com as fallback.
@@ -368,17 +385,40 @@ def _resolve_log_path(log_name: str) -> Optional[Path]:
return None
def _redact_log_text(text: str) -> str:
"""Run ``redact_sensitive_text`` with ``force=True`` over upload-bound text.
Uses ``force=True`` so redaction fires regardless of the operator's
``security.redact_secrets`` setting. The local on-disk log file is
not modified; only the in-memory copy headed for the public paste
service is sanitized. Returns the redacted text (or the original
when empty / non-string).
"""
if not text:
return text
from agent.redact import redact_sensitive_text
return redact_sensitive_text(text, force=True)
def _capture_log_snapshot(
log_name: str,
*,
tail_lines: int,
max_bytes: int = _MAX_LOG_BYTES,
redact: bool = True,
) -> LogSnapshot:
"""Capture a log once and derive summary/full-log views from it.
The report tail and standalone log upload must come from the same file
snapshot. Otherwise a rotation/truncate between reads can make the report
look newer than the uploaded ``agent.log`` paste.
When ``redact`` is True (the default), both ``tail_text`` and
``full_text`` are run through ``_redact_log_text`` so the snapshot
returned is upload-safe. The on-disk log file is never modified.
Pass ``redact=False`` to capture original log content (used by
``hermes debug share --no-redact``).
"""
log_path = _resolve_log_path(log_name)
if log_path is None:
@@ -438,18 +478,34 @@ def _capture_log_snapshot(
if truncated:
full_text = f"[... truncated — showing last ~{max_bytes // 1024}KB ...]\n{full_text}"
if redact:
tail_text = _redact_log_text(tail_text)
full_text = _redact_log_text(full_text)
return LogSnapshot(path=log_path, tail_text=tail_text, full_text=full_text)
except Exception as exc:
return LogSnapshot(path=log_path, tail_text=f"(error reading: {exc})", full_text=None)
def _capture_default_log_snapshots(log_lines: int) -> dict[str, LogSnapshot]:
"""Capture all logs used by debug-share exactly once."""
def _capture_default_log_snapshots(
log_lines: int, *, redact: bool = True
) -> dict[str, LogSnapshot]:
"""Capture all logs used by debug-share exactly once.
``redact`` is forwarded to each ``_capture_log_snapshot`` call so all
captured logs share the same redaction policy for a given run.
"""
errors_lines = min(log_lines, 100)
return {
"agent": _capture_log_snapshot("agent", tail_lines=log_lines),
"errors": _capture_log_snapshot("errors", tail_lines=errors_lines),
"gateway": _capture_log_snapshot("gateway", tail_lines=errors_lines),
"agent": _capture_log_snapshot(
"agent", tail_lines=log_lines, redact=redact
),
"errors": _capture_log_snapshot(
"errors", tail_lines=errors_lines, redact=redact
),
"gateway": _capture_log_snapshot(
"gateway", tail_lines=errors_lines, redact=redact
),
}
@@ -532,6 +588,7 @@ def run_debug_share(args):
log_lines = getattr(args, "lines", 200)
expiry = getattr(args, "expire", 7)
local_only = getattr(args, "local", False)
redact = not getattr(args, "no_redact", False)
if not local_only:
print(_PRIVACY_NOTICE)
@@ -539,8 +596,16 @@ def run_debug_share(args):
print("Collecting debug report...")
# Capture dump once — prepended to every paste for context.
# The dump is already redacted at extract time via dump.py:_redact;
# log_snapshots are redacted by _capture_default_log_snapshots when
# redact=True so credentials never reach the public paste service.
dump_text = _capture_dump()
log_snapshots = _capture_default_log_snapshots(log_lines)
log_snapshots = _capture_default_log_snapshots(log_lines, redact=redact)
if redact:
logger.info(
"hermes debug share: applied force-mode redaction to log snapshots before upload"
)
report = collect_debug_report(
log_lines=log_lines,
@@ -556,6 +621,15 @@ def run_debug_share(args):
if gateway_log:
gateway_log = dump_text + "\n\n--- full gateway.log ---\n" + gateway_log
# Visible banner so reviewers reading the public paste know redaction
# was applied at upload time. Banner is omitted under --no-redact.
if redact:
report = _REDACTION_BANNER + report
if agent_log:
agent_log = _REDACTION_BANNER + agent_log
if gateway_log:
gateway_log = _REDACTION_BANNER + gateway_log
if local_only:
print(report)
if agent_log:
@@ -666,6 +740,7 @@ def run_debug(args):
print(" --lines N Number of log lines to include (default: 200)")
print(" --expire N Paste expiry in days (default: 7)")
print(" --local Print report locally instead of uploading")
print(" --no-redact Disable upload-time secret redaction (default: redact)")
print()
print("Options (delete):")
print(" <url> ... One or more paste URLs to delete")
+5 -2
View File
@@ -263,8 +263,11 @@ def run_doctor(args):
if env_path.exists():
check_ok(f"{_DHH}/.env file exists")
# Check for common issues
content = env_path.read_text()
# Check for common issues. Pin encoding to UTF-8 because .env files are
# written as UTF-8 everywhere in the codebase, while Path.read_text()
# defaults to the system locale — which crashes on non-UTF-8 Windows
# locales (e.g. GBK) as soon as the file contains any non-ASCII byte.
content = env_path.read_text(encoding="utf-8")
if _has_provider_env_config(content):
check_ok("API key or custom endpoint configured")
else:
+67 -11
View File
@@ -188,7 +188,7 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
SIGUSR1 is wired in gateway/run.py to ``request_restart(via_service=True)``
which drains in-flight agent runs (up to ``agent.restart_drain_timeout``
seconds), then exits with code 75. Both systemd (``Restart=on-failure``
seconds), then exits with code 75. Both systemd (``Restart=always``
+ ``RestartForceExitStatus=75``) and launchd (``KeepAlive.SuccessfulExit
= false``) relaunch the process after the graceful exit.
@@ -237,6 +237,26 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
return False
def _get_ancestor_pids() -> set[int]:
"""Return the set of PIDs in the current process's ancestor chain.
Walks from the current PID up to PID 1 (init) so that process-table scans
never match the calling CLI process or any of its parents. This prevents
``hermes gateway status`` from falsely counting the ``hermes`` CLI that
invoked it as a running gateway instance (see #13242).
"""
ancestors: set[int] = set()
pid = os.getpid()
# Cap iterations to avoid infinite loops on exotic platforms.
for _ in range(64):
ancestors.add(pid)
parent = _get_parent_pid(pid)
if parent is None or parent <= 0 or parent in ancestors:
break
pid = parent
return ancestors
def _append_unique_pid(pids: list[int], pid: int | None, exclude_pids: set[int]) -> None:
if pid is None or pid <= 0:
return
@@ -252,6 +272,10 @@ def _scan_gateway_pids(exclude_pids: set[int], all_profiles: bool = False) -> li
a live gateway when the PID file is stale/missing, and ``--all`` sweeps can
discover gateways outside the current profile.
"""
# Exclude the entire ancestor chain so the CLI process that invoked this
# scan (e.g. ``hermes gateway status``) is never mistaken for a running
# gateway. See #13242.
exclude_pids = exclude_pids | _get_ancestor_pids()
pids: list[int] = []
patterns = [
"hermes_cli.main gateway",
@@ -690,6 +714,32 @@ def _print_gateway_process_mismatch(snapshot: GatewayRuntimeSnapshot) -> None:
print(" can refuse to start another copy until this process stops.")
def _print_other_profiles_gateway_status() -> None:
"""Print a summary of gateway status across all profiles.
Shown at the bottom of ``hermes gateway status`` output so users with
multiple profiles can tell at a glance which gateways are running and
avoid confusing another profile's process with the current one.
"""
try:
from hermes_cli.profiles import get_active_profile_name
current = get_active_profile_name()
other_processes = [
p for p in find_profile_gateway_processes()
if p.profile != current
]
if not other_processes:
return
print()
print("Other profiles:")
for proc in other_processes:
print(f"{proc.profile:<16s} — PID {proc.pid}")
except Exception:
pass
def kill_gateway_processes(force: bool = False, exclude_pids: set | None = None,
all_profiles: bool = False) -> int:
"""Kill any running gateway processes. Returns count killed.
@@ -1655,8 +1705,7 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
Description={SERVICE_DESCRIPTION}
After=network-online.target
Wants=network-online.target
StartLimitIntervalSec=600
StartLimitBurst=5
StartLimitIntervalSec=0
[Service]
Type=simple
@@ -1670,8 +1719,10 @@ Environment="LOGNAME={username}"
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=on-failure
RestartSec=30
Restart=always
RestartSec=60
RestartMaxDelaySec=300
RestartSteps=5
RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
KillMode=mixed
KillSignal=SIGTERM
@@ -1691,9 +1742,9 @@ WantedBy=multi-user.target
sane_path = ":".join(path_entries)
return f"""[Unit]
Description={SERVICE_DESCRIPTION}
After=network.target
StartLimitIntervalSec=600
StartLimitBurst=5
After=network-online.target
Wants=network-online.target
StartLimitIntervalSec=0
[Service]
Type=simple
@@ -1702,8 +1753,10 @@ WorkingDirectory={working_dir}
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=on-failure
RestartSec=30
Restart=always
RestartSec=60
RestartMaxDelaySec=300
RestartSteps=5
RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
KillMode=mixed
KillSignal=SIGTERM
@@ -2451,7 +2504,7 @@ def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
print()
# Exit with code 1 if gateway fails to connect any platform,
# so systemd Restart=on-failure will retry on transient errors
# so systemd Restart=always will retry on transient errors
verbosity = None if quiet else verbose
try:
success = asyncio.run(start_gateway(replace=replace, verbosity=verbosity))
@@ -4453,6 +4506,9 @@ def _gateway_command_inner(args):
print(" hermes gateway install # Install as user service")
print(" sudo hermes gateway install --system # Install as boot-time system service")
# Show other profiles' gateway status for multi-profile awareness
_print_other_profiles_gateway_status()
elif subcmd == "migrate-legacy":
# Stop, disable, and remove legacy Hermes gateway unit files from
# pre-rename installs (e.g. hermes.service). Profile units and
+1 -1
View File
@@ -366,7 +366,7 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
# --- log ---
p_log = sub.add_parser(
"log",
help="Print the worker log for a task (from $HERMES_HOME/kanban/logs/)",
help="Print the worker log for a task (from <kanban-root>/kanban/logs/)",
)
p_log.add_argument("task_id")
p_log.add_argument("--tail", type=int, default=None,
+91 -19
View File
@@ -1,8 +1,28 @@
"""SQLite-backed Kanban board for multi-profile collaboration.
The board lives at ``$HERMES_HOME/kanban.db`` (profile-agnostic on purpose:
multiple profiles on the same machine all see the same board, which IS the
coordination primitive).
The board lives at ``<root>/kanban.db`` where ``<root>`` is the **shared
Hermes root** (the parent of any active profile). Profiles intentionally
collapse onto a single board: it IS the cross-profile coordination
primitive. A worker spawned with ``hermes -p <profile>`` joins the same
board as the dispatcher that claimed the task. The same applies to
``<root>/kanban/workspaces/`` and ``<root>/kanban/logs/``.
In standard installs ``<root>`` is ``~/.hermes``. In Docker / custom
deployments where ``HERMES_HOME`` points outside ``~/.hermes`` (e.g.
``/opt/hermes``), ``<root>`` is ``HERMES_HOME``. Three env-var overrides
are available (highest precedence first, all optional):
* ``HERMES_KANBAN_DB`` pin the database file path directly.
* ``HERMES_KANBAN_WORKSPACES_ROOT`` pin the workspaces root directly.
* ``HERMES_KANBAN_HOME`` pin the umbrella root that anchors all three
kanban paths (db + workspaces + logs). Useful for tests and unusual
deployments where a single override is enough.
The dispatcher injects ``HERMES_KANBAN_DB`` and
``HERMES_KANBAN_WORKSPACES_ROOT`` into the worker subprocess env as a
defense-in-depth measure: even if the worker's ``get_default_hermes_root()``
resolution somehow disagrees with the dispatcher's (unusual symlink or
Docker layout), the two processes still converge on the same files.
Schema is intentionally small: tasks, task_links, task_comments,
task_events. The ``workspace_kind`` field decouples coordination from git
@@ -61,16 +81,57 @@ _CTX_MAX_COMMENT_BYTES = 2 * 1024 # 2 KB per comment
# Paths
# ---------------------------------------------------------------------------
def kanban_home() -> Path:
"""Return the shared Hermes root that anchors the kanban board.
Resolution order:
1. ``HERMES_KANBAN_HOME`` env var when set and non-empty (explicit
override for tests and unusual deployments).
2. ``get_default_hermes_root()``, which already returns ``<root>``
when ``HERMES_HOME`` is ``<root>/profiles/<name>``, and returns
``HERMES_HOME`` directly for Docker / custom deployments.
The kanban board is shared across profiles **by design** (see the
module docstring). Resolving the kanban paths through the active
profile's ``HERMES_HOME`` would silently fork the board per profile,
which breaks the dispatcher / worker handoff.
"""
override = os.environ.get("HERMES_KANBAN_HOME", "").strip()
if override:
return Path(override).expanduser()
from hermes_constants import get_default_hermes_root
return get_default_hermes_root()
def kanban_db_path() -> Path:
"""Return the path to ``kanban.db`` inside the active HERMES_HOME."""
from hermes_constants import get_hermes_home
return get_hermes_home() / "kanban.db"
"""Return the path to the shared ``kanban.db``.
Anchored at :func:`kanban_home`, not the active profile's
``HERMES_HOME``, so profile workers and the dispatcher converge on
the same board. ``HERMES_KANBAN_DB`` pins the path directly (highest
precedence) the dispatcher injects this into worker subprocess env
as defense-in-depth.
"""
override = os.environ.get("HERMES_KANBAN_DB", "").strip()
if override:
return Path(override).expanduser()
return kanban_home() / "kanban.db"
def workspaces_root() -> Path:
"""Return the directory under which ``scratch`` workspaces are created."""
from hermes_constants import get_hermes_home
return get_hermes_home() / "kanban" / "workspaces"
"""Return the directory under which ``scratch`` workspaces are created.
Anchored at :func:`kanban_home` so workspace paths are stable across
profile workers spawned by the dispatcher.
``HERMES_KANBAN_WORKSPACES_ROOT`` pins the path directly (highest
precedence) the dispatcher injects this into worker subprocess env
as defense-in-depth.
"""
override = os.environ.get("HERMES_KANBAN_WORKSPACES_ROOT", "").strip()
if override:
return Path(override).expanduser()
return kanban_home() / "kanban" / "workspaces"
# ---------------------------------------------------------------------------
@@ -1516,12 +1577,15 @@ def archive_task(conn: sqlite3.Connection, task_id: str) -> bool:
def resolve_workspace(task: Task) -> Path:
"""Resolve (and create if needed) the workspace for a task.
- ``scratch``: a fresh dir under ``$HERMES_HOME/kanban/workspaces/<id>/``.
- ``scratch``: a fresh dir under ``<kanban-root>/kanban/workspaces/<id>/``,
where ``<kanban-root>`` is the shared Hermes root (see
:func:`kanban_home`). The path is the same for the dispatcher and
every profile worker, so handoff is path-stable.
- ``dir:<path>``: the path stored in ``workspace_path``. Created
if missing. MUST be absolute relative paths are rejected to
prevent confused-deputy traversal where ``../../../tmp/attacker``
resolves against the dispatcher's CWD instead of a meaningful
root. Users who want a HERMES_HOME-relative workspace should
root. Users who want a kanban-root-relative workspace should
compute the absolute path themselves.
- ``worktree``: a git worktree at ``workspace_path``. Not created
automatically in v1 -- the kanban-worker skill documents
@@ -2070,6 +2134,14 @@ def _default_spawn(task: Task, workspace: str) -> Optional[int]:
env["HERMES_TENANT"] = task.tenant
env["HERMES_KANBAN_TASK"] = task.id
env["HERMES_KANBAN_WORKSPACE"] = workspace
# Pin the shared board + workspaces root the dispatcher resolved, so
# that even when the worker activates a profile (`hermes -p <name>`
# rewrites HERMES_HOME), its kanban paths still match the
# dispatcher's. Belt-and-braces with the `get_default_hermes_root()`
# resolution in `kanban_home()` — symmetric resolution is the norm,
# but unusual symlink / Docker layouts are caught here too.
env["HERMES_KANBAN_DB"] = str(kanban_db_path())
env["HERMES_KANBAN_WORKSPACES_ROOT"] = str(workspaces_root())
# HERMES_PROFILE is the author the kanban_comment tool defaults to.
# `hermes -p <assignee>` activates the profile, but the env var is
# what the tool reads — set it explicitly here so comments are
@@ -2104,9 +2176,10 @@ def _default_spawn(task: Task, workspace: str) -> Optional[int]:
"chat",
"-q", prompt,
])
# Redirect output to a per-task log under HERMES_HOME/kanban/logs/.
from hermes_constants import get_hermes_home
log_dir = get_hermes_home() / "kanban" / "logs"
# Redirect output to a per-task log under <kanban-root>/kanban/logs/.
# Anchored at the shared kanban root, not the worker's profile home,
# so `hermes kanban tail` reads the same file the worker writes to.
log_dir = kanban_home() / "kanban" / "logs"
log_dir.mkdir(parents=True, exist_ok=True)
log_path = log_dir / f"{task.id}.log"
_rotate_worker_log(log_path, DEFAULT_LOG_ROTATE_BYTES)
@@ -2591,8 +2664,7 @@ def gc_worker_logs(
"""Delete worker log files older than ``older_than_seconds``. Returns
the number of files removed. Kept separate from ``gc_events`` because
log files live on disk, not in SQLite."""
from hermes_constants import get_hermes_home
log_dir = get_hermes_home() / "kanban" / "logs"
log_dir = kanban_home() / "kanban" / "logs"
if not log_dir.exists():
return 0
cutoff = time.time() - older_than_seconds
@@ -2614,8 +2686,7 @@ def gc_worker_logs(
def worker_log_path(task_id: str) -> Path:
"""Return the path to a worker's log file. The file may not exist
(task never spawned, or log already GC'd)."""
from hermes_constants import get_hermes_home
return get_hermes_home() / "kanban" / "logs" / f"{task_id}.log"
return kanban_home() / "kanban" / "logs" / f"{task_id}.log"
def read_worker_log(
@@ -2661,7 +2732,8 @@ def list_profiles_on_disk() -> list[str]:
``config.yaml`` a bare dir without config isn't a real profile.
"""
try:
home = Path.home() / ".hermes" / "profiles"
from hermes_constants import get_default_hermes_root
home = get_default_hermes_root() / "profiles"
except Exception:
return []
if not home.is_dir():
+36 -5
View File
@@ -289,7 +289,7 @@ def _has_any_provider_configured() -> bool:
env_file = get_env_path()
if env_file.exists():
try:
for line in env_file.read_text().splitlines():
for line in env_file.read_text(encoding="utf-8").splitlines():
line = line.strip()
if line.startswith("#") or "=" not in line:
continue
@@ -837,7 +837,17 @@ def _print_tui_exit_summary(session_id: Optional[str], active_session_file: Opti
)
_NPM_LOCK_RUNTIME_KEYS = frozenset({"ideallyInert"})
_NPM_LOCK_RUNTIME_KEYS = frozenset({"ideallyInert", "peer"})
"""Lockfile fields npm writes non-deterministically at install time.
``ideallyInert`` is npm's runtime annotation for packages it skipped installing
(per-platform opt-outs). ``peer`` is dropped from the hidden ``.package-lock.json``
on dev-dependencies that are *also* declared as peers the canonical
``package-lock.json`` records the dual role, but npm 9's actualized tree strips
it. Neither key represents a real skew between what was declared and what was
installed, so we exclude them from the comparison in :func:`_tui_need_npm_install`
to avoid false-positive reinstalls on every launch.
"""
def _tui_need_npm_install(root: Path) -> bool:
@@ -1042,17 +1052,21 @@ def _make_tui_argv(tui_dir: Path, tui_dev: bool) -> tuple[list[str], Path]:
if _tui_need_npm_install(tui_dir):
if not os.environ.get("HERMES_QUIET"):
print("Installing TUI dependencies…")
# Capture stdout as well as stderr — some npm errors (notably EACCES on a
# root-owned node_modules in containers) are emitted on stdout, and a
# bare "npm install failed." with no preview defeats debugging. We keep
# the failure-only print path so a successful install stays silent.
result = subprocess.run(
[npm, "install", "--silent", "--no-fund", "--no-audit", "--progress=false"],
cwd=str(tui_dir),
stdout=subprocess.DEVNULL,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
env={**os.environ, "CI": "1"},
)
if result.returncode != 0:
err = (result.stderr or "").strip()
preview = "\n".join(err.splitlines()[-30:])
combined = f"{result.stdout or ''}\n{result.stderr or ''}".strip()
preview = "\n".join(combined.splitlines()[-30:])
print("npm install failed.")
if preview:
print(preview)
@@ -8460,6 +8474,12 @@ def main():
)
slack_parser.set_defaults(func=cmd_slack)
# =========================================================================
# send command — pipe shell-script output to any configured platform
# =========================================================================
from hermes_cli.send_cmd import register_send_subparser
register_send_subparser(subparsers)
# =========================================================================
# login command
# =========================================================================
@@ -8891,6 +8911,7 @@ Examples:
hermes debug share --lines 500 Include more log lines
hermes debug share --expire 30 Keep paste for 30 days
hermes debug share --local Print report locally (no upload)
hermes debug share --no-redact Disable upload-time secret redaction
hermes debug delete <url> Delete a previously uploaded paste
""",
)
@@ -8916,6 +8937,16 @@ Examples:
action="store_true",
help="Print the report locally instead of uploading",
)
share_parser.add_argument(
"--no-redact",
action="store_true",
help=(
"Disable upload-time secret redaction (default: redact). Logs "
"are normally run through agent.redact.redact_sensitive_text "
"with force=True before upload so credentials are not leaked "
"into the public paste service."
),
)
delete_parser = debug_sub.add_parser(
"delete",
help="Delete a paste uploaded by 'hermes debug share'",
+1 -1
View File
@@ -361,7 +361,7 @@ def _write_env_vars(env_path: Path, env_writes: dict) -> None:
existing_lines = []
if env_path.exists():
existing_lines = env_path.read_text().splitlines()
existing_lines = env_path.read_text(encoding="utf-8").splitlines()
updated_keys = set()
new_lines = []
+63 -6
View File
@@ -904,6 +904,26 @@ def switch_model(
if any(m.get("name") == new_model for m in cfg_models if isinstance(m, dict)):
override = True
break
# Also check custom_providers list — models declared there should be accepted
# even if the remote /v1/models endpoint doesn't list them.
if not override and custom_providers and isinstance(custom_providers, list):
for entry in custom_providers:
if not isinstance(entry, dict):
continue
# Match by provider slug (custom:<name>) or by base_url
entry_name = entry.get("name", "")
entry_slug = f"custom:{entry_name}" if entry_name else ""
entry_url = entry.get("base_url", "")
if entry_slug == target_provider or entry_url == base_url:
# Check if the requested model matches the entry's model
entry_model = entry.get("model", "")
entry_models = entry.get("models", {})
if new_model == entry_model:
override = True
break
if isinstance(entry_models, dict) and new_model in entry_models:
override = True
break
if override:
validation = {"accepted": True, "persist": True, "recognized": False, "message": validation.get("message", "")}
else:
@@ -1057,6 +1077,45 @@ def list_authenticated_providers(
if normed:
_builtin_endpoints.add(normed)
def _has_fast_aws_sdk_signal() -> bool:
"""Return True when explicit AWS auth config is present.
This intentionally avoids botocore's full credential chain. Provider
picker/model-switch discovery can run for non-Bedrock providers, and
botocore may otherwise probe EC2 IMDS (169.254.169.254) on local
machines before returning no credentials.
"""
if os.environ.get("AWS_BEARER_TOKEN_BEDROCK", "").strip():
return True
if (
os.environ.get("AWS_ACCESS_KEY_ID", "").strip()
and os.environ.get("AWS_SECRET_ACCESS_KEY", "").strip()
):
return True
return any(
os.environ.get(name, "").strip()
for name in (
"AWS_PROFILE",
"AWS_CONTAINER_CREDENTIALS_RELATIVE_URI",
"AWS_CONTAINER_CREDENTIALS_FULL_URI",
"AWS_WEB_IDENTITY_TOKEN_FILE",
)
)
def _has_aws_sdk_creds_for_listing(slug: str) -> bool:
"""Credential check for AWS SDK providers in non-runtime discovery."""
slug_norm = str(slug or "").strip().lower()
current_norm = str(current_provider or "").strip().lower()
if _has_fast_aws_sdk_signal():
return True
if slug_norm != current_norm:
return False
try:
from agent.bedrock_adapter import has_aws_credentials
return bool(has_aws_credentials())
except Exception:
return False
data = fetch_models_dev()
# Build curated model lists keyed by hermes provider ID
@@ -1184,7 +1243,9 @@ def list_authenticated_providers(
# Check if credentials exist
has_creds = False
if overlay.extra_env_vars:
if overlay.auth_type == "aws_sdk":
has_creds = _has_aws_sdk_creds_for_listing(hermes_slug)
elif overlay.extra_env_vars:
has_creds = any(os.environ.get(ev) for ev in overlay.extra_env_vars)
# Also check api_key_env_vars from PROVIDER_REGISTRY for api_key auth_type
if not has_creds and overlay.auth_type == "api_key":
@@ -1324,11 +1385,7 @@ def list_authenticated_providers(
# credentials come from the boto3 credential chain (env vars,
# ~/.aws/credentials, instance roles, etc.)
if not _cp_has_creds and _cp_config and getattr(_cp_config, "auth_type", "") == "aws_sdk":
try:
from agent.bedrock_adapter import has_aws_credentials
_cp_has_creds = has_aws_credentials()
except Exception:
pass
_cp_has_creds = _has_aws_sdk_creds_for_listing(_cp.slug)
if not _cp_has_creds:
continue
+1 -1
View File
@@ -3087,7 +3087,7 @@ def validate_requested_model(
"message": f"Model `{requested}` was not found in LM Studio's model listing.",
}
if normalized == "custom":
if normalized == "custom" or normalized.startswith("custom:"):
# Try probing with correct auth for the api_mode.
if api_mode == "anthropic_messages":
probe = probe_api_models(api_key, base_url, api_mode=api_mode)
+445
View File
@@ -0,0 +1,445 @@
"""CLI subcommand: ``hermes send`` — pipe text from shell scripts to any
configured messaging platform (Telegram, Discord, Slack, Signal, SMS, etc.).
This is a thin wrapper around ``tools.send_message_tool.send_message_tool``
that exposes its functionality as a standalone CLI entry point so ops
scripts, cron jobs, CI hooks, and monitoring daemons can reuse the gateway's
already-configured credentials without having to reimplement each platform's
REST API client.
Design notes:
* No LLM, no agent loop the subcommand just resolves arguments, reads the
message body, calls the shared tool function, and prints/returns the
result. It is intentionally fast, cheap, and side-effect-only.
* For platforms that send via bot token (Telegram, Discord, Slack, Signal,
SMS, WhatsApp-CloudAPI, ) no running gateway is required. The tool
talks directly to each platform's REST endpoint. For platforms that rely
on a persistent adapter connection (plugin platforms, Matrix in some
modes, ) a live gateway is needed; the underlying tool surfaces that
error to the caller.
* Exit codes follow the classic Unix convention:
0 delivery (or list) succeeded
1 delivery failed at the platform level
2 usage / argument / config error (argparse already uses 2)
"""
from __future__ import annotations
import argparse
import json
import sys
from pathlib import Path
from typing import Optional
_USAGE_EXIT = 2
_FAILURE_EXIT = 1
_SUCCESS_EXIT = 0
def _read_message_body(
positional: Optional[str],
file_path: Optional[str],
) -> Optional[str]:
"""Resolve the message body from (in order):
1. An explicit positional message argument.
2. ``--file PATH`` or ``--file -`` (where ``-`` means stdin).
3. Piped stdin when it is not attached to a TTY.
Returns ``None`` when nothing is available callers must treat that as
a usage error.
"""
if positional:
return positional
if file_path:
if file_path == "-":
return sys.stdin.read()
try:
return Path(file_path).read_text()
except OSError as exc:
print(f"hermes send: cannot read {file_path}: {exc}", file=sys.stderr)
sys.exit(_USAGE_EXIT)
# Piped input: only consume stdin when it is not a TTY. Reading from a
# TTY would block the user in a half-broken "type your message" state,
# which is a poor default for an ops CLI.
if not sys.stdin.isatty():
data = sys.stdin.read()
if data:
return data
return None
def _resolve_target(arg_to: Optional[str]) -> Optional[str]:
"""Return a cleaned ``--to`` value, or ``None`` when nothing is set."""
if arg_to and arg_to.strip():
return arg_to.strip()
return None
def _emit_result(
result_json: str,
*,
json_mode: bool,
quiet: bool,
) -> int:
"""Print the tool result in the requested format and return the exit code.
The underlying ``send_message_tool`` always returns a JSON string. We
parse it, decide success/failure, and format accordingly.
"""
try:
payload = json.loads(result_json) if result_json else {}
except json.JSONDecodeError:
# Shouldn't happen with the shared tool, but be defensive — pass the
# raw string through so the user can still see what went wrong.
payload = {"error": "invalid JSON from send_message_tool", "raw": result_json}
if json_mode:
print(json.dumps(payload, indent=2))
elif quiet:
pass
else:
if payload.get("error"):
print(f"hermes send: {payload['error']}", file=sys.stderr)
elif payload.get("success"):
note = payload.get("note")
if note:
print(note)
else:
print("sent")
else:
# Unknown shape — dump it so nothing is silently dropped.
print(json.dumps(payload, indent=2))
if payload.get("error"):
return _FAILURE_EXIT
if payload.get("skipped"):
return _SUCCESS_EXIT
if payload.get("success"):
return _SUCCESS_EXIT
# Unknown / unexpected — treat as failure so scripts notice.
return _FAILURE_EXIT
def _list_targets(platform_filter: Optional[str], *, json_mode: bool) -> int:
"""Print the channel directory (all configured targets across platforms).
Uses ``load_directory()`` for structured JSON output and
``format_directory_for_display()`` for the human-readable rendering that
the send_message tool itself shows to the model keeps the two surfaces
identical.
"""
try:
from gateway.channel_directory import (
format_directory_for_display,
load_directory,
)
except Exception as exc:
print(f"hermes send: failed to load channel directory: {exc}", file=sys.stderr)
return _FAILURE_EXIT
try:
raw = load_directory()
except Exception as exc:
print(f"hermes send: failed to read channel directory: {exc}", file=sys.stderr)
return _FAILURE_EXIT
platforms = dict(raw.get("platforms") or {})
if platform_filter:
key = platform_filter.strip().lower()
filtered = {k: v for k, v in platforms.items() if k.lower() == key}
if not filtered:
print(
f"hermes send: no targets found for platform '{platform_filter}'. "
f"Configured: {', '.join(sorted(platforms)) or '(none)'}",
file=sys.stderr,
)
return _FAILURE_EXIT
platforms = filtered
if json_mode:
print(json.dumps({"platforms": platforms}, indent=2, default=str))
return _SUCCESS_EXIT
if not any(platforms.values()):
print("No messaging platforms configured or no channels discovered yet.")
print("Set one up with `hermes gateway setup`, or run the gateway once so")
print("channel discovery can populate ~/.hermes/channel_directory.json.")
return _SUCCESS_EXIT
# Human display — when unfiltered, reuse the shared formatter the agent
# already sees. When filtered, build a minimal view ourselves.
if platform_filter is None:
print(format_directory_for_display())
return _SUCCESS_EXIT
for plat_name in sorted(platforms):
channels = platforms[plat_name]
print(f"{plat_name}:")
if not channels:
print(" (no channels discovered yet)")
continue
for ch in channels:
name = ch.get("name", "?")
chat_id = ch.get("id") or ch.get("chat_id") or ""
suffix = f" [{chat_id}]" if chat_id and chat_id != name else ""
print(f" {plat_name}:{name}{suffix}")
print()
return _SUCCESS_EXIT
def _load_hermes_env() -> None:
"""Populate ``os.environ`` from ``~/.hermes/.env`` AND bridge top-level
``config.yaml`` keys into the environment so the underlying gateway
config loader sees platform credentials and home channel IDs.
``send_message_tool`` reads tokens and home-channel IDs via
``os.getenv(...)`` on each call. The gateway process does two things at
startup that ``hermes send`` must replicate when invoked standalone:
1. ``load_dotenv(~/.hermes/.env)`` brings bot tokens into the env.
2. Bridge top-level simple values from ``~/.hermes/config.yaml`` into
``os.environ`` (without overriding existing env vars). This is where
``TELEGRAM_HOME_CHANNEL`` and friends live when the user saved them
via ``hermes config set``.
See ``gateway/run.py`` for the canonical version of this bridge we
intentionally reimplement the minimum needed here so ``hermes send``
doesn't pull in the full gateway module just to resolve a home channel.
"""
# Step 1: dotenv
try:
from dotenv import load_dotenv
except Exception:
load_dotenv = None # type: ignore[assignment]
try:
from hermes_cli.config import get_hermes_home
home = get_hermes_home()
except Exception:
return
env_path = home / ".env"
if load_dotenv and env_path.exists():
try:
load_dotenv(str(env_path), override=True, encoding="utf-8")
except UnicodeDecodeError:
try:
load_dotenv(str(env_path), override=True, encoding="latin-1")
except Exception:
pass
except Exception:
pass
# Step 2: bridge top-level config.yaml values into the environment so
# gateway.config.load_gateway_config() sees them. Scalars only; don't
# override values already in the env.
import os
config_path = home / "config.yaml"
if not config_path.exists():
return
try:
import yaml # type: ignore[import-not-found]
except Exception:
return
try:
with open(config_path, "r", encoding="utf-8") as fh:
raw = yaml.safe_load(fh) or {}
except Exception:
return
try:
from hermes_cli.config import _expand_env_vars
raw = _expand_env_vars(raw)
except Exception:
pass
if not isinstance(raw, dict):
return
for key, val in raw.items():
if not isinstance(val, (str, int, float, bool)):
continue
if key in os.environ:
continue
os.environ[key] = str(val)
def cmd_send(args: argparse.Namespace) -> None:
"""Entry point wired into the top-level argparse dispatcher."""
# Bridge ~/.hermes/.env and ~/.hermes/config.yaml into os.environ so the
# gateway config loader (invoked downstream by send_message_tool and by
# the channel directory) can see platform credentials and home channels.
_load_hermes_env()
# --list short-circuits everything else.
if getattr(args, "list_targets", False):
# When `--list telegram` is used, argparse stores "telegram" in the
# `message` positional (since list_targets takes no argument).
platform_filter = getattr(args, "message", None)
exit_code = _list_targets(platform_filter, json_mode=getattr(args, "json", False))
sys.exit(exit_code)
target = _resolve_target(getattr(args, "to", None))
if not target:
print(
"hermes send: --to PLATFORM[:channel[:thread]] is required\n"
"Examples:\n"
" hermes send --to telegram \"hello\"\n"
" hermes send --to discord:#ops --file report.md\n"
" hermes send --list # list available targets",
file=sys.stderr,
)
sys.exit(_USAGE_EXIT)
message = _read_message_body(
getattr(args, "message", None),
getattr(args, "file", None),
)
if message is None or not message.strip():
print(
"hermes send: no message provided. Pass text as a positional "
"argument, use --file PATH, or pipe data via stdin.",
file=sys.stderr,
)
sys.exit(_USAGE_EXIT)
# Optional: prepend a subject line. Useful for alerting scripts that
# want a consistent header without inlining it into every call.
subject = getattr(args, "subject", None)
if subject:
message = f"{subject}\n\n{message.lstrip()}"
# Import lazily so `hermes send --help` stays fast and does not pull in
# the full tool registry / gateway config stack.
from tools.send_message_tool import send_message_tool
# send_message_tool auto-loads gateway config + env and routes to the
# appropriate platform adapter (bot-token path for Telegram/Discord/Slack/
# Signal/SMS/WhatsApp; live-adapter path for plugin platforms).
#
# It expects the standard tool-call dict and returns a JSON string.
tool_args = {
"action": "send",
"target": target,
"message": message,
}
result = send_message_tool(tool_args)
exit_code = _emit_result(
result,
json_mode=getattr(args, "json", False),
quiet=getattr(args, "quiet", False),
)
sys.exit(exit_code)
def register_send_subparser(subparsers) -> argparse.ArgumentParser:
"""Create the ``send`` subparser and return it.
Kept as a standalone function so the top-level parser builder can wire
it in next to the other messaging subcommands without cluttering
``_parser.py`` or ``main.py``.
"""
parser = subparsers.add_parser(
"send",
help="Send a message to a configured platform (scripts, cron jobs, CI).",
description=(
"Pipe text from any shell script to any messaging platform Hermes "
"is already configured for. Reuses the gateway's platform "
"credentials (~/.hermes/.env + ~/.hermes/config.yaml) — no LLM, "
"no agent loop, no running gateway required for bot-token "
"platforms like Telegram/Discord/Slack/Signal."
),
epilog=(
"Examples:\n"
" hermes send --to telegram \"deploy finished\"\n"
" echo \"RAM 92%\" | hermes send --to telegram:-1001234567890\n"
" hermes send --to discord:#ops --file /tmp/report.md\n"
" hermes send --to slack:#eng --subject \"[CI]\" --file build.log\n"
" hermes send --list # all platforms\n"
" hermes send --list telegram # filter by platform\n"
"\n"
"Exit codes: 0 ok, 1 delivery/backend error, 2 usage error."
),
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
"-t",
"--to",
metavar="TARGET",
default=None,
help=(
"Delivery target. Format: 'platform' (home channel), "
"'platform:chat_id', 'platform:chat_id:thread_id', or "
"'platform:#channel-name'. Examples: telegram, "
"telegram:-1001234567890:17585, discord:#ops, slack:C0123ABCD, "
"signal:+15551234567."
),
)
parser.add_argument(
"message",
nargs="?",
default=None,
help="Message text. If omitted, read from --file or stdin.",
)
# Legacy / convenience positional removed — use --to for clarity.
parser.add_argument(
"-f",
"--file",
metavar="PATH",
default=None,
help="Read message body from PATH. Use '-' to force stdin.",
)
parser.add_argument(
"-s",
"--subject",
metavar="LINE",
default=None,
help="Prepend a subject/header line before the message body.",
)
parser.add_argument(
"-l",
"--list",
dest="list_targets",
action="store_true",
default=False,
help="List available targets. Optional positional filter: `hermes send --list telegram`.",
)
parser.add_argument(
"-q",
"--quiet",
action="store_true",
default=False,
help="Suppress stdout on success (exit code only).",
)
parser.add_argument(
"--json",
action="store_true",
default=False,
help="Emit raw JSON result instead of human-readable output.",
)
parser.set_defaults(func=cmd_send)
return parser
__all__ = ["cmd_send", "register_send_subparser"]
-316
View File
@@ -1,316 +0,0 @@
"""Session recap — summarize what's happened in the current session.
Inspired by Claude Code's `/recap` command (v2.1.114, April 2026), which
shows a one-line summary of what happened while a terminal was unfocused
so users juggling multiple sessions can re-orient quickly.
Source: https://code.claude.com/docs/en/whats-new/2026-w17
Differences from Claude Code:
- Pure local computation from the in-memory conversation history. No
LLM call, no auxiliary model, no prompt-cache invalidation. A
recap should be instant and free.
- Works unchanged on CLI and every gateway platform (Telegram,
Discord, Slack, ) because both call into the same ``build_recap``
helper. Claude Code only shows this on the CLI.
- Tailored to hermes-agent's tool vocabulary (``terminal``, ``patch``,
``write_file``, ``delegate_task``, ``browser_*``, ``web_*``) the
recap surfaces which classes of work were most active.
"""
from __future__ import annotations
import os
from collections import Counter
from typing import Any, Iterable, List, Mapping, Optional, Sequence, Tuple
# How many recent user/assistant turns we consider "recent activity".
_RECENT_TURN_WINDOW = 20
# How many characters of the latest user prompt to show.
_PROMPT_PREVIEW_CHARS = 140
# How many characters of the latest assistant text to show.
_ASSISTANT_PREVIEW_CHARS = 200
# How many recently-touched files to list.
_MAX_FILES_LISTED = 5
# Tool names that identify a file-editing action and the argument key that
# holds the path.
_FILE_EDIT_TOOLS: Mapping[str, str] = {
"write_file": "path",
"patch": "path",
"read_file": "path",
"skill_manage": "file_path",
"skill_view": "file_path",
}
def _coerce_text(value: Any) -> str:
"""Flatten assistant/user ``content`` into a plain string.
Content can be a string or a list of content blocks (for multimodal
or reasoning models). We concatenate every text-like block and
ignore the rest.
"""
if value is None:
return ""
if isinstance(value, str):
return value
if isinstance(value, list):
parts: List[str] = []
for block in value:
if isinstance(block, str):
parts.append(block)
continue
if isinstance(block, Mapping):
text = block.get("text")
if isinstance(text, str) and text:
parts.append(text)
return "\n".join(parts)
return str(value)
def _tool_call_name_and_args(tool_call: Any) -> Tuple[str, Mapping[str, Any]]:
"""Extract ``(name, arguments_dict)`` from a tool_call entry.
``arguments`` may be a JSON string or a dict depending on provider.
Return an empty dict if it cannot be parsed.
"""
if not isinstance(tool_call, Mapping):
return "", {}
fn = tool_call.get("function") or {}
if not isinstance(fn, Mapping):
return "", {}
name = str(fn.get("name") or "") or ""
raw_args = fn.get("arguments")
if isinstance(raw_args, Mapping):
return name, raw_args
if isinstance(raw_args, str) and raw_args:
try:
import json
parsed = json.loads(raw_args)
if isinstance(parsed, Mapping):
return name, parsed
except Exception:
return name, {}
return name, {}
def _iter_assistant_tool_calls(
messages: Sequence[Mapping[str, Any]],
) -> Iterable[Tuple[str, Mapping[str, Any]]]:
for msg in messages:
if not isinstance(msg, Mapping):
continue
if msg.get("role") != "assistant":
continue
tool_calls = msg.get("tool_calls") or []
if not isinstance(tool_calls, list):
continue
for tc in tool_calls:
name, args = _tool_call_name_and_args(tc)
if name:
yield name, args
def _count_visible_turns(
messages: Sequence[Mapping[str, Any]],
) -> Tuple[int, int, int]:
"""Return ``(user_turn_count, assistant_turn_count, tool_message_count)``."""
users = assistants = tools = 0
for msg in messages:
if not isinstance(msg, Mapping):
continue
role = msg.get("role")
if role == "user":
users += 1
elif role == "assistant":
assistants += 1
elif role == "tool":
tools += 1
return users, assistants, tools
def _latest_user_prompt(
messages: Sequence[Mapping[str, Any]],
) -> Optional[str]:
for msg in reversed(messages):
if isinstance(msg, Mapping) and msg.get("role") == "user":
text = _coerce_text(msg.get("content")).strip()
if text:
return text
return None
def _latest_assistant_text(
messages: Sequence[Mapping[str, Any]],
) -> Optional[str]:
for msg in reversed(messages):
if not isinstance(msg, Mapping):
continue
if msg.get("role") != "assistant":
continue
text = _coerce_text(msg.get("content")).strip()
if text:
return text
return None
def _recent_window(
messages: Sequence[Mapping[str, Any]], window: int = _RECENT_TURN_WINDOW
) -> List[Mapping[str, Any]]:
"""Return the tail slice of ``messages`` covering at most ``window``
user+assistant turns (tool messages ride along inside the window).
Iterating from the end, we count user and assistant messages and
keep everything from the first message that falls within the window.
"""
count = 0
cut = 0
for i in range(len(messages) - 1, -1, -1):
msg = messages[i]
if isinstance(msg, Mapping) and msg.get("role") in ("user", "assistant"):
count += 1
if count >= window:
cut = i
break
else:
return list(messages)
return list(messages[cut:])
def _shortened_path(path: str) -> str:
"""Show a path relative to cwd when possible, otherwise with ~ expansion."""
if not path:
return path
try:
abs_path = os.path.abspath(os.path.expanduser(path))
cwd = os.getcwd()
if abs_path == cwd:
return "."
if abs_path.startswith(cwd + os.sep):
return abs_path[len(cwd) + 1 :]
home = os.path.expanduser("~")
if abs_path.startswith(home + os.sep):
return "~/" + abs_path[len(home) + 1 :]
return abs_path
except Exception:
return path
def _summarise_tool_activity(
tool_calls: Sequence[Tuple[str, Mapping[str, Any]]],
) -> Tuple[List[Tuple[str, int]], List[str]]:
"""Return ``(tool_counts_sorted, recently_edited_files)``.
``tool_counts_sorted`` is descending by count, keeping the full list
so callers can truncate for display. ``recently_edited_files`` lists
distinct paths (most recent first) from file-editing tools.
"""
counter: Counter[str] = Counter()
files_seen: List[str] = []
files_set: set[str] = set()
# Walk in reverse so "most recent first" drops out of order-preserved iteration.
for name, args in reversed(list(tool_calls)):
counter[name] += 1
arg_key = _FILE_EDIT_TOOLS.get(name)
if arg_key:
path = args.get(arg_key)
if isinstance(path, str) and path and path not in files_set:
files_set.add(path)
files_seen.append(_shortened_path(path))
# Restore "reverse of reverse" for correct counts; Counter ignores order
# so only files_seen needed the reversal. Fix ordering: currently
# files_seen is newest→oldest which is what we want for display.
tool_counts = sorted(counter.items(), key=lambda kv: (-kv[1], kv[0]))
return tool_counts, files_seen
def _truncate(text: str, limit: int) -> str:
text = " ".join(text.split()) # collapse newlines for a compact one-liner
if len(text) <= limit:
return text
return text[: limit - 1].rstrip() + ""
def build_recap(
messages: Sequence[Mapping[str, Any]],
*,
session_title: Optional[str] = None,
session_id: Optional[str] = None,
platform: Optional[str] = None,
) -> str:
"""Build a multi-line recap of recent activity.
Inputs:
messages: the full conversation history as a list of
chat-completion-style dicts (``role``, ``content``,
``tool_calls``, ).
session_title: optional human title (from SessionDB).
session_id: optional session id.
platform: optional hint (``"cli"``, ``"telegram"``, ). Does not
change behavior today but is accepted for forward compat.
The output is plain text designed to render well in both a terminal
(with 80-col wrapping) and a gateway message bubble.
"""
_ = platform # reserved for future use
lines: List[str] = []
header_bits: List[str] = ["Session recap"]
if session_title:
header_bits.append(f"{session_title}")
elif session_id:
header_bits.append(f"{session_id[:8]}")
lines.append(" ".join(header_bits))
if not messages:
lines.append(" (nothing to recap — no messages yet)")
return "\n".join(lines)
users, assistants, tool_msgs = _count_visible_turns(messages)
window = _recent_window(messages)
win_users, win_assistants, _ = _count_visible_turns(window)
scope = (
f"{win_users} user turn{'s' if win_users != 1 else ''} / "
f"{win_assistants} assistant repl{'ies' if win_assistants != 1 else 'y'}"
)
if (users, assistants) != (win_users, win_assistants):
scope += f" (of {users}/{assistants} total)"
lines.append(f" Recent: {scope}, {tool_msgs} tool result{'s' if tool_msgs != 1 else ''}")
tool_calls = list(_iter_assistant_tool_calls(window))
tool_counts, files = _summarise_tool_activity(tool_calls)
if tool_counts:
top = ", ".join(f"{name}×{count}" for name, count in tool_counts[:5])
extra = len(tool_counts) - 5
if extra > 0:
top += f" (+{extra} more)"
lines.append(f" Tools used: {top}")
if files:
shown = files[:_MAX_FILES_LISTED]
extra = len(files) - len(shown)
entry = ", ".join(shown)
if extra > 0:
entry += f" (+{extra} more)"
lines.append(f" Files touched: {entry}")
latest_user = _latest_user_prompt(window)
if latest_user:
lines.append(f" Last ask: {_truncate(latest_user, _PROMPT_PREVIEW_CHARS)}")
latest_reply = _latest_assistant_text(window)
if latest_reply:
lines.append(f" Last reply: {_truncate(latest_reply, _ASSISTANT_PREVIEW_CHARS)}")
if len(lines) == 2:
# Only the header + scope line — nothing substantive to show.
lines.append(" (no assistant activity yet in this window)")
return "\n".join(lines)
__all__ = ["build_recap"]
+55 -12
View File
@@ -1190,6 +1190,13 @@ def _setup_tts_provider(config: dict):
"Falling back to Edge TTS."
)
selected = "edge"
if selected == "xai":
print()
voice_id = prompt("xAI voice_id (Enter for 'eve', or paste a custom voice ID)")
if voice_id and voice_id.strip():
config.setdefault("tts", {}).setdefault("xai", {})["voice_id"] = voice_id.strip()
print_success(f"xAI voice_id set to: {voice_id.strip()}")
elif selected == "minimax":
existing = get_env_value("MINIMAX_API_KEY")
@@ -1321,15 +1328,13 @@ def setup_terminal_backend(config: dict):
print_success("Terminal backend: Local")
print_info("Commands run directly on this machine.")
# CWD for messaging
# Gateway/cron working directory
print()
print_info("Working directory for messaging sessions:")
print_info(" When using Hermes via Telegram/Discord, this is where")
print_info(
" the agent starts. CLI mode always starts in the current directory."
)
print_info("Gateway working directory:")
print_info(" Used by Telegram/Discord/cron sessions.")
print_info(" CLI/TUI always uses your launch directory instead.")
current_cwd = cfg_get(config, "terminal", "cwd", default="")
cwd = prompt(" Messaging working directory", current_cwd or str(Path.home()))
cwd = prompt(" Gateway working directory", current_cwd or str(Path.home()))
if cwd:
config["terminal"]["cwd"] = cwd
@@ -1643,7 +1648,11 @@ def setup_terminal_backend(config: dict):
def _apply_default_agent_settings(config: dict):
"""Apply recommended defaults for all agent settings without prompting."""
config.setdefault("agent", {})["max_turns"] = 90
save_env_value("HERMES_MAX_ITERATIONS", "90")
# config.yaml is the authoritative source for max_turns; the gateway
# bridges it into HERMES_MAX_ITERATIONS at startup. We no longer write
# to .env to avoid the dual-source inconsistency that caused the
# 60-vs-500 bug (stale .env entry silently shadowing config.yaml).
remove_env_value("HERMES_MAX_ITERATIONS")
config.setdefault("display", {})["tool_progress"] = "all"
@@ -1673,9 +1682,10 @@ def setup_agent_settings(config: dict):
print()
# ── Max Iterations ──
current_max = get_env_value("HERMES_MAX_ITERATIONS") or str(
cfg_get(config, "agent", "max_turns", default=90)
)
# config.yaml is authoritative; read from there. If a legacy .env
# entry is still around (from pre-PR#18413 setups), prefer the
# config value so we don't surface a stale number to the user.
current_max = str(cfg_get(config, "agent", "max_turns", default=90))
print_info("Maximum tool-calling iterations per conversation.")
print_info("Higher = more complex tasks, but costs more tokens.")
print_info(
@@ -1686,9 +1696,13 @@ def setup_agent_settings(config: dict):
try:
max_iter = int(max_iter_str)
if max_iter > 0:
save_env_value("HERMES_MAX_ITERATIONS", str(max_iter))
# Write to config.yaml (authoritative) only. Also clean up any
# stale .env entry from earlier setup runs — the gateway's
# bridge in gateway/run.py now unconditionally derives
# HERMES_MAX_ITERATIONS from agent.max_turns at startup.
config.setdefault("agent", {})["max_turns"] = max_iter
config.pop("max_turns", None)
remove_env_value("HERMES_MAX_ITERATIONS")
print_success(f"Max iterations set to {max_iter}")
except ValueError:
print_warning("Invalid number, keeping current value")
@@ -2033,6 +2047,16 @@ def _setup_slack():
print_warning("⚠️ No Slack allowlist set - unpaired users will be denied by default.")
print_info(" Set SLACK_ALLOW_ALL_USERS=true or GATEWAY_ALLOW_ALL_USERS=true only if you intentionally want open workspace access.")
print()
print_info("📬 Home Channel: where Hermes delivers cron job results,")
print_info(" cross-platform messages, and notifications.")
print_info(" To get a channel ID: open the channel in Slack, then right-click")
print_info(" the channel name → Copy link — the ID starts with C (e.g. C01ABC2DE3F).")
print_info(" You can also set this later by typing /set-home in a Slack channel.")
home_channel = prompt("Home channel ID (leave empty to set later with /set-home)")
if home_channel:
save_env_value("SLACK_HOME_CHANNEL", home_channel.strip())
def _write_slack_manifest_and_instruct():
"""Generate the Slack manifest, write it under HERMES_HOME, and print
@@ -2979,6 +3003,21 @@ def run_setup_wizard(args):
config = load_config()
hermes_home = get_hermes_home()
# Back up existing config before setup modifies it (#3522)
config_path = get_config_path()
if config_path.exists():
from datetime import datetime as _dt
_backup_path = config_path.with_suffix(
f".yaml.bak.{_dt.now().strftime('%Y%m%d_%H%M%S')}"
)
try:
import shutil
shutil.copy2(config_path, _backup_path)
except Exception:
_backup_path = None
else:
_backup_path = None
# Detect non-interactive environments (headless SSH, Docker, CI/CD)
non_interactive = getattr(args, 'non_interactive', False)
if not non_interactive and not is_interactive_stdin():
@@ -3148,6 +3187,10 @@ def run_setup_wizard(args):
# Save and show summary
save_config(config)
if _backup_path and _backup_path.exists():
print_info(f"Previous config backed up to: {_backup_path}")
print_info("If setup changed a value you customized, restore it with:")
print_info(f" cp {_backup_path} {config_path}")
_print_setup_summary(config, hermes_home)
_offer_launch_chat()
+25 -2
View File
@@ -56,6 +56,7 @@ CONFIGURABLE_TOOLSETS = [
("file", "📁 File Operations", "read, write, patch, search"),
("code_execution", "⚡ Code Execution", "execute_code"),
("vision", "👁️ Vision / Image Analysis", "vision_analyze"),
("video", "🎬 Video Analysis", "video_analyze (requires video-capable model)"),
("image_gen", "🎨 Image Generation", "image_generate"),
("moa", "🧠 Mixture of Agents", "mixture_of_agents"),
("tts", "🔊 Text-to-Speech", "text_to_speech"),
@@ -78,7 +79,7 @@ CONFIGURABLE_TOOLSETS = [
# Toolsets that are OFF by default for new installs.
# They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
# but the setup checklist won't pre-select them for first-time users.
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin"}
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin", "video"}
# Platform-scoped toolsets: only appear in the `hermes tools` checklist for
# these platforms, and only resolve/save for these platforms. A toolset
@@ -1822,7 +1823,7 @@ def _reconfigure_tool(config: dict):
cat = TOOL_CATEGORIES.get(ts_key)
reqs = TOOLSET_ENV_REQUIREMENTS.get(ts_key)
if cat or reqs:
if _toolset_has_keys(ts_key, config):
if _toolset_has_keys(ts_key, config) or _toolset_enabled_for_reconfigure(ts_key, config):
configurable.append((ts_key, ts_label))
if not configurable:
@@ -1848,6 +1849,28 @@ def _reconfigure_tool(config: dict):
save_config(config)
def _toolset_enabled_for_reconfigure(ts_key: str, config: dict) -> bool:
"""Return True if a configurable toolset is enabled anywhere.
Reconfigure must include enabled-but-unconfigured categories so users can
finish provider/API-key setup without disabling and re-enabling the toolset.
"""
for platform in PLATFORMS:
if not _toolset_allowed_for_platform(ts_key, platform):
continue
try:
enabled = _get_platform_tools(
config,
platform,
include_default_mcp_servers=False,
)
except Exception:
continue
if ts_key in enabled:
return True
return False
def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
"""Reconfigure a tool category - provider selection + API key update."""
icon = cat.get("icon", "")
+36 -8
View File
@@ -470,10 +470,23 @@ except (ValueError, TypeError):
)
_GATEWAY_HEALTH_TIMEOUT = 3.0
# DEPRECATED (scheduled for removal): GATEWAY_HEALTH_URL / GATEWAY_HEALTH_TIMEOUT.
# Cross-container / cross-host gateway liveness detection will be folded into a
# first-class dashboard config key so it's no longer Docker-adjacent lore buried
# in env vars. The env vars still work for now so existing Compose deployments
# don't break. Do not add new callers — wire new uses through the planned
# config surface.
def _probe_gateway_health() -> tuple[bool, dict | None]:
"""Probe the gateway via its HTTP health endpoint (cross-container).
.. deprecated::
Driven by the deprecated ``GATEWAY_HEALTH_URL`` /
``GATEWAY_HEALTH_TIMEOUT`` env vars. Scheduled for removal alongside
a move to a first-class dashboard config key. See
:data:`_GATEWAY_HEALTH_URL` for context.
Uses ``/health/detailed`` first (returns full state), falling back to
the simpler ``/health`` endpoint. Returns ``(is_alive, body_dict)``.
@@ -2882,6 +2895,25 @@ _VALID_CHANNEL_RE = re.compile(r"^[A-Za-z0-9._-]{1,128}$")
# loopback so tests don't need to rewrite request scope.
_LOOPBACK_HOSTS = frozenset({"127.0.0.1", "::1", "localhost", "testclient"})
def _is_public_bind() -> bool:
"""True when bound to all-interfaces (operator used --insecure)."""
return getattr(app.state, "bound_host", "") in ("0.0.0.0", "::")
def _ws_client_is_allowed(ws: "WebSocket") -> bool:
"""Check if the WebSocket client IP is acceptable.
Allows loopback always; allows any IP when bound to all-interfaces
(--insecure mode, guarded by session token auth).
"""
if _is_public_bind():
return True
client_host = ws.client.host if ws.client else ""
if not client_host:
return True
return client_host in _LOOPBACK_HOSTS
# Per-channel subscriber registry used by /api/pub (PTY-side gateway → dashboard)
# and /api/events (dashboard → browser sidebar). Keyed by an opaque channel id
# the chat tab generates on mount; entries auto-evict when the last subscriber
@@ -2972,8 +3004,7 @@ async def pty_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
@@ -3080,8 +3111,7 @@ async def gateway_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
@@ -3113,8 +3143,7 @@ async def pub_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
@@ -3143,8 +3172,7 @@ async def events_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
+51 -1
View File
@@ -8,14 +8,64 @@ import os
from pathlib import Path
_profile_fallback_warned: bool = False
def get_hermes_home() -> Path:
"""Return the Hermes home directory (default: ~/.hermes).
Reads HERMES_HOME env var, falls back to ~/.hermes.
This is the single source of truth all other copies should import this.
When ``HERMES_HOME`` is unset but an ``active_profile`` file indicates
a non-default profile is active, logs a loud one-shot warning to
``errors.log`` so cross-profile data corruption is diagnosable instead
of silent. Behavior is unchanged otherwise we still return
``~/.hermes`` because raising here would brick 30+ module-level
callers that import this at load time. Subprocess spawners are
expected to propagate ``HERMES_HOME`` explicitly (see the systemd
template in ``hermes_cli/gateway.py`` and the kanban dispatcher in
``hermes_cli/kanban_db.py``). See https://github.com/NousResearch/hermes-agent/issues/18594.
"""
val = os.environ.get("HERMES_HOME", "").strip()
return Path(val) if val else Path.home() / ".hermes"
if val:
return Path(val)
# Guard: if a non-default profile is sticky-active, warn once that
# the fallback to the default profile is almost certainly wrong.
global _profile_fallback_warned
if not _profile_fallback_warned:
try:
# Inline the default-root resolution from get_default_hermes_root()
# to stay import-safe (this function is called from module scope
# in 30+ files; we cannot afford to trigger logging setup here).
active_path = (Path.home() / ".hermes" / "active_profile")
active = active_path.read_text().strip() if active_path.exists() else ""
except (UnicodeDecodeError, OSError):
active = ""
if active and active != "default":
_profile_fallback_warned = True
# Write directly to stderr. We intentionally do NOT route this
# through ``logging`` because (a) this function is called at
# module-import time from 30+ sites, often before logging is
# configured, and (b) root-logger propagation would double-emit
# on consoles where a StreamHandler is already attached.
import sys
msg = (
f"[HERMES_HOME fallback] HERMES_HOME is unset but active "
f"profile is {active!r}. Falling back to ~/.hermes, which "
f"is the DEFAULT profile — not {active!r}. Any data this "
f"process writes will land in the wrong profile. The "
f"subprocess spawner should pass HERMES_HOME explicitly "
f"(see issue #18594)."
)
try:
sys.stderr.write(msg + "\n")
sys.stderr.flush()
except Exception:
pass
return Path.home() / ".hermes"
def get_default_hermes_root() -> Path:
@@ -0,0 +1,206 @@
---
name: kanban-video-orchestrator
description: Plan, set up, and monitor a multi-agent video production pipeline backed by Hermes Kanban. Use when the user wants to make ANY video — narrative film, product/marketing, music video, explainer, ASCII/terminal art, abstract/generative loop, comic, 3D, real-time/installation — and the work warrants decomposition into specialized profiles (writer, designer, animator, renderer, voice, editor, etc.) coordinated through a kanban board. Performs adaptive discovery to scope the brief, designs an appropriate team for the requested style, generates the setup script that creates Hermes profiles + initial kanban task, then helps monitor execution and intervene when tasks stall or fail. Routes scenes to whichever Hermes rendering / audio / design skill fits each beat (`ascii-video`, `manim-video`, `p5js`, `comfyui`, `touchdesigner-mcp`, `blender-mcp`, `pixel-art`, `baoyu-comic`, `claude-design`, `excalidraw`, `songsee`, `heartmula`, …) plus external APIs for TTS, image-gen, and image-to-video as needed.
version: 1.0.0
author: [SHL0MS, alt-glitch]
license: MIT
metadata:
hermes:
tags: [video, kanban, multi-agent, orchestration, production-pipeline]
related_skills: [kanban-orchestrator, kanban-worker, ascii-video, manim-video, p5js, comfyui, touchdesigner-mcp, blender-mcp, pixel-art, ascii-art, songwriting-and-ai-music, heartmula, songsee, spotify, youtube-content, claude-design, excalidraw, architecture-diagram, concept-diagrams, baoyu-comic, baoyu-infographic, humanizer, gif-search, meme-generation]
credits: |
The single-project workspace layout, profile-config patching pattern,
SOUL.md-per-profile model, TEAM.md task-graph convention, and
`--workspace dir:<path>` discipline are adapted from alt-glitch's
original multi-agent video pipeline at
https://github.com/NousResearch/kanban-video-pipeline.
---
# Kanban Video Orchestrator
Wrap any video request — from a 15-second product teaser to a 5-minute narrative
short to a music video to an ASCII loop — in a Hermes Kanban pipeline that
decomposes the work to specialized agent profiles.
This skill does **not** render anything itself. It is a meta-pipeline that:
1. **Scopes** the request through targeted discovery
2. **Designs** an appropriate team (which roles, which tools per role) based on the style
3. **Generates** a setup script that creates Hermes profiles, project workspace, and the initial kanban task
4. **Hands off** to the director profile, which decomposes via the kanban
5. **Monitors** execution, helps intervene when tasks stall or fail
The actual rendering happens inside the kanban once it's running, via whichever
existing skills + tools fit the scenes — `ascii-video`, `manim-video`, `p5js`,
`comfyui`, `touchdesigner-mcp`, `blender-mcp`, `songwriting-and-ai-music`,
`heartmula`, external APIs, or plain Python with PIL + ffmpeg.
## When NOT to use this skill
- The video is one continuous procedural project that needs no specialists. Just write the code directly.
- The user wants a quick one-shot conversion (e.g. "convert this mp4 to a GIF") — use ffmpeg directly.
- The output is a static image, GIF, or audio-only artifact — use the matching specific skill (`ascii-art`, `gifs`, `meme-generation`, `songwriting-and-ai-music`).
- The work fits a single existing skill cleanly (e.g. a pure ASCII video — just use `ascii-video`).
## Workflow
```
DISCOVER → BRIEF → TEAM DESIGN → SETUP → EXECUTE → MONITOR
```
### Step 1 — Discover (ask the right questions)
The discovery process is **adaptive**: ask only what is actually needed. Always
start with three questions to identify the broad shape:
- **What is the video?** (one-sentence brief)
- **How long?** (5-30s teaser / 30-90s short / 90s-3min explainer / 3-10min film / longer)
- **What aspect ratio + target platform?** (1:1 / 9:16 / 16:9; X, IG, YouTube, internal, etc.)
From the answer, classify the style category. The style determines which
follow-up questions to ask. **Do not ask all questions at once.** Ask 2-4 at a
time, listen, then proceed. Make reasonable assumptions whenever the user
implies an answer.
For complete intake patterns and per-style question banks, see
**[references/intake.md](references/intake.md)**.
### Step 2 — Brief
Once enough is known, produce a structured `brief.md` using the template in
`assets/brief.md.tmpl`. Stages:
1. **Concept** — the one-sentence pitch + emotional north star
2. **Scope** — duration, aspect, platform, deadline
3. **Style** — visual references, brand constraints, tone
4. **Scenes** — beat-by-beat breakdown (durations, content, target tool)
5. **Audio** — narration / music / SFX / silent (per scene if needed)
6. **Deliverables** — file format, resolution, optional alternates (vertical cut, GIF, etc.)
Show the brief to the user for confirmation before designing the team. **The
brief is the contract** — every downstream task references it.
### Step 3 — Team design
Pick role archetypes from the library that fit this video. **Compose, don't
clone.** Most videos need 4-7 profiles. The director is always present; the
rest are picked by what the brief actually requires.
For the role library and per-style team compositions, see
**[references/role-archetypes.md](references/role-archetypes.md)**.
For mapping role → which Hermes skills + toolsets it loads, see
**[references/tool-matrix.md](references/tool-matrix.md)**.
### Step 4 — Setup
Generate a setup script (`setup.sh`) and run it. The script:
1. Creates the project workspace (`~/projects/video-pipeline/<slug>/`)
2. Copies any provided assets into `taste/`, `audio/`, `assets/`
3. Creates each Hermes profile via `hermes profile create --clone`
4. Writes per-profile `SOUL.md` (personality + role definition)
5. Configures profile YAML (toolsets, always_load skills, cwd)
6. Writes `brief.md`, `TEAM.md`, and `taste/` content
7. Fires the initial `hermes kanban create` task assigned to the director
Use `scripts/bootstrap_pipeline.py` to generate setup.sh from a brief +
team-design JSON. See **[references/kanban-setup.md](references/kanban-setup.md)**
for the setup script structure, profile config patterns, and the critical
"shared workspace" rule.
### Step 5 — Execute
Run `setup.sh`. Then provide the user with monitoring commands:
```bash
hermes kanban watch --tenant <project-tenant> # live events
hermes kanban list --tenant <project-tenant> # board snapshot
hermes dashboard # visual board UI
```
The director profile takes over from here, decomposing the work and routing
tasks to specialist profiles via the kanban toolset.
### Step 6 — Monitor and intervene
Stay engaged — the kanban runs autonomously but a stuck task or bad output
needs human (or AI) judgment.
Monitoring patterns: poll `kanban list` periodically, inspect any RUNNING task
that exceeds its expected duration with `kanban show <id>`, and check
heartbeats. When a worker's output fails review, the standard interventions are:
1. Comment on the worker's task with specific feedback (`kanban_comment`)
2. Create a re-run task with the original as parent
3. Adjust the brief's scope and let the director re-decompose
For diagnostic patterns, intervention recipes, and the "task is stuck"
playbook, see **[references/monitoring.md](references/monitoring.md)**.
## Reference: worked examples
Six concrete pipelines covering very different video styles — narrative film,
product/marketing, music video, math/algorithm explainer, ASCII video, real-time
installation — showing how the same workflow yields very different teams and
task graphs. See **[references/examples.md](references/examples.md)**.
## Critical rules
1. **Discovery before action.** Never start generating a brief or team without
asking at least the three baseline questions. A bad brief cascades through
the entire pipeline.
2. **Match the team to the video.** Don't reuse the same 4-profile setup for
every job. A music video that doesn't have a beat-analysis profile will
misfire. A narrative film that doesn't have a writer profile will produce
incoherent scenes. See `references/role-archetypes.md`.
3. **One workspace per project.** All profiles for a given video share the same
`dir:` workspace. Tasks pass artifacts via shared filesystem and structured
handoffs. **Every** `kanban_create` call passes
`workspace_kind="dir"` + `workspace_path="<absolute project path>"`.
4. **Tenant every project.** Use a project-specific tenant
(`--tenant <project-slug>`). Keeps the dashboard scoped and prevents
cross-pollination with other ongoing kanbans.
5. **Respect existing skills.** When a scene fits an existing skill, the
relevant renderer should load that skill via `--skill <name>` on its task
or `always_load` in its profile. Do not re-derive what a skill already
provides.
6. **The director never executes.** Even with the full `kanban + terminal +
file` toolset, the director's `SOUL.md` rules forbid it from executing
work itself. It decomposes and routes only — every concrete task becomes
a `hermes kanban create` call to a specialist profile. The
`kanban-orchestrator` skill spells this out further.
7. **Don't over-decompose.** A 30-second product video does NOT need 20 tasks.
Aim for the smallest task graph that still parallelizes well and exposes the
right human-review gates.
8. **Verify API keys BEFORE firing.** External APIs (TTS, image-gen,
image-to-video) need keys in `~/.hermes/.env` or the user's secret store.
A worker that hits a missing-key error wastes a task slot. The setup
script's `check_key` helper aborts cleanly if a required key is missing.
## File map
```
SKILL.md ← this file (workflow + rules)
references/
intake.md ← discovery question banks per style
role-archetypes.md ← role library (writer, designer, animator, …)
tool-matrix.md ← skill + toolset mapping per role
kanban-setup.md ← setup script structure & profile config
monitoring.md ← watch + intervene patterns
examples.md ← six worked pipelines
assets/
brief.md.tmpl ← brief skeleton
setup.sh.tmpl ← setup script skeleton
soul.md.tmpl ← profile personality skeleton
scripts/
bootstrap_pipeline.py ← generate setup.sh from brief + team JSON
monitor.py ← polling + intervention helpers
```
@@ -0,0 +1,79 @@
# Video Brief — {{TITLE}}
> Slug: `{{SLUG}}` · Tenant: `{{TENANT}}` · Project workspace: `{{WORKSPACE}}`
## 1. Concept
**One-line pitch.** {{ONE_LINE_PITCH}}
**Emotional north star.** {{EMOTIONAL_NORTH_STAR}}
*(What should the viewer feel walking away?)*
## 2. Scope
| | |
|---|---|
| Duration | {{DURATION_S}} seconds |
| Aspect ratio | {{ASPECT}} |
| Resolution | {{RESOLUTION}} |
| Frame rate | {{FPS}} fps |
| Target platforms | {{PLATFORMS}} |
| Deadline | {{DEADLINE}} |
| Quality bar | {{QUALITY_BAR}} *(rough draft / polished / archival)* |
## 3. Style
**Visual references.** {{VISUAL_REFS}}
**Tone.** {{TONE}}
**Brand constraints.** {{BRAND_CONSTRAINTS}}
*(colors, typography, motion language; or "n/a")*
**Aesthetic rules.**
{{AESTHETIC_RULES}}
## 4. Scenes
Beat-by-beat breakdown. Each scene gets a row.
| # | Time | Content | Target tool / skill | Audio | Notes |
|---|------|---------|---------------------|-------|-------|
| 1 | 0:000:0X | {{SCENE_1_CONTENT}} | {{SCENE_1_TOOL}} | {{SCENE_1_AUDIO}} | {{SCENE_1_NOTES}} |
| 2 | 0:0X0:0Y | ... | ... | ... | ... |
## 5. Audio
**Approach.** {{AUDIO_APPROACH}}
*(narration / music-only / synced to track / silent / mixed)*
**Voiceover.** {{VO_DETAILS}}
*(provider, voice, language, script source — "n/a" if no VO)*
**Music.** {{MUSIC_DETAILS}}
*(provided track path / commission via Suno / commission via heartmula /
license-free / "n/a")*
**SFX.** {{SFX_DETAILS}}
*(generated, library, or "n/a")*
## 6. Deliverables
| Format | Resolution | Notes |
|--------|-----------|-------|
| {{PRIMARY_FORMAT}} | {{PRIMARY_RES}} | The main output |
| {{ALT_FORMAT_1}} | {{ALT_RES_1}} | {{ALT_NOTES_1}} |
**Final filename.** `output/final.mp4`
*(plus optional `output/final-9x16.mp4`, `output/captions.srt`, etc.)*
## 7. Constraints
- API keys required: {{API_KEYS_REQUIRED}}
- External dependencies: {{EXT_DEPS}}
- Source assets to incorporate: {{SOURCE_ASSETS}}
---
**This brief is the contract. The director and every downstream profile read
it. If the brief changes, the kanban must be re-fired — don't edit live.**
@@ -0,0 +1,185 @@
#!/usr/bin/env bash
# ═══════════════════════════════════════════════════════════════════════
# Video Pipeline Setup — {{TITLE}}
#
# Generated by kanban-video-orchestrator skill.
#
# Slug: {{SLUG}}
# Workspace: {{WORKSPACE}}
# Tenant: {{TENANT}}
# ═══════════════════════════════════════════════════════════════════════
set -euo pipefail
PROJECT_SLUG="{{SLUG}}"
WORKSPACE="$HOME/projects/video-pipeline/${PROJECT_SLUG}"
TENANT="{{TENANT}}"
# ─────────────────────────────────────────────────────────────────────
# 1. Verify required API keys
# ─────────────────────────────────────────────────────────────────────
echo "═══ Checking required API keys ═══"
check_key() {
local var="$1"
local kc_account="${2:-hermes}"
local kc_service="${3:-$1}"
if grep -q "^${var}=" "$HOME/.hermes/.env" 2>/dev/null && \
[ -n "$(grep "^${var}=" "$HOME/.hermes/.env" | cut -d= -f2-)" ]; then
echo " ✓ ${var} (env)"
return 0
fi
if command -v security >/dev/null 2>&1 && \
security find-generic-password -a "${kc_account}" -s "${kc_service}" -w >/dev/null 2>&1; then
echo " ✓ ${var} (Keychain ${kc_account}/${kc_service})"
return 0
fi
echo " ✗ ${var} not set in ~/.hermes/.env or Keychain (${kc_account}/${kc_service})"
return 1
}
# Customize this list per project — only check keys actually used:
{{KEY_CHECKS}}
# ─────────────────────────────────────────────────────────────────────
# 2. Create project workspace
# ─────────────────────────────────────────────────────────────────────
echo "═══ Creating project workspace ═══"
mkdir -p "$WORKSPACE"/{taste,audio/{voiceover,sfx},assets,scenes,checkpoints,tools,output}
{{SCENE_DIRS}}
echo " ✓ $WORKSPACE"
# ─────────────────────────────────────────────────────────────────────
# 3. Create Hermes profiles
# ─────────────────────────────────────────────────────────────────────
echo "═══ Creating Hermes profiles ═══"
{{PROFILE_CREATE_COMMANDS}}
# ─────────────────────────────────────────────────────────────────────
# 4. Configure profiles (toolsets, skills, cwd)
# ─────────────────────────────────────────────────────────────────────
echo "═══ Configuring profiles ═══"
configure_profile() {
local profile="$1"
local toolsets_json="$2" # JSON array string, e.g. '["kanban","terminal","file"]'
local skills_json="$3" # JSON array string, e.g. '["kanban-worker","ascii-video"]'
python3 - "$profile" "$toolsets_json" "$skills_json" "$WORKSPACE" <<'PY'
"""Patch a Hermes profile config.yaml using PyYAML so we don't depend on the
exact default-config string format. Validates the patch took effect and exits
non-zero if anything's off."""
import json
import os
import sys
try:
import yaml
except ImportError:
print("ERROR: PyYAML required. pip install pyyaml", file=sys.stderr)
sys.exit(1)
profile, toolsets_json, skills_json, workspace = sys.argv[1:5]
toolsets = json.loads(toolsets_json)
skills = json.loads(skills_json)
p = os.path.expanduser(f"~/.hermes/profiles/{profile}/config.yaml")
if not os.path.exists(p):
print(f" ✗ profile config not found: {p}", file=sys.stderr)
sys.exit(1)
with open(p) as f:
cfg = yaml.safe_load(f) or {}
# Apply our changes — only the keys we actually want to set.
cfg["toolsets"] = toolsets
cfg.setdefault("skills", {})
cfg["skills"]["always_load"] = skills
# Note: we do NOT touch cfg["approvals"] — that's a security-sensitive
# setting (manual confirmation of tool calls). Workspace cwd is overridden
# per-task by `--workspace dir:<path>` on `hermes kanban create`, so we
# don't need to mutate cfg["terminal"]["cwd"] either.
with open(p, "w") as f:
yaml.safe_dump(cfg, f, sort_keys=False)
# Validate
with open(p) as f:
after = yaml.safe_load(f)
errors = []
if after.get("toolsets") != toolsets:
errors.append(f"toolsets mismatch: {after.get('toolsets')!r}")
if after.get("skills", {}).get("always_load") != skills:
errors.append(f"skills.always_load mismatch: {after.get('skills', {}).get('always_load')!r}")
if errors:
print(f" ✗ {profile}: " + "; ".join(errors), file=sys.stderr)
sys.exit(1)
PY
if [ $? -ne 0 ]; then
echo " ✗ failed to configure ${profile}" >&2
exit 1
fi
echo " ✓ ${profile}"
}
{{PROFILE_CONFIG_COMMANDS}}
# ─────────────────────────────────────────────────────────────────────
# 5. Write SOUL.md per profile
# ─────────────────────────────────────────────────────────────────────
echo "═══ Writing profile personalities ═══"
{{SOUL_WRITES}}
# ─────────────────────────────────────────────────────────────────────
# 6. Copy brief, TEAM.md, and any provided assets
# ─────────────────────────────────────────────────────────────────────
echo "═══ Writing brief + taste ═══"
cat > "$WORKSPACE/brief.md" <<'BRIEF_EOF'
{{BRIEF_CONTENTS}}
BRIEF_EOF
cat > "$WORKSPACE/TEAM.md" <<'TEAM_EOF'
{{TEAM_CONTENTS}}
TEAM_EOF
{{TASTE_WRITES}}
{{ASSET_COPIES}}
# ─────────────────────────────────────────────────────────────────────
# 7. Fire the initial kanban task
# ─────────────────────────────────────────────────────────────────────
echo "═══ Firing initial kanban task ═══"
hermes kanban create "Direct production of {{TITLE}}" \
--assignee director \
--workspace dir:"$WORKSPACE" \
--tenant "$TENANT" \
--priority 2 \
--max-runtime 4h \
--body "$(cat <<EOF
Read brief.md, TEAM.md, and taste/.
Decompose into the team graph defined in TEAM.md.
All child tasks MUST use:
workspace_kind="dir"
workspace_path="$WORKSPACE"
tenant="$TENANT"
Do not execute the work yourself — route every concrete subtask to the
appropriate profile via kanban_create.
EOF
)"
echo ""
echo "═══ Setup complete ═══"
echo ""
echo "Monitor with:"
echo " hermes kanban watch --tenant $TENANT"
echo " hermes kanban list --tenant $TENANT"
echo " hermes dashboard"
echo ""
echo "Workspace: $WORKSPACE"
@@ -0,0 +1,38 @@
# {{ROLE_NAME}}
You are the **{{ROLE_NAME}}** for this video production.
## Project context
- **Brief:** read `brief.md` in your CWD
- **Team graph:** read `TEAM.md` in your CWD
- **Style spec:** read `taste/brand-guide.md` and `taste/emotional-dna.md` in
your CWD
## What you do
{{ROLE_RESPONSIBILITIES}}
## Inputs you read
{{INPUTS_READ}}
## Outputs you produce
{{OUTPUTS_PRODUCED}}
## Tools and skills available
- **Toolsets:** {{TOOLSETS}}
- **Skills loaded:** {{SKILLS}}
- **External APIs / CLIs:** {{EXTERNAL_TOOLS}}
## Rules
{{ROLE_RULES}}
{{COMMON_RULES}}
## Common reference commands
{{COMMON_COMMANDS}}
@@ -0,0 +1,227 @@
# Worked Examples
Six concrete pipelines covering different video styles. Each shows the team
composition, task graph, and skill/tool choices the orchestrator would make
for that brief. **These are illustrative, not templates** — adapt to the
actual brief.
## Example 1 — Narrative short film (text-to-image → image-to-video → cut)
**Brief:** A 90-second noir-style short. A detective walks through a rainy
city. Voiceover narration. AI-generated visuals.
**Team:**
- `director` — vision, decomposition, approval
- `writer` — script + voiceover copy (loads `humanizer` for natural voice)
- `storyboarder` — beat-by-beat shot list (loads `excalidraw`)
- `image-generator` — generates each shot's still via local ComfyUI workflows
(loads `comfyui`)
- `image-to-video-generator` — animates each still (Runway/Kling, OR
ComfyUI's AnimateDiff/WAN workflows via `comfyui`)
- `voice-talent` — narration via ElevenLabs
- `audio-mixer` — VO + ambient pad
- `editor` — assembly + transitions
- `reviewer` — final QA
**Task graph:**
```
T0 director decompose
T1 writer script + voiceover.md (parent: T0)
T2 storyboarder shot list with framing per beat (parent: T1)
T3 image-generator one still per shot (~12 shots) (parent: T2)
T4 image-to-video animate each still (parent: T3)
T5 voice-talent generate narration audio (parent: T1)
T6 audio-mixer mix VO + ambient (parent: T5)
T7 editor cut + transitions + audio mux (parents: T4, T6)
T8 reviewer final QA (parent: T7)
```
**Key choices:**
- Local ComfyUI via `comfyui` skill is preferred over external API for
cost/control — but external APIs are fine if ComfyUI isn't installed
- `editor` profile is ffmpeg-only, no Hermes skill required beyond
`kanban-worker`
- Storyboarder produces `storyboard.excalidraw` alongside the markdown
## Example 2 — Product / marketing teaser
**Brief:** A 30-second product teaser for a developer tool. Shows code +
terminal + UI screen recordings, voiceover, CTA at end. Square 1:1.
**Team:**
- `director`
- `copywriter` — taglines, voiceover script, CTA (loads `humanizer`)
- `concept-artist` — style frames (loads `claude-design` for UI mockups)
- `renderer-motion-graphics` — animated UI sequences (Remotion CLI)
- `renderer-ascii` — terminal-style demo scenes (loads `ascii-video`)
- `voice-talent` — VO via ElevenLabs
- `editor` — assembly + brand-color treatment
- `audio-mixer` — VO + light music bed
- `captioner` — burned subtitles for muted-autoplay platforms
- `masterer` — produces 1:1 + 9:16 + 16:9 variants
**Task graph:**
```
T0 director decompose
T1 copywriter copy.md + cta + vo script (parent: T0)
T2 concept-artist visual-spec.md + style frames (parent: T1)
T3a renderer-motion-graphics scene 1: UI sequence (parent: T2)
T3b renderer-ascii scene 2: terminal demo (parent: T2)
T3c renderer-motion-graphics scene 3: feature highlight (parent: T2)
T3d renderer-motion-graphics scene 4: CTA card (parent: T2)
T4 voice-talent narration (parent: T1)
T5 audio-mixer VO + music bed (parent: T4)
T6 editor cut + transitions (parents: T3*, T5)
T7 captioner SRT + burned subtitles (parent: T6)
T8 masterer 1:1, 9:16, 16:9 variants (parent: T7)
```
**Key choices:**
- Multiple specialized renderers (motion-graphics + ASCII) coexist
- Captioner is included because muted autoplay is the norm on social
- `claude-design` skill for UI mockups maps directly to the product video idiom
## Example 3 — Music video (synced to provided track)
**Brief:** A 3-minute music video for a provided lo-fi hip-hop track. Visuals
should pulse with the beat. Generative + ASCII hybrid. Vertical 9:16.
**Team:**
- `director`
- `music-supervisor` — analyze track, emit `audio/beats.json` (loads `songsee`)
- `storyboarder` — beat-aligned shot list (loads `excalidraw`)
- `renderer-ascii` — ASCII scenes synced to bass kicks (loads `ascii-video`)
- `renderer-p5js` — generative particle scenes synced to highs (loads `p5js`)
- `editor` — beat-cut assembly using `beats.json`
- `reviewer` — sync QA
**Task graph:**
```
T0 director decompose
T1 music-supervisor analyze track → beats.json + spectrogram (parent: T0)
T2 storyboarder shot list aligned to beats (parents: T1, T0)
T3a renderer-ascii scene 1: bass-driven ASCII (parent: T2)
T3b renderer-p5js scene 2: high-end particle field (parent: T2)
... (more scenes)
T4 editor cut to beats + mux track (parents: T3*, T1)
T5 reviewer sync QA + final approval (parent: T4)
```
**Key choices:**
- `music-supervisor` runs FIRST — `beats.json` gates the renderers
- `editor` uses `beats.json` directly to align cuts to bass kicks
- No voice-talent — music is the audio
- Two specialized renderers (`ascii-video` + `p5js`) for visual variety
## Example 4 — Math/algorithm explainer
**Brief:** A 2-minute explainer of an algorithm. 3Blue1Brown-style. Animated
diagrams, equations, narration. Square 1:1.
**Team:**
- `director`
- `writer` — narration script (loads `humanizer`)
- `cinematographer` — visual spec (loads `manim-video`)
- `renderer-manim` — all animated scenes (loads `manim-video`)
- `voice-talent` — narration via ElevenLabs
- `editor` — assembly + audio mux
- `captioner` — burned subtitles
**Task graph:**
```
T0 director decompose
T1 writer script + narration (parent: T0)
T2 cinematographer visual spec for all scenes (parent: T1)
T3a-Tn renderer-manim scenes 1..N (parents: T2)
T4 voice-talent narration audio (parent: T1)
T5 editor cut + mux (parents: T3*, T4)
T6 captioner SRT + burn (parent: T5)
```
**Key choices:**
- `manim-video` skill drives both the cinematographer (visual language) and
the renderer (actual scene production)
- The `manim-video` skill's reference docs (animation-design-thinking,
scene-planning, equations) auto-load when needed via the renderer's pinned skill
## Example 5 — ASCII video, music-track-only
**Brief:** A 60-second pure-ASCII video reactive to an existing track. No
voiceover, no other tools. Square 1:1.
**Team:**
- `director`
- `music-supervisor` — track analysis (loads `songsee`)
- `renderer-ascii` — all visuals (loads `ascii-video`)
- `editor` — assembly + audio mux
**Task graph:**
```
T0 director decompose
T1 music-supervisor analyze track (parent: T0)
T2a renderer-ascii scene 1 (parents: T1, T0)
T2b renderer-ascii scene 2 (parents: T1, T0)
T2c renderer-ascii scene 3 (parents: T1, T0)
T3 editor stitch + mux audio (parents: T2*)
```
**Key choices:**
- Minimal team (4 profiles) for a focused single-tool project
- No reviewer — short experimental piece, director approves directly
- All scenes run through one `renderer-ascii` profile because the `ascii-video`
skill covers everything
This example illustrates the rule: **don't over-decompose**. Three scenes
through one renderer is fine. Don't spawn three renderer profiles.
## Example 6 — Real-time / installation art
**Brief:** A 2-minute audio-reactive visual for a gallery installation. Driven
by an audio input feed. TouchDesigner-based. 16:9 4K.
**Team:**
- `director`
- `cinematographer` — visual language spec (loads `touchdesigner-mcp`)
- `renderer-touchdesigner` — all visuals + record-to-disk
(loads `touchdesigner-mcp`)
- `audio-mixer` — final loudness pass on the captured audio (optional if
pre-mixed source)
- `editor` — assemble final clip from TouchDesigner recording
- `reviewer` — visual QA
**Task graph:**
```
T0 director decompose
T1 cinematographer TD operator graph spec (parent: T0)
T2 renderer-touchdesigner build TD network + record output (parent: T1)
T3 editor trim + audio mux (parent: T2)
T4 reviewer final QA (parent: T3)
```
**Key choices:**
- `touchdesigner-mcp` controls a running TouchDesigner instance — the
cinematographer designs the operator graph, renderer builds it
- Output is a recording from the running TD network, not a render-to-frames
process; editor mostly just trims
## Pattern recognition
When the user describes a video, look for these signals to map to an example:
- **Plot, characters, scripted dialogue** → Example 1 (narrative)
- **Specific product, CTA, brand colors, voiceover** → Example 2 (marketing)
- **Track file provided, "synced to music"** → Example 3 (music video)
- **"Explain how X works", math/algorithm/concept walkthrough** → Example 4 (manim explainer)
- **Terminal aesthetic, ASCII, retro pixel** → Example 5 (ASCII)
- **"Audio-reactive", "real-time", "installation"** → Example 6 (TouchDesigner)
- **Comic-style narrative** → use `renderer-comic` (`baoyu-comic` skill)
- **Retro game / pixel-art aesthetic** → use `renderer-pixel` (`pixel-art` skill)
- **3D scene, photoreal environment** → use `renderer-3d` (`blender-mcp`)
- **Generative art, particle system, shader** → use `renderer-p5js` (`p5js`)
- **AI-generated photoreal stills + animation** → use `renderer-comfyui`
(`comfyui`) for both stills and image-to-video
- **"video about how the system works", recursive demo** → composable from
any of the above; the recursion is a rendering technique, not a style
The actual team should be derived from the specific brief — these examples are
starting points, not endpoints.
@@ -0,0 +1,166 @@
# Intake — Discovery Question Banks
The discovery process is **adaptive**. Always start with three baseline
questions to identify the broad style category, then drill into a per-style
question bank. Ask 2-4 questions at a time, listen, then proceed. Make
reasonable assumptions whenever the user implies an answer.
## Tier 0 — Baseline (always ask)
1. **What is the video?** — One-sentence pitch
2. **How long?** — Approximate duration
3. **Aspect ratio + target platform?** — 16:9 / 9:16 / 1:1 / 4:5; X, IG, YouTube, internal, etc.
From these answers, classify the style category and pick the relevant Tier 1
follow-ups. **Do not** continue asking until you have at least these three.
## Style classification
Map the brief to one of these archetypes (or a hybrid):
| Archetype | Tells |
|-----------|-------|
| **Narrative film** | Plot, characters, scenes-with-events, dialogue, location |
| **Product / marketing** | A specific product or feature being shown / sold; CTA at end |
| **Music video** | A specific track exists; visuals sync to music |
| **Explainer / educational** | A concept being taught; voiceover-driven |
| **Tutorial / changelog** | Software demo, terminal-heavy, technical |
| **ASCII / terminal art** | Retro terminal aesthetic explicit, character-grid |
| **Abstract / loop** | Generative, no plot, often perfect-loop |
| **Documentary / interview cut** | Real footage, transcription-driven |
| **Real-time / installation** | Audio-reactive, gallery installation, VJ output |
If ambiguous, **ask** which category fits — don't guess. Hybrids are common
(e.g., a product video with a narrative arc); decompose into the dominant
mode + secondary modifiers.
**Recursive / meta** ("a video that shows its own production") is a
*rendering technique*, not a separate style — compose it from any of the
above by adding a two-pass render step where pass 2 uses pass 1's output as
texture inside the final scene.
## Tier 1 — Per-style follow-ups
### Narrative film
- **Setting / world?** — When and where the story takes place
- **Characters?** — How many, archetypes, who carries dialogue
- **Beat list or full script?** — Has the user written the story or do we draft it
- **Dialogue language?** — Spoken lines, on-screen subs only, silent
- **Visual generation approach?** — Text-to-image (FAL/Midjourney/Imagen) →
image-to-video (Runway/Kling), 3D animation (Blender), 2D animation,
procedural, or hybrid
- **Voice approach?** — TTS (which voice), recorded VO, no dialogue
- **Music / score?** — Commissioned (via `songwriting-and-ai-music` Suno
prompts, or local `heartmula`), licensed track provided, silent
### Product / marketing
- **Product?** — Name, what it does, key feature being shown
- **Target audience?** — Who's watching, what they care about
- **CTA?** — Visit URL, install, sign up, etc.
- **Tone?** — Serious, playful, technical, premium, edgy
- **Brand assets available?** — Logo files, color palette, fonts, existing footage
- **Animation style?** — Motion graphics (Remotion / AE-style), screen recording,
generative, illustrated
- **Voiceover?** — Yes (which voice / language) or text-only
- **Music?** — Track provided, license-free needed, custom-composed
### Music video
- **Track file?** — Path to the audio (essential — we'll analyze BPM + beats)
- **Track length to use?** — Full song or a section
- **Genre / energy?** — Tells what visual rhythm and density to use
- **Lyric / narrative content?** — Are there lyrics to render on screen,
or is it purely visual?
- **Visual reference style?** — Existing music videos / artists for reference
- **Performer footage?** — None, has clips, will provide
- **Visual generation approach?** — Per-beat generative, edit-driven cuts of stock
footage, illustrated, hybrid
### Explainer / educational
- **What concept is being taught?** — One-sentence concept, key takeaway
- **Audience expertise?** — Beginner / intermediate / expert
- **Diagram density?** — Heavy math / formulas / code / abstract concepts
- **Voiceover?** — TTS / recorded / on-screen text only
- **Tool preference?**`manim-video` (math), `p5js` (generative),
Remotion (UI motion graphics), `comfyui` (AI-generated visuals),
`ascii-video` (technical/retro), hybrid
- **Pacing?** — Fast and dense (3Blue1Brown) or slow and contemplative
### Tutorial / changelog / software demo
- **Software being demonstrated?** — Name, what it does
- **Demo script?** — Sequence of commands / screens to show
- **Terminal-only or with GUI?**
- **Voiceover for narration?**
- **Diagram support needed?** — Often these benefit from a diagram skill
alongside the screen-capture/render step (`excalidraw`,
`architecture-diagram`, `concept-diagrams`)
### ASCII / terminal art
- **Source material?** — Generative / driven by audio / converting existing
video / static image starting point
- **Color palette?** — Brand-driven (gold/black/blue), Matrix green, full
rainbow, monochrome
- **Audio reactivity?** — None / loose mood / tight beat sync / FFT-driven
- **Character set?** — ASCII only / Unicode block-drawing / mystic glyphs
- **Loop or narrative?** — Perfect loop or one-shot
### Abstract / loop
- **Mood / emotion?** — One word that captures the feel
- **Motion type?** — Zoom-into-itself, particle drift, wave, geometric, organic
- **Loop required?** — Perfect loop (Droste-style) or just satisfying ending
- **Audio?** — Silent, ambient pad, beat-synced
### Documentary / interview cut
- **Source footage?** — Provided clips, length per clip
- **Transcript / subtitles?** — Provided or to be generated
- **Story structure?** — Chronological / thematic / arc
- **B-roll approach?** — Generated, stock library, none
### Real-time / installation
- **Output environment?** — Gallery wall, projector, screen, web embed
- **Audio source?** — Live audio input, pre-recorded track, both
- **Reactivity tightness?** — Mood-level (loose) vs. tight beat-sync vs. live
parameter control
- **Tool preference?**`touchdesigner-mcp` for full TD operator graphs;
`p5js` for web-canvas; `comfyui` for generative-AI fed by audio features
## Tier 2 — Always ask near the end
- **Brand assets path?** — Where logo / color palette / fonts / music library lives
- **Output format requirements?** — Codec preference, target file size, accepted
alternates (vertical cut, GIF, audio-only)
- **Deadline?** — Affects task `max_runtime_seconds` and acceptable scope
- **Quality bar?** — Rough draft for review / polished final / archival
- **Existing footage / assets to reuse?** — Anything that should appear, not just inform
## Reasonable assumption defaults
When the user under-specifies, fill in these defaults rather than asking:
| Question | Default |
|----------|---------|
| Frame rate | 30 fps for X / IG; 60 fps for tutorials/explainers; 24 fps for narrative film |
| Resolution | 1080×1080 for square, 1920×1080 for 16:9, 1080×1920 for 9:16 |
| Codec | H.264 / yuv420p, CRF 18 |
| Audio codec | AAC 192 kbps |
| Voice | Provider's mid-range neutral voice unless brand calls for distinctive timbre |
| Music | Silent (require user to specify if music is wanted) |
| Captions | On for explainer/tutorial; off for narrative/abstract unless requested |
| Quality bar | Polished final unless user says draft |
State the assumption explicitly: *"Assuming 30fps and AAC audio unless you say otherwise — proceed?"*
## Anti-patterns
- **Asking 10 questions at once.** Maximum 4 per turn.
- **Asking for things the brief already implies.** If the user said "music video for my track," do not ask "is there a track?"
- **Failing to classify before drilling in.** Tier-1 questions depend on classification; mixing them up wastes turns.
- **Treating "make a video" as enough to proceed.** Always confirm the three baseline questions.
@@ -0,0 +1,276 @@
# Kanban Setup — Project Bootstrap & Profile Configuration
Once the brief is locked and the team is designed, the next step is producing
the actual `setup.sh` that creates the project workspace, configures Hermes
profiles, and fires the initial kanban task.
This file documents the patterns. The companion script
`scripts/bootstrap_pipeline.py` automates most of it from a structured input
JSON.
> **Credit:** the single-project-workspace layout, profile-config patching
> approach, SOUL.md-per-profile convention, and `--workspace dir:<path>` rule
> are adapted from alt-glitch's original multi-agent video pipeline:
> [NousResearch/kanban-video-pipeline](https://github.com/NousResearch/kanban-video-pipeline).
> This skill generalizes those patterns across video styles and replaces the
> string-replacement config patcher with a PyYAML-based one.
## Project workspace structure
Every video project gets one workspace under `~/projects/video-pipeline/<slug>/`:
```
~/projects/video-pipeline/<slug>/
├── brief.md ← the contract; all tasks reference
├── TEAM.md ← team composition + task graph (director reads this)
├── taste/
│ ├── brand-guide.md ← color, typography, motion rules
│ ├── emotional-dna.md ← what the piece should FEEL like
│ └── style-frames/ ← optional: visual references
├── audio/
│ ├── track.mp3 ← provided music (if any)
│ ├── voiceover/ ← per-line TTS clips
│ └── sfx/ ← sound effects
├── assets/
│ ├── logos/
│ ├── fonts/
│ └── existing-footage/ ← reusable provided clips
├── scenes/
│ ├── scene-01/
│ │ ├── VISUAL_SPEC.md ← cinematographer's per-scene spec
│ │ ├── render.py ← renderer's code (or sketch.html, etc.)
│ │ ├── checkpoints/ ← preview frames for QA
│ │ └── clip.mp4 ← the deliverable for this scene
│ ├── scene-02/...
│ └── ...
├── checkpoints/ ← global review frames
├── tools/ ← optional project-local helpers
└── output/
├── final.mp4 ← stitched + audio
├── final-noaudio.mp4
├── final-9x16.mp4 ← optional: vertical alternate
└── captions.srt ← optional: subtitle file
```
**The slug** is derived from the brief title: lowercase, hyphen-separated.
Example: `q3-product-teaser`, `ascii-mood-loop`, `interview-cut-2026-q1`.
## The setup.sh script
The setup script does six things in order:
1. **Create workspace tree** — all directories above
2. **Create profiles**`hermes profile create <name> --clone`
3. **Configure profiles** — patch each profile's
`~/.hermes/profiles/<name>/config.yaml` to set toolsets, always_load skills,
and `cwd`
4. **Write SOUL.md per profile** — the personality + role definition
5. **Copy any provided assets + write `brief.md`, `TEAM.md`, and `taste/`**
6. **Fire the initial kanban task**`hermes kanban create` assigned to the director
See `assets/setup.sh.tmpl` for the skeleton.
### Profile creation pattern
```bash
hermes profile create director --clone 2>/dev/null || true
```
The `--clone` flag clones from the active profile (preserving model, base
config). The `|| true` makes the script idempotent — re-running won't error if
the profile already exists.
### Profile config patching
Each profile has a YAML config at `~/.hermes/profiles/<name>/config.yaml`. The
setup script edits exactly two keys:
1. `toolsets:` — replace the default with the role's required toolsets
2. `skills.always_load:` — list the role's must-load skills (may be empty)
**Do NOT** modify `approvals.mode` (controls user-confirmation of tool calls
— a security setting that must stay as the user configured it). **Do NOT**
modify `terminal.cwd` — the kanban dispatcher overrides cwd per-task via
`--workspace dir:<path>`, so the profile's cwd is irrelevant to the kanban
work and changing it could break the user's interactive use of the profile.
Use **PyYAML**, not string replacement, so the patch is robust against
default-config schema drift:
```bash
configure_profile() {
local profile="$1"
local toolsets_json="$2" # JSON array, e.g. '["kanban","terminal","file"]'
local skills_json="$3" # JSON array, e.g. '["kanban-worker","ascii-video"]'
python3 - "$profile" "$toolsets_json" "$skills_json" <<'PY'
import json, os, sys, yaml
profile, ts_json, sk_json = sys.argv[1:4]
p = os.path.expanduser(f"~/.hermes/profiles/{profile}/config.yaml")
with open(p) as f:
cfg = yaml.safe_load(f) or {}
cfg["toolsets"] = json.loads(ts_json)
cfg.setdefault("skills", {})["always_load"] = json.loads(sk_json)
with open(p, "w") as f:
yaml.safe_dump(cfg, f, sort_keys=False)
PY
}
```
PyYAML must be installed in the user's Python (it ships with most Hermes
installs). If absent: `pip install pyyaml`.
The setup script should also **validate** the patch by re-reading the file
and comparing — see `assets/setup.sh.tmpl` for the validation pattern.
### SOUL.md per profile
Each profile gets a `SOUL.md` at `~/.hermes/profiles/<name>/SOUL.md` that
defines its role, voice, and rules. See `assets/soul.md.tmpl` for the
template. Customize per role and per project.
The director's SOUL.md should be the most opinionated — its voice flavors
the entire production. **Critical content for the director's SOUL.md:**
- **Anti-temptation rules:** "Do not execute the work yourself. For every
concrete task, create a kanban task and assign it. Decompose, route, comment,
approve — that's the whole job." (The `kanban-orchestrator` skill provides
the deeper playbook; load it.)
- **Decomposition steps:** Read `brief.md`, `TEAM.md`, `taste/`. Use the team
graph in `TEAM.md` to fan out tasks.
- **The workspace_path rule** (see below).
Other profiles' SOUL.md is briefer; mostly mechanical: who you are, what you
read, what you produce, what skills/tools to use, where to write outputs.
Most non-director profiles should `always_load: kanban-worker` for the
deeper-than-baseline kanban guidance.
### Initial kanban task
The final action of setup.sh is firing the kanban:
```bash
hermes kanban create "Direct production of <video title>" \
--assignee director \
--workspace dir:"$HOME/projects/video-pipeline/${PROJECT_SLUG}" \
--tenant ${PROJECT_SLUG} \
--priority 2 \
--max-runtime 4h \
--body "$(cat <<EOF
Read brief.md, TEAM.md, and taste/.
Decompose into the team graph defined in TEAM.md.
All child tasks MUST use:
workspace_kind="dir"
workspace_path="$HOME/projects/video-pipeline/${PROJECT_SLUG}"
tenant="${PROJECT_SLUG}"
EOF
)"
```
The `--workspace dir:<path>` flag is **critical** — it tells the kanban that
all child tasks share this workspace. Skipping or using `worktree` will
isolate profiles and break artifact sharing.
## The TEAM.md file
Alongside `brief.md`, write a `TEAM.md` that the director reads. It documents
the team composition + task graph the orchestrator should follow. This
removes ambiguity and prevents the director from inventing extra steps.
Example structure (for an ASCII video with a music supervisor and editor):
```markdown
# Team & Task Graph — <video title>
## Team
- `director` (this profile) — vision, decomposition, approval
- `cinematographer` — visual spec, quality review (loads `ascii-video`)
- `renderer-ascii` — ASCII scenes (loads `ascii-video`)
- `music-supervisor` — track analysis (loads `songsee`)
- `voice-talent` — narration (uses ElevenLabs API)
- `audio-mixer` — final mix (ffmpeg)
- `editor` — assembly (ffmpeg)
- `reviewer` — final QA gate
## Task Graph
T0: this task — decompose
├── T1: cinematographer "Design visual language" (parent: T0)
│ │
│ ├── T2a: renderer-ascii "Scene 1 — title card" (parent: T1)
│ ├── T2b: renderer-ascii "Scene 2 — main beat" (parent: T1)
│ ├── T2c: renderer-ascii "Scene 3 — outro" (parent: T1)
├── T3: music-supervisor "Analyze track + emit beats.json" (parent: T0)
├── T4: voice-talent "Generate narration" (parent: T0)
├── T5: audio-mixer "Mix VO + bg music" (parents: T3, T4)
├── T6: editor "Assemble cut + mux audio" (parents: T2*, T5)
└── T7: reviewer "Final QA" (parent: T6)
```
The director turns this into actual `kanban_create` calls.
## API-key prerequisites check
Before firing the kanban, verify required keys are available. Check both
`~/.hermes/.env` and macOS Keychain (if on macOS):
```bash
check_key() {
local var="$1"
local kc_account="$2"
local kc_service="$3"
if grep -q "^${var}=" ~/.hermes/.env 2>/dev/null && \
[ -n "$(grep "^${var}=" ~/.hermes/.env | cut -d= -f2-)" ]; then
return 0
fi
if command -v security >/dev/null 2>&1 && \
security find-generic-password -a "${kc_account}" -s "${kc_service}" -w >/dev/null 2>&1; then
return 0
fi
echo "ERROR: ${var} not set in ~/.hermes/.env or Keychain (${kc_account}/${kc_service})"
return 1
}
check_key ELEVENLABS_API_KEY hermes ELEVENLABS_API_KEY || exit 1
check_key OPENROUTER_API_KEY hermes OPENROUTER_API_KEY || exit 1
# ...
```
If a key is missing, the script aborts with a clear message rather than
firing a kanban that will hit credential errors mid-execution.
## Critical rules
1. **`workspace_kind="dir"` + `workspace_path="<absolute>"` on every kanban_create.** Otherwise profiles can't share artifacts.
2. **Tenant every task.** `--tenant <project-slug>` keeps the dashboard scoped
and prevents cross-pollination with other ongoing kanbans.
3. **Idempotency keys.** For tasks that should not duplicate on re-run (e.g.,
setup creating profiles), use the `idempotency_key` argument or check
existence first.
4. **`max_runtime_seconds` per task.** Renderers that get stuck eat compute.
Standard defaults:
- Renderer task: 1800s (30min)
- Editor task: 600s (10min)
- Voice-talent task: 300s (5min)
- Image-generator task: 600s (10min)
- Image-to-video-generator task: 900s (15min)
5. **Heartbeats for long renders.** Tasks expected to run >5min should emit
`kanban_heartbeat` periodically with progress. Renderers should report
frame counts; the editor should report assembly progress.
6. **The `audio/` and `taste/` dirs are populated BEFORE firing the kanban.**
Don't ask the director's pipeline to source these — copy at setup time.
7. **`brief.md` is read-only after setup.** If the brief changes during
execution, that's a significant pivot — re-fire the kanban rather than edit
live.
@@ -0,0 +1,180 @@
# Monitoring — Watch the Pipeline + Intervene
After `setup.sh` fires the kanban, the work runs autonomously. The role of
this skill in the execution phase is to help the user (and the AI overseeing
the session) detect problems early and intervene effectively.
## Live monitoring commands
```bash
# Live event stream — task spawns, status changes, heartbeats, completions
hermes kanban watch --tenant <project-slug>
# Snapshot of the board
hermes kanban list --tenant <project-slug>
hermes kanban list --tenant <project-slug> --json # machine-readable
# Per-status counts + oldest-ready age
hermes kanban stats --tenant <project-slug>
# Visual dashboard (browser)
hermes dashboard
# Inspect a specific task (includes comments + events)
hermes kanban show <task-id>
# Follow a single task's event stream
hermes kanban tail <task-id>
```
Verify available subcommands with `hermes kanban --help` — the kanban CLI
ships with `init / create / list / show / assign / link / unlink / claim /
comment / complete / block / unblock / archive / tail / dispatch / watch /
stats / heartbeat / log / runs / context / gc`.
The companion `scripts/monitor.py` polls the kanban via the CLI and surfaces
common issues (stuck tasks, missing heartbeats, repeated retries, dependency
deadlocks).
## What to watch for
### Healthy pipeline indicators
- Tasks transition `READY → RUNNING → DONE` in roughly the expected order
- Renderers emit periodic `kanban_heartbeat` events with progress (e.g. "frame
240/720")
- Each task's runtime is well under its `max_runtime_seconds` cap
- No task accumulates more than 1 retry
- Dependency arrows resolve (children unblock as parents complete)
### Warning signs
| Symptom | Likely cause | Action |
|---------|--------------|--------|
| Task RUNNING but no heartbeat in 2+ min | Worker stuck, infinite loop, blocked on input | `hermes kanban show <id>` — read the worker's last events. The dispatcher SIGTERMs tasks that exceed their `max-runtime`; if you need to stop one earlier, `hermes kanban block <id>` then `hermes kanban archive <id>`, and create a re-run task. |
| Same task retried 2+ times | Reproducible failure (missing key, bad spec, broken tool) | `hermes kanban show <id>` to read failure events. Fix root cause before re-running. |
| RUNNING longer than max_runtime | Task is slow but progressing OR genuinely stuck | Check heartbeats with `hermes kanban tail <id>`. If progressing, the dispatcher will SIGTERM eventually anyway — raise `max-runtime` on a re-created task. |
| Child task READY but parents still RUNNING for >2× expected | Cascade slow, dependency miswired | Check the dependency graph. Inspect the parent: sometimes it completed but its handoff fields (summary, metadata) were empty so the child has nothing to consume. |
| New tasks not appearing | Director is hung in decomposition | Inspect director task with `kanban show`. Often a malformed `kanban_create` call. |
| Specialist tasks completing instantly | Decomposition created tasks without bodies | Director didn't pass enough context. Re-create with explicit body content. |
| Tasks created but never picked up | Profile not running, or tenant mismatch, or dispatcher not running | Check `hermes profile list` (profile exists?), `hermes status` (gateway/dispatcher up?), and verify tenant. |
| Specific renderer task fails → review note → renderer redoes → fails again | Brief is asking for the impossible | Pivot the brief, not the renderer. |
## Intervention recipes
### Rejecting bad output
When a renderer ships a clip that doesn't pass review:
```bash
# 1. Comment on the renderer's task with specific feedback
hermes kanban comment <renderer-task-id> "Scene 3 looks too sparse \
— increase visual density. Tighten color palette to brand spec."
# 2. Create a re-render task with the original as parent
hermes kanban create "Scene 3 — re-render with feedback" \
--assignee renderer-ascii \
--parent <renderer-task-id> \
--workspace dir:"$HOME/projects/video-pipeline/<slug>" \
--tenant <slug> \
--skill ascii-video \
--max-runtime 30m
```
### Adding a new dependency mid-flight
When the editor needs an asset that wasn't originally planned (e.g., a captions
file):
```bash
# 1. Create the new task and capture its id
NEW_TASK_ID=$(hermes kanban create "Generate SRT captions from voiceover" \
--assignee captioner \
--workspace dir:"$HOME/projects/video-pipeline/<slug>" \
--tenant <slug> \
--json | python3 -c "import json,sys;print(json.load(sys.stdin)['id'])")
# 2. Wire it as a parent of the editor's task with `kanban link`
hermes kanban link "$NEW_TASK_ID" <editor-task-id>
```
`kanban link` takes `parent_id child_id` (parent first). Use `kanban unlink`
to remove a dependency.
### Stopping a worker that's stuck
The kanban dispatcher will SIGTERM (then SIGKILL) any task that exceeds its
`--max-runtime` automatically. To stop one sooner:
```bash
# Mark blocked so the dispatcher leaves it alone, then archive
hermes kanban block <task-id>
hermes kanban archive <task-id>
# Diagnose what happened
hermes kanban show <task-id> # task body, comments, recent events
hermes kanban tail <task-id> # follow the live event stream
hermes kanban log <task-id> # worker process log
```
After stopping, decide: fix root cause + re-create the task, or skip and
adjust dependent tasks.
### Pivoting the brief
If during execution the user wants something fundamentally different:
1. Cancel the active director task and all RUNNING children
2. Edit `brief.md` and `TEAM.md`
3. Re-fire the initial `hermes kanban create` for the director
Don't try to "edit while running" — the kanban's audit trail makes a clean
pivot more legible than mid-stream changes.
## Periodic check-in script
A simple polling pattern for hands-off monitoring:
```bash
while true; do
clear
hermes kanban list --tenant <slug>
echo "---"
hermes kanban stats --tenant <slug>
sleep 30
done
```
For a live event feed, run `hermes kanban watch --tenant <slug>` in a
separate terminal — it streams task lifecycle events as they happen.
For automated intervention (auto-restart stuck tasks, auto-create re-render on
review failure), see the `scripts/monitor.py` patterns.
## When to call it done
The pipeline is finished when:
1. All RENDER tasks complete and pass review
2. The editor's `output/final.mp4` exists and `ffprobe` confirms expected
duration + streams
3. The reviewer (if present) has approved
4. Optional masterer variants exist
At this point, present the final.mp4 path to the user along with any review
notes. Do NOT delete the workspace — the user may want to iterate on a single
scene without re-running the whole pipeline.
## Common gotchas
- **Tenant mismatches.** A task created with the wrong tenant won't appear in
monitoring. Always pass `--tenant <slug>` consistently.
- **Profile process not running.** Tasks queue indefinitely in READY if no
worker for that profile is online. Check `hermes profile list` and start
any missing profiles.
- **Workspace permissions.** All profiles need read+write to the workspace
directory. `chmod -R u+rw <workspace>` if any worker reports permission
errors.
- **Audio/visual sync.** The editor's clip stitching must match the
renderer's actual output durations. Don't hardcode scene durations in
the editor — read from the renderer's handoff metadata.
@@ -0,0 +1,298 @@
# Role Archetypes
The library of role archetypes for video production. **Compose a team from this
list, don't clone a fixed roster.** Most videos need 4-7 profiles. The director
is always present; everything else is conditional on the brief.
Each role's profile name is by convention `kebab-case` (e.g. `creative-director`,
`image-generator`). Multiple instances of the same role get descriptive suffixes
when they need different focus (e.g., `renderer-ascii`, `renderer-3d`).
For toolset + skill mapping per role, see [tool-matrix.md](tool-matrix.md).
## Always present
### director
The vision-holder. Reads the brief and brand guide, decomposes into a task
graph, comments to steer creative direction, approves the final cut.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-orchestrator`. The kanban plugin auto-injects baseline
orchestration guidance for free; `kanban-orchestrator` is the deeper
decomposition playbook. Add `creative-ideation` if the brief is wide-open
and needs framing help.
- **Personality:** Tied to the brand voice — see `assets/soul.md.tmpl`
The director has the same toolset as everyone else, but its `SOUL.md` rules
**forbid** execution. The "decompose, don't execute" discipline is enforced
by personality + the kanban-orchestrator skill, not by missing tools.
## Pre-production roles
Pick based on what the brief needs.
### writer / screenwriter
Writes scripts, dialogue, voiceover copy, narration. Use for any video with
spoken or written words beyond a tagline.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker`, `humanizer` (post-process to strip AI-tells)
- **Outputs:** `script.md`, `narration.md`, `dialogue/scene-NN.md`
### copywriter
Like `writer` but specifically for marketing copy: taglines, CTAs, voiceover
scripts for product videos.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker`, `humanizer`
- **Outputs:** `copy.md`
### concept-artist / visual-designer
Develops the visual identity: mood board, style frames, color palette
rationale, typography choices. Produces a `visual-spec.md` that all generators
follow. Often produces still reference frames using image-generation APIs or
local skills.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker` plus any project-specific design skill —
`claude-design` (UI/web), `sketch` (quick mockup variants),
`popular-web-designs` (matching known web aesthetic), `pixel-art` (retro),
`ascii-art` (terminal/retro), `excalidraw` (hand-drawn frames),
`design-md` (text-based design docs)
- **Outputs:** `visual-spec.md`, `taste/style-frames/*.png`
### storyboarder
Maps the brief to a beat-by-beat shot list with timing. Critical for narrative
film and music video. Often pairs with a diagramming tool.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker` plus a diagram skill — `excalidraw` (sketch),
`architecture-diagram` (technical/system), `concept-diagrams` (educational/
scientific)
- **Outputs:** `storyboard.md` with one row per scene/shot, optional
storyboard sketches
### cinematographer / dp
Designs the visual language: framing, color, motion, transitions. Reviews
generator output for visual consistency. Hands off per-scene `VISUAL_SPEC.md`.
- **Toolsets:** kanban, terminal, file, video, vision
- **Skills:** `kanban-worker` plus the visual skill that matches the project
(e.g., `ascii-video` for ASCII work, `manim-video` for explainers,
`touchdesigner-mcp` for real-time visuals, etc.)
- **Outputs:** `scenes/scene-NN/VISUAL_SPEC.md`, review comments on renderer
tasks
- **Reviews via:** `video_analyze` (sends full clip to multimodal LLM for
native review), `vision_analyze` for spot-checking frames, ffprobe summaries
## Production roles
### renderer (generic)
A worker that produces visual content for one or more scenes. Loaded with
whichever creative skill fits the scene's style. Multiple renderers can run in
parallel, each pinned to a different skill via `always_load` in their profile
or `--skill` on the task.
- **Toolsets:** kanban, terminal, file
- **Skills:** one creative skill (see specialized variants below)
- **Outputs:** `scenes/scene-NN/clip.mp4`
### Specialized renderer variants
When scenes need very different tools, create specialized renderer profiles
instead of overloading one. Each loads a different creative skill.
| Variant | Skill | Best for |
|---------|-------|----------|
| `renderer-ascii` | `ascii-video` | Terminal aesthetic, retro pixel, audio-reactive grid, video-to-ASCII conversion |
| `renderer-manim` | `manim-video` | Math, algorithms, 3Blue1Brown-style explainers, equation derivations |
| `renderer-p5js` | `p5js` | Generative art, particles, shaders, organic motion, web-canvas content |
| `renderer-comfyui` | `comfyui` | AI-generated stills + video using local ComfyUI workflows (img-to-img, img-to-video, etc.) |
| `renderer-touchdesigner` | `touchdesigner-mcp` | Real-time, audio-reactive, installation art, VJ-style content |
| `renderer-3d` | `blender-mcp` *(optional)* | 3D modeling, animation, photoreal environments, character animation |
| `renderer-pixel` | `pixel-art` | Retro game aesthetic with era-correct palettes |
| `renderer-comic` | `baoyu-comic` | Knowledge-comic style narrative scenes |
| `renderer-meme` | `meme-generation` *(optional)* | Meme-style stills for satirical/social content |
| `renderer-procedural` | (none — Python with PIL + ffmpeg directly) | Custom procedural content where no skill fits |
| `renderer-video` | (external image-to-video API: Runway / Kling / Luma) | Animating still images in narrative film |
| `renderer-motion-graphics` | (external — Remotion CLI) | Motion graphics, kinetic typography, UI animations |
For external-API renderers, the profile holds the API client logic; only
`kanban-worker` is loaded, plus the terminal toolset and the API key.
### image-generator
Specifically for text-to-image generation. Often produces stills that go to
`renderer-video` for animation.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, optionally `comfyui` (drives a local
ComfyUI install for image generation)
- **External APIs (alternative to local ComfyUI):** FAL, Replicate, OpenAI
Images, Midjourney
- **Outputs:** `scenes/scene-NN/stills/*.png`
### image-to-video-generator
Takes still images and animates them via Runway/Kling/Luma APIs, or via
ComfyUI's image-to-video workflows locally. Almost always follows
`image-generator` in narrative film pipelines.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, optionally `comfyui` (for local image-to-video
workflows like AnimateDiff or WAN)
- **External APIs:** Runway, Kling, Luma, Pika
- **Outputs:** `scenes/scene-NN/clip.mp4`
### music-supervisor
Sources, analyzes, and prepares the music track. For music videos, also
produces a beat/BPM map and key-moment timestamps. Uses `songsee` for
spectrograms when the editor or renderer needs a visual reference of the
audio's energy.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, `songsee` (audio visualization), plus one of:
- `songwriting-and-ai-music` — when commissioning lyrics + Suno prompts
- `heartmula` — when generating music with the open-source local model
- `spotify` — when sourcing existing tracks
- **Outputs:** `audio/track.mp3`, `audio/beats.json`, optional
`audio/track-spectrogram.png`
### voice-talent / narrator
Generates voiceover audio. Calls a TTS API directly; no Hermes skill required
beyond `kanban-worker`. The user can also supply pre-recorded VO instead of
generation.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External APIs:** ElevenLabs, OpenAI TTS, etc.
- **Outputs:** `audio/voiceover/line-NN.mp3`, `audio/voiceover/timeline.mp3`
### foley / sfx-designer
Sound effects and ambient design. Often optional unless the brief calls for
sound design specifically.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, `songsee` for audio-feature visualization when
designing to a track
- **Outputs:** `audio/sfx/*.mp3`
## Post-production roles
### editor
Assembles the final cut from clips. Uses ffmpeg for stitching, fades,
transitions. Reviews each clip for pacing and quality before assembly.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External tools:** ffmpeg, ffprobe
- **Outputs:** `output/final.mp4`, `output/final-noaudio.mp4`
### colorist
Color grading. Usually optional — if the renderers already produce
brand-consistent output and the editor just stitches, the colorist is overkill.
Worth including for narrative film with hero shots.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **Outputs:** `output/final-graded.mp4`
### audio-mixer
Mixes voiceover + music + SFX into a final audio track. Sets levels, ducks
music under VO, normalizes loudness (LUFS).
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External tools:** ffmpeg with `loudnorm` filter, optional `sox`
- **Outputs:** `audio/final-mix.mp3`
### captioner
Burns subtitles into the video, generates SRT, handles accessibility. Can also
generate captions from audio via Whisper.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External tools:** Whisper (CLI or API), ffmpeg subtitle filters
- **Outputs:** `output/captions.srt`, `output/final-captioned.mp4`
### masterer
Final encode + format variants. Produces deliverables for each platform target
(square for IG, vertical for TikTok, full HD for YouTube, etc.).
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **Outputs:** `output/final-1080.mp4`, `output/final-9x16.mp4`, etc.
## QA roles
### reviewer
A neutral quality gate. Reads the brief, watches the cut, comments
specifically on what's off (pacing, sync, brand alignment, technical
quality). Distinct from the cinematographer (who reviews visuals during
production) and the editor (who reviews for assembly).
- **Toolsets:** kanban, terminal, file, video, vision
- **Skills:** `kanban-worker`
- **Review tools:** `video_analyze` (native clip review via multimodal LLM),
`vision_analyze` (frame/thumbnail review), ffprobe
- **Outputs:** `review-notes.md`, comments on tasks
### brand-cop
Reviews specifically for brand compliance — colors, typography, voice. Use
when the brand guidelines are detailed and a generic reviewer might miss
violations.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker`
- **Outputs:** comments + `brand-review.md`
## Composing teams — heuristics
- **Always:** director + at least one renderer + editor.
- **Add writer** if scripted dialogue / narration / on-screen text exceeds a
tagline.
- **Add storyboarder** if the brief has more than 5 distinct beats and the
director hasn't already laid out a beat list.
- **Add cinematographer** if multiple renderer instances need consistent
visual language. (For a single-tool video, the renderer's own skill spec
is enough.)
- **Add image-generator + image-to-video-generator pair** for narrative film
with photorealistic visuals.
- **Add music-supervisor** when music is provided and rhythm matters
(music videos always; explainers sometimes).
- **Add voice-talent** for any voiceover / narrative dialogue.
- **Add audio-mixer** when there are 2+ audio sources (VO + music, music + SFX).
- **Add captioner** for accessibility-priority projects (explainer, tutorial,
any platform that defaults to muted playback).
- **Add reviewer** for high-stakes projects. Skip for quick experimental loops.
- **Add masterer** when multiple platform deliverables are needed.
## Anti-patterns
- **One renderer doing everything.** If scenes use very different tools
(ASCII + 3D + motion graphics), use specialized renderer variants. The
renderer loads ONE creative skill at a time; mixing styles in a single
renderer causes thrashing.
- **A separate profile per scene.** No. Profiles are per-role, not per-scene.
Eight scenes use one or two renderer profiles, not eight.
- **A "general" profile that does everything.** Worse than no specialization.
The kanban routing breaks down if every task fits every profile.
- **No reviewer for important deliverables.** Saves an hour of pipeline time
but ships flaws.
@@ -0,0 +1,317 @@
# Tool Matrix — Skills + Toolsets per Role
Maps each role archetype to the Hermes skills it should `always_load` and the
toolsets it needs. Only references skills that ship in the public hermes-agent
repository (under `skills/` or `optional-skills/`). External APIs and CLIs are
called from the terminal toolset; they don't appear in `always_load`.
## Hermes skills relevant to video production
### Visual / rendering skills (`hermes-agent/skills/creative/`)
| Skill | What it does | Best fit for |
|-------|--------------|--------------|
| `ascii-video` | Production pipeline for ASCII art video — generative, audio-reactive, video-to-ASCII | Renderer for ASCII / terminal / retro pixel content; cinematographer for ASCII projects |
| `ascii-art` | Static ASCII art generation | Concept artist for ASCII style frames; secondary tool for ASCII renderer |
| `manim-video` | Manim CE animations — math, algorithms, 3Blue1Brown-style explainers | Renderer for math, algorithm walkthroughs, technical concept explainers |
| `p5js` | p5.js sketches — generative art, shaders, interactive, 3D | Renderer for generative art, particle systems, organic motion, web-canvas content |
| `comfyui` | Generate images, video, audio with ComfyUI workflows (image-to-image, image-to-video, etc.) | image-generator, image-to-video-generator, or general renderer for AI-generated content |
| `touchdesigner-mcp` | Control a running TouchDesigner instance — real-time visuals, audio-reactive installation art, VJ | Renderer for real-time/audio-reactive content; installation art; live performance |
| `blender-mcp` *(optional)* | Control Blender 4.3+ via MCP — 3D modeling, animation, rendering | Renderer for 3D scenes, photoreal environments, character animation |
| `pixel-art` | Pixel art with era palettes (NES, Game Boy, PICO-8) | Renderer for retro game aesthetic; concept artist for pixel-style frames |
| `baoyu-comic` | Knowledge-comic generation (educational, biography, tutorial) | Renderer for comic-style narrative; explainer in panel form |
| `baoyu-infographic` | Infographic generation | Renderer for data-driven explainer scenes |
| `meme-generation` *(optional)* | Generate meme images by overlaying text on templates | Generator for satirical/social content; meme-style stills |
### Design / pre-production skills (`hermes-agent/skills/creative/`)
| Skill | What it does | Best fit for |
|-------|--------------|--------------|
| `claude-design` | Design one-off HTML artifacts (landing, deck, prototype) | Concept artist for product video style frames; storyboarder for UI-heavy content |
| `design-md` | Design markdown docs | Concept artist documenting visual specs |
| `popular-web-designs` | Reference patterns for popular web designs | Concept artist; cinematographer when matching a known UI aesthetic |
| `sketch` | Throwaway HTML mockups (2-3 design variants to compare) | Concept artist exploring directions; storyboarder for UI flows |
| `excalidraw` | Excalidraw-style hand-drawn diagrams | Storyboarder; concept artist for sketch-style frames |
| `architecture-diagram` | Software architecture diagrams | Storyboarder for technical content; explainer scenes about systems |
| `concept-diagrams` *(optional)* | Flat, minimal SVG diagrams (educational visual language; physics, chemistry, math, anatomy, etc.) | Renderer / storyboarder for explainer scenes with clean educational diagrams |
| `pretext` | Mathematical/scientific content authoring | Writer / cinematographer for technical-explainer pretexts |
| `creative-ideation` | Constraint-driven project ideation | Director / cinematographer when the brief is wide-open and needs framing |
| `humanizer` | Strip AI-isms from text, add real voice | Writer / copywriter post-process to avoid AI-tells in scripts and VO copy |
### Audio / media skills (`hermes-agent/skills/creative/` + `skills/media/`)
| Skill | What it does | Best fit for |
|-------|--------------|--------------|
| `songwriting-and-ai-music` | Songwriting craft + Suno prompt patterns | Music supervisor when commissioning a track via Suno |
| `heartmula` | Open-source music generation (Apache-2.0, Suno-like) | Music supervisor generating bespoke tracks without external APIs |
| `songsee` | Spectrograms, mel/chroma/MFCC of audio files | Music supervisor analyzing tracks; foley-designer designing to a beat; editor visualizing a mix |
| `spotify` | Spotify control — play, search, queue, manage playlists | Music supervisor sourcing existing tracks; reference research |
| `youtube-content` | Fetch transcripts + transform to chapters/summaries/posts | Documentary cut, content adaptation, research for explainers |
| `gif-search` | Find existing GIFs | Editor / concept artist sourcing references |
| `gifs` | GIF tooling | Masterer producing GIF deliverables |
### Kanban infrastructure (`hermes-agent/skills/devops/`)
| Skill | What it does | When to load |
|-------|--------------|--------------|
| `kanban-orchestrator` | Decomposition playbook + anti-temptation rules for orchestrator profiles | Director only |
| `kanban-worker` | Pitfalls, examples, edge cases for kanban workers (deeper than auto-injected guidance) | Any profile — load when handling tricky multi-step workflows |
The kanban plugin auto-injects baseline orchestration guidance into every
worker's system prompt — the `kanban_create` fan-out pattern, claim/handoff
lifecycle, and the "decompose, don't execute" rule for orchestrators.
`kanban-orchestrator` and `kanban-worker` are deeper playbooks loaded when a
profile needs them.
## External tools (called from terminal toolset)
These are **not** Hermes skills but external CLIs / APIs that profiles invoke.
They don't appear in `always_load`; instead the role's terminal commands hit
them directly.
| Tool | What it does | Profile that uses it |
|------|--------------|----------------------|
| `ffmpeg` | Video / audio encode, splice, mux | renderer, editor, audio-mixer, masterer |
| `ffprobe` | Inspect media | All media-touching profiles |
| Whisper (CLI or API) | Speech-to-text for captions | captioner |
| Text-to-image API (FAL / Replicate / OpenAI / Midjourney) | Stills generation | image-generator (alternative to local `comfyui`) |
| Image-to-video API (Runway / Kling / Luma / Pika) | Animate stills | image-to-video-generator |
| Text-to-speech API (ElevenLabs / OpenAI TTS / etc.) | Voiceover generation | voice-talent |
| Suno API or web | Track composition (paired with `songwriting-and-ai-music`) | music-supervisor |
| Remotion CLI (`npx remotion render`) | React-based motion graphics | renderer-motion-graphics |
| Manim CE (`manim`) | Math animation render (driven by `manim-video` skill's recipes) | renderer-manim |
| Blender (`blender -b`) | 3D rendering (alternative to `blender-mcp`) | renderer-3d |
## Built-in Hermes tools for media review
These are native Hermes tools — not invoked via terminal but through their own
toolsets. Enable them per-profile by adding the toolset to the profile config.
| Tool | Toolset | What it does | Profile that uses it |
|------|---------|--------------|----------------------|
| `video_analyze` | `video` (opt-in — `hermes tools enable video`) | Native video understanding — sends full clip to a multimodal LLM (Gemini via OpenRouter) for review without frame extraction. Supports mp4, webm, mov, avi, mkv. 50 MB cap. Model: `AUXILIARY_VIDEO_MODEL` env → `AUXILIARY_VISION_MODEL` fallback. | reviewer, cinematographer, editor |
| `vision_analyze` | `vision` (core — enabled by default) | Image/frame analysis — review stills, thumbnails, exported frames. Already available to all profiles without opt-in. | reviewer, cinematographer, concept-artist |
## Standard toolset configurations per role
### director
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-orchestrator
```
The director's terminal access is conventional but the SOUL.md rules forbid
execution. Audit logs catch violations.
### writer / copywriter
```yaml
toolsets:
- kanban
- file
skills:
always_load:
- kanban-worker
- humanizer # post-process scripts to strip AI-tells
```
No terminal — writers don't need it.
### concept-artist
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
# plus one or more (style-dependent):
# - claude-design (UI / web product video)
# - sketch (quick mockup variants)
# - excalidraw (hand-drawn frames)
# - ascii-art (ASCII style frames)
# - pixel-art (retro/game aesthetic)
# - popular-web-designs (matching known web aesthetic)
# - design-md (text-based design docs)
```
### storyboarder
```yaml
toolsets:
- kanban
- file
skills:
always_load:
- kanban-worker
# one of:
# - excalidraw (sketch storyboards)
# - architecture-diagram (technical/system content)
# - concept-diagrams (educational / scientific content)
```
### cinematographer
```yaml
toolsets:
- kanban
- terminal
- file
- video # video_analyze — review full clips natively
- vision # vision_analyze — review stills / exported frames
skills:
always_load:
- kanban-worker
# the visual skill that matches the project, e.g.:
# - ascii-video (ASCII projects)
# - manim-video (math/explainer)
# - p5js (generative)
# - comfyui (AI-generated visuals)
# - blender-mcp (3D)
# - touchdesigner-mcp (real-time/installation)
```
### renderer (specialized variants)
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
# ONE skill per renderer variant (or empty for external-API renderers):
# - ascii-video (renderer-ascii)
# - manim-video (renderer-manim)
# - p5js (renderer-p5js)
# - comfyui (renderer-comfyui — img/video AI gen)
# - touchdesigner-mcp (renderer-touchdesigner)
# - blender-mcp (renderer-3d)
# - pixel-art (renderer-pixel)
# - baoyu-comic (renderer-comic)
# - meme-generation (renderer-meme)
```
For external-API renderers (image-to-video-generator using Runway, voice-talent
using ElevenLabs, renderer-motion-graphics using Remotion), `always_load` only
contains `kanban-worker` — the role's work is API-driven and the API key +
terminal commands suffice.
For multi-skill renderer setups (rare — usually one variant per skill is
cleaner) use `--skill <name>` on individual `kanban_create` calls to override
which skill loads for that specific task.
### image-generator / image-to-video-generator / voice-talent
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
# for image-generator that drives ComfyUI locally:
# - comfyui
env_required:
# populate based on the chosen API:
- FAL_KEY # or REPLICATE_API_TOKEN, OPENAI_API_KEY for image-gen
- RUNWAY_API_KEY # or KLING_API_KEY, LUMA_API_KEY for image-to-video
- ELEVENLABS_API_KEY # or OPENAI_API_KEY for TTS
```
If the user's setup has ComfyUI installed locally, the `comfyui` skill can
replace the external image-gen API entirely (cheaper, more control, supports
custom workflows for image-to-video too).
### music-supervisor
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
- songsee # spectrograms / audio analysis
# plus (depending on what the project needs):
# - songwriting-and-ai-music (commissioning Suno tracks)
# - heartmula (commissioning open-source local generation)
# - spotify (sourcing existing tracks)
```
### editor / audio-mixer / captioner / masterer
```yaml
toolsets:
- kanban
- terminal
- file
- video # video_analyze — editor reviews assembled cuts natively
- vision # vision_analyze — spot-check frames
skills:
always_load:
- kanban-worker
```
These are mostly ffmpeg-driven; no special skill needed beyond `kanban-worker`.
For captioner add Whisper invocation patterns to the SOUL.md.
### reviewer / brand-cop
```yaml
toolsets:
- kanban
- terminal # for media inspection (ffprobe, etc.)
- file
- video # video_analyze — review full clips natively
- vision # vision_analyze — review stills / exported frames
skills:
always_load:
- kanban-worker
```
## API key requirements
Track these in the project setup. The setup script should verify each required
key is present in `~/.hermes/.env` (or macOS Keychain) before firing the kanban.
| Service | Env var | Used by |
|---------|---------|---------|
| ElevenLabs | `ELEVENLABS_API_KEY` | voice-talent |
| OpenAI | `OPENAI_API_KEY` | image-generator (DALL-E), voice-talent (TTS) |
| OpenRouter | `OPENROUTER_API_KEY` | reviewer, cinematographer, editor (`video_analyze` routes through `AUXILIARY_VIDEO_MODEL` → OpenRouter) |
| FAL | `FAL_KEY` | image-generator (FAL flux models) |
| Replicate | `REPLICATE_API_TOKEN` | image-generator (alternate provider) |
| Runway | `RUNWAY_API_KEY` | image-to-video-generator |
| Kling | `KLING_API_KEY` | image-to-video-generator (alternate) |
| Luma | `LUMA_API_KEY` | image-to-video-generator (alternate) |
| Suno | `SUNO_API_KEY` | music-supervisor (paired with `songwriting-and-ai-music`) |
| Spotify | `SPOTIFY_CLIENT_ID` + `SPOTIFY_CLIENT_SECRET` | music-supervisor (paired with `spotify` skill) |
| Anthropic | `ANTHROPIC_API_KEY` | every Hermes profile (Claude) |
If a key is missing, prompt the user to add it. Storage methods, in order of
preference: macOS Keychain → `~/.hermes/.env` → environment variable.
## Skill version pinning
If a specific skill version is desired, pass it via the per-task
`--skill <name>=<version>` flag. The default is whatever's installed.
## Adding a new skill to the matrix
When a new Hermes-public video skill ships:
1. Add a row to the relevant table at the top of this file
2. If it warrants a specialized renderer variant, add to `role-archetypes.md`
3. Update relevant per-style examples in `examples.md`
@@ -0,0 +1,501 @@
#!/usr/bin/env python3
"""
Bootstrap a video production kanban from a structured plan JSON.
Reads a plan.json describing the team + brief, expands templates from
../assets/, and writes a setup.sh that creates Hermes profiles and fires the
initial kanban task.
Profile-config patching, SOUL.md-per-profile, TEAM.md task-graph convention,
and the `hermes kanban create --workspace dir:` initial-task pattern are
adapted from alt-glitch's NousResearch/kanban-video-pipeline.
Usage:
bootstrap_pipeline.py plan.json [--out setup.sh]
The plan.json schema is documented inline below see the `validate_plan`
function. A minimal example:
{
"title": "Q3 Product Teaser",
"slug": "q3-product-teaser",
"tenant": "q3-product-teaser",
"duration_s": 30,
"aspect": "1:1",
"resolution": "1080x1080",
"fps": 30,
"team": [
{
"profile": "director",
"role": "director",
"toolsets": ["kanban", "terminal", "file"],
"skills": [],
"responsibilities": "...",
"inputs": "brief.md, TEAM.md, taste/",
"outputs": "kanban tasks for the team"
},
...
],
"scenes": [
{"n": 1, "time": "0:00-0:08", "content": "...", "tool": "renderer-ascii"},
...
],
"audio": {"approach": "voiceover + music bed", "vo": "ElevenLabs Lily",
"music": "license-free", "sfx": "n/a"},
"deliverables": [
{"format": "mp4", "resolution": "1080x1080", "notes": "primary"}
],
"api_keys_required": ["ELEVENLABS_API_KEY", "OPENROUTER_API_KEY"],
"brief_extra": {
"concept_one_liner": "...",
"emotional_north_star": "...",
"visual_refs": "...",
"tone": "...",
"brand_constraints": "..."
}
}
"""
from __future__ import annotations
import argparse
import json
import os
import re
import sys
from pathlib import Path
ASSETS_DIR = Path(__file__).resolve().parent.parent / "assets"
def load_template(name: str) -> str:
return (ASSETS_DIR / name).read_text()
PROFILE_NAME_RE = re.compile(r"^[a-z0-9][a-z0-9_-]{0,63}$")
SLUG_RE = re.compile(r"^[a-z0-9][a-z0-9-]+$")
def validate_plan(plan: dict) -> list[str]:
"""Return a list of validation error strings; empty list = valid."""
errors = []
required_top = ["title", "slug", "tenant", "duration_s", "aspect",
"resolution", "fps", "team", "scenes", "audio",
"deliverables"]
for k in required_top:
if k not in plan:
errors.append(f"missing required key: {k}")
if "team" in plan:
if not isinstance(plan["team"], list) or not plan["team"]:
errors.append("team must be a non-empty list")
else:
roles = [t.get("role") for t in plan["team"]]
if "director" not in roles:
errors.append("team must include a director role")
seen_profiles = set()
for i, t in enumerate(plan["team"]):
for k in ["profile", "role", "toolsets", "skills",
"responsibilities"]:
if k not in t:
errors.append(f"team[{i}] missing {k}")
# Profile name must match Hermes's regex (lowercase
# alphanumeric + hyphens + underscores, up to 64 chars).
if "profile" in t:
if not PROFILE_NAME_RE.match(t["profile"]):
errors.append(
f"team[{i}].profile {t['profile']!r} must match "
f"[a-z0-9][a-z0-9_-]{{0,63}} per Hermes profile rules"
)
if t["profile"] in seen_profiles:
errors.append(
f"team[{i}].profile {t['profile']!r} is duplicated"
)
seen_profiles.add(t["profile"])
# Toolsets / skills must be lists, not strings.
if "toolsets" in t and not isinstance(t["toolsets"], list):
errors.append(
f"team[{i}].toolsets must be a list of strings"
)
if "skills" in t and not isinstance(t["skills"], list):
errors.append(
f"team[{i}].skills must be a list of strings"
)
if "slug" in plan:
if not SLUG_RE.match(plan["slug"]):
errors.append("slug must be lowercase, hyphenated, "
"starting with [a-z0-9]")
return errors
def render_brief(plan: dict) -> str:
"""Render brief.md from the plan."""
tmpl = load_template("brief.md.tmpl")
extra = plan.get("brief_extra", {})
# Scene table rows
scene_rows = []
for s in plan["scenes"]:
scene_rows.append(
f"| {s.get('n', '?')} | {s.get('time', '?')} | "
f"{s.get('content', '')} | {s.get('tool', '')} | "
f"{s.get('audio', '')} | {s.get('notes', '')} |"
)
scene_table = "\n".join(scene_rows) if scene_rows else "_(none yet)_"
# Deliverable rows
deliv_rows = []
for d in plan["deliverables"]:
deliv_rows.append(
f"| {d.get('format', '?')} | {d.get('resolution', '?')} | "
f"{d.get('notes', '')} |"
)
deliv_table = "\n".join(deliv_rows) if deliv_rows else "_(none)_"
# Replacements (single-pass)
replacements = {
"TITLE": plan["title"],
"SLUG": plan["slug"],
"TENANT": plan["tenant"],
"WORKSPACE": f"~/projects/video-pipeline/{plan['slug']}",
"ONE_LINE_PITCH": extra.get("concept_one_liner", "_(TBD)_"),
"EMOTIONAL_NORTH_STAR": extra.get("emotional_north_star", "_(TBD)_"),
"DURATION_S": str(plan["duration_s"]),
"ASPECT": plan["aspect"],
"RESOLUTION": plan["resolution"],
"FPS": str(plan["fps"]),
"PLATFORMS": extra.get("platforms", "_(TBD)_"),
"DEADLINE": extra.get("deadline", "_(none)_"),
"QUALITY_BAR": extra.get("quality_bar", "polished"),
"VISUAL_REFS": extra.get("visual_refs", "_(none)_"),
"TONE": extra.get("tone", "_(TBD)_"),
"BRAND_CONSTRAINTS": extra.get("brand_constraints", "_(none)_"),
"AESTHETIC_RULES": extra.get("aesthetic_rules", "_(TBD)_"),
"AUDIO_APPROACH": plan["audio"].get("approach", "_(TBD)_"),
"VO_DETAILS": plan["audio"].get("vo", "_(n/a)_"),
"MUSIC_DETAILS": plan["audio"].get("music", "_(n/a)_"),
"SFX_DETAILS": plan["audio"].get("sfx", "_(n/a)_"),
"PRIMARY_FORMAT": plan["deliverables"][0]["format"],
"PRIMARY_RES": plan["deliverables"][0]["resolution"],
"ALT_FORMAT_1": (plan["deliverables"][1]["format"]
if len(plan["deliverables"]) > 1 else "_(none)_"),
"ALT_RES_1": (plan["deliverables"][1]["resolution"]
if len(plan["deliverables"]) > 1 else ""),
"ALT_NOTES_1": (plan["deliverables"][1].get("notes", "")
if len(plan["deliverables"]) > 1 else ""),
"API_KEYS_REQUIRED": ", ".join(plan.get("api_keys_required", [])) or "none",
"EXT_DEPS": extra.get("ext_deps", "ffmpeg, Python 3.11+"),
"SOURCE_ASSETS": extra.get("source_assets", "_(none)_"),
}
out = tmpl
for k, v in replacements.items():
out = out.replace("{{" + k + "}}", str(v))
# Scene + deliv tables: replace the placeholder row in the template
out = re.sub(
r"\|\s*1\s*\|\s*0:000:0X.+?\n\|\s*2\s*\|.+?\n",
scene_table + "\n",
out, flags=re.DOTALL,
)
return out
def render_team_md(plan: dict) -> str:
"""Render TEAM.md from the team list + scene → tool mapping."""
lines = [f"# Team & Task Graph — {plan['title']}", "", "## Team", ""]
for t in plan["team"]:
skills = (
f"loads `{', '.join(t['skills'])}`"
if t["skills"] else "no skills required"
)
lines.append(
f"- `{t['profile']}` — {t['responsibilities']} ({skills})"
)
lines.extend(["", "## Task Graph", "", "```"])
# Build a simple task graph based on conventions
profiles_by_role = {t["role"]: t["profile"] for t in plan["team"]}
director = profiles_by_role.get("director", "director")
lines.append(f"T0 {director} — decompose")
next_id = 1
parents_for_renderer: list[str] = ["T0"]
if "cinematographer" in profiles_by_role:
cid = f"T{next_id}"
lines.append(
f"{cid:5} {profiles_by_role['cinematographer']} — visual spec for all scenes (parent: T0)"
)
parents_for_renderer = [cid]
next_id += 1
if "music-supervisor" in profiles_by_role:
cid = f"T{next_id}"
lines.append(
f"{cid:5} {profiles_by_role['music-supervisor']} — track analysis + beats.json (parent: T0)"
)
next_id += 1
ms_id = cid
else:
ms_id = None
# Scenes
scene_ids = []
for s in plan["scenes"]:
cid = f"T{next_id}"
renderer_profile = s.get("tool") or "renderer"
# Lookup the actual profile name
for t in plan["team"]:
if t["role"] == renderer_profile or t["profile"] == renderer_profile:
renderer_profile = t["profile"]
break
parents = parents_for_renderer + ([ms_id] if ms_id else [])
parent_str = ", ".join(parents)
lines.append(
f"{cid:5} {renderer_profile} — scene {s.get('n', '?')}: "
f"{s.get('content', '')[:50]} (parents: {parent_str})"
)
scene_ids.append(cid)
next_id += 1
# VO + audio mix
if "voice-talent" in profiles_by_role:
vo_id = f"T{next_id}"
lines.append(f"{vo_id:5} {profiles_by_role['voice-talent']} — narration (parent: T0)")
next_id += 1
else:
vo_id = None
if "audio-mixer" in profiles_by_role:
am_id = f"T{next_id}"
am_parents = [p for p in [ms_id, vo_id] if p]
lines.append(
f"{am_id:5} {profiles_by_role['audio-mixer']} — mix audio (parents: {', '.join(am_parents)})"
)
next_id += 1
else:
am_id = None
# Editor
if "editor" in profiles_by_role:
ed_id = f"T{next_id}"
ed_parents = scene_ids + [p for p in [am_id, vo_id, ms_id] if p and p not in scene_ids]
lines.append(
f"{ed_id:5} {profiles_by_role['editor']} — assemble + mux (parents: {', '.join(ed_parents)})"
)
next_id += 1
else:
ed_id = None
# Captioner
if "captioner" in profiles_by_role and ed_id:
cap_id = f"T{next_id}"
lines.append(
f"{cap_id:5} {profiles_by_role['captioner']} — SRT + burn (parent: {ed_id})"
)
next_id += 1
last = cap_id
else:
last = ed_id
# Reviewer
if "reviewer" in profiles_by_role and last:
rv_id = f"T{next_id}"
lines.append(
f"{rv_id:5} {profiles_by_role['reviewer']} — final QA (parent: {last})"
)
lines.append("```")
lines.extend([
"",
"## Per-task workspace requirement",
"",
f"All `kanban_create` calls MUST pass:",
f"```",
f'workspace_kind="dir"',
f'workspace_path="$HOME/projects/video-pipeline/{plan["slug"]}"',
f'tenant="{plan["tenant"]}"',
f"```",
])
return "\n".join(lines)
def render_setup_sh(plan: dict, brief_md: str, team_md: str) -> str:
"""Render setup.sh from the plan."""
tmpl = load_template("setup.sh.tmpl")
# API key checks
key_checks = []
for key in plan.get("api_keys_required", []):
key_checks.append(f'check_key {key} hermes {key} || exit 1')
key_checks_str = "\n".join(key_checks) if key_checks else "# (no API keys required)"
# Scene dirs
scene_dir_lines = []
for s in plan["scenes"]:
n = s.get("n", "?")
scene_dir_lines.append(f'mkdir -p "$WORKSPACE/scenes/scene-{n:02d}"/checkpoints')
scene_dirs = "\n".join(scene_dir_lines) if scene_dir_lines else ""
# Profile create
profile_creates = []
for t in plan["team"]:
profile_creates.append(
f'hermes profile create {t["profile"]} --clone 2>/dev/null || true'
)
# Profile config — emit JSON arrays so the bash function can pass them
# safely through to the Python YAML patcher.
profile_configs = []
for t in plan["team"]:
ts_json = json.dumps(t["toolsets"])
sk_json = json.dumps(t["skills"])
# Use single-quoted bash strings; JSON only contains "/[/], no single
# quotes, so this is safe.
profile_configs.append(
f"configure_profile {t['profile']!r} {ts_json!r} {sk_json!r}"
)
# SOUL writes — uses heredocs per profile
soul_writes = []
for t in plan["team"]:
soul_writes.append(
f'cat > "$HOME/.hermes/profiles/{t["profile"]}/SOUL.md" <<\'SOUL_EOF\'\n'
f"{render_soul_md(t, plan)}\n"
f"SOUL_EOF\n"
f'echo " ✓ SOUL.md for {t["profile"]}"'
)
# Taste writes (placeholder; real content optional)
taste_writes = (
'cat > "$WORKSPACE/taste/brand-guide.md" <<\'TASTE_EOF\'\n'
'# Brand Guide\n\n'
'_(Populate with project-specific colors, typography, motion rules)_\n'
'TASTE_EOF\n'
'cat > "$WORKSPACE/taste/emotional-dna.md" <<\'DNA_EOF\'\n'
'# Emotional DNA\n\n'
'_(What this piece should FEEL like — populate from the brief.)_\n'
'DNA_EOF'
)
# Asset copies — leave empty by default; user fills in
asset_copies = "# Add cp/rsync commands here for any provided assets"
out = tmpl
out = out.replace("{{TITLE}}", plan["title"])
out = out.replace("{{SLUG}}", plan["slug"])
out = out.replace("{{TENANT}}", plan["tenant"])
out = out.replace("{{WORKSPACE}}", f"~/projects/video-pipeline/{plan['slug']}")
out = out.replace("{{KEY_CHECKS}}", key_checks_str)
out = out.replace("{{SCENE_DIRS}}", scene_dirs)
out = out.replace("{{PROFILE_CREATE_COMMANDS}}", "\n".join(profile_creates))
out = out.replace("{{PROFILE_CONFIG_COMMANDS}}", "\n".join(profile_configs))
out = out.replace("{{SOUL_WRITES}}", "\n".join(soul_writes))
out = out.replace("{{BRIEF_CONTENTS}}", brief_md)
out = out.replace("{{TEAM_CONTENTS}}", team_md)
out = out.replace("{{TASTE_WRITES}}", taste_writes)
out = out.replace("{{ASSET_COPIES}}", asset_copies)
return out
def render_soul_md(team_member: dict, plan: dict) -> str:
"""Render a profile's SOUL.md from a team member dict + plan context."""
tmpl = load_template("soul.md.tmpl")
role = team_member["role"]
common_rules = (
"- **Read the brief and team graph** before doing anything else.\n"
"- **Pass `workspace_kind=\"dir\"` and `workspace_path` on every "
"`kanban_create` call.** This keeps the team in one shared workspace.\n"
f"- **Use tenant `{plan['tenant']}`** on every kanban call.\n"
"- **Write outputs to predictable paths.** Other profiles depend on "
"your filename conventions.\n"
"- **Emit heartbeats** during long-running work. Renderers should "
"report frame counts; editors should report assembly progress.\n"
)
if role == "director":
common_rules += (
"- **Do not execute the work yourself.** For every concrete task, "
"create a kanban task and assign it to the appropriate profile.\n"
"- **Decompose, route, comment, approve — that's the whole job.**\n"
"- **Read TEAM.md** for the canonical task graph. Do not invent "
"new roles unless the brief truly demands it.\n"
"- **Load the `kanban-orchestrator` skill** for the deeper "
"decomposition playbook beyond the auto-injected baseline.\n"
)
common_commands = (
"```bash\n"
"# Inspect a clip\n"
"ffprobe -v quiet -show_entries format=duration -show_entries "
"stream=codec_name,width,height,r_frame_rate <file.mp4>\n"
"\n"
"# Extract a frame for QA\n"
"ffmpeg -y -i <input.mp4> -vf \"select='eq(n,30)'\" -vsync vfr <out.png>\n"
"```"
)
out = tmpl
out = out.replace("{{ROLE_NAME}}", role)
out = out.replace("{{ROLE_RESPONSIBILITIES}}", team_member["responsibilities"])
out = out.replace("{{INPUTS_READ}}", team_member.get("inputs", "_(see brief)_"))
out = out.replace("{{OUTPUTS_PRODUCED}}", team_member.get("outputs", "_(see brief)_"))
out = out.replace("{{TOOLSETS}}", ", ".join(team_member["toolsets"]))
out = out.replace(
"{{SKILLS}}",
", ".join(team_member["skills"]) if team_member["skills"] else "(none)"
)
out = out.replace(
"{{EXTERNAL_TOOLS}}",
team_member.get("external_tools", "ffmpeg, ffprobe (via terminal)")
)
out = out.replace(
"{{ROLE_RULES}}",
team_member.get("role_rules", "_(see TEAM.md and brief.md)_")
)
out = out.replace("{{COMMON_RULES}}", common_rules)
out = out.replace("{{COMMON_COMMANDS}}", common_commands)
return out
def main():
ap = argparse.ArgumentParser(description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
ap.add_argument("plan_json", help="Path to plan.json")
ap.add_argument("--out", default="setup.sh",
help="Output path for setup.sh (default: ./setup.sh)")
ap.add_argument("--brief-out", default=None,
help="Write brief.md alongside (default: skipped)")
ap.add_argument("--team-out", default=None,
help="Write TEAM.md alongside (default: skipped)")
args = ap.parse_args()
plan = json.loads(Path(args.plan_json).read_text())
errors = validate_plan(plan)
if errors:
print("Plan validation failed:", file=sys.stderr)
for e in errors:
print(f" - {e}", file=sys.stderr)
sys.exit(2)
brief = render_brief(plan)
team = render_team_md(plan)
setup = render_setup_sh(plan, brief, team)
Path(args.out).write_text(setup)
os.chmod(args.out, 0o755)
print(f"Wrote {args.out}")
if args.brief_out:
Path(args.brief_out).write_text(brief)
print(f"Wrote {args.brief_out}")
if args.team_out:
Path(args.team_out).write_text(team)
print(f"Wrote {args.team_out}")
if __name__ == "__main__":
main()
@@ -0,0 +1,195 @@
#!/usr/bin/env python3
"""
Monitor a running video-production kanban. Polls `hermes kanban list` and
`events` for a tenant and surfaces issues (stuck tasks, missing heartbeats,
repeated retries, dependency deadlocks).
Usage:
monitor.py --tenant <project-slug> [--interval 30]
Outputs a periodic snapshot to stdout. Sends alerts via stderr when issues
are detected. Designed to run alongside the kanban kill with Ctrl-C when
you're satisfied (or scripted to stop on completion).
This is best-effort observability. It does not auto-restart tasks; intervention
decisions should remain human/AI-overseen.
"""
from __future__ import annotations
import argparse
import json
import shutil
import subprocess
import sys
import time
from collections import defaultdict
from datetime import datetime, timedelta
def hermes_available() -> bool:
return shutil.which("hermes") is not None
def kanban_list(tenant: str) -> list[dict]:
"""Returns parsed task rows. Falls back to plain stdout parsing if JSON
output isn't supported by the installed hermes CLI."""
try:
out = subprocess.run(
["hermes", "kanban", "list", "--tenant", tenant, "--json"],
capture_output=True, text=True, check=False,
)
if out.returncode == 0 and out.stdout.strip().startswith("["):
return json.loads(out.stdout)
except (FileNotFoundError, json.JSONDecodeError):
pass
# Fallback: textual parse of `hermes kanban list`
out = subprocess.run(
["hermes", "kanban", "list", "--tenant", tenant],
capture_output=True, text=True, check=False,
)
rows = []
for line in out.stdout.splitlines():
line = line.strip()
if not line or line.startswith("#") or "STATUS" in line.upper():
continue
parts = line.split()
if len(parts) >= 4 and parts[0].startswith("t_"):
rows.append({
"id": parts[0],
"status": parts[1] if len(parts) > 1 else "?",
"assignee": parts[2] if len(parts) > 2 else "?",
"title": " ".join(parts[3:]) if len(parts) > 3 else "",
"started_at": None,
"heartbeat_at": None,
"max_runtime_s": None,
})
return rows
def kanban_show(task_id: str) -> dict | None:
out = subprocess.run(
["hermes", "kanban", "show", task_id, "--json"],
capture_output=True, text=True, check=False,
)
if out.returncode != 0:
return None
try:
return json.loads(out.stdout)
except json.JSONDecodeError:
return None
def detect_issues(tasks: list[dict]) -> list[str]:
"""Return a list of issue strings, one per concern."""
now = datetime.now()
issues: list[str] = []
by_status = defaultdict(list)
for t in tasks:
by_status[t.get("status", "?")].append(t)
# Stuck tasks: RUNNING with no heartbeat in 2 min
for t in by_status.get("running", []) + by_status.get("RUNNING", []):
hb = t.get("heartbeat_at")
if not hb:
continue
try:
hb_dt = datetime.fromisoformat(str(hb).rstrip("Z"))
except ValueError:
continue
if now - hb_dt > timedelta(minutes=2):
issues.append(
f"STUCK: {t['id']} ({t.get('assignee', '?')}) — "
f"no heartbeat in {(now - hb_dt).total_seconds():.0f}s"
)
# Tasks exceeding max_runtime
for t in by_status.get("running", []) + by_status.get("RUNNING", []):
started = t.get("started_at")
max_rt = t.get("max_runtime_s")
if not started or not max_rt:
continue
try:
started_dt = datetime.fromisoformat(str(started).rstrip("Z"))
except ValueError:
continue
elapsed = (now - started_dt).total_seconds()
if elapsed > max_rt:
issues.append(
f"OVERTIME: {t['id']} ({t.get('assignee', '?')}) — "
f"running {elapsed:.0f}s, cap was {max_rt}s"
)
# Repeated retries
for t in tasks:
retries = t.get("retries", 0)
if retries and retries >= 2:
issues.append(
f"FLAPPING: {t['id']} ({t.get('assignee', '?')}) — "
f"retried {retries}× — fix root cause before next run"
)
return issues
def snapshot(tenant: str) -> tuple[list[dict], list[str]]:
tasks = kanban_list(tenant)
issues = detect_issues(tasks)
return tasks, issues
def print_snapshot(tasks: list[dict], issues: list[str]):
counts = defaultdict(int)
for t in tasks:
counts[str(t.get("status", "?")).lower()] += 1
print(f"\n[{datetime.now().strftime('%H:%M:%S')}] "
f"Total: {len(tasks)} | "
+ " | ".join(f"{k}: {v}" for k, v in sorted(counts.items())))
for t in tasks:
bar = "" if str(t.get("status", "")).lower() == "done" else \
"" if str(t.get("status", "")).lower() == "running" else \
"·" if str(t.get("status", "")).lower() == "ready" else \
"" if str(t.get("status", "")).lower() == "failed" else "?"
print(f" {bar} {t.get('id', '?'):14} {t.get('assignee', '?'):20} "
f"{t.get('title', '')[:60]}")
if issues:
print("\n ⚠ ISSUES:", file=sys.stderr)
for i in issues:
print(f" {i}", file=sys.stderr)
def main():
ap = argparse.ArgumentParser(description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
ap.add_argument("--tenant", required=True,
help="Project tenant slug to monitor")
ap.add_argument("--interval", type=int, default=30,
help="Poll interval in seconds (default: 30)")
ap.add_argument("--once", action="store_true",
help="Print one snapshot and exit (no polling loop)")
args = ap.parse_args()
if not hermes_available():
print("ERROR: 'hermes' CLI not found in PATH", file=sys.stderr)
sys.exit(1)
if args.once:
tasks, issues = snapshot(args.tenant)
print_snapshot(tasks, issues)
sys.exit(0 if not issues else 2)
print(f"Monitoring tenant '{args.tenant}' every {args.interval}s. "
"Ctrl-C to exit.")
try:
while True:
tasks, issues = snapshot(args.tenant)
print_snapshot(tasks, issues)
time.sleep(args.interval)
except KeyboardInterrupt:
print("\nStopped.")
if __name__ == "__main__":
main()
+8 -1
View File
@@ -43,7 +43,7 @@ class NodeServer:
def __init__(
self,
host: str = "0.0.0.0",
host: str = "127.0.0.1",
port: int = 18789,
token_path: Optional[Path] = None,
display_name: str = "hermes-meet-node",
@@ -76,6 +76,13 @@ class NodeServer:
json.dumps({"token": tok, "generated_at": time.time()}, indent=2),
encoding="utf-8",
)
# Restrict to owner-read-write only — the token grants full RPC
# access to the meet bot (start, transcribe, speak in meetings).
try:
tmp.chmod(0o600)
except (OSError, NotImplementedError):
# Best-effort on non-POSIX filesystems; mode is set on POSIX.
pass
tmp.replace(self.token_path)
self._token = tok
return tok
+84 -23
View File
@@ -1258,6 +1258,10 @@ class AIAgent:
# after each API call. Accessed by /usage slash command.
self._rate_limit_state: Optional["RateLimitState"] = None
# OpenRouter response cache hit counter — incremented when
# X-OpenRouter-Cache-Status: HIT is seen in streaming response headers.
self._or_cache_hits: int = 0
# Centralized logging — agent.log (INFO+) and errors.log (WARNING+)
# both live under ~/.hermes/logs/. Idempotent, so gateway mode
# (which creates a new AIAgent per message) won't duplicate handlers.
@@ -1421,11 +1425,8 @@ class AIAgent:
client_kwargs["args"] = self.acp_args
effective_base = base_url
if base_url_host_matches(effective_base, "openrouter.ai"):
client_kwargs["default_headers"] = {
"HTTP-Referer": "https://hermes-agent.nousresearch.com",
"X-OpenRouter-Title": "Hermes Agent",
"X-OpenRouter-Categories": "productivity,cli-agent",
}
from agent.auxiliary_client import build_or_headers
client_kwargs["default_headers"] = build_or_headers()
elif base_url_host_matches(effective_base, "api.routermint.com"):
client_kwargs["default_headers"] = _routermint_headers()
elif base_url_host_matches(effective_base, "api.githubcopilot.com"):
@@ -1473,17 +1474,49 @@ class AIAgent:
_env_hint = _pcfg.api_key_env_vars[0]
except Exception:
pass
# --- Init-time fallback (#17929) ---
_fb_entries = []
if isinstance(fallback_model, list):
_fb_entries = [
f for f in fallback_model
if isinstance(f, dict) and f.get("provider") and f.get("model")
]
elif isinstance(fallback_model, dict) and fallback_model.get("provider") and fallback_model.get("model"):
_fb_entries = [fallback_model]
_fb_resolved = False
for _fb in _fb_entries:
_fb_client, _fb_model = resolve_provider_client(
_fb["provider"], model=_fb["model"], raw_codex=True,
explicit_base_url=_fb.get("base_url"),
explicit_api_key=_fb.get("api_key"),
)
if _fb_client is not None:
self.provider = _fb["provider"]
self.model = _fb_model or _fb["model"]
self._fallback_activated = True
client_kwargs = {
"api_key": _fb_client.api_key,
"base_url": str(_fb_client.base_url),
}
if _provider_timeout is not None:
client_kwargs["timeout"] = _provider_timeout
if hasattr(_fb_client, "_default_headers") and _fb_client._default_headers:
client_kwargs["default_headers"] = dict(_fb_client._default_headers)
_fb_resolved = True
break
if not _fb_resolved:
raise RuntimeError(
f"Provider '{_explicit}' is set in config.yaml but no API key "
f"was found. Set the {_env_hint} environment "
f"variable, or switch to a different provider with `hermes model`."
)
if not getattr(self, "_fallback_activated", False):
# No provider configured — reject with a clear message.
raise RuntimeError(
f"Provider '{_explicit}' is set in config.yaml but no API key "
f"was found. Set the {_env_hint} environment "
f"variable, or switch to a different provider with `hermes model`."
"No LLM provider configured. Run `hermes model` to "
"select a provider, or run `hermes setup` for first-time "
"configuration."
)
# No provider configured — reject with a clear message.
raise RuntimeError(
"No LLM provider configured. Run `hermes model` to "
"select a provider, or run `hermes setup` for first-time "
"configuration."
)
self._client_kwargs = client_kwargs # stored for rebuilding after interrupt
@@ -1536,7 +1569,7 @@ class AIAgent:
else:
self._fallback_chain = []
self._fallback_index = 0
self._fallback_activated = False
self._fallback_activated = getattr(self, "_fallback_activated", False)
# Legacy attribute kept for backward compat (tests, external callers)
self._fallback_model = self._fallback_chain[0] if self._fallback_chain else None
if self._fallback_chain and not self.quiet_mode:
@@ -4548,6 +4581,28 @@ class AIAgent:
"""Return the last captured RateLimitState, or None."""
return self._rate_limit_state
def _check_openrouter_cache_status(self, http_response: Any) -> None:
"""Read X-OpenRouter-Cache-Status from response headers and log it.
Increments ``_or_cache_hits`` on HIT so callers can report savings.
"""
if http_response is None:
return
headers = getattr(http_response, "headers", None)
if not headers:
return
try:
status = headers.get("x-openrouter-cache-status")
if not status:
return
if status.upper() == "HIT":
self._or_cache_hits += 1
logger.info("OpenRouter response cache HIT (total: %d)", self._or_cache_hits)
else:
logger.debug("OpenRouter response cache %s", status.upper())
except Exception:
pass # Never let header parsing break the agent loop
def get_activity_summary(self) -> dict:
"""Return a snapshot of the agent's current activity for diagnostics.
@@ -6125,10 +6180,10 @@ class AIAgent:
return True
def _apply_client_headers_for_base_url(self, base_url: str) -> None:
from agent.auxiliary_client import _AI_GATEWAY_HEADERS, _OR_HEADERS
from agent.auxiliary_client import _AI_GATEWAY_HEADERS, build_or_headers
if base_url_host_matches(base_url, "openrouter.ai"):
self._client_kwargs["default_headers"] = dict(_OR_HEADERS)
self._client_kwargs["default_headers"] = build_or_headers()
elif base_url_host_matches(base_url, "ai-gateway.vercel.sh"):
self._client_kwargs["default_headers"] = dict(_AI_GATEWAY_HEADERS)
elif base_url_host_matches(base_url, "api.routermint.com"):
@@ -6748,6 +6803,9 @@ class AIAgent:
# response via .response before any chunks are consumed.
self._capture_rate_limits(getattr(stream, "response", None))
# Log OpenRouter response cache status when present.
self._check_openrouter_cache_status(getattr(stream, "response", None))
content_parts: list = []
tool_calls_acc: dict = {}
tool_gen_notified: set = set()
@@ -10242,7 +10300,10 @@ class AIAgent:
provider_preferences["order"] = self.providers_order
if self.provider_sort:
provider_preferences["sort"] = self.provider_sort
if provider_preferences:
if provider_preferences and (
(self.provider or "").strip().lower() == "openrouter"
or self._is_openrouter_url()
):
summary_extra_body["provider"] = provider_preferences
if summary_extra_body:
@@ -10565,11 +10626,11 @@ class AIAgent:
self.model,
f"{self.context_compressor.context_length:,}",
)
if not self.quiet_mode:
self._safe_print(
f"📦 Preflight compression: ~{_preflight_tokens:,} tokens "
f">= {self.context_compressor.threshold_tokens:,} threshold"
)
self._emit_status(
f"📦 Preflight compression: ~{_preflight_tokens:,} tokens "
f">= {self.context_compressor.threshold_tokens:,} threshold. "
"This may take a moment."
)
# May need multiple passes for very large sessions with small
# context windows (each pass summarises the middle N turns).
for _pass in range(3):
+42
View File
@@ -46,6 +46,7 @@ AUTHOR_MAP = {
"leone.parise@gmail.com": "leoneparise",
"teknium@nousresearch.com": "teknium1",
"127238744+teknium1@users.noreply.github.com": "teknium1",
"159539633+MottledShadow@users.noreply.github.com": "MottledShadow",
"aludwin+gh@gmail.com": "adamludwin",
"2093036+exiao@users.noreply.github.com": "exiao",
"rylen.anil@gmail.com": "rylena",
@@ -66,7 +67,10 @@ AUTHOR_MAP = {
"nbot@liizfq.top": "liizfq",
"274096618+hermes-agent-dhabibi@users.noreply.github.com": "dhabibi",
"dejie.guo@gmail.com": "JayGwod",
"133716830+0xKingBack@users.noreply.github.com": "0xKingBack",
"daixin1204@gmail.com": "SimbaKingjoe",
"maxence@groine.fr": "MaxyMoos",
"61830395+leprincep35700@users.noreply.github.com": "leprincep35700",
# OpenViking viking_read salvage (April 2026)
"hitesh@gmail.com": "htsh",
"pty819@outlook.com": "pty819",
@@ -370,6 +374,10 @@ AUTHOR_MAP = {
"xowiekk@gmail.com": "Xowiek",
"1243352777@qq.com": "zons-zhaozhy",
"e.silacandmr@gmail.com": "Es1la",
"h3057183414@gmail.com": "CoreyNoDream",
"franksong2702@gmail.com": "franksong2702",
"673088860@qq.com": "ambition0802",
"beibei1988@proton.me": "beibi9966",
# ── bulk addition: 75 emails resolved via API, PR salvage bodies, noreply
# crossref, and GH contributor list matching (April 2026 audit) ──
"1115117931@qq.com": "aaronagent",
@@ -500,6 +508,10 @@ AUTHOR_MAP = {
"michel.belleau@malaiwah.com": "malaiwah",
"gnanasekaran.sekareee@gmail.com": "gnanam1990",
"jz.pentest@gmail.com": "0xyg3n",
"7093928+0xyg3n@users.noreply.github.com": "0xyg3n",
"nftpoetrist@gmail.com": "nftpoetrist", # PR #18982
"millerc79@users.noreply.github.com": "millerc79", # PR #19033
"hermes@example.com": "shellybotmoyer", # PR #18915 (bot-committed)
"hypnosis.mda@gmail.com": "Hypn0sis",
"ywt000818@gmail.com": "OwenYWT",
"dhandhalyabhavik@gmail.com": "v1k22",
@@ -611,6 +623,33 @@ AUTHOR_MAP = {
"2114364329@qq.com": "cuyua9",
"2557058999@qq.com": "Disaster-Terminator",
"cine.dreamer.one@gmail.com": "LeonSGP43",
"zyprothh@gmail.com": "Zyproth",
"amitgaur@gmail.com": "amitgaur",
"albuquerque.abner@gmail.com": "mrbob-git",
"kiala@users.noreply.github.com": "kiala9",
"alanxchen@gmail.com": "alanxchen85",
"clawbot@clawbots-Mac-mini.local": "John-tip",
"der@konsi.org": "konsisumer",
"cirwel@The-CIRWEL-Group.local": "CIRWEL",
"molvikar8@gmail.com": "molvikar",
"nftpoetrist@gmail.com": "nftpoetrist",
"dodofun@126.com": "colorcross",
"1615063567@qq.com": "zhao0112",
"ethanguo.2003@gmail.com": "EthanGuo-coder",
"dev0jsh@gmail.com": "tmdgusya",
"leavr@163.com": "leavrcn",
"17683456+wanazhar@users.noreply.github.com": "wanazhar",
"26782336+cixuuz@users.noreply.github.com": "cixuuz",
"aleksandr.pasevin@openzeppelin.com": "pasevin",
"ubuntu@localhost.localdomain": "holynn-q",
"holynn@placeholder.local": "holynn-q",
"agent@hermes.local": "jacdevos",
"sunsky.lau@gmail.com": "liuhao1024",
"qiuqfang98@qq.com": "keepcalmqqf",
"261867348+ai-ag2026@users.noreply.github.com": "ai-ag2026",
"yanzh.su@gmail.com": "YanzhongSu",
"wanderwang@users.noreply.github.com": "WanderWang",
"yueheime@gmail.com": "yuehei",
"leozeli@qq.com": "leozeli",
"linlehao@cuhk.edu.cn": "LehaoLin",
"liutong@isacas.ac.cn": "I3eg1nner",
@@ -668,6 +707,9 @@ AUTHOR_MAP = {
"web3blind@gmail.com": "web3blind",
"ztzheng@163.com": "chengoak", # PR #17467
"24110240104@m.fudan.edu.cn": "YuShu", # co-author only
"charliekerfoot@gmail.com": "CharlieKerfoot", # PR #18951
# Debug share upload-time redaction (May 2026)
"dhuysamen@gmail.com": "GodsBoy", # PR #19318
}
+97 -119
View File
@@ -25,15 +25,15 @@
}
},
"node_modules/@cacheable/memory": {
"version": "2.0.7",
"resolved": "https://registry.npmjs.org/@cacheable/memory/-/memory-2.0.7.tgz",
"integrity": "sha512-RbxnxAMf89Tp1dLhXMS7ceft/PGsDl1Ip7T20z5nZ+pwIAsQ1p2izPjVG69oCLv/jfQ7HDPHTWK0c9rcAWXN3A==",
"version": "2.0.8",
"resolved": "https://registry.npmjs.org/@cacheable/memory/-/memory-2.0.8.tgz",
"integrity": "sha512-FvEb29x5wVwu/Kf93IWwsOOEuhHh6dYCJF3vcKLzXc0KXIW181AOzv6ceT4ZpBHDvAfG60eqb+ekmrnLHIy+jw==",
"license": "MIT",
"dependencies": {
"@cacheable/utils": "^2.3.3",
"@keyv/bigmap": "^1.3.0",
"hookified": "^1.14.0",
"keyv": "^5.5.5"
"@cacheable/utils": "^2.4.0",
"@keyv/bigmap": "^1.3.1",
"hookified": "^1.15.1",
"keyv": "^5.6.0"
}
},
"node_modules/@cacheable/node-cache": {
@@ -51,19 +51,19 @@
}
},
"node_modules/@cacheable/utils": {
"version": "2.3.4",
"resolved": "https://registry.npmjs.org/@cacheable/utils/-/utils-2.3.4.tgz",
"integrity": "sha512-knwKUJEYgIfwShABS1BX6JyJJTglAFcEU7EXqzTdiGCXur4voqkiJkdgZIQtWNFhynzDWERcTYv/sETMu3uJWA==",
"version": "2.4.1",
"resolved": "https://registry.npmjs.org/@cacheable/utils/-/utils-2.4.1.tgz",
"integrity": "sha512-eiFgzCbIneyMlLOmNG4g9xzF7Hv3Mga4LjxjcSC/ues6VYq2+gUbQI8JqNuw/ZM8tJIeIaBGpswAsqV2V7ApgA==",
"license": "MIT",
"dependencies": {
"hashery": "^1.3.0",
"hashery": "^1.5.1",
"keyv": "^5.6.0"
}
},
"node_modules/@emnapi/runtime": {
"version": "1.8.1",
"resolved": "https://registry.npmjs.org/@emnapi/runtime/-/runtime-1.8.1.tgz",
"integrity": "sha512-mehfKSMWjjNol8659Z8KxEMrdSJDDot5SXMq00dM8BN4o+CLNXQ0xH2V7EchNHV4RmbZLmmPdEaXZc5H2FXmDg==",
"version": "1.10.0",
"resolved": "https://registry.npmjs.org/@emnapi/runtime/-/runtime-1.10.0.tgz",
"integrity": "sha512-ewvYlk86xUoGI0zQRNq/mC+16R1QeDlKQy21Ki3oSYXNgLb45GV1P6A0M+/s6nyCuNDqe5VpaY84BzXGwVbwFA==",
"license": "MIT",
"optional": true,
"peer": true,
@@ -87,9 +87,9 @@
"license": "BSD-3-Clause"
},
"node_modules/@img/colour": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/@img/colour/-/colour-1.0.0.tgz",
"integrity": "sha512-A5P/LfWGFSl6nsckYtjw9da+19jB8hkJ6ACTGcDfEJ0aE+l2n2El7dsVM7UVHZQ9s2lmYMWlrS21YLy2IR1LUw==",
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/@img/colour/-/colour-1.1.0.tgz",
"integrity": "sha512-Td76q7j57o/tLVdgS746cYARfSyxk8iEfRxewL9h4OMzYhbW4TAcppl0mT4eyqXddh6L/jwoM75mo7ixa/pCeQ==",
"license": "MIT",
"peer": true,
"engines": {
@@ -617,9 +617,9 @@
"license": "BSD-3-Clause"
},
"node_modules/@protobufjs/codegen": {
"version": "2.0.4",
"resolved": "https://registry.npmjs.org/@protobufjs/codegen/-/codegen-2.0.4.tgz",
"integrity": "sha512-YyFaikqM5sH0ziFZCN3xDC7zeGaB/d0IUb9CATugHWbd1FRFwWwt4ld4OYMPWu5a3Xe01mGAULCdqhMlPl29Jg==",
"version": "2.0.5",
"resolved": "https://registry.npmjs.org/@protobufjs/codegen/-/codegen-2.0.5.tgz",
"integrity": "sha512-zgXFLzW3Ap33e6d0Wlj4MGIm6Ce8O89n/apUaGNB/jx+hw+ruWEp7EwGUshdLKVRCxZW12fp9r40E1mQrf/34g==",
"license": "BSD-3-Clause"
},
"node_modules/@protobufjs/eventemitter": {
@@ -645,9 +645,9 @@
"license": "BSD-3-Clause"
},
"node_modules/@protobufjs/inquire": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/@protobufjs/inquire/-/inquire-1.1.0.tgz",
"integrity": "sha512-kdSefcPdruJiFMVSbn801t4vFK7KB/5gd2fYvrxhuJYg8ILrmn9SKSX2tZdV6V+ksulWqS7aXjBcRXl3wHoD9Q==",
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/@protobufjs/inquire/-/inquire-1.1.1.tgz",
"integrity": "sha512-mnzgDV26ueAvk7rsbt9L7bE0SuAoqyuys/sMMrmVcN5x9VsxpcG3rqAUSgDyLp0UZlmNfIbQ4fHfCtreVBk8Ew==",
"license": "BSD-3-Clause"
},
"node_modules/@protobufjs/path": {
@@ -663,9 +663,9 @@
"license": "BSD-3-Clause"
},
"node_modules/@protobufjs/utf8": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/@protobufjs/utf8/-/utf8-1.1.0.tgz",
"integrity": "sha512-Vvn3zZrhQZkkBE8LSuW3em98c0FwgO4nxzv6OdSxPKJIEKY2bGbHn+mhGIPerzI4twdxaP8/0+06HBpwf345Lw==",
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/@protobufjs/utf8/-/utf8-1.1.1.tgz",
"integrity": "sha512-oOAWABowe8EAbMyWKM0tYDKi8Yaox52D+HWZhAIJqQXbqe0xI/GV7FhLWqlEKreMkfDjshR5FKgi3mnle0h6Eg==",
"license": "BSD-3-Clause"
},
"node_modules/@tokenizer/inflate": {
@@ -714,25 +714,20 @@
"integrity": "sha512-OvjF+z51L3ov0OyAU0duzsYuvO01PH7x4t6DJx+guahgTnBHkhJdG7soQeTSFLWN3efnHyibZ4Z8l2EuWwJN3A==",
"license": "MIT"
},
"node_modules/@types/long": {
"version": "4.0.2",
"resolved": "https://registry.npmjs.org/@types/long/-/long-4.0.2.tgz",
"integrity": "sha512-MqTGEo5bj5t157U6fA/BiDynNkn0YknVdh48CMPkTSpFTVmvao5UQmm7uEF6xBEo7qIMAlY/JSleYaE6VOdpaA==",
"license": "MIT"
},
"node_modules/@types/node": {
"version": "25.3.1",
"resolved": "https://registry.npmjs.org/@types/node/-/node-25.3.1.tgz",
"integrity": "sha512-hj9YIJimBCipHVfHKRMnvmHg+wfhKc0o4mTtXh9pKBjC8TLJzz0nzGmLi5UJsYAUgSvXFHgb0V2oY10DUFtImw==",
"version": "25.6.0",
"resolved": "https://registry.npmjs.org/@types/node/-/node-25.6.0.tgz",
"integrity": "sha512-+qIYRKdNYJwY3vRCZMdJbPLJAtGjQBudzZzdzwQYkEPQd+PJGixUL5QfvCLDaULoLv+RhT3LDkwEfKaAkgSmNQ==",
"license": "MIT",
"dependencies": {
"undici-types": "~7.18.0"
"undici-types": "~7.19.0"
}
},
"node_modules/@whiskeysockets/baileys": {
"name": "baileys",
"version": "7.0.0-rc.9",
"resolved": "git+ssh://git@github.com/WhiskeySockets/Baileys.git#01047debd81beb20da7b7779b08edcb06aa03770",
"integrity": "sha512-letWyB96JHD6NdqpAiseOfaUBi13u8AhiRcKSRqcVjc5Vw5xoPTZGvVnw8K/NvGBFAvyLJkwim9Mjvwzhx/SlA==",
"hasInstallScript": true,
"license": "MIT",
"dependencies": {
@@ -807,9 +802,9 @@
}
},
"node_modules/body-parser": {
"version": "1.20.4",
"resolved": "https://registry.npmjs.org/body-parser/-/body-parser-1.20.4.tgz",
"integrity": "sha512-ZTgYYLMOXY9qKU/57FAo8F+HA2dGX7bqGc71txDRC1rS4frdFI5R7NhluHxH6M0YItAP0sHB4uqAOcYKxO6uGA==",
"version": "1.20.5",
"resolved": "https://registry.npmjs.org/body-parser/-/body-parser-1.20.5.tgz",
"integrity": "sha512-3grm+/2tUOvu2cjJkvsIxrv/wVpfXQW4PsQHYm7yk4vfpu7Ekl6nEsYBoJUL6qDwZUx8wUhQ8tR2qz+ad9c9OA==",
"license": "MIT",
"dependencies": {
"bytes": "~3.1.2",
@@ -820,7 +815,7 @@
"http-errors": "~2.0.1",
"iconv-lite": "~0.4.24",
"on-finished": "~2.4.1",
"qs": "~6.14.0",
"qs": "~6.15.1",
"raw-body": "~2.5.3",
"type-is": "~1.6.18",
"unpipe": "~1.0.0"
@@ -830,6 +825,21 @@
"npm": "1.2.8000 || >= 1.4.16"
}
},
"node_modules/body-parser/node_modules/qs": {
"version": "6.15.1",
"resolved": "https://registry.npmjs.org/qs/-/qs-6.15.1.tgz",
"integrity": "sha512-6YHEFRL9mfgcAvql/XhwTvf5jKcOiiupt2FiJxHkiX1z4j7WL8J/jRHYLluORvc1XxB5rV20KoeK00gVJamspg==",
"license": "BSD-3-Clause",
"dependencies": {
"side-channel": "^1.1.0"
},
"engines": {
"node": ">=0.6"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/bytes": {
"version": "3.1.2",
"resolved": "https://registry.npmjs.org/bytes/-/bytes-3.1.2.tgz",
@@ -840,16 +850,16 @@
}
},
"node_modules/cacheable": {
"version": "2.3.2",
"resolved": "https://registry.npmjs.org/cacheable/-/cacheable-2.3.2.tgz",
"integrity": "sha512-w+ZuRNmex9c1TR9RcsxbfTKCjSL0rh1WA5SABbrWprIHeNBdmyQLSYonlDy9gpD+63XT8DgZ/wNh1Smvc9WnJA==",
"version": "2.3.4",
"resolved": "https://registry.npmjs.org/cacheable/-/cacheable-2.3.4.tgz",
"integrity": "sha512-djgxybDbw9fL/ZWMI3+CE8ZilNxcwFkVtDc1gJ+IlOSSWkSMPQabhV/XCHTQ6pwwN6aivXPZ43omTooZiX06Ew==",
"license": "MIT",
"dependencies": {
"@cacheable/memory": "^2.0.7",
"@cacheable/utils": "^2.3.3",
"@cacheable/memory": "^2.0.8",
"@cacheable/utils": "^2.4.0",
"hookified": "^1.15.0",
"keyv": "^5.5.5",
"qified": "^0.6.0"
"keyv": "^5.6.0",
"qified": "^0.9.0"
}
},
"node_modules/call-bind-apply-helpers": {
@@ -1212,21 +1222,21 @@
}
},
"node_modules/hashery": {
"version": "1.5.0",
"resolved": "https://registry.npmjs.org/hashery/-/hashery-1.5.0.tgz",
"integrity": "sha512-nhQ6ExaOIqti2FDWoEMWARUqIKyjr2VcZzXShrI+A3zpeiuPWzx6iPftt44LhP74E5sW36B75N6VHbvRtpvO6Q==",
"version": "1.5.1",
"resolved": "https://registry.npmjs.org/hashery/-/hashery-1.5.1.tgz",
"integrity": "sha512-iZyKG96/JwPz1N55vj2Ie2vXbhu440zfUfJvSwEqEbeLluk7NnapfGqa7LH0mOsnDxTF85Mx8/dyR6HfqcbmbQ==",
"license": "MIT",
"dependencies": {
"hookified": "^1.14.0"
"hookified": "^1.15.0"
},
"engines": {
"node": ">=20"
}
},
"node_modules/hasown": {
"version": "2.0.2",
"resolved": "https://registry.npmjs.org/hasown/-/hasown-2.0.2.tgz",
"integrity": "sha512-0hJU9SCPvmMzIBdZFqNPXWa6dqh7WdH0cII9y+CyS8rG3nL48Bclra9HmKhVVUHyPWNH5Y7xDwAB7bfgSjkUMQ==",
"version": "2.0.3",
"resolved": "https://registry.npmjs.org/hasown/-/hasown-2.0.3.tgz",
"integrity": "sha512-ej4AhfhfL2Q2zpMmLo7U1Uv9+PyhIZpgQLGT1F9miIGmiCJIoCgSmczFdrc97mWT4kVY72KA+WnnhJ5pghSvSg==",
"license": "MIT",
"dependencies": {
"function-bind": "^1.1.2"
@@ -1327,44 +1337,6 @@
"protobufjs": "6.8.8"
}
},
"node_modules/libsignal/node_modules/@types/node": {
"version": "10.17.60",
"resolved": "https://registry.npmjs.org/@types/node/-/node-10.17.60.tgz",
"integrity": "sha512-F0KIgDJfy2nA3zMLmWGKxcH2ZVEtCZXHHdOQs2gSaQ27+lNeEfGxzkIw90aXswATX7AZ33tahPbzy6KAfUreVw==",
"license": "MIT"
},
"node_modules/libsignal/node_modules/long": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/long/-/long-4.0.0.tgz",
"integrity": "sha512-XsP+KhQif4bjX1kbuSiySJFNAehNxgLb6hPRGJ9QsUr8ajHkuXGdrHmFUTUUXhDwVX2R5bY4JNZEwbUiMhV+MA==",
"license": "Apache-2.0"
},
"node_modules/libsignal/node_modules/protobufjs": {
"version": "6.8.8",
"resolved": "https://registry.npmjs.org/protobufjs/-/protobufjs-6.8.8.tgz",
"integrity": "sha512-AAmHtD5pXgZfi7GMpllpO3q1Xw1OYldr+dMUlAnffGTAhqkg72WdmSY71uKBF/JuyiKs8psYbtKrhi0ASCD8qw==",
"hasInstallScript": true,
"license": "BSD-3-Clause",
"dependencies": {
"@protobufjs/aspromise": "^1.1.2",
"@protobufjs/base64": "^1.1.2",
"@protobufjs/codegen": "^2.0.4",
"@protobufjs/eventemitter": "^1.1.0",
"@protobufjs/fetch": "^1.1.0",
"@protobufjs/float": "^1.0.2",
"@protobufjs/inquire": "^1.1.0",
"@protobufjs/path": "^1.1.2",
"@protobufjs/pool": "^1.1.0",
"@protobufjs/utf8": "^1.1.0",
"@types/long": "^4.0.0",
"@types/node": "^10.1.0",
"long": "^4.0.0"
},
"bin": {
"pbjs": "bin/pbjs",
"pbts": "bin/pbts"
}
},
"node_modules/long": {
"version": "5.3.2",
"resolved": "https://registry.npmjs.org/long/-/long-5.3.2.tgz",
@@ -1372,9 +1344,9 @@
"license": "Apache-2.0"
},
"node_modules/lru-cache": {
"version": "11.2.6",
"resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-11.2.6.tgz",
"integrity": "sha512-ESL2CrkS/2wTPfuend7Zhkzo2u0daGJ/A2VucJOgQ/C48S/zB8MMeMHSGKYpXhIjbPxfuezITkaBH1wqv00DDQ==",
"version": "11.3.5",
"resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-11.3.5.tgz",
"integrity": "sha512-NxVFwLAnrd9i7KUBxC4DrUhmgjzOs+1Qm50D3oF1/oL+r1NpZ4gA7xvG0/zJ8evR7zIKn4vLf7qTNduWFtCrRw==",
"license": "BlueOak-1.0.0",
"engines": {
"node": "20 || >=22"
@@ -1552,12 +1524,12 @@
}
},
"node_modules/p-queue": {
"version": "9.1.0",
"resolved": "https://registry.npmjs.org/p-queue/-/p-queue-9.1.0.tgz",
"integrity": "sha512-O/ZPaXuQV29uSLbxWBGGZO1mCQXV2BLIwUr59JUU9SoH76mnYvtms7aafH/isNSNGwuEfP6W/4xD0/TJXxrizw==",
"version": "9.2.0",
"resolved": "https://registry.npmjs.org/p-queue/-/p-queue-9.2.0.tgz",
"integrity": "sha512-dWgLE8AH0HjQ9fe74pUkKkvzzYT18Inp4zra3lKHnnwqGvcfcUBrvF2EAVX+envufDNBOzpPq/IBUONDbI7+3g==",
"license": "MIT",
"dependencies": {
"eventemitter3": "^5.0.1",
"eventemitter3": "^5.0.4",
"p-timeout": "^7.0.0"
},
"engines": {
@@ -1648,22 +1620,22 @@
"license": "MIT"
},
"node_modules/protobufjs": {
"version": "7.5.4",
"resolved": "https://registry.npmjs.org/protobufjs/-/protobufjs-7.5.4.tgz",
"integrity": "sha512-CvexbZtbov6jW2eXAvLukXjXUW1TzFaivC46BpWc/3BpcCysb5Vffu+B3XHMm8lVEuy2Mm4XGex8hBSg1yapPg==",
"version": "7.5.6",
"resolved": "https://registry.npmjs.org/protobufjs/-/protobufjs-7.5.6.tgz",
"integrity": "sha512-M71sTMB146U3u0di3yup8iM+zv8yPRNQVr1KK4tyBitl3qFvEGucq/rGDRShD2rsJhtN02RJaJ7j5X5hmy8SJg==",
"hasInstallScript": true,
"license": "BSD-3-Clause",
"dependencies": {
"@protobufjs/aspromise": "^1.1.2",
"@protobufjs/base64": "^1.1.2",
"@protobufjs/codegen": "^2.0.4",
"@protobufjs/codegen": "^2.0.5",
"@protobufjs/eventemitter": "^1.1.0",
"@protobufjs/fetch": "^1.1.0",
"@protobufjs/float": "^1.0.2",
"@protobufjs/inquire": "^1.1.0",
"@protobufjs/inquire": "^1.1.1",
"@protobufjs/path": "^1.1.2",
"@protobufjs/pool": "^1.1.0",
"@protobufjs/utf8": "^1.1.0",
"@protobufjs/utf8": "^1.1.1",
"@types/node": ">=13.7.0",
"long": "^5.0.0"
},
@@ -1685,17 +1657,23 @@
}
},
"node_modules/qified": {
"version": "0.6.0",
"resolved": "https://registry.npmjs.org/qified/-/qified-0.6.0.tgz",
"integrity": "sha512-tsSGN1x3h569ZSU1u6diwhltLyfUWDp3YbFHedapTmpBl0B3P6U3+Qptg7xu+v+1io1EwhdPyyRHYbEw0KN2FA==",
"version": "0.9.1",
"resolved": "https://registry.npmjs.org/qified/-/qified-0.9.1.tgz",
"integrity": "sha512-n7mar4T0xQ+39dE2vGTAlbxUEpndwPANH0kDef1/MYsB8Bba9wshkybIRx74qgcvKQPEWErf9AqAdYjhzY2Ilg==",
"license": "MIT",
"dependencies": {
"hookified": "^1.14.0"
"hookified": "^2.1.1"
},
"engines": {
"node": ">=20"
}
},
"node_modules/qified/node_modules/hookified": {
"version": "2.2.0",
"resolved": "https://registry.npmjs.org/hookified/-/hookified-2.2.0.tgz",
"integrity": "sha512-p/LgFzRN5FeoD3DLS6bkUapeye6E4SI6yJs6KetENd18S+FBthqYq2amJUWpt5z0EQwwHemidjY5OqJGEKm5uA==",
"license": "MIT"
},
"node_modules/qrcode-terminal": {
"version": "0.12.0",
"resolved": "https://registry.npmjs.org/qrcode-terminal/-/qrcode-terminal-0.12.0.tgz",
@@ -1922,13 +1900,13 @@
}
},
"node_modules/side-channel-list": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/side-channel-list/-/side-channel-list-1.0.0.tgz",
"integrity": "sha512-FCLHtRD/gnpCiCHEiJLOwdmFP+wzCmDEkc9y7NsYxeF4u7Btsn1ZuwgwJGxImImHicJArLP4R0yX4c2KCrMrTA==",
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/side-channel-list/-/side-channel-list-1.0.1.tgz",
"integrity": "sha512-mjn/0bi/oUURjc5Xl7IaWi/OJJJumuoJFQJfDDyO46+hBWsfaVM65TBHq2eoZBhzl9EchxOijpkbRC8SVBQU0w==",
"license": "MIT",
"dependencies": {
"es-errors": "^1.3.0",
"object-inspect": "^1.13.3"
"object-inspect": "^1.13.4"
},
"engines": {
"node": ">= 0.4"
@@ -2094,9 +2072,9 @@
}
},
"node_modules/undici-types": {
"version": "7.18.2",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-7.18.2.tgz",
"integrity": "sha512-AsuCzffGHJybSaRrmr5eHr81mwJU3kjw6M+uprWvCXiNeN9SOGwQ3Jn8jb8m3Z6izVgknn1R0FTCEAP2QrLY/w==",
"version": "7.19.2",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-7.19.2.tgz",
"integrity": "sha512-qYVnV5OEm2AW8cJMCpdV20CDyaN3g0AjDlOGf1OW4iaDEx8MwdtChUp4zu4H0VP3nDRF/8RKWH+IPp9uW0YGZg==",
"license": "MIT"
},
"node_modules/unpipe": {
@@ -2139,9 +2117,9 @@
"license": "MIT"
},
"node_modules/ws": {
"version": "8.19.0",
"resolved": "https://registry.npmjs.org/ws/-/ws-8.19.0.tgz",
"integrity": "sha512-blAT2mjOEIi0ZzruJfIhb3nps74PRWTCz1IjglWEEpQl5XS/UNama6u2/rjFkDDouqr4L67ry+1aGIALViWjDg==",
"version": "8.20.0",
"resolved": "https://registry.npmjs.org/ws/-/ws-8.20.0.tgz",
"integrity": "sha512-sAt8BhgNbzCtgGbt2OxmpuryO63ZoDk/sqaB/znQm94T4fCEsy/yV+7CdC1kJhOU9lboAEU7R3kquuycDoibVA==",
"license": "MIT",
"engines": {
"node": ">=10.0.0"
+3
View File
@@ -12,5 +12,8 @@
"express": "^4.21.0",
"qrcode-terminal": "^0.12.0",
"pino": "^9.0.0"
},
"overrides": {
"protobufjs": "^7.5.5"
}
}
+4 -3
View File
@@ -178,9 +178,10 @@ class TestMcpRegistrationE2E:
complete_event = completions[0]
assert isinstance(complete_event, ToolCallProgress)
assert complete_event.status == "completed"
# rawOutput should contain the tool result string
assert complete_event.raw_output is not None
assert "hello" in str(complete_event.raw_output)
# Completion should contain human-readable output rather than forcing raw JSON panes.
assert complete_event.content
assert "hello" in complete_event.content[0].content.text
assert complete_event.raw_output is None
def test_patch_mode_tool_start_emits_diff_blocks_for_v4a_patch(self):
update = build_tool_start(
+185 -7
View File
@@ -27,7 +27,10 @@ from acp.schema import (
SetSessionModeResponse,
SessionInfo,
TextContentBlock,
ToolCallProgress,
ToolCallStart,
Usage,
UsageUpdate,
UserMessageChunk,
)
from acp_adapter.server import HermesACPAgent, HERMES_VERSION
@@ -200,6 +203,8 @@ class TestSessionOps:
"context",
"reset",
"compact",
"steer",
"queue",
"version",
]
model_cmd = next(
@@ -208,6 +213,46 @@ class TestSessionOps:
assert model_cmd.input is not None
assert model_cmd.input.root.hint == "model name to switch to"
def test_build_usage_update_for_zed_context_indicator(self, agent, mock_manager):
state = mock_manager.create_session(cwd="/tmp")
state.history = [{"role": "user", "content": "hello"}]
state.agent.context_compressor = MagicMock(context_length=100_000)
state.agent._cached_system_prompt = "system"
state.agent.tools = [{"type": "function", "function": {"name": "demo"}}]
with patch(
"agent.model_metadata.estimate_request_tokens_rough",
return_value=25_000,
):
update = agent._build_usage_update(state)
assert isinstance(update, UsageUpdate)
assert update.session_update == "usage_update"
assert update.size == 100_000
assert update.used == 25_000
@pytest.mark.asyncio
async def test_send_usage_update_to_client(self, agent, mock_manager):
state = mock_manager.create_session(cwd="/tmp")
state.agent.context_compressor = MagicMock(context_length=100_000)
mock_conn = MagicMock(spec=acp.Client)
mock_conn.session_update = AsyncMock()
agent._conn = mock_conn
with patch(
"agent.model_metadata.estimate_request_tokens_rough",
return_value=25_000,
):
await agent._send_usage_update(state)
mock_conn.session_update.assert_awaited_once()
call = mock_conn.session_update.await_args
assert call.kwargs["session_id"] == state.session_id
update = call.kwargs["update"]
assert isinstance(update, UsageUpdate)
assert update.size == 100_000
assert update.used == 25_000
@pytest.mark.asyncio
async def test_cancel_sets_event(self, agent):
resp = await agent.new_session(cwd=".")
@@ -238,11 +283,31 @@ class TestSessionOps:
{"role": "system", "content": "hidden system"},
{"role": "user", "content": "what controls the / slash commands?"},
{"role": "assistant", "content": "HermesACPAgent._ADVERTISED_COMMANDS controls them."},
{"role": "tool", "content": "tool output should not replay"},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"id": "call_search_1",
"type": "function",
"function": {
"name": "search_files",
"arguments": '{"pattern":"slash commands","path":"."}',
},
}
],
},
{
"role": "tool",
"tool_call_id": "call_search_1",
"content": '{"total_count":1,"matches":[{"path":"cli.py","line":42,"content":"slash commands"}]}',
},
]
mock_conn.session_update.reset_mock()
resp = await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
await asyncio.sleep(0)
await asyncio.sleep(0)
assert isinstance(resp, LoadSessionResponse)
calls = mock_conn.session_update.await_args_list
@@ -257,6 +322,21 @@ class TestSessionOps:
assert isinstance(replay_calls[1].kwargs["update"], AgentMessageChunk)
assert replay_calls[1].kwargs["update"].content.text.startswith("HermesACPAgent")
tool_updates = [
call.kwargs["update"]
for call in calls
if getattr(call.kwargs.get("update"), "session_update", None)
in {"tool_call", "tool_call_update"}
]
assert len(tool_updates) == 2
assert isinstance(tool_updates[0], ToolCallStart)
assert tool_updates[0].tool_call_id == "call_search_1"
assert tool_updates[0].title == "search: slash commands"
assert isinstance(tool_updates[1], ToolCallProgress)
assert tool_updates[1].tool_call_id == "call_search_1"
assert "Search results" in tool_updates[1].content[0].content.text
assert "cli.py:42" in tool_updates[1].content[0].content.text
@pytest.mark.asyncio
async def test_resume_session_replays_persisted_history_to_client(self, agent):
mock_conn = MagicMock(spec=acp.Client)
@@ -269,6 +349,8 @@ class TestSessionOps:
mock_conn.session_update.reset_mock()
resp = await agent.resume_session(cwd="/tmp", session_id=new_resp.session_id)
await asyncio.sleep(0)
await asyncio.sleep(0)
assert isinstance(resp, ResumeSessionResponse)
updates = [call.kwargs["update"] for call in mock_conn.session_update.await_args_list]
@@ -278,6 +360,27 @@ class TestSessionOps:
for update in updates
)
@pytest.mark.asyncio
async def test_load_session_schedules_history_replay_after_response(self, agent):
"""Zed only attaches replayed updates after session/load has completed."""
new_resp = await agent.new_session(cwd="/tmp")
state = agent.session_manager.get_session(new_resp.session_id)
state.history = [{"role": "user", "content": "hello from history"}]
events = []
async def replay_after_response(_state):
events.append("replay")
with patch.object(agent, "_replay_session_history", side_effect=replay_after_response):
resp = await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
events.append("returned")
assert isinstance(resp, LoadSessionResponse)
assert events == ["returned"]
await asyncio.sleep(0)
await asyncio.sleep(0)
assert events == ["returned", "replay"]
@pytest.mark.asyncio
async def test_resume_session_creates_new_if_missing(self, agent):
resume_resp = await agent.resume_session(cwd="/tmp", session_id="nonexistent")
@@ -522,6 +625,11 @@ class TestPrompt:
assert isinstance(resp, PromptResponse)
assert resp.stop_reason == "end_turn"
state.agent.run_conversation.assert_called_once()
assert state.agent.tool_progress_callback is not None
assert state.agent.step_callback is not None
assert state.agent.stream_delta_callback is not None
assert state.agent.reasoning_callback is not None
assert state.agent.thinking_callback is None
@pytest.mark.asyncio
async def test_prompt_updates_history(self, agent):
@@ -565,12 +673,40 @@ class TestPrompt:
prompt = [TextContentBlock(type="text", text="help me")]
await agent.prompt(prompt=prompt, session_id=new_resp.session_id)
# session_update should have been called with the final message
# session_update should include the final message (usage_update may follow it)
mock_conn.session_update.assert_called()
# Get the last call's update argument
last_call = mock_conn.session_update.call_args_list[-1]
update = last_call[1].get("update") or last_call[0][1]
assert update.session_update == "agent_message_chunk"
updates = [
call.kwargs.get("update") or call.args[1]
for call in mock_conn.session_update.call_args_list
]
assert any(update.session_update == "agent_message_chunk" for update in updates)
@pytest.mark.asyncio
async def test_prompt_does_not_duplicate_streamed_final_message(self, agent):
"""If ACP already streamed response chunks, final_response should not be sent again."""
new_resp = await agent.new_session(cwd=".")
state = agent.session_manager.get_session(new_resp.session_id)
def mock_run(*args, **kwargs):
state.agent.stream_delta_callback("streamed answer")
return {"final_response": "streamed answer", "messages": []}
state.agent.run_conversation = mock_run
mock_conn = MagicMock(spec=acp.Client)
mock_conn.session_update = AsyncMock()
agent._conn = mock_conn
prompt = [TextContentBlock(type="text", text="hello")]
await agent.prompt(prompt=prompt, session_id=new_resp.session_id)
updates = [
call.kwargs.get("update") or call.args[1]
for call in mock_conn.session_update.call_args_list
]
agent_chunks = [update for update in updates if update.session_update == "agent_message_chunk"]
assert len(agent_chunks) == 1
assert agent_chunks[0].content.text == "streamed answer"
@pytest.mark.asyncio
async def test_prompt_auto_titles_session(self, agent):
@@ -708,6 +844,43 @@ class TestSlashCommands:
assert "2 messages" in result
assert "user: 1" in result
def test_context_shows_usage_and_compression_threshold(self, agent, mock_manager):
state = self._make_state(mock_manager)
state.history = [{"role": "user", "content": "hello"}]
state.agent.context_compressor = MagicMock(
context_length=100_000,
threshold_tokens=80_000,
)
state.agent._cached_system_prompt = "system"
state.agent.tools = [{"type": "function", "function": {"name": "demo"}}]
with patch(
"agent.model_metadata.estimate_request_tokens_rough",
return_value=25_000,
):
result = agent._handle_slash_command("/context", state)
assert "Context usage: ~25,000 / 100,000 tokens (25.0%)" in result
assert "Compression: ~55,000 tokens until threshold (~80,000, 80%)" in result
assert "Tip: run /compact" in result
def test_context_says_compression_due_when_past_threshold(self, agent, mock_manager):
state = self._make_state(mock_manager)
state.history = [{"role": "user", "content": "hello"}]
state.agent.context_compressor = MagicMock(
context_length=100_000,
threshold_tokens=80_000,
)
with patch(
"agent.model_metadata.estimate_request_tokens_rough",
return_value=82_000,
):
result = agent._handle_slash_command("/context", state)
assert "Context usage: ~82,000 / 100,000 tokens (82.0%)" in result
assert "Compression: due now (threshold ~80,000, 80%). Run /compact." in result
def test_reset_clears_history(self, agent, mock_manager):
state = self._make_state(mock_manager)
state.history = [{"role": "user", "content": "hello"}]
@@ -787,7 +960,12 @@ class TestSlashCommands:
resp = await agent.prompt(prompt=prompt, session_id=new_resp.session_id)
assert resp.stop_reason == "end_turn"
mock_conn.session_update.assert_called_once()
updates = [
call.kwargs.get("update") or call.args[1]
for call in mock_conn.session_update.call_args_list
]
assert any(update.session_update == "agent_message_chunk" for update in updates)
assert any(update.session_update == "usage_update" for update in updates)
@pytest.mark.asyncio
async def test_unknown_slash_falls_through_to_llm(self, agent, mock_manager):
+232 -5
View File
@@ -52,6 +52,12 @@ class TestToolKindMap:
def test_tool_kind_execute_code(self):
assert get_tool_kind("execute_code") == "execute"
def test_tool_kind_todo(self):
assert get_tool_kind("todo") == "other"
def test_tool_kind_skill_view(self):
assert get_tool_kind("skill_view") == "read"
def test_tool_kind_browser_navigate(self):
assert get_tool_kind("browser_navigate") == "fetch"
@@ -110,6 +116,25 @@ class TestBuildToolTitle:
title = build_tool_title("web_search", {"query": "python asyncio"})
assert "python asyncio" in title
def test_skill_view_title_includes_skill_name(self):
title = build_tool_title("skill_view", {"name": "github-pitfalls"})
assert title == "skill view (github-pitfalls)"
def test_skill_view_title_includes_linked_file(self):
title = build_tool_title("skill_view", {"name": "github-pitfalls", "file_path": "references/api.md"})
assert title == "skill view (github-pitfalls/references/api.md)"
def test_execute_code_title_includes_first_code_line(self):
title = build_tool_title("execute_code", {"code": "\nfrom hermes_tools import terminal\nprint('done')"})
assert title == "python: from hermes_tools import terminal"
def test_skill_manage_title_includes_action_and_target(self):
title = build_tool_title(
"skill_manage",
{"action": "patch", "name": "hermes-agent-operations", "file_path": "references/acp.md"},
)
assert title == "skill patch: hermes-agent-operations/references/acp.md"
def test_unknown_tool_uses_name(self):
title = build_tool_title("some_new_tool", {"foo": "bar"})
assert title == "some_new_tool"
@@ -164,15 +189,23 @@ class TestBuildToolStart:
assert "ls -la /tmp" in text
def test_build_tool_start_for_read_file(self):
"""read_file should include the path in content."""
"""read_file start should stay compact; completion carries file contents."""
args = {"path": "/etc/hosts", "offset": 1, "limit": 50}
result = build_tool_start("tc-3", "read_file", args)
assert isinstance(result, ToolCallStart)
assert result.kind == "read"
assert len(result.content) >= 1
content_item = result.content[0]
assert isinstance(content_item, ContentToolCallContent)
assert "/etc/hosts" in content_item.content.text
assert result.content is None
assert result.raw_input is None
def test_build_tool_start_for_web_extract_is_compact(self):
"""web_extract start should stay compact; title identifies URLs."""
args = {"urls": ["https://example.com/docs"]}
result = build_tool_start("tc-web-start", "web_extract", args)
assert isinstance(result, ToolCallStart)
assert result.title == "extract: https://example.com/docs"
assert result.kind == "fetch"
assert result.content is None
assert result.raw_input is None
def test_build_tool_start_for_search(self):
"""search_files should include pattern in content."""
@@ -181,6 +214,48 @@ class TestBuildToolStart:
assert isinstance(result, ToolCallStart)
assert result.kind == "search"
assert "TODO" in result.content[0].content.text
assert result.raw_input is None
def test_build_tool_start_for_todo_is_human_readable(self):
args = {"todos": [{"id": "one", "content": "Fix ACP rendering", "status": "in_progress"}]}
result = build_tool_start("tc-todo", "todo", args)
assert result.title == "todo (1 item)"
assert "Fix ACP rendering" in result.content[0].content.text
assert result.raw_input is None
def test_build_tool_start_for_skill_view_is_human_readable(self):
result = build_tool_start("tc-skill", "skill_view", {"name": "github-pitfalls"})
assert result.title == "skill view (github-pitfalls)"
assert "github-pitfalls" in result.content[0].content.text
assert result.raw_input is None
def test_build_tool_start_for_execute_code_shows_code_preview(self):
result = build_tool_start("tc-code", "execute_code", {"code": "print('hello')"})
assert result.kind == "execute"
assert result.title == "python: print('hello')"
assert "```python" in result.content[0].content.text
assert "print('hello')" in result.content[0].content.text
assert result.raw_input is None
def test_build_tool_start_for_skill_manage_patch_shows_diff(self):
result = build_tool_start(
"tc-skill-manage",
"skill_manage",
{
"action": "patch",
"name": "hermes-agent-operations",
"file_path": "references/acp.md",
"old_string": "old advice",
"new_string": "new advice",
},
)
assert result.kind == "edit"
assert result.title == "skill patch: hermes-agent-operations/references/acp.md"
assert isinstance(result.content[0], FileEditToolCallContent)
assert result.content[0].path == "skills/hermes-agent-operations/references/acp.md"
assert result.content[0].old_text == "old advice"
assert result.content[0].new_text == "new advice"
assert result.raw_input is None
def test_build_tool_start_generic_fallback(self):
"""Unknown tools should get a generic text representation."""
@@ -205,6 +280,158 @@ class TestBuildToolComplete:
content_item = result.content[0]
assert isinstance(content_item, ContentToolCallContent)
assert "total 42" in content_item.content.text
assert result.raw_output is None
def test_build_tool_complete_for_todo_is_checklist(self):
result = build_tool_complete(
"tc-todo",
"todo",
'{"todos":[{"id":"a","content":"Inspect ACP","status":"completed"},{"id":"b","content":"Patch renderers","status":"in_progress"}],"summary":{"total":2,"pending":0,"in_progress":1,"completed":1,"cancelled":0}}',
)
text = result.content[0].content.text
assert "✅ Inspect ACP" in text
assert "- 🔄 Patch renderers" in text
assert "**Progress:** 1 completed, 1 in progress, 0 pending" in text
assert result.raw_output is None
def test_build_tool_complete_for_skill_view_summarizes_content_without_raw_json(self):
result = build_tool_complete(
"tc-skill",
"skill_view",
'{"success":true,"name":"github-pitfalls","description":"GitHub gotchas","content":"# GitHub Pitfalls\\nUse gh carefully.","path":"github/github-pitfalls/SKILL.md"}',
)
text = result.content[0].content.text
assert "**Skill loaded**" in text
assert "`github-pitfalls`" in text
assert "GitHub gotchas" in text
assert "GitHub Pitfalls" in text
assert "Use gh carefully" not in text
assert "Full skill content is available to the agent" in text
assert result.raw_output is None
def test_build_tool_complete_for_execute_code_formats_output(self):
result = build_tool_complete("tc-code", "execute_code", '{"output":"hello\\n","exit_code":0}')
text = result.content[0].content.text
assert "Exit code: 0" in text
assert "hello" in text
assert result.raw_output is None
def test_build_tool_complete_for_skill_manage_summarizes_without_raw_json(self):
result = build_tool_complete(
"tc-skill-manage",
"skill_manage",
'{"success":true,"message":"Patched references/hermes-acp-zed-rendering.md in skill \'hermes-agent-operations\' (1 replacement)."}',
function_args={
"action": "patch",
"name": "hermes-agent-operations",
"file_path": "references/hermes-acp-zed-rendering.md",
},
)
text = result.content[0].content.text
assert "**✅ Skill updated**" in text
assert "`patch`" in text
assert "`hermes-agent-operations`" in text
assert "references/hermes-acp-zed-rendering.md" in text
assert "{\"success\"" not in text
assert result.raw_output is None
def test_build_tool_complete_for_read_file_formats_content(self):
result = build_tool_complete(
"tc-read",
"read_file",
'{"content":"1|hello\\n2|world","total_lines":2}',
function_args={"path":"README.md","offset":1,"limit":20},
)
text = result.content[0].content.text
assert "Read README.md" in text
assert "```\n1|hello\n2|world\n```" in text
assert result.raw_output is None
def test_build_tool_complete_for_search_files_formats_matches(self):
result = build_tool_complete(
"tc-search",
"search_files",
'{"total_count":2,"matches":[{"path":"README.md","line":3,"content":"TODO: fix this"},{"path":"src/app.py","line":9,"content":"needle"}],"truncated":true}\n\n[Hint: Results truncated. Use offset=12 to see more.]',
)
text = result.content[0].content.text
assert "Search results" in text
assert "Found 2 matches" in text
assert "README.md:3" in text
assert "TODO: fix this" in text
assert "Results truncated" in text
assert result.raw_output is None
def test_build_tool_complete_for_process_list_formats_table(self):
result = build_tool_complete(
"tc-process",
"process",
'{"processes":[{"session_id":"p1","status":"running","pid":123,"command":"npm run dev"}]}',
function_args={"action":"list"},
)
text = result.content[0].content.text
assert "Processes: 1" in text
assert "`p1`" in text
assert "npm run dev" in text
assert result.raw_output is None
def test_build_tool_complete_for_delegate_task_summarizes_children(self):
result = build_tool_complete(
"tc-delegate",
"delegate_task",
'{"results":[{"task_index":0,"status":"completed","summary":"Reviewed ACP rendering.","model":"gpt-5.5","duration_seconds":3.2,"tool_trace":[{"tool":"read_file"}]}],"total_duration_seconds":3.4}',
)
text = result.content[0].content.text
assert "Delegation results: 1 task" in text
assert "Reviewed ACP rendering" in text
assert "gpt-5.5" in text
assert "Tools: read_file" in text
assert result.raw_output is None
def test_build_tool_complete_for_session_search_recent(self):
result = build_tool_complete(
"tc-session",
"session_search",
'{"success":true,"mode":"recent","results":[{"session_id":"s1","title":"ACP work","last_active":"2026-05-02","message_count":12,"preview":"Polished tool rendering."}],"count":1}',
)
text = result.content[0].content.text
assert "Recent sessions" in text
assert "ACP work" in text
assert "Polished tool rendering" in text
assert result.raw_output is None
def test_build_tool_complete_for_memory_avoids_dumping_entries(self):
result = build_tool_complete(
"tc-memory",
"memory",
'{"success":true,"target":"user","entries":["private long memory"],"usage":"1% — 19/2000 chars","entry_count":1,"message":"Entry added."}',
function_args={"action":"add","target":"user","content":"User likes concise ACP rendering."},
)
text = result.content[0].content.text
assert "Memory add saved" in text
assert "User likes concise ACP rendering" in text
assert "private long memory" not in text
assert result.raw_output is None
def test_build_tool_complete_for_web_extract_success_stays_compact(self):
result = build_tool_complete(
"tc-web-extract",
"web_extract",
'{"results":[{"url":"https://example.com","title":"Example","content":"# Intro\\nThis is extracted content."}]}',
)
assert result.content is None
assert result.raw_output is None
def test_build_tool_complete_for_web_extract_error_shows_error(self):
result = build_tool_complete(
"tc-web-extract-error",
"web_extract",
'{"results":[{"url":"https://example.com","title":"Example","error":"timeout"}]}',
)
text = result.content[0].content.text
assert "Web extract failed" in text
assert "https://example.com" in text
assert "timeout" in text
assert result.raw_output is None
def test_build_tool_complete_truncates_large_output(self):
"""Very large outputs should be truncated."""
+52
View File
@@ -1836,3 +1836,55 @@ class TestResolveMessagesMaxTokens:
result = _resolve_anthropic_messages_max_tokens(0.5, "claude-opus-4-6")
assert result > 0
assert result != 0
# ---------------------------------------------------------------------------
# convert_tools_to_anthropic — tool dedup at API boundary
# ---------------------------------------------------------------------------
class TestConvertToolsToAnthropicDedup:
"""convert_tools_to_anthropic must deduplicate tool names.
Anthropic rejects requests with duplicate tool names. This guard converts
a hard failure into a warning log. See:
https://github.com/NousResearch/hermes-agent/issues/18478
"""
def _make_openai_tool(self, name: str) -> dict:
return {
"type": "function",
"function": {
"name": name,
"description": f"Tool {name}",
"parameters": {"type": "object", "properties": {}},
},
}
def test_unique_tools_pass_through(self):
tools = [self._make_openai_tool("alpha"), self._make_openai_tool("beta")]
result = convert_tools_to_anthropic(tools)
assert len(result) == 2
names = [t["name"] for t in result]
assert names == ["alpha", "beta"]
def test_duplicate_tool_names_are_deduplicated(self):
"""RED test — must fail until dedup guard is added."""
tools = [
self._make_openai_tool("lcm_grep"),
self._make_openai_tool("lcm_describe"),
self._make_openai_tool("lcm_grep"), # duplicate
self._make_openai_tool("lcm_expand"),
self._make_openai_tool("lcm_describe"), # duplicate
]
result = convert_tools_to_anthropic(tools)
names = [t["name"] for t in result]
assert len(names) == len(set(names)), (
f"Duplicate tool names found: {names}"
)
assert len(result) == 3 # lcm_grep, lcm_describe, lcm_expand
def test_empty_tools_returns_empty(self):
assert convert_tools_to_anthropic([]) == []
def test_none_tools_returns_empty(self):
assert convert_tools_to_anthropic(None) == []
+191
View File
@@ -16,6 +16,7 @@ from agent.auxiliary_client import (
auxiliary_max_tokens_param,
call_llm,
async_call_llm,
_build_call_kwargs,
_read_codex_access_token,
_get_provider_chain,
_is_payment_error,
@@ -1752,3 +1753,193 @@ class TestVisionAutoSkipsKimiCoding:
"kimi-coding",
"kimi-coding-cn",
})
# ---------------------------------------------------------------------------
# _build_call_kwargs — tool dedup at API boundary
# ---------------------------------------------------------------------------
class TestBuildCallKwargsToolDedup:
"""_build_call_kwargs must deduplicate tool names before passing to API.
Providers like Google Vertex, Azure, and Bedrock reject requests with
duplicate tool names (HTTP 400). This guard converts a hard failure into
a warning log so agent turns succeed even if an upstream injection path
regresses. See: https://github.com/NousResearch/hermes-agent/issues/18478
"""
def _make_tool(self, name: str) -> dict:
return {
"type": "function",
"function": {
"name": name,
"description": f"Tool {name}",
"parameters": {"type": "object", "properties": {}},
},
}
def test_unique_tools_pass_through_unchanged(self):
tools = [self._make_tool("alpha"), self._make_tool("beta")]
kwargs = _build_call_kwargs(
provider="openai", model="gpt-4o", messages=[], tools=tools,
)
assert len(kwargs["tools"]) == 2
names = [t["function"]["name"] for t in kwargs["tools"]]
assert names == ["alpha", "beta"]
def test_duplicate_tool_names_are_deduplicated(self):
"""RED test — must fail until dedup guard is added."""
tools = [
self._make_tool("lcm_grep"),
self._make_tool("lcm_describe"),
self._make_tool("lcm_grep"), # duplicate
self._make_tool("lcm_expand"),
self._make_tool("lcm_describe"), # duplicate
]
kwargs = _build_call_kwargs(
provider="google", model="gemini-2.5-pro", messages=[], tools=tools,
)
result_tools = kwargs["tools"]
names = [t["function"]["name"] for t in result_tools]
# Must be deduplicated — no repeated names
assert len(names) == len(set(names)), (
f"Duplicate tool names found: {names}"
)
assert len(result_tools) == 3 # lcm_grep, lcm_describe, lcm_expand
def test_empty_tools_unchanged(self):
kwargs = _build_call_kwargs(
provider="openai", model="gpt-4o", messages=[], tools=[],
)
assert kwargs.get("tools") == [] or "tools" not in kwargs
def test_none_tools_unchanged(self):
kwargs = _build_call_kwargs(
provider="openai", model="gpt-4o", messages=[], tools=None,
)
assert "tools" not in kwargs
@pytest.fixture(autouse=True)
def _clean_env(monkeypatch):
"""Strip provider env vars so each test starts clean."""
for key in (
"OPENROUTER_API_KEY", "OPENAI_BASE_URL", "OPENAI_API_KEY",
):
monkeypatch.delenv(key, raising=False)
class TestOpenRouterExplicitApiKey:
"""Test that explicit_api_key is correctly propagated to _try_openrouter()."""
def test_resolve_provider_client_passes_explicit_api_key_to_openrouter(
self, monkeypatch
):
"""
When resolve_provider_client() is called with explicit_api_key for OpenRouter,
the explicit key should be passed to the OpenAI client instead of falling back
to OPENROUTER_API_KEY env var.
"""
# Set up env var as fallback (should NOT be used when explicit_api_key is provided)
monkeypatch.setenv("OPENROUTER_API_KEY", "env-fallback-key")
# Mock OpenAI to capture the api_key used
mock_openai = MagicMock()
mock_openai.return_value = MagicMock(name="openrouter-client")
with patch("agent.auxiliary_client.OpenAI", mock_openai):
client, model = resolve_provider_client(
provider="openrouter",
explicit_api_key="explicit-pool-key",
)
# Verify a client was created
assert client is not None
# Verify the explicit key was used, not the env var fallback
mock_openai.assert_called_once()
call_kwargs = mock_openai.call_args[1]
assert call_kwargs["api_key"] == "explicit-pool-key", (
f"Expected explicit_api_key to be passed, got: {call_kwargs['api_key']}"
)
assert call_kwargs["api_key"] != "env-fallback-key", (
"Should NOT fall back to OPENROUTER_API_KEY when explicit_api_key is provided"
)
def test_resolve_provider_client_without_explicit_api_key_falls_back_to_env(
self, monkeypatch
):
"""
When resolve_provider_client() is called WITHOUT explicit_api_key for OpenRouter,
it should fall back to OPENROUTER_API_KEY env var.
"""
# Set up env var as fallback (should be used when explicit_api_key is NOT provided)
monkeypatch.setenv("OPENROUTER_API_KEY", "env-fallback-key")
# Mock OpenAI to capture the api_key used
mock_openai = MagicMock()
mock_openai.return_value = MagicMock(name="openrouter-client")
with patch("agent.auxiliary_client.OpenAI", mock_openai):
client, model = resolve_provider_client(
provider="openrouter",
explicit_api_key=None,
)
# Verify a client was created
assert client is not None
# Verify the env var fallback was used
mock_openai.assert_called_once()
call_kwargs = mock_openai.call_args[1]
assert call_kwargs["api_key"] == "env-fallback-key", (
f"Expected env fallback key to be used when explicit_api_key is None, got: {call_kwargs['api_key']}"
)
class TestAnthropicExplicitApiKey:
"""Test that explicit_api_key is correctly propagated to _try_anthropic().
Parity with the OpenRouter fix in #18768: resolve_provider_client() passes
explicit_api_key to _try_openrouter(), but the anthropic branch was not
updated _try_anthropic() always fell back to resolve_anthropic_token()
even when an explicit key was supplied (e.g. from a fallback_model entry).
"""
def test_try_anthropic_uses_explicit_api_key_over_env(self):
"""_try_anthropic(explicit_api_key) must use the supplied key, not the env fallback."""
with patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="env-fallback-key"), \
patch("agent.anthropic_adapter.build_anthropic_client") as mock_build, \
patch("agent.auxiliary_client._select_pool_entry", return_value=(False, None)):
mock_build.return_value = MagicMock()
from agent.auxiliary_client import _try_anthropic
client, model = _try_anthropic("explicit-pool-key")
assert client is not None
assert mock_build.call_args.args[0] == "explicit-pool-key", (
f"Expected explicit_api_key to be passed, got: {mock_build.call_args.args[0]}"
)
assert mock_build.call_args.args[0] != "env-fallback-key"
def test_try_anthropic_without_explicit_key_falls_back_to_resolve(self):
"""Without explicit_api_key, _try_anthropic falls back to resolve_anthropic_token."""
with patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="env-fallback-key"), \
patch("agent.anthropic_adapter.build_anthropic_client") as mock_build, \
patch("agent.auxiliary_client._select_pool_entry", return_value=(False, None)):
mock_build.return_value = MagicMock()
from agent.auxiliary_client import _try_anthropic
client, model = _try_anthropic()
assert client is not None
assert mock_build.call_args.args[0] == "env-fallback-key"
def test_resolve_provider_client_passes_explicit_api_key_to_anthropic(self):
"""resolve_provider_client(provider='anthropic', explicit_api_key=...) must propagate the key."""
with patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="env-key"), \
patch("agent.anthropic_adapter.build_anthropic_client") as mock_build, \
patch("agent.auxiliary_client._select_pool_entry", return_value=(False, None)):
mock_build.return_value = MagicMock()
client, model = resolve_provider_client(
provider="anthropic",
explicit_api_key="explicit-fallback-key",
)
assert client is not None
assert mock_build.call_args.args[0] == "explicit-fallback-key", (
"resolve_provider_client must forward explicit_api_key to _try_anthropic()"
)
+58
View File
@@ -348,6 +348,64 @@ def test_load_pool_seeds_env_api_key(tmp_path, monkeypatch):
assert entry.access_token == "sk-or-seeded"
def test_load_pool_prefers_dotenv_over_stale_os_environ(tmp_path, monkeypatch):
"""Regression for #18254: stale OPENROUTER_API_KEY in os.environ (inherited
from a parent shell) must NOT shadow the fresh key in ~/.hermes/.env when
seeding the credential pool. Before the fix, `get_env_value()` preferred
os.environ and silently wrote the stale value into auth.json, causing
persistent 401 errors after key rotation.
"""
hermes_home = tmp_path / "hermes"
hermes_home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
# Simulate the bug: parent shell exported a stale test key
monkeypatch.setenv("OPENROUTER_API_KEY", "sk-or-STALE-from-shell")
# User edited ~/.hermes/.env with the fresh key
(hermes_home / ".env").write_text(
"OPENROUTER_API_KEY=sk-or-FRESH-from-dotenv\n"
)
_write_auth_store(tmp_path, {"version": 1, "providers": {}})
from agent.credential_pool import load_pool
pool = load_pool("openrouter")
entry = pool.select()
assert entry is not None
assert entry.source == "env:OPENROUTER_API_KEY"
# The fresh key from .env must win over the stale shell export
assert entry.access_token == "sk-or-FRESH-from-dotenv", (
f"Expected .env to win, got {entry.access_token!r}"
)
def test_load_pool_falls_back_to_os_environ_when_dotenv_empty(tmp_path, monkeypatch):
"""When ~/.hermes/.env does not define OPENROUTER_API_KEY (typical Docker /
K8s / systemd deployment), seeding must still pick up the key from
os.environ. Guards against regressions that would break production
deployments relying on runtime-injected env vars.
"""
hermes_home = tmp_path / "hermes"
hermes_home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.setenv("OPENROUTER_API_KEY", "sk-or-from-runtime-env")
# .env exists but does not define OPENROUTER_API_KEY
(hermes_home / ".env").write_text("SOME_OTHER_VAR=unrelated\n")
_write_auth_store(tmp_path, {"version": 1, "providers": {}})
from agent.credential_pool import load_pool
pool = load_pool("openrouter")
entry = pool.select()
assert entry is not None
assert entry.access_token == "sk-or-from-runtime-env"
def test_load_pool_removes_stale_seeded_env_entry(tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
+80
View File
@@ -645,6 +645,86 @@ def test_review_model_honors_auxiliary_curator_slot(curator_env):
)
def test_review_runtime_passes_auxiliary_curator_credentials(curator_env):
"""Per-slot api_key/base_url must ride into resolve_runtime_provider (not main-only creds)."""
curator = curator_env["curator"]
cfg = {
"model": {"provider": "openrouter", "default": "openai/gpt-5.5"},
"auxiliary": {
"curator": {
"provider": "custom",
"model": "local-mini",
"api_key": "sk-curator-only",
"base_url": "http://localhost:11434/v1",
},
},
}
binding = curator._resolve_review_runtime(cfg)
assert binding.provider == "custom"
assert binding.model == "local-mini"
assert binding.explicit_api_key == "sk-curator-only"
assert binding.explicit_base_url == "http://localhost:11434/v1"
def test_review_runtime_strips_blank_aux_credentials(curator_env):
curator = curator_env["curator"]
cfg = {
"model": {"provider": "openrouter", "default": "openai/gpt-5.5"},
"auxiliary": {
"curator": {
"provider": "openrouter",
"model": "x/y",
"api_key": " ",
"base_url": "",
},
},
}
binding = curator._resolve_review_runtime(cfg)
assert binding.explicit_api_key is None
assert binding.explicit_base_url is None
def test_review_runtime_ignores_auxiliary_credentials_when_using_main(curator_env):
"""Falling through to main model must not pick up stray auxiliary.curator secrets."""
curator = curator_env["curator"]
cfg = {
"model": {"provider": "openrouter", "default": "openai/gpt-5.5"},
"auxiliary": {
"curator": {
"provider": "auto",
"model": "",
"api_key": "must-not-leak",
"base_url": "http://curator-slot-ignored/",
},
},
}
binding = curator._resolve_review_runtime(cfg)
assert (binding.provider, binding.model) == ("openrouter", "openai/gpt-5.5")
assert binding.explicit_api_key is None
assert binding.explicit_base_url is None
def test_review_runtime_legacy_auxiliary_carry_credentials(curator_env, caplog):
curator = curator_env["curator"]
cfg = {
"model": {"provider": "openrouter", "default": "openai/gpt-5.5"},
"curator": {
"auxiliary": {
"provider": "custom",
"model": "m",
"api_key": "legacy-key",
"base_url": "http://legacy/v1",
},
},
}
import logging
with caplog.at_level(logging.INFO, logger="agent.curator"):
binding = curator._resolve_review_runtime(cfg)
assert binding.explicit_api_key == "legacy-key"
assert binding.explicit_base_url == "http://legacy/v1"
assert any("deprecated curator.auxiliary" in rec.message for rec in caplog.records)
def test_review_model_auxiliary_curator_partial_override_falls_back(curator_env):
"""Only one of slot provider/model set → fall back to the main pair.
+278
View File
@@ -314,3 +314,281 @@ def test_dry_run_skips_snapshot(backup_env, monkeypatch):
assert not any(r.get("reason") == "pre-curator-run" for r in rows), (
"dry-run must not create a pre-run snapshot"
)
# ---------------------------------------------------------------------------
# cron-jobs backup + rollback (the part issue #18671's follow-up adds)
# ---------------------------------------------------------------------------
def _write_cron_jobs(home: Path, jobs: list) -> Path:
"""Write a synthetic cron/jobs.json under HERMES_HOME. Returns the path.
Mirrors cron.jobs.save_jobs() wrapper shape: `{"jobs": [...], "updated_at": ...}`.
"""
cron_dir = home / "cron"
cron_dir.mkdir(parents=True, exist_ok=True)
path = cron_dir / "jobs.json"
path.write_text(
json.dumps({"jobs": jobs, "updated_at": "2026-05-01T00:00:00Z"}, indent=2),
encoding="utf-8",
)
return path
def _reload_cron_jobs(home: Path):
"""Reload cron.jobs so its module-level HERMES_DIR picks up the tmp HOME."""
import hermes_constants
importlib.reload(hermes_constants)
if "cron.jobs" in sys.modules:
import cron.jobs as _cj
importlib.reload(_cj)
else:
import cron.jobs as _cj # noqa: F401
import cron.jobs as cj
return cj
def test_snapshot_includes_cron_jobs(backup_env):
"""With a cron/jobs.json present, snapshot writes cron-jobs.json and records it in manifest."""
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
_write_cron_jobs(backup_env["home"], [
{"id": "job-a", "name": "a", "schedule": "every 1h", "skills": ["alpha"]},
{"id": "job-b", "name": "b", "schedule": "every 2h", "skill": "alpha"},
])
snap = cb.snapshot_skills(reason="test")
assert snap is not None
assert (snap / cb.CRON_JOBS_FILENAME).exists()
mf = json.loads((snap / "manifest.json").read_text(encoding="utf-8"))
assert mf["cron_jobs"]["backed_up"] is True
assert mf["cron_jobs"]["jobs_count"] == 2
def test_snapshot_without_cron_jobs_file_still_succeeds(backup_env):
"""No cron/jobs.json on disk → snapshot succeeds, manifest records absence."""
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
# Deliberately do not create ~/.hermes/cron/jobs.json
snap = cb.snapshot_skills(reason="test")
assert snap is not None
assert not (snap / cb.CRON_JOBS_FILENAME).exists()
mf = json.loads((snap / "manifest.json").read_text(encoding="utf-8"))
assert mf["cron_jobs"]["backed_up"] is False
assert "cron/jobs.json" in mf["cron_jobs"]["reason"]
def test_snapshot_cron_jobs_malformed_json_still_captured(backup_env):
"""Malformed jobs.json is still copied to the snapshot (fidelity over
validation); the manifest notes the parse warning."""
cb = backup_env["cb"]
_write_skill(backup_env["skills"], "alpha")
(backup_env["home"] / "cron").mkdir()
(backup_env["home"] / "cron" / "jobs.json").write_text("{oh no", encoding="utf-8")
snap = cb.snapshot_skills(reason="test")
assert snap is not None
# Raw file was copied even though we couldn't parse it
assert (snap / cb.CRON_JOBS_FILENAME).read_text() == "{oh no"
mf = json.loads((snap / "manifest.json").read_text(encoding="utf-8"))
assert mf["cron_jobs"]["backed_up"] is True
assert mf["cron_jobs"]["jobs_count"] == 0
assert "parse_warning" in mf["cron_jobs"]
def test_rollback_restores_cron_skill_links(backup_env):
"""End-to-end: snapshot with job [alpha,beta], curator-style in-place
rewrite to [umbrella], then rollback skills restored to [alpha,beta]."""
cb = backup_env["cb"]
home = backup_env["home"]
_write_skill(backup_env["skills"], "alpha")
_write_skill(backup_env["skills"], "beta")
_write_skill(backup_env["skills"], "umbrella")
cj = _reload_cron_jobs(home)
cj.create_job(name="weekly", prompt="p", schedule="every 7d",
skills=["alpha", "beta"])
snap = cb.snapshot_skills(reason="pre-curator-run")
assert snap is not None
# Simulate the curator's in-place cron rewrite after consolidation
cj.rewrite_skill_refs(
consolidated={"alpha": "umbrella", "beta": "umbrella"},
pruned=[],
)
live_after_curator = cj.load_jobs()
assert live_after_curator[0]["skills"] == ["umbrella"]
# Now roll back
ok, msg, _ = cb.rollback(backup_id=snap.name)
assert ok, msg
assert "cron links" in msg
live_after_rollback = cj.load_jobs()
# skills restored; legacy `skill` mirror follows first element
assert live_after_rollback[0]["skills"] == ["alpha", "beta"]
def test_rollback_only_touches_skill_fields(backup_env):
"""Every field other than skills/skill must remain untouched across rollback.
Schedule, enabled, prompt, timestamps all live state, hands off."""
cb = backup_env["cb"]
home = backup_env["home"]
_write_skill(backup_env["skills"], "alpha")
# Hand-rolled jobs.json with varied fields (no real create_job — we want
# exact field control).
_write_cron_jobs(home, [{
"id": "stable-id",
"name": "original-name",
"prompt": "original prompt",
"schedule": "every 1h",
"skills": ["alpha"],
"enabled": True,
"last_run_at": "2026-04-01T00:00:00Z",
}])
snap = cb.snapshot_skills(reason="pre-curator-run")
assert snap is not None
# User/scheduler activity AFTER the snapshot: rename the job, change
# the schedule, update timestamps, and (curator) rewrite the skills list.
cj = _reload_cron_jobs(home)
jobs = cj.load_jobs()
jobs[0]["name"] = "renamed-since-snapshot"
jobs[0]["schedule"] = "every 30m"
jobs[0]["last_run_at"] = "2026-05-01T12:00:00Z"
jobs[0]["skills"] = ["umbrella"] # pretend curator did this
cj.save_jobs(jobs)
ok, _, _ = cb.rollback(backup_id=snap.name)
assert ok
after = cj.load_jobs()
job = after[0]
# skills: restored
assert job["skills"] == ["alpha"]
# everything else: untouched (live state preserved)
assert job["name"] == "renamed-since-snapshot"
assert job["schedule"] == "every 30m"
assert job["last_run_at"] == "2026-05-01T12:00:00Z"
assert job["prompt"] == "original prompt"
def test_rollback_skips_jobs_the_user_deleted(backup_env):
"""If the user deleted a cron job after the snapshot, rollback must
NOT resurrect it the user's delete is a later, explicit choice."""
cb = backup_env["cb"]
home = backup_env["home"]
_write_skill(backup_env["skills"], "alpha")
_write_cron_jobs(home, [
{"id": "keep-me", "name": "keep", "schedule": "every 1h", "skills": ["alpha"]},
{"id": "delete-me", "name": "gone", "schedule": "every 1h", "skills": ["alpha"]},
])
snap = cb.snapshot_skills(reason="pre-curator-run")
# User deletes one job after the snapshot
cj = _reload_cron_jobs(home)
cj.save_jobs([j for j in cj.load_jobs() if j["id"] != "delete-me"])
ok, _, _ = cb.rollback(backup_id=snap.name)
assert ok
live_after = cj.load_jobs()
live_ids = {j["id"] for j in live_after}
assert "keep-me" in live_ids
assert "delete-me" not in live_ids # not resurrected
def test_rollback_leaves_new_jobs_untouched(backup_env):
"""Jobs created AFTER the snapshot must pass through rollback unchanged."""
cb = backup_env["cb"]
home = backup_env["home"]
_write_skill(backup_env["skills"], "alpha")
_write_cron_jobs(home, [
{"id": "original", "name": "o", "schedule": "every 1h", "skills": ["alpha"]},
])
snap = cb.snapshot_skills(reason="pre-curator-run")
cj = _reload_cron_jobs(home)
jobs = cj.load_jobs()
jobs.append({"id": "new-after-snapshot", "name": "new",
"schedule": "every 15m", "skills": ["brand-new-skill"]})
cj.save_jobs(jobs)
ok, _, _ = cb.rollback(backup_id=snap.name)
assert ok
live = cj.load_jobs()
by_id = {j["id"]: j for j in live}
assert "new-after-snapshot" in by_id
# New job's fields completely preserved
assert by_id["new-after-snapshot"]["skills"] == ["brand-new-skill"]
assert by_id["new-after-snapshot"]["schedule"] == "every 15m"
def test_rollback_with_snapshot_missing_cron_succeeds(backup_env):
"""Older snapshots (created before this feature shipped) have no
cron-jobs.json. Rollback must still restore the skills tree and not
error out."""
cb = backup_env["cb"]
home = backup_env["home"]
_write_skill(backup_env["skills"], "alpha")
# No cron/jobs.json at snapshot time — simulates a pre-feature snapshot
snap = cb.snapshot_skills(reason="test")
assert snap is not None
assert not (snap / cb.CRON_JOBS_FILENAME).exists()
# Later the user created a cron job
_write_cron_jobs(home, [
{"id": "later-job", "name": "l", "schedule": "every 1h", "skills": ["x"]},
])
ok, msg, _ = cb.rollback(backup_id=snap.name)
# Main rollback still succeeds; cron report notes the missing file.
assert ok, msg
# Jobs.json untouched (nothing to restore from)
cj = _reload_cron_jobs(home)
jobs = cj.load_jobs()
assert jobs[0]["id"] == "later-job"
assert jobs[0]["skills"] == ["x"]
def test_restore_cron_skill_links_standalone(backup_env):
"""Unit-level test on _restore_cron_skill_links without the full rollback.
Verifies the report structure carefully."""
cb = backup_env["cb"]
home = backup_env["home"]
# Prime a snapshot dir manually with cron-jobs.json
backups_dir = home / "skills" / ".curator_backups" / "fake-id"
backups_dir.mkdir(parents=True)
(backups_dir / cb.CRON_JOBS_FILENAME).write_text(json.dumps([
{"id": "job-1", "name": "one", "skills": ["narrow-a", "narrow-b"]},
{"id": "job-2", "name": "two", "skill": "legacy-single"},
{"id": "job-gone", "name": "deleted", "skills": ["whatever"]},
]), encoding="utf-8")
# Live jobs: job-1 got rewritten, job-2 unchanged, job-gone deleted
_write_cron_jobs(home, [
{"id": "job-1", "name": "one", "skills": ["umbrella"], "schedule": "every 1h"},
{"id": "job-2", "name": "two", "skill": "legacy-single", "schedule": "every 1h"},
{"id": "job-new", "name": "new", "skills": ["x"], "schedule": "every 1h"},
])
_reload_cron_jobs(home)
report = cb._restore_cron_skill_links(backups_dir)
assert report["attempted"] is True
assert report["error"] is None
assert report["unchanged"] == 1 # job-2 matched
assert len(report["restored"]) == 1 # job-1 got restored
assert report["restored"][0]["job_id"] == "job-1"
assert report["restored"][0]["to"]["skills"] == ["narrow-a", "narrow-b"]
assert len(report["skipped_missing"]) == 1
assert report["skipped_missing"][0]["job_id"] == "job-gone"
+338
View File
@@ -220,6 +220,81 @@ def test_classify_handles_malformed_arguments_string(curator_env):
assert len(result["pruned"]) == 1
def test_classify_no_false_positive_short_name_in_file_path(curator_env):
"""Short skill name that is a substring of another filename = pruned, not consolidated."""
# e.g. "api" should NOT match "references/api-design.md"
result = curator_env._classify_removed_skills(
removed=["api"],
added=[],
after_names={"conventions"},
tool_calls=[
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "write_file",
"name": "conventions",
"file_path": "references/api-design.md",
"file_content": "# API Design\n...",
}),
},
],
)
assert result["consolidated"] == [], (
f"Short name 'api' should NOT match file_path 'references/api-design.md'"
)
assert len(result["pruned"]) == 1
assert result["pruned"][0]["name"] == "api"
def test_classify_no_false_positive_short_name_in_content(curator_env):
"""Short skill name embedded in longer word in content = pruned, not consolidated."""
# e.g. "test" should NOT match content "running latest tests"
result = curator_env._classify_removed_skills(
removed=["test"],
added=[],
after_names={"umbrella"},
tool_calls=[
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "patch",
"name": "umbrella",
"old_string": "old",
"new_string": "running latest tests with pytest",
}),
},
],
)
assert result["consolidated"] == [], (
f"Short name 'test' should NOT match 'latest' via word boundary"
)
assert len(result["pruned"]) == 1
def test_classify_still_matches_exact_word_in_content(curator_env):
"""Word-boundary match still works for exact word occurrences."""
# "api" SHOULD match content "use the api gateway"
result = curator_env._classify_removed_skills(
removed=["api"],
added=[],
after_names={"gateway"},
tool_calls=[
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "edit",
"name": "gateway",
"content": "# Gateway\n\nUse the api gateway for all requests.\n",
}),
},
],
)
assert len(result["consolidated"]) == 1, (
f"'api' should match as a standalone word in content"
)
assert result["consolidated"][0]["into"] == "gateway"
def test_report_md_splits_consolidated_and_pruned_sections(curator_env):
"""End-to-end: REPORT.md shows both sections distinctly."""
curator = curator_env
@@ -548,3 +623,266 @@ def test_reconcile_model_block_visible_in_full_report(curator_env):
md = (run_dir / "REPORT.md").read_text()
assert "duplicate content, now a subsection" in md
assert "pre-curator junk" in md
# ---------------------------------------------------------------------------
# _extract_absorbed_into_declarations — authoritative signal from delete calls
# ---------------------------------------------------------------------------
def test_extract_absorbed_into_picks_up_consolidation(curator_env):
"""Delete call with absorbed_into=<umbrella> yields a declaration."""
declarations = curator_env._extract_absorbed_into_declarations([
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "delete",
"name": "narrow-skill",
"absorbed_into": "umbrella",
}),
},
])
assert declarations == {
"narrow-skill": {"into": "umbrella", "declared": True},
}
def test_extract_absorbed_into_empty_string_is_explicit_prune(curator_env):
"""absorbed_into='' is recorded as an explicit prune declaration."""
declarations = curator_env._extract_absorbed_into_declarations([
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "delete",
"name": "stale",
"absorbed_into": "",
}),
},
])
assert declarations == {"stale": {"into": "", "declared": True}}
def test_extract_absorbed_into_missing_arg_ignored(curator_env):
"""Delete call without absorbed_into is skipped — fallback to heuristic."""
declarations = curator_env._extract_absorbed_into_declarations([
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "delete",
"name": "legacy-skill",
}),
},
])
assert declarations == {}
def test_extract_absorbed_into_ignores_non_delete_actions(curator_env):
"""Patch, create, write_file etc. must not leak into declarations."""
declarations = curator_env._extract_absorbed_into_declarations([
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "patch",
"name": "umbrella",
"old_string": "...",
"new_string": "...",
"absorbed_into": "something", # bogus on non-delete, must be ignored
}),
},
])
assert declarations == {}
def test_extract_absorbed_into_accepts_dict_arguments(curator_env):
"""arguments can arrive as a dict (defensive path) — still works."""
declarations = curator_env._extract_absorbed_into_declarations([
{
"name": "skill_manage",
"arguments": {
"action": "delete",
"name": "narrow",
"absorbed_into": "umbrella",
},
},
])
assert declarations == {"narrow": {"into": "umbrella", "declared": True}}
def test_extract_absorbed_into_strips_whitespace(curator_env):
declarations = curator_env._extract_absorbed_into_declarations([
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "delete",
"name": " narrow ",
"absorbed_into": " umbrella ",
}),
},
])
assert declarations == {"narrow": {"into": "umbrella", "declared": True}}
def test_extract_absorbed_into_ignores_non_skill_manage_calls(curator_env):
declarations = curator_env._extract_absorbed_into_declarations([
{"name": "terminal", "arguments": json.dumps({"command": "ls"})},
{"name": "read_file", "arguments": json.dumps({"path": "/tmp/x"})},
])
assert declarations == {}
def test_extract_absorbed_into_handles_malformed_arguments(curator_env):
"""Garbage JSON in arguments must not crash the extractor."""
declarations = curator_env._extract_absorbed_into_declarations([
{"name": "skill_manage", "arguments": "{not json"},
{"name": "skill_manage", "arguments": None},
{"name": "skill_manage"}, # no arguments key at all
])
assert declarations == {}
# ---------------------------------------------------------------------------
# _reconcile_classification with absorbed_into declarations (authoritative)
# ---------------------------------------------------------------------------
def test_reconcile_absorbed_into_beats_everything_else(curator_env):
"""Model declared absorbed_into at delete; YAML/heuristic disagree — declaration wins.
This is the exact #18671 regression: the model forgets to emit the YAML
summary block, the heuristic's substring match misses because the
umbrella's patch content doesn't literally contain the old skill's
slug. Previously this fell through to 'no-evidence fallback' prune,
which dropped the cron ref instead of rewriting. With absorbed_into
declared, the model tells us directly.
"""
out = curator_env._reconcile_classification(
removed=["pr-review-format"],
heuristic={"consolidated": [], "pruned": [{"name": "pr-review-format"}]},
model_block={"consolidations": [], "prunings": []}, # model forgot YAML block
destinations={"hermes-agent-dev"},
absorbed_declarations={
"pr-review-format": {"into": "hermes-agent-dev", "declared": True},
},
)
assert len(out["consolidated"]) == 1
assert out["pruned"] == []
e = out["consolidated"][0]
assert e["name"] == "pr-review-format"
assert e["into"] == "hermes-agent-dev"
assert "absorbed_into" in e["source"]
def test_reconcile_absorbed_into_empty_is_explicit_prune(curator_env):
"""absorbed_into='' takes precedence and routes to pruned, not fallback."""
out = curator_env._reconcile_classification(
removed=["stale"],
heuristic={"consolidated": [], "pruned": [{"name": "stale"}]},
model_block={"consolidations": [], "prunings": []},
destinations=set(),
absorbed_declarations={
"stale": {"into": "", "declared": True},
},
)
assert out["consolidated"] == []
assert len(out["pruned"]) == 1
assert "model-declared prune" in out["pruned"][0]["source"]
def test_reconcile_absorbed_into_nonexistent_target_falls_through(curator_env):
"""If the declared umbrella doesn't exist in destinations, fall through to
heuristic/YAML logic. Shouldn't happen in practice (the tool validates at
delete time) but the reconciler is defensive."""
out = curator_env._reconcile_classification(
removed=["thing"],
heuristic={
"consolidated": [{"name": "thing", "into": "real-umbrella", "evidence": "..."}],
"pruned": [],
},
model_block={"consolidations": [], "prunings": []},
destinations={"real-umbrella"},
absorbed_declarations={
"thing": {"into": "ghost-umbrella", "declared": True},
},
)
assert len(out["consolidated"]) == 1
assert out["consolidated"][0]["into"] == "real-umbrella"
assert "tool-call audit" in out["consolidated"][0]["source"]
def test_reconcile_declaration_preserves_yaml_reason(curator_env):
"""When the model both declared absorbed_into AND emitted YAML with reason,
the reason carries through so REPORT.md still has it."""
out = curator_env._reconcile_classification(
removed=["narrow"],
heuristic={"consolidated": [], "pruned": []},
model_block={
"consolidations": [{
"from": "narrow",
"into": "umbrella",
"reason": "duplicate of umbrella's main content",
}],
"prunings": [],
},
destinations={"umbrella"},
absorbed_declarations={
"narrow": {"into": "umbrella", "declared": True},
},
)
assert len(out["consolidated"]) == 1
e = out["consolidated"][0]
assert e["into"] == "umbrella"
assert "absorbed_into" in e["source"]
assert e["reason"] == "duplicate of umbrella's main content"
def test_reconcile_without_declarations_preserves_legacy_behavior(curator_env):
"""Backward compat: no absorbed_declarations arg → all existing logic intact."""
out = curator_env._reconcile_classification(
removed=["thing"],
heuristic={
"consolidated": [{"name": "thing", "into": "umbrella", "evidence": "..."}],
"pruned": [],
},
model_block={"consolidations": [], "prunings": []},
destinations={"umbrella"},
# no absorbed_declarations — defaults to None → behaves identically to pre-change
)
assert len(out["consolidated"]) == 1
assert out["consolidated"][0]["into"] == "umbrella"
def test_reconcile_mixed_declarations_and_legacy_calls(curator_env):
"""Real-world run: some deletes declared absorbed_into, some didn't.
Declared ones use the authoritative path; others fall through to YAML/heuristic.
"""
out = curator_env._reconcile_classification(
removed=["declared-cons", "declared-prune", "legacy-cons", "legacy-prune"],
heuristic={
"consolidated": [
{"name": "legacy-cons", "into": "umbrella-a", "evidence": "..."},
],
"pruned": [{"name": "legacy-prune"}],
},
model_block={"consolidations": [], "prunings": []},
destinations={"umbrella-a", "umbrella-b"},
absorbed_declarations={
"declared-cons": {"into": "umbrella-b", "declared": True},
"declared-prune": {"into": "", "declared": True},
},
)
cons_by_name = {e["name"]: e for e in out["consolidated"]}
pruned_by_name = {e["name"]: e for e in out["pruned"]}
assert "declared-cons" in cons_by_name
assert cons_by_name["declared-cons"]["into"] == "umbrella-b"
assert "absorbed_into" in cons_by_name["declared-cons"]["source"]
assert "legacy-cons" in cons_by_name
assert cons_by_name["legacy-cons"]["into"] == "umbrella-a"
assert "tool-call audit" in cons_by_name["legacy-cons"]["source"]
assert "declared-prune" in pruned_by_name
assert "model-declared prune" in pruned_by_name["declared-prune"]["source"]
assert "legacy-prune" in pruned_by_name
assert "no-evidence fallback" in pruned_by_name["legacy-prune"]["source"]
@@ -0,0 +1,284 @@
"""Tests for OpenRouter response caching header injection."""
from types import SimpleNamespace
from unittest.mock import patch
import pytest
# ---------------------------------------------------------------------------
# build_or_headers
# ---------------------------------------------------------------------------
class TestBuildOrHeaders:
"""Test the build_or_headers() helper in agent/auxiliary_client.py."""
def test_base_attribution_always_present(self):
"""Attribution headers must always be included regardless of cache setting."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={"response_cache": False})
assert headers["HTTP-Referer"] == "https://hermes-agent.nousresearch.com"
assert headers["X-OpenRouter-Title"] == "Hermes Agent"
assert headers["X-OpenRouter-Categories"] == "productivity,cli-agent"
def test_cache_enabled(self):
"""When response_cache is True, X-OpenRouter-Cache header is set."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={"response_cache": True})
assert headers["X-OpenRouter-Cache"] == "true"
def test_cache_disabled(self):
"""When response_cache is False, no cache header is sent."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={"response_cache": False})
assert "X-OpenRouter-Cache" not in headers
assert "X-OpenRouter-Cache-TTL" not in headers
def test_cache_disabled_by_default_empty_config(self):
"""Empty config dict means no cache headers (response_cache defaults to False)."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={})
assert "X-OpenRouter-Cache" not in headers
def test_ttl_default(self):
"""Default TTL (300) is included when cache is enabled."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={"response_cache": True, "response_cache_ttl": 300})
assert headers["X-OpenRouter-Cache-TTL"] == "300"
def test_ttl_custom(self):
"""Custom TTL values within range are sent."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={"response_cache": True, "response_cache_ttl": 3600})
assert headers["X-OpenRouter-Cache-TTL"] == "3600"
def test_ttl_max(self):
"""Maximum TTL (86400) is accepted."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={"response_cache": True, "response_cache_ttl": 86400})
assert headers["X-OpenRouter-Cache-TTL"] == "86400"
def test_ttl_out_of_range_too_high(self):
"""TTL above 86400 is silently ignored (no TTL header sent)."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={"response_cache": True, "response_cache_ttl": 100000})
assert "X-OpenRouter-Cache-TTL" not in headers
# But cache is still enabled
assert headers["X-OpenRouter-Cache"] == "true"
def test_ttl_out_of_range_zero(self):
"""TTL of 0 is below minimum — no TTL header sent."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={"response_cache": True, "response_cache_ttl": 0})
assert "X-OpenRouter-Cache-TTL" not in headers
def test_ttl_negative(self):
"""Negative TTL is ignored."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={"response_cache": True, "response_cache_ttl": -5})
assert "X-OpenRouter-Cache-TTL" not in headers
def test_ttl_not_a_number(self):
"""Non-numeric TTL is ignored."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={"response_cache": True, "response_cache_ttl": "five"})
assert "X-OpenRouter-Cache-TTL" not in headers
def test_ttl_float_truncated(self):
"""Float TTL values are truncated to int."""
from agent.auxiliary_client import build_or_headers
headers = build_or_headers(or_config={"response_cache": True, "response_cache_ttl": 600.7})
assert headers["X-OpenRouter-Cache-TTL"] == "600"
def test_returns_fresh_dict(self):
"""Each call returns a new dict so mutations don't leak."""
from agent.auxiliary_client import build_or_headers
cfg = {"response_cache": True}
h1 = build_or_headers(or_config=cfg)
h2 = build_or_headers(or_config=cfg)
assert h1 is not h2
assert h1 == h2
def test_none_config_falls_back_to_load_config(self):
"""When or_config is None, build_or_headers reads from load_config()."""
from agent.auxiliary_client import build_or_headers
fake_cfg = {
"openrouter": {"response_cache": True, "response_cache_ttl": 900},
}
with patch("hermes_cli.config.load_config", return_value=fake_cfg):
headers = build_or_headers(or_config=None)
assert headers["X-OpenRouter-Cache"] == "true"
assert headers["X-OpenRouter-Cache-TTL"] == "900"
def test_none_config_load_config_fails_gracefully(self):
"""When load_config() fails, build_or_headers still returns base headers."""
from agent.auxiliary_client import build_or_headers
with patch("hermes_cli.config.load_config", side_effect=RuntimeError("boom")):
headers = build_or_headers(or_config=None)
# Should have base attribution but no cache headers
assert "HTTP-Referer" in headers
assert "X-OpenRouter-Cache" not in headers
# ---------------------------------------------------------------------------
# Environment variable overrides
# ---------------------------------------------------------------------------
class TestEnvVarOverrides:
"""Test env var precedence over config.yaml for response caching."""
def test_env_enables_cache(self, monkeypatch):
"""HERMES_OPENROUTER_CACHE=true enables cache even when config disables it."""
from agent.auxiliary_client import build_or_headers
monkeypatch.setenv("HERMES_OPENROUTER_CACHE", "true")
headers = build_or_headers(or_config={"response_cache": False})
assert headers["X-OpenRouter-Cache"] == "true"
def test_env_disables_cache(self, monkeypatch):
"""HERMES_OPENROUTER_CACHE=false disables cache even when config enables it."""
from agent.auxiliary_client import build_or_headers
monkeypatch.setenv("HERMES_OPENROUTER_CACHE", "false")
headers = build_or_headers(or_config={"response_cache": True})
assert "X-OpenRouter-Cache" not in headers
@pytest.mark.parametrize("value", ["1", "true", "TRUE", "yes", "Yes", "on"])
def test_truthy_values(self, monkeypatch, value):
"""Various truthy strings enable caching."""
from agent.auxiliary_client import build_or_headers
monkeypatch.setenv("HERMES_OPENROUTER_CACHE", value)
headers = build_or_headers(or_config={})
assert headers["X-OpenRouter-Cache"] == "true"
@pytest.mark.parametrize("value", ["0", "false", "no", "off", "maybe", ""])
def test_non_truthy_values(self, monkeypatch, value):
"""Non-truthy strings do not enable caching (empty falls through to config)."""
from agent.auxiliary_client import build_or_headers
monkeypatch.setenv("HERMES_OPENROUTER_CACHE", value)
# Empty string falls through to config; others are explicitly non-truthy
if value == "":
# Empty env var falls through to config default (False)
headers = build_or_headers(or_config={"response_cache": False})
else:
headers = build_or_headers(or_config={"response_cache": True})
assert "X-OpenRouter-Cache" not in headers
def test_env_ttl_overrides_config(self, monkeypatch):
"""HERMES_OPENROUTER_CACHE_TTL overrides config TTL."""
from agent.auxiliary_client import build_or_headers
monkeypatch.setenv("HERMES_OPENROUTER_CACHE", "true")
monkeypatch.setenv("HERMES_OPENROUTER_CACHE_TTL", "1800")
headers = build_or_headers(or_config={"response_cache_ttl": 300})
assert headers["X-OpenRouter-Cache-TTL"] == "1800"
@pytest.mark.parametrize("ttl", ["0", "86401", "abc", "-1", "12.5"])
def test_invalid_env_ttl_dropped(self, monkeypatch, ttl):
"""Invalid TTL env values are ignored; cache still enabled without TTL."""
from agent.auxiliary_client import build_or_headers
monkeypatch.setenv("HERMES_OPENROUTER_CACHE", "1")
monkeypatch.setenv("HERMES_OPENROUTER_CACHE_TTL", ttl)
headers = build_or_headers(or_config={})
assert headers["X-OpenRouter-Cache"] == "true"
assert "X-OpenRouter-Cache-TTL" not in headers
@pytest.mark.parametrize("ttl", ["1", "300", "86400"])
def test_valid_env_ttl_boundaries(self, monkeypatch, ttl):
"""Boundary TTL values (1, 300, 86400) are accepted."""
from agent.auxiliary_client import build_or_headers
monkeypatch.setenv("HERMES_OPENROUTER_CACHE", "yes")
monkeypatch.setenv("HERMES_OPENROUTER_CACHE_TTL", ttl)
assert build_or_headers(or_config={})["X-OpenRouter-Cache-TTL"] == ttl
def test_no_env_vars_falls_through_to_config(self, monkeypatch):
"""Without env vars, config.yaml controls behavior."""
from agent.auxiliary_client import build_or_headers
monkeypatch.delenv("HERMES_OPENROUTER_CACHE", raising=False)
monkeypatch.delenv("HERMES_OPENROUTER_CACHE_TTL", raising=False)
headers = build_or_headers(or_config={"response_cache": True, "response_cache_ttl": 600})
assert headers["X-OpenRouter-Cache"] == "true"
assert headers["X-OpenRouter-Cache-TTL"] == "600"
class TestDefaultConfig:
"""Verify the openrouter config section is in DEFAULT_CONFIG."""
def test_openrouter_section_exists(self):
from hermes_cli.config import DEFAULT_CONFIG
assert "openrouter" in DEFAULT_CONFIG
or_cfg = DEFAULT_CONFIG["openrouter"]
assert or_cfg["response_cache"] is True
assert or_cfg["response_cache_ttl"] == 300
# ---------------------------------------------------------------------------
# _check_openrouter_cache_status
# ---------------------------------------------------------------------------
class TestCheckOpenrouterCacheStatus:
"""Test the _check_openrouter_cache_status method on AIAgent."""
def _make_agent(self):
"""Create a minimal AIAgent-like object with just the method under test."""
from run_agent import AIAgent
# Use object.__new__ to skip __init__, then set the attributes we need
agent = object.__new__(AIAgent)
agent._or_cache_hits = 0
return agent
def test_hit_increments_counter(self):
agent = self._make_agent()
resp = SimpleNamespace(headers={"x-openrouter-cache-status": "HIT"})
agent._check_openrouter_cache_status(resp)
assert agent._or_cache_hits == 1
# Second hit increments
agent._check_openrouter_cache_status(resp)
assert agent._or_cache_hits == 2
def test_miss_does_not_increment(self):
agent = self._make_agent()
resp = SimpleNamespace(headers={"x-openrouter-cache-status": "MISS"})
agent._check_openrouter_cache_status(resp)
assert getattr(agent, "_or_cache_hits", 0) == 0
def test_no_header_is_noop(self):
agent = self._make_agent()
resp = SimpleNamespace(headers={})
agent._check_openrouter_cache_status(resp)
assert getattr(agent, "_or_cache_hits", 0) == 0
def test_none_response_is_safe(self):
agent = self._make_agent()
agent._check_openrouter_cache_status(None) # no crash
def test_no_headers_attr_is_safe(self):
agent = self._make_agent()
agent._check_openrouter_cache_status(object()) # no crash
def test_case_insensitive(self):
agent = self._make_agent()
resp = SimpleNamespace(headers={"x-openrouter-cache-status": "hit"})
agent._check_openrouter_cache_status(resp)
assert agent._or_cache_hits == 1
+52
View File
@@ -125,6 +125,58 @@ class TestScanSkillCommands:
assert "/knowledge-brain" in result
assert result["/knowledge-brain"]["name"] == "knowledge-brain"
def test_get_skill_commands_rescans_when_platform_scope_changes(self, tmp_path):
"""Platform-specific disabled-skill caches must not leak across platforms.
Regression test for #14536: a gateway process serving Telegram
and Discord concurrently would seed the process-global cache
with whichever platform scanned first, and subsequent
``get_skill_commands()`` calls from the other platform silently
inherited that filter.
"""
import agent.skill_commands as sc_mod
from agent.skill_commands import get_skill_commands
def _disabled_skills():
platform = os.getenv("HERMES_PLATFORM")
if platform == "telegram":
return {"telegram-only"}
if platform == "discord":
return {"discord-only"}
return set()
with (
patch("tools.skills_tool.SKILLS_DIR", tmp_path),
patch("tools.skills_tool._get_disabled_skill_names", side_effect=_disabled_skills),
patch.object(sc_mod, "_skill_commands", {}),
patch.object(sc_mod, "_skill_commands_platform", None),
):
_make_skill(tmp_path, "shared")
_make_skill(tmp_path, "telegram-only")
_make_skill(tmp_path, "discord-only")
with patch.dict(os.environ, {"HERMES_PLATFORM": "telegram"}):
telegram_commands = dict(get_skill_commands())
assert "/shared" in telegram_commands
assert "/discord-only" in telegram_commands
assert "/telegram-only" not in telegram_commands
with patch.dict(os.environ, {"HERMES_PLATFORM": "discord"}):
discord_commands = dict(get_skill_commands())
assert "/shared" in discord_commands
assert "/telegram-only" in discord_commands
assert "/discord-only" not in discord_commands
# Switching back to telegram must also rescan — not re-serve
# the discord view that was just cached.
with patch.dict(os.environ, {"HERMES_PLATFORM": "telegram"}):
telegram_again = dict(get_skill_commands())
assert "/telegram-only" not in telegram_again
assert "/discord-only" in telegram_again
def test_special_chars_stripped_from_cmd_key(self, tmp_path):
"""Skill names with +, /, or other special chars produce clean cmd keys."""
@@ -126,6 +126,20 @@ class TestCodexBuildKwargs:
)
assert kw.get("extra_headers", {}).get("x-grok-conv-id") == "conv-123"
def test_xai_headers_preserve_request_override_headers(self, transport):
messages = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
model="grok-3", messages=messages, tools=[],
session_id="conv-123",
is_xai_responses=True,
request_overrides={"extra_headers": {"X-Test": "1", "X-Trace": "abc"}},
)
assert kw.get("extra_headers") == {
"X-Test": "1",
"X-Trace": "abc",
"x-grok-conv-id": "conv-123",
}
def test_minimal_effort_clamped(self, transport):
messages = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
+80 -86
View File
@@ -1,107 +1,101 @@
"""Tests that load_cli_config() guards against lazy-import TERMINAL_CWD clobbering.
"""Tests for CLI/TUI CWD resolution in load_cli_config().
When the gateway resolves TERMINAL_CWD at startup and cli.py is later
imported lazily (via delegate_tool CLI_CONFIG), load_cli_config() must
not overwrite the already-resolved value with os.getcwd().
config.yaml terminal.cwd is the canonical source of truth.
.env TERMINAL_CWD and MESSAGING_CWD are deprecated.
See issue #10817.
Rules:
- Local backend CLI/TUI: always os.getcwd(), ignoring config and inherited env.
- Non-local with placeholder: pop cwd for backend default.
- Non-local with explicit path: keep as-is.
"""
import os
import pytest
# The sentinel values that mean "resolve at runtime"
_CWD_PLACEHOLDERS = (".", "auto", "cwd")
def _resolve_terminal_cwd(terminal_config: dict, defaults: dict, env: dict):
"""Simulate the CWD resolution logic from load_cli_config().
def _resolve_cwd(terminal_config: dict, defaults: dict, env: dict):
"""Mirror the CWD resolution logic from cli.py load_cli_config()."""
effective_backend = terminal_config.get("env_type", "local")
This mirrors the code in cli.py that checks for a pre-resolved
TERMINAL_CWD before falling back to os.getcwd().
"""
if terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
_existing_cwd = env.get("TERMINAL_CWD", "")
if _existing_cwd and _existing_cwd not in _CWD_PLACEHOLDERS and os.path.isabs(_existing_cwd):
terminal_config["cwd"] = _existing_cwd
defaults["terminal"]["cwd"] = _existing_cwd
else:
effective_backend = terminal_config.get("env_type", "local")
if effective_backend == "local":
terminal_config["cwd"] = "/fake/getcwd" # stand-in for os.getcwd()
defaults["terminal"]["cwd"] = terminal_config["cwd"]
else:
terminal_config.pop("cwd", None)
if effective_backend == "local":
terminal_config["cwd"] = "/fake/getcwd"
defaults["terminal"]["cwd"] = terminal_config["cwd"]
elif terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
terminal_config.pop("cwd", None)
# Simulate the bridging loop: write terminal_config["cwd"] to env
_file_has_terminal = defaults.get("_file_has_terminal", False)
# Bridge: TERMINAL_CWD always exported in CLI, skipped in gateway
_is_gateway = env.get("_HERMES_GATEWAY") == "1"
if "cwd" in terminal_config:
if _file_has_terminal or "TERMINAL_CWD" not in env:
if _is_gateway:
pass # don't touch env
else:
env["TERMINAL_CWD"] = str(terminal_config["cwd"])
return env.get("TERMINAL_CWD", "")
class TestLazyImportGuard:
"""TERMINAL_CWD resolved by gateway must survive a lazy cli.py import."""
class TestLocalBackendCli:
"""Local backend always uses os.getcwd()."""
def test_gateway_resolved_cwd_survives(self):
"""Gateway set TERMINAL_CWD → lazy cli import must not clobber."""
env = {"TERMINAL_CWD": "/home/user/workspace"}
terminal_config = {"cwd": ".", "env_type": "local"}
defaults = {"terminal": {"cwd": "."}, "_file_has_terminal": False}
result = _resolve_terminal_cwd(terminal_config, defaults, env)
assert result == "/home/user/workspace"
def test_gateway_resolved_cwd_survives_with_file_terminal(self):
"""Even when config.yaml has a terminal: section, resolved CWD survives."""
env = {"TERMINAL_CWD": "/home/user/workspace"}
terminal_config = {"cwd": ".", "env_type": "local"}
defaults = {"terminal": {"cwd": "."}, "_file_has_terminal": True}
result = _resolve_terminal_cwd(terminal_config, defaults, env)
assert result == "/home/user/workspace"
class TestConfigCwdResolution:
"""config.yaml terminal.cwd is the canonical source of truth."""
def test_explicit_config_cwd_wins(self):
"""terminal.cwd: /explicit/path always wins."""
env = {"TERMINAL_CWD": "/old/gateway/value"}
terminal_config = {"cwd": "/explicit/path"}
defaults = {"terminal": {"cwd": "/explicit/path"}, "_file_has_terminal": True}
result = _resolve_terminal_cwd(terminal_config, defaults, env)
assert result == "/explicit/path"
def test_dot_cwd_resolves_to_getcwd_when_no_prior(self):
"""With no pre-set TERMINAL_CWD, "." resolves to os.getcwd()."""
def test_explicit_config_ignored(self):
env = {}
terminal_config = {"cwd": "."}
defaults = {"terminal": {"cwd": "."}, "_file_has_terminal": False}
tc = {"cwd": "/explicit/path", "env_type": "local"}
d = {"terminal": {"cwd": "/explicit/path"}}
assert _resolve_cwd(tc, d, env) == "/fake/getcwd"
result = _resolve_terminal_cwd(terminal_config, defaults, env)
def test_inherited_env_overwritten(self):
env = {"TERMINAL_CWD": "/parent/hermes"}
tc = {"cwd": "/home/user", "env_type": "local"}
d = {"terminal": {"cwd": "/home/user"}}
assert _resolve_cwd(tc, d, env) == "/fake/getcwd"
def test_placeholder_resolved(self):
env = {}
tc = {"cwd": "."}
d = {"terminal": {"cwd": "."}}
assert _resolve_cwd(tc, d, env) == "/fake/getcwd"
def test_env_and_no_config_file(self):
env = {"TERMINAL_CWD": "/stale/value"}
tc = {"cwd": ".", "env_type": "local"}
d = {"terminal": {"cwd": "."}}
assert _resolve_cwd(tc, d, env) == "/fake/getcwd"
class TestNonLocalBackends:
"""Non-local backends use config or per-backend defaults."""
def test_placeholder_popped(self):
env = {}
tc = {"cwd": ".", "env_type": "docker"}
d = {"terminal": {"cwd": "."}}
assert _resolve_cwd(tc, d, env) == ""
def test_explicit_path_kept(self):
env = {}
tc = {"cwd": "/srv/app", "env_type": "ssh"}
d = {"terminal": {"cwd": "/srv/app"}}
assert _resolve_cwd(tc, d, env) == "/srv/app"
def test_auto_placeholder_popped(self):
env = {}
tc = {"cwd": "auto", "env_type": "modal"}
d = {"terminal": {"cwd": "auto"}}
assert _resolve_cwd(tc, d, env) == ""
class TestGatewayLazyImport:
"""Gateway lazy import of cli.py must not clobber TERMINAL_CWD."""
def test_gateway_cwd_preserved(self):
env = {"_HERMES_GATEWAY": "1", "TERMINAL_CWD": "/home/user/project"}
tc = {"cwd": "/home/user", "env_type": "local"}
d = {"terminal": {"cwd": "/home/user"}}
result = _resolve_cwd(tc, d, env)
assert result == "/home/user/project"
def test_cli_overwrites_stale_env(self):
env = {"TERMINAL_CWD": "/stale/from/dotenv"}
tc = {"cwd": "/home/user", "env_type": "local"}
d = {"terminal": {"cwd": "/home/user"}}
result = _resolve_cwd(tc, d, env)
assert result == "/fake/getcwd"
def test_remote_backend_pops_cwd(self):
"""Remote backend + placeholder cwd → popped for backend default."""
env = {}
terminal_config = {"cwd": ".", "env_type": "docker"}
defaults = {"terminal": {"cwd": "."}, "_file_has_terminal": False}
result = _resolve_terminal_cwd(terminal_config, defaults, env)
assert result == "" # cwd popped, no env var set
def test_remote_backend_with_prior_cwd_preserves(self):
"""Remote backend + pre-resolved TERMINAL_CWD → adopted."""
env = {"TERMINAL_CWD": "/project"}
terminal_config = {"cwd": ".", "env_type": "docker"}
defaults = {"terminal": {"cwd": "."}, "_file_has_terminal": False}
result = _resolve_terminal_cwd(terminal_config, defaults, env)
assert result == "/project"
+68
View File
@@ -647,6 +647,74 @@ class TestGetDueJobs:
assert get_due_jobs() == []
assert get_job("oneshot-stale")["next_run_at"] is None
def test_broken_cron_without_next_run_is_recovered(self, tmp_cron_dir, monkeypatch):
now = datetime(2026, 3, 18, 10, 0, 0, tzinfo=timezone.utc)
monkeypatch.setattr("cron.jobs._hermes_now", lambda: now)
save_jobs(
[{
"id": "cron-recover",
"name": "AI Daily Digest",
"prompt": "...",
"schedule": {"kind": "cron", "expr": "0 12 * * *", "display": "0 12 * * *"},
"schedule_display": "0 12 * * *",
"repeat": {"times": None, "completed": 0},
"enabled": True,
"state": "scheduled",
"paused_at": None,
"paused_reason": None,
"created_at": "2026-03-18T09:00:00+00:00",
"next_run_at": None,
"last_run_at": None,
"last_status": None,
"last_error": None,
"deliver": "local",
"origin": None,
}]
)
assert get_due_jobs() == []
recovered = get_job("cron-recover")["next_run_at"]
assert recovered is not None
recovered_dt = datetime.fromisoformat(recovered)
if recovered_dt.tzinfo is None:
recovered_dt = recovered_dt.replace(tzinfo=timezone.utc)
assert recovered_dt > now
def test_broken_interval_without_next_run_is_recovered(self, tmp_cron_dir, monkeypatch):
now = datetime(2026, 3, 18, 10, 0, 0, tzinfo=timezone.utc)
monkeypatch.setattr("cron.jobs._hermes_now", lambda: now)
save_jobs(
[{
"id": "interval-recover",
"name": "Hourly heartbeat",
"prompt": "...",
"schedule": {"kind": "interval", "minutes": 60, "display": "every 60m"},
"schedule_display": "every 1h",
"repeat": {"times": None, "completed": 0},
"enabled": True,
"state": "scheduled",
"paused_at": None,
"paused_reason": None,
"created_at": "2026-03-18T09:00:00+00:00",
"next_run_at": None,
"last_run_at": None,
"last_status": None,
"last_error": None,
"deliver": "local",
"origin": None,
}]
)
assert get_due_jobs() == []
recovered = get_job("interval-recover")["next_run_at"]
assert recovered is not None
recovered_dt = datetime.fromisoformat(recovered)
if recovered_dt.tzinfo is None:
recovered_dt = recovered_dt.replace(tzinfo=timezone.utc)
assert recovered_dt > now
class TestEnabledToolsets:
def test_enabled_toolsets_stored(self, tmp_cron_dir):
+81
View File
@@ -46,6 +46,29 @@ class TestResolveOrigin:
job = {"origin": {}}
assert _resolve_origin(job) is None
@pytest.mark.parametrize(
"non_dict_origin",
[
"combined-digest-replaces-x-and-y-20260503",
123,
["telegram", "12345"],
("platform", "chat_id"),
42.0,
],
)
def test_non_dict_origin_returns_none_instead_of_crashing(self, non_dict_origin):
"""Non-dict origins (provenance strings from hand-edited or migrated
jobs.json) must be treated as missing instead of crashing the
scheduler tick on ``origin.get('platform')`` with
``'str' object has no attribute 'get'`` (#18722).
Before this guard a job in this state crashed every fire attempt
forever; ``mark_job_run`` recorded the error but the next tick
re-loaded the poisoned origin and crashed identically.
"""
job = {"origin": non_dict_origin}
assert _resolve_origin(job) is None
class TestResolveDeliveryTarget:
def test_origin_delivery_preserves_thread_id(self):
@@ -118,6 +141,16 @@ class TestResolveDeliveryTarget:
"thread_id": None,
}
def test_bare_platform_delivery_preserves_home_thread_id(self, monkeypatch):
monkeypatch.setenv("DISCORD_HOME_CHANNEL", "parent-42")
monkeypatch.setenv("DISCORD_HOME_CHANNEL_THREAD_ID", "topic-7")
assert _resolve_delivery_target({"deliver": "discord"}) == {
"platform": "discord",
"chat_id": "parent-42",
"thread_id": "topic-7",
}
def test_explicit_telegram_topic_target_with_thread_id(self):
"""deliver: 'telegram:chat_id:thread_id' parses correctly."""
job = {
@@ -1824,6 +1857,54 @@ class TestBuildJobPromptMissingSkill:
assert "go" in result
class TestBuildJobPromptBumpUse:
"""Verify that cron jobs bump skill usage counters so the curator sees them as active."""
def test_bump_use_called_for_loaded_skill(self):
"""bump_use is called for each successfully loaded skill."""
def _skill_view(name: str) -> str:
return json.dumps({"success": True, "content": f"Content for {name}."})
with patch("tools.skills_tool.skill_view", side_effect=_skill_view), \
patch("tools.skill_usage.bump_use") as mock_bump:
_build_job_prompt({"skills": ["alpha", "beta"], "prompt": "go"})
assert mock_bump.call_count == 2
calls = [c[0][0] for c in mock_bump.call_args_list]
assert "alpha" in calls
assert "beta" in calls
def test_bump_use_not_called_for_missing_skill(self):
"""bump_use is NOT called when a skill fails to load."""
def _missing_view(name: str) -> str:
return json.dumps({"success": False, "error": "not found"})
with patch("tools.skills_tool.skill_view", side_effect=_missing_view), \
patch("tools.skill_usage.bump_use") as mock_bump:
_build_job_prompt({"skills": ["ghost"], "prompt": "go"})
assert mock_bump.call_count == 0
def test_bump_failure_does_not_break_prompt(self, caplog):
"""If bump_use raises, the prompt still builds — error is logged at DEBUG."""
def _skill_view(name: str) -> str:
return json.dumps({"success": True, "content": "Works."})
with patch("tools.skills_tool.skill_view", side_effect=_skill_view), \
patch("tools.skill_usage.bump_use", side_effect=RuntimeError("boom")), \
caplog.at_level(logging.DEBUG, logger="cron.scheduler"):
result = _build_job_prompt({"skills": ["good-skill"], "prompt": "go"})
# Prompt should still contain the skill content and original instruction
assert "Works." in result
assert "go" in result
# The error should be logged at DEBUG level, not crash
assert any("failed to bump" in r.message for r in caplog.records)
class TestSendMediaViaAdapter:
"""Unit tests for _send_media_via_adapter — routes files to typed adapter methods."""
+23
View File
@@ -138,6 +138,29 @@ class TestSlashCommands:
response_text = send.call_args[1].get("content") or send.call_args[0][1]
assert "compress" in response_text.lower() or "context" in response_text.lower()
@pytest.mark.asyncio
async def test_quick_command_alias_targets_builtin_command_with_args(
self, adapter, runner, platform
):
"""Alias targets with args must reach the built-in command handler."""
runner.config.quick_commands = {
"s": {"type": "alias", "target": "/status extra-arg"}
}
async def _handle_status(event):
assert event.get_command_args() == "extra-arg"
return "status via alias"
runner._handle_status_command = AsyncMock(side_effect=_handle_status)
send = await send_and_capture(adapter, "/s", platform)
send.assert_called_once()
response_text = send.call_args[1].get("content") or send.call_args[0][1]
assert response_text == "status via alias"
runner._handle_status_command.assert_awaited_once()
runner._handle_message_with_agent.assert_not_awaited()
class TestSessionLifecycle:
"""Verify session state changes across command sequences."""
+17 -1
View File
@@ -12,6 +12,7 @@ class RestartTestAdapter(BasePlatformAdapter):
def __init__(self):
super().__init__(PlatformConfig(enabled=True, token="***"), Platform.TELEGRAM)
self.sent: list[str] = []
self.sent_calls: list[tuple[str, str, object]] = []
async def connect(self):
return True
@@ -21,6 +22,7 @@ class RestartTestAdapter(BasePlatformAdapter):
async def send(self, chat_id, content, reply_to=None, metadata=None):
self.sent.append(content)
self.sent_calls.append((chat_id, content, metadata))
return SendResult(success=True, message_id="1")
async def send_typing(self, chat_id, metadata=None):
@@ -30,12 +32,17 @@ class RestartTestAdapter(BasePlatformAdapter):
return {"id": chat_id}
def make_restart_source(chat_id: str = "123456", chat_type: str = "dm") -> SessionSource:
def make_restart_source(
chat_id: str = "123456",
chat_type: str = "dm",
thread_id: str | None = None,
) -> SessionSource:
return SessionSource(
platform=Platform.TELEGRAM,
chat_id=chat_id,
chat_type=chat_type,
user_id="u1",
thread_id=thread_id,
)
@@ -81,6 +88,15 @@ def make_restart_runner(
runner._handle_restart_command = GatewayRunner._handle_restart_command.__get__(
runner, GatewayRunner
)
runner._handle_set_home_command = GatewayRunner._handle_set_home_command.__get__(
runner, GatewayRunner
)
runner._send_restart_notification = GatewayRunner._send_restart_notification.__get__(
runner, GatewayRunner
)
runner._send_home_channel_startup_notifications = (
GatewayRunner._send_home_channel_startup_notifications.__get__(runner, GatewayRunner)
)
runner._status_action_label = GatewayRunner._status_action_label.__get__(
runner, GatewayRunner
)
+42
View File
@@ -240,6 +240,48 @@ class TestAdapterInit:
"http://127.0.0.1:3000",
)
def test_invalid_port_from_env_falls_back_to_default(self, monkeypatch):
monkeypatch.setenv("API_SERVER_PORT", "not-a-port")
config = PlatformConfig(enabled=True)
adapter = APIServerAdapter(config)
assert adapter._port == 8642
def test_create_agent_forwards_config_reasoning_effort(self, monkeypatch):
captured = {}
class FakeAgent:
def __init__(self, **kwargs):
captured.update(kwargs)
monkeypatch.setattr("run_agent.AIAgent", FakeAgent)
monkeypatch.setattr(
"gateway.run._resolve_runtime_agent_kwargs",
lambda: {
"provider": "openai-codex",
"base_url": "https://example.test/v1",
"api_mode": "codex_responses",
},
)
monkeypatch.setattr("gateway.run._resolve_gateway_model", lambda: "gpt-5.5")
monkeypatch.setattr(
"gateway.run._load_gateway_config",
lambda: {"agent": {"reasoning_effort": "xhigh"}},
)
monkeypatch.setattr(
"gateway.run.GatewayRunner._load_reasoning_config",
staticmethod(lambda: {"enabled": True, "effort": "xhigh"}),
)
monkeypatch.setattr("gateway.run.GatewayRunner._load_fallback_model", staticmethod(lambda: None))
monkeypatch.setattr("hermes_cli.tools_config._get_platform_tools", lambda *_: set())
adapter = APIServerAdapter(PlatformConfig(enabled=True))
monkeypatch.setattr(adapter, "_ensure_session_db", lambda: None)
agent = adapter._create_agent(session_id="api-session")
assert isinstance(agent, FakeAgent)
assert captured["reasoning_config"] == {"enabled": True, "effort": "xhigh"}
# ---------------------------------------------------------------------------
# Auth checking
+12 -10
View File
@@ -49,9 +49,10 @@ class TestSuspendRecentlyActive:
count = store.suspend_recently_active()
assert count == 1
# Re-fetch — should be suspended now
# Re-fetch — should be resume_pending (preserved, not wiped)
refreshed = store.get_or_create_session(source)
assert refreshed.was_auto_reset
assert refreshed.resume_pending
assert refreshed.session_id == entry.session_id # same session preserved
def test_does_not_suspend_old_sessions(self, tmp_path):
store = _make_store(tmp_path)
@@ -66,21 +67,22 @@ class TestSuspendRecentlyActive:
count = store.suspend_recently_active(max_age_seconds=120)
assert count == 0
def test_already_suspended_not_double_counted(self, tmp_path):
def test_already_resume_pending_not_double_counted(self, tmp_path):
store = _make_store(tmp_path)
source = _make_source()
entry = store.get_or_create_session(source)
# Suspend once
# Mark resume_pending once
count1 = store.suspend_recently_active()
assert count1 == 1
# Create a new session (the old one got reset on next access)
# Re-fetch returns the SAME session (preserved, not reset)
entry2 = store.get_or_create_session(source)
assert entry2.session_id == entry.session_id
# Suspend again — the new session is recent but not yet suspended
# Second call skips already-resume_pending entries
count2 = store.suspend_recently_active()
assert count2 == 1
assert count2 == 0
# ---------------------------------------------------------------------------
@@ -180,11 +182,11 @@ class TestCleanShutdownMarker:
else:
store.suspend_recently_active()
# Session SHOULD be suspended (crash recovery)
# Session SHOULD be resume_pending (crash recovery preserves history)
with store._lock:
store._ensure_loaded_locked()
suspended_count = sum(1 for e in store._entries.values() if e.suspended)
assert suspended_count == 1, "Session should be suspended after crash (no marker)"
resume_count = sum(1 for e in store._entries.values() if e.resume_pending)
assert resume_count == 1, "Session should be resume_pending after crash (no marker)"
def test_marker_written_on_restart_stop(self, tmp_path, monkeypatch):
"""stop(restart=True) should also write the marker."""
@@ -0,0 +1,166 @@
"""Regression tests for the config.yaml → env var bridge in gateway/run.py.
Guards against the 60-vs-500 bug where a stale `.env HERMES_MAX_ITERATIONS=60`
entry silently shadowed `agent.max_turns: 500` in config.yaml because the
bridge used `if X not in os.environ` guards. After PR#18413 the bridge
treats config.yaml as authoritative and unconditionally overwrites .env
values for `agent.*`, `display.*`, `timezone`, and `security.*` keys.
"""
from __future__ import annotations
import os
import subprocess
import sys
import textwrap
from pathlib import Path
import pytest
PROJECT_ROOT = Path(__file__).resolve().parents[2]
def _run_gateway_import(hermes_home: Path, initial_env: dict[str, str]) -> dict[str, str]:
"""Import gateway.run in a clean subprocess and return the post-import env.
The bridge runs at module-import time, so simply importing is enough
to exercise it. Running in a subprocess isolates the test from other
import side effects and makes the "what ends up in os.environ" check
deterministic.
"""
script = textwrap.dedent(
f"""
import os, sys
sys.path.insert(0, {str(PROJECT_ROOT)!r})
try:
from gateway import run # noqa: F401 — module import triggers bridge
except Exception as exc:
print(f"IMPORT_ERROR:{{type(exc).__name__}}:{{exc}}", file=sys.stderr)
sys.exit(2)
for k in (
"HERMES_MAX_ITERATIONS",
"HERMES_AGENT_TIMEOUT",
"HERMES_AGENT_TIMEOUT_WARNING",
"HERMES_GATEWAY_BUSY_INPUT_MODE",
"HERMES_TIMEZONE",
):
v = os.environ.get(k)
if v is not None:
print(f"{{k}}={{v}}")
"""
)
env = dict(initial_env)
env["HERMES_HOME"] = str(hermes_home)
# Keep PATH / PYTHONPATH so venv imports resolve.
for k in ("PATH", "PYTHONPATH", "VIRTUAL_ENV", "HOME"):
if k in os.environ and k not in env:
env[k] = os.environ[k]
result = subprocess.run(
[sys.executable, "-c", script],
env=env,
capture_output=True,
text=True,
timeout=60,
)
if result.returncode != 0:
pytest.fail(
f"gateway.run import failed (rc={result.returncode})\n"
f"stderr:\n{result.stderr}\nstdout:\n{result.stdout}"
)
out: dict[str, str] = {}
for line in result.stdout.splitlines():
if "=" in line:
k, v = line.split("=", 1)
out[k] = v
return out
def _write_config(home: Path, agent_cfg: dict | None = None, display_cfg: dict | None = None,
timezone: str | None = None) -> None:
import yaml
cfg: dict = {}
if agent_cfg:
cfg["agent"] = agent_cfg
if display_cfg:
cfg["display"] = display_cfg
if timezone:
cfg["timezone"] = timezone
(home / "config.yaml").write_text(yaml.safe_dump(cfg))
def _write_env(home: Path, entries: dict[str, str]) -> None:
lines = [f"{k}={v}\n" for k, v in entries.items()]
(home / ".env").write_text("".join(lines))
@pytest.fixture
def hermes_home(tmp_path: Path) -> Path:
home = tmp_path / ".hermes"
home.mkdir()
return home
def test_config_max_turns_wins_over_stale_env(hermes_home: Path) -> None:
"""Regression: config.yaml:agent.max_turns=500 must beat .env=60."""
_write_config(hermes_home, agent_cfg={"max_turns": 500})
_write_env(hermes_home, {"HERMES_MAX_ITERATIONS": "60"})
env = _run_gateway_import(hermes_home, initial_env={})
assert env.get("HERMES_MAX_ITERATIONS") == "500", (
f"expected config.yaml max_turns=500 to win; got {env.get('HERMES_MAX_ITERATIONS')!r}. "
"Stale .env value is shadowing config — the bridge lost its override."
)
def test_config_gateway_timeout_wins_over_stale_env(hermes_home: Path) -> None:
"""Every agent.* bridge key must be config-authoritative, not .env-authoritative."""
_write_config(hermes_home, agent_cfg={
"gateway_timeout": 1800,
"gateway_timeout_warning": 900,
})
_write_env(hermes_home, {
"HERMES_AGENT_TIMEOUT": "60",
"HERMES_AGENT_TIMEOUT_WARNING": "30",
})
env = _run_gateway_import(hermes_home, initial_env={})
assert env.get("HERMES_AGENT_TIMEOUT") == "1800"
assert env.get("HERMES_AGENT_TIMEOUT_WARNING") == "900"
def test_config_display_busy_input_mode_wins_over_stale_env(hermes_home: Path) -> None:
_write_config(hermes_home, display_cfg={"busy_input_mode": "interrupt"})
_write_env(hermes_home, {"HERMES_GATEWAY_BUSY_INPUT_MODE": "queue"})
env = _run_gateway_import(hermes_home, initial_env={})
assert env.get("HERMES_GATEWAY_BUSY_INPUT_MODE") == "interrupt"
def test_config_timezone_wins_over_stale_env(hermes_home: Path) -> None:
_write_config(hermes_home, timezone="America/Los_Angeles")
_write_env(hermes_home, {"HERMES_TIMEZONE": "UTC"})
env = _run_gateway_import(hermes_home, initial_env={})
assert env.get("HERMES_TIMEZONE") == "America/Los_Angeles"
def test_env_value_survives_when_config_omits_key(hermes_home: Path) -> None:
"""If config.yaml doesn't set max_turns, .env value must still pass through.
The bridge only overwrites when the config key is present an absent
config key should NOT clobber the .env value.
"""
_write_config(hermes_home, agent_cfg={}) # no max_turns
_write_env(hermes_home, {"HERMES_MAX_ITERATIONS": "123"})
env = _run_gateway_import(hermes_home, initial_env={})
assert env.get("HERMES_MAX_ITERATIONS") == "123"
@@ -0,0 +1,230 @@
"""Security regression tests: Discord component views honor role allowlists.
The four interactive component views (ExecApprovalView, SlashConfirmView,
UpdatePromptView, ModelPickerView) historically accepted only
``allowed_user_ids``. Deployments that configure DISCORD_ALLOWED_ROLES
without DISCORD_ALLOWED_USERS therefore had a wide-open component
surface: any guild member who could see the prompt could approve exec
commands, cancel slash confirmations, or switch the model -- even when
the same user would be rejected at the slash and on_message gates.
These tests pin the user-or-role OR semantics and the fail-closed
behavior on missing role data so the parity cannot regress.
"""
from types import SimpleNamespace
import pytest
# Trigger the shared discord mock from tests/gateway/conftest.py before
# importing the production module.
from gateway.platforms.discord import ( # noqa: E402
ExecApprovalView,
ModelPickerView,
SlashConfirmView,
UpdatePromptView,
_component_check_auth,
)
# ---------------------------------------------------------------------------
# Direct helper coverage -- the four views all delegate to this helper, so
# pinning the helper's contract pins all four call sites.
# ---------------------------------------------------------------------------
def _interaction(user_id, role_ids=None, *, drop_user=False, drop_roles=False):
"""Build a mock interaction with the requested user/role shape.
drop_user simulates a payload whose .user attribute is None.
drop_roles simulates a payload where .user has no .roles attribute
at all (DM-context Member, raw User payload).
"""
if drop_user:
return SimpleNamespace(user=None)
user_kwargs = {"id": user_id}
if not drop_roles:
user_kwargs["roles"] = [SimpleNamespace(id=r) for r in (role_ids or [])]
return SimpleNamespace(user=SimpleNamespace(**user_kwargs))
# ── back-compat: empty allowlists -> allow everyone ────────────────────────
def test_component_check_empty_allowlists_allows_everyone():
"""SECURITY-CRITICAL backwards-compat: deployments without any
DISCORD_ALLOWED_* env vars set must continue to allow component
interactions from anyone (no regression for unconfigured setups)."""
interaction = _interaction(11111)
assert _component_check_auth(interaction, set(), set()) is True
assert _component_check_auth(interaction, None, None) is True
# ── user allowlist ─────────────────────────────────────────────────────────
def test_component_check_user_in_user_allowlist_passes():
interaction = _interaction(11111)
assert _component_check_auth(interaction, {"11111"}, set()) is True
def test_component_check_user_not_in_user_allowlist_rejected():
interaction = _interaction(99999)
assert _component_check_auth(interaction, {"11111"}, set()) is False
# ── role allowlist OR semantics ────────────────────────────────────────────
def test_component_check_role_only_user_with_matching_role_passes():
"""Role-only deployment (DISCORD_ALLOWED_ROLES set, DISCORD_ALLOWED_USERS
empty) where the user is not in the empty user list but DOES carry a
matching role: must pass. This is the regression that prompted the
fix -- previously _check_auth allowed everyone when the user set was
empty, ignoring the role allowlist."""
interaction = _interaction(99999, role_ids=[42])
assert _component_check_auth(interaction, set(), {42}) is True
def test_component_check_role_only_user_without_matching_role_rejected():
"""Role-only deployment where the user has no matching role: reject.
Previously this allowed everyone because allowed_user_ids was empty."""
interaction = _interaction(99999, role_ids=[7, 8])
assert _component_check_auth(interaction, set(), {42}) is False
def test_component_check_user_or_role_user_match():
"""Both allowlists set; user matches user allowlist: pass."""
interaction = _interaction(11111, role_ids=[7])
assert _component_check_auth(interaction, {"11111"}, {42}) is True
def test_component_check_user_or_role_role_match():
"""Both allowlists set; user not in user list but in role list: pass."""
interaction = _interaction(99999, role_ids=[42])
assert _component_check_auth(interaction, {"11111"}, {42}) is True
def test_component_check_user_or_role_neither_match():
"""Both allowlists set; user matches neither: reject."""
interaction = _interaction(99999, role_ids=[7])
assert _component_check_auth(interaction, {"11111"}, {42}) is False
# ── fail-closed on missing role data ───────────────────────────────────────
def test_component_check_role_policy_with_no_roles_attr_rejects():
"""Role allowlist configured but interaction.user has no .roles
attribute (DM-context Member, raw User payload): must reject. A user
without resolvable roles cannot satisfy a role allowlist."""
interaction = _interaction(11111, drop_roles=True)
assert _component_check_auth(interaction, set(), {42}) is False
def test_component_check_missing_user_with_allowlist_rejects():
"""interaction.user is None with any allowlist configured: fail
closed without raising AttributeError."""
interaction = _interaction(0, drop_user=True)
assert _component_check_auth(interaction, {"11111"}, set()) is False
assert _component_check_auth(interaction, set(), {42}) is False
# ---------------------------------------------------------------------------
# View construction: every view must accept allowed_role_ids and route
# through the shared helper. Default value preserves prior call-sites.
# ---------------------------------------------------------------------------
def test_exec_approval_view_accepts_role_allowlist():
view = ExecApprovalView(
session_key="sess-1",
allowed_user_ids={"11111"},
allowed_role_ids={42},
)
# Role-only user passes
assert view._check_auth(_interaction(99999, role_ids=[42])) is True
# Neither user nor role match: reject
assert view._check_auth(_interaction(99999, role_ids=[7])) is False
def test_exec_approval_view_role_default_is_empty_set():
"""Existing call sites that pass only allowed_user_ids must continue
working with the legacy semantics (no role gate)."""
view = ExecApprovalView(session_key="sess-1", allowed_user_ids={"11111"})
assert view.allowed_role_ids == set()
assert view._check_auth(_interaction(11111)) is True
assert view._check_auth(_interaction(99999)) is False
def test_slash_confirm_view_accepts_role_allowlist():
view = SlashConfirmView(
session_key="sess-1",
confirm_id="c1",
allowed_user_ids=set(),
allowed_role_ids={42},
)
assert view._check_auth(_interaction(99999, role_ids=[42])) is True
assert view._check_auth(_interaction(99999, role_ids=[7])) is False
def test_update_prompt_view_accepts_role_allowlist():
view = UpdatePromptView(
session_key="sess-1",
allowed_user_ids=set(),
allowed_role_ids={42},
)
assert view._check_auth(_interaction(99999, role_ids=[42])) is True
assert view._check_auth(_interaction(99999, role_ids=[7])) is False
def test_model_picker_view_accepts_role_allowlist():
async def _noop(*_a, **_k):
return ""
view = ModelPickerView(
providers=[],
current_model="m",
current_provider="p",
session_key="sess-1",
on_model_selected=_noop,
allowed_user_ids=set(),
allowed_role_ids={42},
)
assert view._check_auth(_interaction(99999, role_ids=[42])) is True
assert view._check_auth(_interaction(99999, role_ids=[7])) is False
# ---------------------------------------------------------------------------
# Empty allowlists across views: legacy "allow everyone" must hold.
# ---------------------------------------------------------------------------
@pytest.mark.parametrize(
"view_factory",
[
lambda: ExecApprovalView(session_key="s", allowed_user_ids=set()),
lambda: SlashConfirmView(session_key="s", confirm_id="c", allowed_user_ids=set()),
lambda: UpdatePromptView(session_key="s", allowed_user_ids=set()),
],
)
def test_views_empty_allowlists_allow_everyone(view_factory):
view = view_factory()
assert view._check_auth(_interaction(99999)) is True
def test_model_picker_view_empty_allowlists_allow_everyone():
async def _noop(*_a, **_k):
return ""
view = ModelPickerView(
providers=[],
current_model="m",
current_provider="p",
session_key="s",
on_model_selected=_noop,
allowed_user_ids=set(),
)
assert view.allowed_role_ids == set()
assert view._check_auth(_interaction(99999)) is True
+63
View File
@@ -172,6 +172,69 @@ async def test_connect_only_requests_members_intent_when_needed(monkeypatch, all
await adapter.disconnect()
@pytest.mark.asyncio
async def test_reconnect_closes_previous_client_to_prevent_zombie_websocket(monkeypatch):
"""Regression for #18187: calling connect() twice without disconnect() in
between (e.g. during an in-process reconnect attempt) must close the old
commands.Bot before creating a new one. Without this guard, two websockets
stay alive and both fire on_message, producing double responses with
different wording.
"""
adapter = DiscordAdapter(PlatformConfig(enabled=True, token="test-token"))
monkeypatch.setattr("gateway.status.acquire_scoped_lock", lambda scope, identity, metadata=None: (True, None))
monkeypatch.setattr("gateway.status.release_scoped_lock", lambda scope, identity: None)
intents = SimpleNamespace(
message_content=False, dm_messages=False, guild_messages=False,
members=False, voice_states=False,
)
monkeypatch.setattr(discord_platform.Intents, "default", lambda: intents)
class TrackedBot(FakeBot):
"""FakeBot that records close() calls and reports open/closed state."""
_closed = False
def is_closed(self):
return self._closed
async def close(self):
self._closed = True
created: list[TrackedBot] = []
def fake_bot_factory(*, command_prefix, intents, proxy=None, allowed_mentions=None, **_):
bot = TrackedBot(intents=intents, allowed_mentions=allowed_mentions)
created.append(bot)
return bot
monkeypatch.setattr(discord_platform.commands, "Bot", fake_bot_factory)
monkeypatch.setattr(adapter, "_resolve_allowed_usernames", AsyncMock())
# First connect — fresh adapter, no prior client.
assert await adapter.connect() is True
assert len(created) == 1
first_bot = created[0]
assert first_bot._closed is False, "first bot should still be open after connect()"
# Second connect WITHOUT disconnect — simulates an in-process reconnect.
# Without the fix, first_bot would remain open (zombie), and both would
# receive every Discord event, causing double responses.
assert await adapter.connect() is True
assert len(created) == 2
second_bot = created[1]
# The first bot must be closed before the second is assigned.
assert first_bot._closed is True, (
"First Discord client must be closed on re-entry of connect() to prevent "
"zombie websocket (#18187)"
)
assert second_bot._closed is False, "second bot should still be open"
assert adapter._client is second_bot
await adapter.disconnect()
@pytest.mark.asyncio
async def test_connect_releases_token_lock_on_timeout(monkeypatch):
adapter = DiscordAdapter(PlatformConfig(enabled=True, token="test-token"))
+737
View File
@@ -0,0 +1,737 @@
"""Security regression tests: slash commands honor on_message authorization gates.
Slash invocations (``_run_simple_slash``, ``_handle_thread_create_slash``)
historically bypassed every gate ``on_message`` enforces DISCORD_ALLOWED_USERS,
DISCORD_ALLOWED_ROLES, DISCORD_ALLOWED_CHANNELS, DISCORD_IGNORED_CHANNELS.
Any guild member could invoke ``/background``, ``/restart``, etc. as the
operator. ``_check_slash_authorization`` mirrors all four gates one-for-one.
These tests pin the security-correct behavior so the bypass cannot regress.
"""
import asyncio
import logging
import sys
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock
import pytest
from gateway.config import PlatformConfig
# ---------------------------------------------------------------------------
# Discord module mock — borrowed from test_discord_slash_commands.py so this
# file runs on machines without discord.py installed.
# ---------------------------------------------------------------------------
def _ensure_discord_mock():
if "discord" in sys.modules and hasattr(sys.modules["discord"], "__file__"):
return # real discord installed
if sys.modules.get("discord") is None:
discord_mod = MagicMock()
discord_mod.Intents.default.return_value = MagicMock()
discord_mod.DMChannel = type("DMChannel", (), {})
discord_mod.Thread = type("Thread", (), {})
discord_mod.ForumChannel = type("ForumChannel", (), {})
discord_mod.Interaction = object
class _FakePermissions:
def __init__(self, value=0, **_):
self.value = value
discord_mod.Permissions = _FakePermissions
class _FakeGroup:
def __init__(self, *, name, description, parent=None):
self.name = name
self.description = description
self.parent = parent
self._children: dict[str, object] = {}
if parent is not None:
parent.add_command(self)
def add_command(self, cmd):
self._children[cmd.name] = cmd
class _FakeCommand:
def __init__(self, *, name, description, callback, parent=None):
self.name = name
self.description = description
self.callback = callback
self.parent = parent
self.default_permissions = None
discord_mod.app_commands = SimpleNamespace(
describe=lambda **kwargs: (lambda fn: fn),
choices=lambda **kwargs: (lambda fn: fn),
autocomplete=lambda **kwargs: (lambda fn: fn),
Choice=lambda **kwargs: SimpleNamespace(**kwargs),
Group=_FakeGroup,
Command=_FakeCommand,
)
ext_mod = MagicMock()
commands_mod = MagicMock()
commands_mod.Bot = MagicMock
ext_mod.commands = commands_mod
sys.modules["discord"] = discord_mod
sys.modules.setdefault("discord.ext", ext_mod)
sys.modules.setdefault("discord.ext.commands", commands_mod)
_ensure_discord_mock()
from gateway.platforms.discord import DiscordAdapter # noqa: E402
@pytest.fixture(autouse=True)
def _isolate_discord_env(monkeypatch):
for var in (
"DISCORD_ALLOWED_USERS",
"DISCORD_ALLOWED_ROLES",
"DISCORD_ALLOWED_CHANNELS",
"DISCORD_IGNORED_CHANNELS",
"DISCORD_HIDE_SLASH_COMMANDS",
"DISCORD_ALLOW_BOTS",
):
monkeypatch.delenv(var, raising=False)
@pytest.fixture(autouse=True)
def _stub_discord_permissions(monkeypatch):
"""Pin discord.Permissions to a plain stand-in so tests can assert the
bitfield value regardless of whether real discord.py or a sibling test
module's MagicMock is loaded."""
import discord
class _Perm:
def __init__(self, value=0, **_):
self.value = value
monkeypatch.setattr(discord, "Permissions", _Perm)
@pytest.fixture
def adapter():
config = PlatformConfig(enabled=True, token="***")
a = DiscordAdapter(config)
a._client = SimpleNamespace(user=SimpleNamespace(id=99999, name="HermesBot"), guilds=[])
return a
_SENTINEL = object()
def _make_interaction(
user_id, *, channel_id=12345, guild_id=42, in_dm=False, in_thread=False,
parent_channel_id=None, user=_SENTINEL,
):
"""Build a mock Discord Interaction with a still-unresponded response.
``channel_id`` may be set to ``None`` to simulate a guild interaction
payload missing a resolvable channel id (fail-closed exercise).
Pass ``user=None`` to simulate a payload missing the user object.
"""
import discord
response = SimpleNamespace(send_message=AsyncMock(), defer=AsyncMock())
if in_dm:
channel = discord.DMChannel()
elif in_thread:
channel = discord.Thread()
channel.id = channel_id
channel.parent_id = parent_channel_id
elif channel_id is None:
channel = None
else:
channel = SimpleNamespace(id=channel_id)
if user is _SENTINEL:
user_obj = SimpleNamespace(id=int(user_id), name=f"user_{user_id}")
else:
user_obj = user
return SimpleNamespace(
user=user_obj,
guild=SimpleNamespace(owner_id=999),
guild_id=guild_id,
channel_id=channel_id,
channel=channel,
response=response,
)
# ---------------------------------------------------------------------------
# Backwards-compat: empty allowlist → everything passes (matches on_message)
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_no_allowlist_allows_everyone(adapter):
"""SECURITY-CRITICAL backwards-compat: deployments without any allowlist
env vars set must see ZERO behavior change. on_message lets everyone
through in this case (returns True at line 1890); slash must do the same.
"""
interaction = _make_interaction("999999999")
assert await adapter._check_slash_authorization(interaction, "/help") is True
interaction.response.send_message.assert_not_awaited()
@pytest.mark.asyncio
async def test_no_allowlist_dm_also_allowed(adapter):
"""Same for DMs — no allowlist means no restriction, matching on_message."""
interaction = _make_interaction("999999999", in_dm=True)
assert await adapter._check_slash_authorization(interaction, "/help") is True
# ---------------------------------------------------------------------------
# User allowlist (DISCORD_ALLOWED_USERS) parity
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_allowed_user_passes(adapter):
adapter._allowed_user_ids = {"100200300"}
interaction = _make_interaction("100200300")
assert await adapter._check_slash_authorization(interaction, "/background hi") is True
interaction.response.send_message.assert_not_awaited()
@pytest.mark.asyncio
async def test_disallowed_user_rejected_with_ephemeral(adapter, caplog):
adapter._allowed_user_ids = {"100200300"}
interaction = _make_interaction("999999999")
with caplog.at_level(logging.WARNING):
assert await adapter._check_slash_authorization(interaction, "/background hi") is False
interaction.response.send_message.assert_awaited_once()
args, kwargs = interaction.response.send_message.call_args
assert kwargs.get("ephemeral") is True
assert "not authorized" in (args[0] if args else kwargs.get("content", "")).lower()
assert any("Unauthorized slash attempt" in r.message for r in caplog.records)
assert any("DISCORD_ALLOWED_USERS" in r.message for r in caplog.records)
# ---------------------------------------------------------------------------
# Role allowlist (DISCORD_ALLOWED_ROLES) parity
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_role_member_passes(adapter):
"""A user whose Member.roles includes an allowed role passes the gate."""
adapter._allowed_role_ids = {1234}
interaction = _make_interaction("999999999")
interaction.user.roles = [SimpleNamespace(id=1234)]
assert await adapter._check_slash_authorization(interaction, "/help") is True
@pytest.mark.asyncio
async def test_role_non_member_rejected(adapter):
"""A user without any matching role is rejected even if no user allowlist."""
adapter._allowed_role_ids = {1234}
interaction = _make_interaction("999999999")
interaction.user.roles = [SimpleNamespace(id=9999)] # different role
assert await adapter._check_slash_authorization(interaction, "/help") is False
# ---------------------------------------------------------------------------
# Channel allowlist (DISCORD_ALLOWED_CHANNELS) parity — the gate prajer used
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_channel_not_in_allowlist_rejected(adapter, monkeypatch, caplog):
"""on_message blocks messages in channels not in DISCORD_ALLOWED_CHANNELS;
slash must do the same. This is the EXACT bypass prajer exploited.
"""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "1111,2222")
interaction = _make_interaction("100200300", channel_id=9999)
with caplog.at_level(logging.WARNING):
assert await adapter._check_slash_authorization(interaction, "/background hi") is False
assert any("DISCORD_ALLOWED_CHANNELS" in r.message for r in caplog.records)
@pytest.mark.asyncio
async def test_channel_in_allowlist_passes(adapter, monkeypatch):
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "1111,2222")
interaction = _make_interaction("100200300", channel_id=1111)
assert await adapter._check_slash_authorization(interaction, "/help") is True
@pytest.mark.asyncio
async def test_channel_allowlist_wildcard_passes(adapter, monkeypatch):
"""``*`` in DISCORD_ALLOWED_CHANNELS = allow any channel, matching on_message."""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "*")
interaction = _make_interaction("100200300", channel_id=9999)
assert await adapter._check_slash_authorization(interaction, "/help") is True
@pytest.mark.asyncio
async def test_channel_allowlist_does_not_apply_to_dms(adapter, monkeypatch):
"""DMs aren't channel-gated — they go through on_message's DM lockdown."""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "1111")
interaction = _make_interaction("100200300", in_dm=True)
assert await adapter._check_slash_authorization(interaction, "/help") is True
# ---------------------------------------------------------------------------
# Channel blocklist (DISCORD_IGNORED_CHANNELS) parity
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_ignored_channel_rejected(adapter, monkeypatch, caplog):
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "9999")
interaction = _make_interaction("100200300", channel_id=9999)
with caplog.at_level(logging.WARNING):
assert await adapter._check_slash_authorization(interaction, "/help") is False
assert any("DISCORD_IGNORED_CHANNELS" in r.message for r in caplog.records)
@pytest.mark.asyncio
async def test_ignored_channel_wildcard_blocks_all(adapter, monkeypatch):
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "*")
interaction = _make_interaction("100200300", channel_id=9999)
assert await adapter._check_slash_authorization(interaction, "/help") is False
# ---------------------------------------------------------------------------
# Cross-platform admin notification
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_unauthorized_attempt_notifies_telegram(adapter):
from gateway.session import Platform
telegram_adapter = SimpleNamespace(send=AsyncMock())
home = SimpleNamespace(chat_id="987654321")
runner = SimpleNamespace(
adapters={Platform.TELEGRAM: telegram_adapter},
config=SimpleNamespace(get_home_channel=lambda p: home if p is Platform.TELEGRAM else None),
)
adapter.gateway_runner = runner
adapter._allowed_user_ids = {"100200300"}
interaction = _make_interaction("999999999")
await adapter._check_slash_authorization(interaction, "/background hi")
# Notify is fire-and-forget — let the scheduled task run.
await asyncio.sleep(0)
await asyncio.sleep(0)
telegram_adapter.send.assert_awaited_once()
chat_id, msg = telegram_adapter.send.call_args.args
assert chat_id == "987654321"
assert "Unauthorized" in msg
assert "999999999" in msg
assert "/background hi" in msg
assert "DISCORD_ALLOWED_USERS" in msg
@pytest.mark.asyncio
async def test_notify_silently_no_ops_without_runner(adapter):
adapter.gateway_runner = None
await adapter._notify_unauthorized_slash("u", "1", 2, 3, "/x", "reason") # must not raise
@pytest.mark.asyncio
async def test_notify_falls_back_to_slack_if_no_telegram(adapter):
from gateway.session import Platform
slack_adapter = SimpleNamespace(send=AsyncMock())
home_slack = SimpleNamespace(chat_id="C12345")
runner = SimpleNamespace(
adapters={Platform.SLACK: slack_adapter},
config=SimpleNamespace(
get_home_channel=lambda p: home_slack if p is Platform.SLACK else None,
),
)
adapter.gateway_runner = runner
await adapter._notify_unauthorized_slash("u", "1", 2, 3, "/x", "reason")
slack_adapter.send.assert_awaited_once()
# ---------------------------------------------------------------------------
# Opt-in visibility hide
# ---------------------------------------------------------------------------
def test_visibility_hide_off_by_default_is_noop(adapter, monkeypatch):
"""DISCORD_HIDE_SLASH_COMMANDS unset → don't touch any command's permissions."""
cmd = SimpleNamespace(name="x", default_permissions="UNCHANGED")
tree = SimpleNamespace(get_commands=lambda: [cmd])
# Re-run the registration tail logic by calling the bit that decides:
# we don't have a clean way to simulate the env-gated branch from
# _register_slash_commands, so we just confirm the helper itself works
# AND assert the env-gating logic is correct.
assert os.environ.get("DISCORD_HIDE_SLASH_COMMANDS") is None
# Helper should still work when called directly:
adapter._apply_owner_only_visibility(tree)
# When called directly the helper applies — env gating is at the call site,
# which we exercise in an integration-style test below.
def test_visibility_hide_helper_zeroes_perms(adapter):
cmd_a = SimpleNamespace(name="a", default_permissions=None)
cmd_b = SimpleNamespace(name="b", default_permissions=None)
tree = SimpleNamespace(get_commands=lambda: [cmd_a, cmd_b])
adapter._apply_owner_only_visibility(tree)
assert cmd_a.default_permissions is not None
assert cmd_b.default_permissions is not None
assert cmd_a.default_permissions.value == 0
assert cmd_b.default_permissions.value == 0
def test_visibility_hide_tolerates_unsetable_command(adapter, caplog):
class _Frozen:
__slots__ = ("name",)
def __init__(self, name):
self.name = name
cmd_ok = SimpleNamespace(name="ok", default_permissions=None)
cmd_bad = _Frozen("bad")
tree = SimpleNamespace(get_commands=lambda: [cmd_bad, cmd_ok])
with caplog.at_level(logging.DEBUG):
adapter._apply_owner_only_visibility(tree)
assert cmd_ok.default_permissions.value == 0
# os import for test_visibility_hide_off_by_default_is_noop
import os # noqa: E402
# ---------------------------------------------------------------------------
# Fail-closed parity on malformed slash auth context
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_missing_channel_id_rejected_when_channel_policy_configured(
adapter, monkeypatch,
):
"""A guild interaction without a resolvable channel id must fail
closed when DISCORD_ALLOWED_CHANNELS is configured. Without this
guard the entire channel-policy block silently fell through."""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "1111,2222")
interaction = _make_interaction("100200300", channel_id=None)
assert await adapter._check_slash_authorization(interaction, "/help") is False
interaction.response.send_message.assert_awaited_once()
@pytest.mark.asyncio
async def test_missing_channel_id_allowed_when_no_channel_policy(adapter):
"""No DISCORD_ALLOWED_CHANNELS configured + missing channel id: still
pass through the channel block (matches no-allowlist default)."""
interaction = _make_interaction("100200300", channel_id=None)
assert await adapter._check_slash_authorization(interaction, "/help") is True
@pytest.mark.asyncio
async def test_missing_user_rejected_when_allowlist_configured(adapter):
"""interaction.user is None with a user/role allowlist active:
fail closed without raising AttributeError."""
adapter._allowed_user_ids = {"100200300"}
interaction = _make_interaction("100200300", user=None)
# Must not raise — must return False with an ephemeral rejection
assert await adapter._check_slash_authorization(interaction, "/help") is False
interaction.response.send_message.assert_awaited_once()
@pytest.mark.asyncio
async def test_missing_user_allowed_when_no_allowlist_configured(adapter):
"""interaction.user is None but no allowlist configured: allow
(preserves no-allowlist back-compat -- anyone is allowed when no
policy is in effect)."""
interaction = _make_interaction("100200300", user=None)
assert await adapter._check_slash_authorization(interaction, "/help") is True
# ---------------------------------------------------------------------------
# Thread parent channel allowlist parity
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_thread_parent_in_allowlist_passes(adapter, monkeypatch):
"""Thread whose parent channel is on DISCORD_ALLOWED_CHANNELS passes
even though the thread id itself isn't on the list."""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "5555")
interaction = _make_interaction(
"100200300", channel_id=9999, in_thread=True, parent_channel_id=5555,
)
assert await adapter._check_slash_authorization(interaction, "/help") is True
@pytest.mark.asyncio
async def test_thread_parent_in_ignorelist_rejects(adapter, monkeypatch):
"""Thread whose parent channel is on DISCORD_IGNORED_CHANNELS rejects
even when the thread id itself isn't ignored."""
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "5555")
interaction = _make_interaction(
"100200300", channel_id=9999, in_thread=True, parent_channel_id=5555,
)
assert await adapter._check_slash_authorization(interaction, "/help") is False
@pytest.mark.asyncio
async def test_ignored_beats_allowed(adapter, monkeypatch):
"""Channel listed in BOTH allowed and ignored: the ignored entry wins.
Anything else would be a foot-gun where adding to ignored does nothing
if the channel is also explicitly allowed."""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "1111")
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "1111")
interaction = _make_interaction("100200300", channel_id=1111)
assert await adapter._check_slash_authorization(interaction, "/help") is False
# ---------------------------------------------------------------------------
# Admin notify soft-fail fallback
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_notify_falls_back_to_slack_on_telegram_soft_fail(adapter):
"""adapter.send returning SendResult(success=False) must NOT short-
circuit the fallback chain. Treating a soft failure as delivered
means a Telegram outage swallows alerts silently."""
from gateway.session import Platform
soft_fail = SimpleNamespace(success=False, error="rate limited")
telegram_adapter = SimpleNamespace(send=AsyncMock(return_value=soft_fail))
slack_adapter = SimpleNamespace(send=AsyncMock())
home_tg = SimpleNamespace(chat_id="987654321")
home_sl = SimpleNamespace(chat_id="C12345")
homes = {Platform.TELEGRAM: home_tg, Platform.SLACK: home_sl}
runner = SimpleNamespace(
adapters={
Platform.TELEGRAM: telegram_adapter,
Platform.SLACK: slack_adapter,
},
config=SimpleNamespace(get_home_channel=lambda p: homes.get(p)),
)
adapter.gateway_runner = runner
await adapter._notify_unauthorized_slash("u", "1", 2, 3, "/x", "reason")
telegram_adapter.send.assert_awaited_once()
slack_adapter.send.assert_awaited_once()
@pytest.mark.asyncio
async def test_notify_returns_on_telegram_truthy_success(adapter):
"""adapter.send returning SendResult(success=True) -- or any object
without a falsy success attribute -- should still short-circuit at
Telegram. (This guards against the soft-fail patch over-correcting.)"""
from gateway.session import Platform
ok = SimpleNamespace(success=True, message_id="m1")
telegram_adapter = SimpleNamespace(send=AsyncMock(return_value=ok))
slack_adapter = SimpleNamespace(send=AsyncMock())
home_tg = SimpleNamespace(chat_id="987654321")
home_sl = SimpleNamespace(chat_id="C12345")
homes = {Platform.TELEGRAM: home_tg, Platform.SLACK: home_sl}
runner = SimpleNamespace(
adapters={
Platform.TELEGRAM: telegram_adapter,
Platform.SLACK: slack_adapter,
},
config=SimpleNamespace(get_home_channel=lambda p: homes.get(p)),
)
adapter.gateway_runner = runner
await adapter._notify_unauthorized_slash("u", "1", 2, 3, "/x", "reason")
telegram_adapter.send.assert_awaited_once()
slack_adapter.send.assert_not_awaited()
# ---------------------------------------------------------------------------
# /skill autocomplete + callback gating
# ---------------------------------------------------------------------------
def _capture_skill_registration(adapter, monkeypatch, entries):
"""Run ``_register_skill_group`` against a stubbed skill catalog and
return ``(handler_callback, autocomplete_callback)``.
The autocomplete callback is captured by monkeypatching
``discord.app_commands.autocomplete`` -- the production decorator is
a no-op stub in this test file's discord mock, so capturing the
callback through it is the direct route in tests.
"""
import discord
captured: dict = {}
def fake_categories(reserved_names):
# Match discord_skill_commands_by_category's tuple shape:
# (categories_dict, uncategorized_list, hidden_count)
return ({}, list(entries), 0)
import hermes_cli.commands as _hc
monkeypatch.setattr(
_hc, "discord_skill_commands_by_category", fake_categories,
)
def capture_autocomplete(**kwargs):
# Only one autocomplete in /skill registration: name=...
captured["autocomplete"] = kwargs.get("name")
def _passthrough(fn):
return fn
return _passthrough
monkeypatch.setattr(
discord.app_commands, "autocomplete", capture_autocomplete,
raising=False,
)
registered: list = []
class _Tree:
def get_commands(self):
return []
def add_command(self, cmd):
registered.append(cmd)
adapter._register_skill_group(_Tree())
assert registered, "_register_skill_group did not register a command"
return registered[0].callback, captured["autocomplete"]
@pytest.mark.asyncio
async def test_skill_autocomplete_returns_empty_for_unauthorized(
adapter, monkeypatch,
):
"""Autocomplete must not leak the installed skill catalog to users
who can't run /skill. With DISCORD_ALLOWED_USERS configured and the
interaction user outside it, the autocomplete callback returns []."""
adapter._allowed_user_ids = {"100200300"}
entries = [
("alpha", "First skill", "/alpha"),
("beta", "Second skill", "/beta"),
]
_handler, autocomplete = _capture_skill_registration(
adapter, monkeypatch, entries,
)
interaction = _make_interaction("999999999")
result = await autocomplete(interaction, "")
assert result == []
@pytest.mark.asyncio
async def test_skill_autocomplete_returns_choices_for_authorized(
adapter, monkeypatch,
):
"""Sanity: an authorized user still gets the autocomplete suggestions."""
adapter._allowed_user_ids = {"100200300"}
entries = [
("alpha", "First skill", "/alpha"),
("beta", "Second skill", "/beta"),
]
_handler, autocomplete = _capture_skill_registration(
adapter, monkeypatch, entries,
)
interaction = _make_interaction("100200300")
result = await autocomplete(interaction, "")
assert len(result) == 2
assert {choice.value for choice in result} == {"alpha", "beta"}
@pytest.mark.asyncio
async def test_skill_handler_rejects_before_dispatch_for_unauthorized(
adapter, monkeypatch,
):
"""The /skill handler must call _check_slash_authorization BEFORE
skill_lookup. Otherwise unknown vs known names produce divergent
responses ("Unknown skill: foo" vs auth rejection) which is a
catalog-probing oracle."""
adapter._allowed_user_ids = {"100200300"}
entries = [("alpha", "First skill", "/alpha")]
handler, _autocomplete = _capture_skill_registration(
adapter, monkeypatch, entries,
)
# Patch _run_simple_slash so we can detect any leak through it.
dispatched: list = []
async def fake_dispatch(_interaction, text):
dispatched.append(text)
adapter._run_simple_slash = fake_dispatch # type: ignore[assignment]
interaction = _make_interaction("999999999")
await handler(interaction, "alpha", "")
interaction.response.send_message.assert_awaited_once()
args, kwargs = interaction.response.send_message.call_args
assert kwargs.get("ephemeral") is True
assert "not authorized" in (
args[0] if args else kwargs.get("content", "")
).lower()
# Critically: nothing was dispatched, and the auth message did NOT
# mention the skill name "alpha" (no catalog leak).
assert dispatched == []
@pytest.mark.asyncio
async def test_skill_handler_known_and_unknown_produce_same_rejection(
adapter, monkeypatch,
):
"""An unauthorized user probing for valid skill names must see the
same rejection text regardless of whether the name they tried is
on the registered catalog."""
adapter._allowed_user_ids = {"100200300"}
entries = [("alpha", "First skill", "/alpha")]
handler, _ = _capture_skill_registration(adapter, monkeypatch, entries)
adapter._run_simple_slash = AsyncMock() # type: ignore[assignment]
known_interaction = _make_interaction("999999999")
unknown_interaction = _make_interaction("999999999")
await handler(known_interaction, "alpha", "")
await handler(unknown_interaction, "definitely-not-a-skill", "")
known_interaction.response.send_message.assert_awaited_once()
unknown_interaction.response.send_message.assert_awaited_once()
known_args, known_kwargs = known_interaction.response.send_message.call_args
unknown_args, unknown_kwargs = (
unknown_interaction.response.send_message.call_args
)
assert known_args == unknown_args
assert known_kwargs == unknown_kwargs
@pytest.mark.asyncio
async def test_skill_handler_dispatches_for_authorized(
adapter, monkeypatch,
):
"""Sanity: an authorized user reaches _run_simple_slash with the
resolved cmd_key and arguments."""
adapter._allowed_user_ids = {"100200300"}
entries = [("alpha", "First skill", "/alpha")]
handler, _ = _capture_skill_registration(adapter, monkeypatch, entries)
dispatched: list = []
async def fake_dispatch(_interaction, text):
dispatched.append(text)
adapter._run_simple_slash = fake_dispatch # type: ignore[assignment]
interaction = _make_interaction("100200300")
await handler(interaction, "alpha", "extra args")
assert dispatched == ["/alpha extra args"]
+16 -1
View File
@@ -107,6 +107,10 @@ def adapter():
user=SimpleNamespace(id=99999, name="HermesBot"),
)
adapter._text_batch_delay_seconds = 0 # disable batching for tests
# Slash auth is exercised in test_discord_slash_auth.py — bypass it here
# so registration / dispatch / thread behavior tests don't have to
# construct a full auth context (allowlist / channel scope).
adapter._check_slash_authorization = AsyncMock(return_value=True)
return adapter
@@ -117,6 +121,10 @@ def adapter():
@pytest.mark.asyncio
async def test_registers_native_thread_slash_command(adapter):
# The /thread slash closure now delegates ALL the work — including
# defer() — to _handle_thread_create_slash so the auth gate can send
# an ephemeral rejection on the still-unresponded interaction. The
# closure should just forward.
adapter._handle_thread_create_slash = AsyncMock()
adapter._register_slash_commands()
@@ -127,7 +135,9 @@ async def test_registers_native_thread_slash_command(adapter):
await command(interaction, name="Planning", message="", auto_archive_duration=1440)
interaction.response.defer.assert_awaited_once_with(ephemeral=True)
# defer is now performed inside _handle_thread_create_slash, AFTER the
# auth check passes — not by the closure.
interaction.response.defer.assert_not_awaited()
adapter._handle_thread_create_slash.assert_awaited_once_with(interaction, "Planning", "", 1440)
@@ -298,6 +308,7 @@ async def test_handle_thread_create_slash_reports_success(adapter):
user=SimpleNamespace(display_name="Jezza", id=42),
guild=SimpleNamespace(name="TestGuild"),
followup=SimpleNamespace(send=AsyncMock()),
response=SimpleNamespace(defer=AsyncMock()),
)
await adapter._handle_thread_create_slash(interaction, "Planning", "Kickoff", 1440)
@@ -326,6 +337,7 @@ async def test_handle_thread_create_slash_dispatches_session_when_message_provid
user=SimpleNamespace(display_name="Jezza", id=42),
guild=SimpleNamespace(name="TestGuild"),
followup=SimpleNamespace(send=AsyncMock()),
response=SimpleNamespace(defer=AsyncMock()),
)
adapter._dispatch_thread_session = AsyncMock()
@@ -348,6 +360,7 @@ async def test_handle_thread_create_slash_no_dispatch_without_message(adapter):
user=SimpleNamespace(display_name="Jezza", id=42),
guild=SimpleNamespace(name="TestGuild"),
followup=SimpleNamespace(send=AsyncMock()),
response=SimpleNamespace(defer=AsyncMock()),
)
adapter._dispatch_thread_session = AsyncMock()
@@ -371,6 +384,7 @@ async def test_handle_thread_create_slash_falls_back_to_seed_message(adapter):
user=SimpleNamespace(display_name="Jezza", id=42),
guild=SimpleNamespace(name="TestGuild"),
followup=SimpleNamespace(send=AsyncMock()),
response=SimpleNamespace(defer=AsyncMock()),
)
await adapter._handle_thread_create_slash(interaction, "Planning", "Kickoff", 1440)
@@ -395,6 +409,7 @@ async def test_handle_thread_create_slash_reports_failure(adapter):
channel_id=123,
user=SimpleNamespace(display_name="Jezza", id=42),
followup=SimpleNamespace(send=AsyncMock()),
response=SimpleNamespace(defer=AsyncMock()),
)
await adapter._handle_thread_create_slash(interaction, "Planning", "", 1440)
+63
View File
@@ -1771,6 +1771,69 @@ class TestAdapterBehavior(unittest.TestCase):
self.assertIn("GIF downgraded to file", caption)
self.assertIn("look", caption)
def test_download_remote_document_reads_response_before_httpx_client_closes(self):
"""#18451 — snapshot Content-Type + body while the httpx.AsyncClient
context is still active so pooled connections fully release on
exit. Otherwise the response is only readable because httpx
eagerly buffers it; a future refactor to .stream() would silently
read-after-close."""
from gateway.config import PlatformConfig
from gateway.platforms.feishu import FeishuAdapter
events: list[str] = []
class _FakeResponse:
headers = {"Content-Type": "application/octet-stream"}
def raise_for_status(self) -> None:
events.append("raise_for_status")
@property
def content(self) -> bytes:
events.append("content_read")
return b"doc-bytes"
class _FakeAsyncClient:
def __init__(self, *_a: object, **_k: object) -> None:
pass
async def __aenter__(self) -> "_FakeAsyncClient":
events.append("client_enter")
return self
async def __aexit__(self, *exc: object) -> None:
events.append("client_exit")
async def get(self, *_a: object, **_k: object) -> _FakeResponse:
events.append("get")
return _FakeResponse()
with tempfile.TemporaryDirectory() as tmp:
with patch.dict(os.environ, {"HERMES_HOME": tmp}, clear=False):
adapter = FeishuAdapter(PlatformConfig())
async def _run() -> tuple[str, str]:
with patch("tools.url_safety.is_safe_url", return_value=True):
with patch("httpx.AsyncClient", _FakeAsyncClient):
with patch(
"gateway.platforms.feishu.cache_document_from_bytes",
return_value="/tmp/cached-doc.bin",
):
return await adapter._download_remote_document(
"https://example.com/doc.bin",
default_ext=".bin",
preferred_name="doc",
)
path, filename = asyncio.run(_run())
self.assertEqual(path, "/tmp/cached-doc.bin")
self.assertTrue(filename)
# content_read MUST happen before client_exit — otherwise we're
# reading response body after the connection pool has been torn
# down, which only works by accident (httpx's eager buffering).
self.assertLess(events.index("content_read"), events.index("client_exit"))
def test_dedup_state_persists_across_adapter_restart(self):
from gateway.config import PlatformConfig
from gateway.platforms.feishu import FeishuAdapter
@@ -0,0 +1,78 @@
"""Gateway command help rendering tests."""
import pytest
from gateway.config import Platform
from gateway.platforms.base import MessageEvent
from gateway.session import SessionSource
def _make_event(text: str, platform: Platform) -> MessageEvent:
return MessageEvent(
text=text,
source=SessionSource(
platform=platform,
chat_id="chat-1",
user_id="user-1",
user_name="tester",
chat_type="dm",
),
)
def _make_runner():
from gateway.run import GatewayRunner
return object.__new__(GatewayRunner)
@pytest.mark.asyncio
async def test_help_sanitizes_slash_command_mentions_for_telegram(monkeypatch):
"""Telegram help output must not expose invalid uppercase/hyphenated slashes."""
monkeypatch.setattr(
"agent.skill_commands.get_skill_commands",
lambda: {
"/Linear": {"description": "Open Linear"},
"/Custom-Thing": {"description": "Run a custom thing"},
},
)
result = await _make_runner()._handle_help_command(
_make_event("/help", Platform.TELEGRAM)
)
assert "`/linear`" in result
assert "`/custom_thing`" in result
assert "`/Linear`" not in result
assert "`/Custom-Thing`" not in result
@pytest.mark.asyncio
async def test_commands_sanitizes_slash_command_mentions_for_telegram(monkeypatch):
"""Paginated Telegram /commands output uses Telegram-valid slash mentions."""
monkeypatch.setattr(
"agent.skill_commands.get_skill_commands",
lambda: {"/Linear": {"description": "Open Linear"}},
)
result = await _make_runner()._handle_commands_command(
_make_event("/commands 999", Platform.TELEGRAM)
)
assert "`/linear`" in result
assert "`/Linear`" not in result
@pytest.mark.asyncio
async def test_help_keeps_non_telegram_slash_command_mentions_unchanged(monkeypatch):
"""Only Telegram needs slash mentions rewritten to Telegram command names."""
monkeypatch.setattr(
"agent.skill_commands.get_skill_commands",
lambda: {"/Linear": {"description": "Open Linear"}},
)
result = await _make_runner()._handle_help_command(
_make_event("/help", Platform.DISCORD)
)
assert "`/Linear`" in result
+217
View File
@@ -0,0 +1,217 @@
"""Tests for gateway /goal verdict-message delivery.
The judge verdict message ("✓ Goal achieved", "⏸ budget exhausted", etc.)
must reach the user after each turn. Before this fix the code checked
``hasattr(adapter, "send_message")`` but adapters expose ``send()``,
never ``send_message``, so the check always evaluated False and users
never saw verdicts. This test locks in the fix.
"""
from __future__ import annotations
import asyncio
from datetime import datetime
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from gateway.config import GatewayConfig, Platform, PlatformConfig
from gateway.session import SessionEntry, SessionSource, build_session_key
@pytest.fixture()
def hermes_home(tmp_path, monkeypatch):
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(home))
from hermes_cli import goals
goals._DB_CACHE.clear()
yield home
goals._DB_CACHE.clear()
def _make_source() -> SessionSource:
return SessionSource(
platform=Platform.TELEGRAM,
user_id="u1",
chat_id="c1",
user_name="tester",
chat_type="dm",
)
class _RecordingAdapter:
"""Minimal adapter that records send() invocations."""
def __init__(self) -> None:
self._pending_messages: dict = {}
self.sends: list[dict] = []
async def send(self, chat_id: str, content: str, reply_to=None, metadata=None):
self.sends.append({"chat_id": chat_id, "content": content, "metadata": metadata})
class _R:
success = True
message_id = "mock-msg"
return _R()
def _make_runner_with_adapter():
from gateway.run import GatewayRunner
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(
platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="***")},
)
runner.adapters = {}
runner._running_agents = {}
runner._running_agents_ts = {}
runner._queued_events = {}
src = _make_source()
session_entry = SessionEntry(
session_key=build_session_key(src),
session_id="goal-sess-1",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
chat_type="dm",
)
runner.session_store = MagicMock()
runner.session_store.get_or_create_session.return_value = session_entry
runner.session_store._generate_session_key.return_value = build_session_key(src)
adapter = _RecordingAdapter()
runner.adapters[Platform.TELEGRAM] = adapter
return runner, adapter, session_entry, src
@pytest.mark.asyncio
async def test_goal_verdict_done_sent_via_adapter_send(hermes_home):
"""When the judge says done, the '✓ Goal achieved' message must reach
the user through the adapter's ``send()`` method."""
runner, adapter, session_entry, src = _make_runner_with_adapter()
from hermes_cli.goals import GoalManager
mgr = GoalManager(session_entry.session_id)
mgr.set("ship the feature")
with patch("hermes_cli.goals.judge_goal", return_value=("done", "the feature shipped")):
runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="I shipped the feature.",
)
# fire-and-forget create_task — give the loop a tick
await asyncio.sleep(0.05)
assert len(adapter.sends) == 1, f"expected 1 send, got {len(adapter.sends)}: {adapter.sends}"
msg = adapter.sends[0]
assert msg["chat_id"] == "c1"
assert "Goal achieved" in msg["content"]
assert "the feature shipped" in msg["content"]
@pytest.mark.asyncio
async def test_goal_verdict_continue_enqueues_continuation(hermes_home):
"""When the judge says continue, both the 'continuing' status and the
continuation-prompt event must be delivered. The continuation prompt is
routed through the adapter's pending-messages FIFO so the goal loop
proceeds on the next turn."""
runner, adapter, session_entry, src = _make_runner_with_adapter()
from hermes_cli.goals import GoalManager
mgr = GoalManager(session_entry.session_id)
mgr.set("polish the docs")
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "still needs work")):
runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="here's a partial edit",
)
await asyncio.sleep(0.05)
# Status line sent back
assert len(adapter.sends) == 1
assert "Continuing toward goal" in adapter.sends[0]["content"]
# Continuation prompt enqueued for next turn
assert adapter._pending_messages, "continuation prompt must be enqueued in pending_messages"
@pytest.mark.asyncio
async def test_goal_verdict_budget_exhausted_sends_pause(hermes_home):
"""When the budget is exhausted, a '⏸ Goal paused' message must be sent
and no further continuation enqueued."""
runner, adapter, session_entry, src = _make_runner_with_adapter()
from hermes_cli.goals import GoalManager, save_goal
mgr = GoalManager(session_entry.session_id, default_max_turns=2)
state = mgr.set("tiny goal", max_turns=2)
state.turns_used = 2
save_goal(session_entry.session_id, state)
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "keep going")):
runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="still partial",
)
await asyncio.sleep(0.05)
assert len(adapter.sends) == 1
content = adapter.sends[0]["content"]
assert "paused" in content.lower()
assert "turns used" in content.lower()
# No continuation enqueued when budget is exhausted
assert not adapter._pending_messages
@pytest.mark.asyncio
async def test_goal_verdict_skipped_when_no_active_goal(hermes_home):
"""No goal set → the hook is a no-op. Nothing is sent, nothing enqueued."""
runner, adapter, session_entry, src = _make_runner_with_adapter()
runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="anything",
)
await asyncio.sleep(0.05)
assert adapter.sends == []
assert adapter._pending_messages == {}
@pytest.mark.asyncio
async def test_goal_verdict_survives_adapter_without_send(hermes_home):
"""Bad adapter (no ``send`` attribute) must not crash the judge hook."""
runner, _adapter, session_entry, src = _make_runner_with_adapter()
from hermes_cli.goals import GoalManager
GoalManager(session_entry.session_id).set("survive missing send")
class _NoSendAdapter:
def __init__(self):
self._pending_messages: dict = {}
runner.adapters[Platform.TELEGRAM] = _NoSendAdapter()
with patch("hermes_cli.goals.judge_goal", return_value=("done", "ok")):
# must not raise
runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="whatever",
)
await asyncio.sleep(0.05)

Some files were not shown because too many files have changed in this diff Show More