Compare commits

...

88 Commits

Author SHA1 Message Date
teknium1 9e17ddcead feat(cli): add hermes send to pipe script output to any messaging platform
Introduces a thin CLI wrapper around the existing send_message_tool so
shell scripts, cron scripts, CI hooks, and monitoring daemons can reuse
the gateway's already-configured platform credentials without
reimplementing each platform's REST client.

## What

  hermes send --to telegram "deploy finished"
  echo "RAM 92%" | hermes send --to telegram:-1001234567890
  hermes send --to discord:#ops --file report.md
  hermes send --to slack:#eng --subject "[CI]" --file build.log
  hermes send --list                  # all targets
  hermes send --list telegram         # filter by platform

Supports all platforms the send_message tool already does (Telegram,
Discord, Slack, Signal, SMS, WhatsApp, Matrix, Feishu, DingTalk, WeCom,
Weixin, Email, etc.), including threaded targets and #channel-name
resolution via the channel directory.

## How

hermes_cli/send_cmd.py delegates to tools.send_message_tool.send_message_tool,
which means there is zero new platform-specific code. The subcommand just:

1. Bridges ~/.hermes/.env and top-level ~/.hermes/config.yaml scalars into
   os.environ (same bootstrap the gateway does at startup) — required so
   TELEGRAM_HOME_CHANNEL and friends are visible to load_gateway_config().
2. Resolves the message body from positional arg, --file, or piped stdin.
3. Calls the shared tool and translates its JSON result to exit codes:
   0 success, 1 delivery failure, 2 usage error.

No running gateway is required for bot-token platforms (Telegram, Discord,
Slack, Signal, SMS, WhatsApp) — the tool hits each platform's REST API
directly. Plugin platforms that rely on a live adapter connection still
need the gateway running; the error message is forwarded verbatim.

## Docs

- New guide: website/docs/guides/pipe-script-output.md covering real-world
  patterns (memory watchdogs, CI hooks, cron pipes, long-running task
  completion pings) and the security/gateway notes.
- Cross-links added from automate-with-cron.md ("no LLM? use hermes send")
  and developer-guide/gateway-internals.md (delivery-path section).

## Tests

tests/hermes_cli/test_send_cmd.py (20 tests, all green):

- Happy paths: positional message, stdin, --file, --file -, --subject,
  --json, --quiet.
- Error paths: missing --to, missing body, file not found, tool returns
  error payload (exit 1), tool skipped-send result (exit 0).
- --list: human output, --json output, platform filter, unknown platform.
- Env loader: bridges config.yaml scalars into env, does not override
  existing env vars, gracefully handles missing files.
- Registrar contract: register_send_subparser() returns a working parser.

Smoke-tested end-to-end against a live Telegram bot before commit.
2026-05-04 02:32:49 -07:00
Teknium cac4f2c0e6 test(kanban): update worker-prompt header assertion to match #19427
PR #19427 dropped the 'You are a Kanban worker' identity line from
KANBAN_GUIDANCE so SOUL.md stays authoritative for profile identity.
This test assertion was stale against that change; update it to the
new protocol-only header.
2026-05-04 02:00:42 -07:00
pdonizete deb59eab72 fix: allow kanban tools for orchestrator profiles with kanban toolset
The _check_kanban_mode() gating function only checked for
HERMES_KANBAN_TASK env var, which is only set by the dispatcher
when spawning workers. This prevented orchestrator profiles (like
techlead) from using kanban_create, kanban_link, etc. even when
they had 'kanban' explicitly in their toolsets config.

Now uses load_config() from hermes_cli.config (which has mtime-based
caching) to check if 'kanban' is in the profile's toolsets list.
This enables orchestrators to route work via Kanban while workers
continue using the dispatcher env var.

Fixes #18968
2026-05-04 02:00:42 -07:00
nftpoetrist 9faaa292b4 fix(delegate): inherit parent fallback_chain in _build_child_agent
_build_child_agent constructed child AIAgents without passing
fallback_model, leaving _fallback_chain=[] for every subagent.
When a subagent hit a rate-limit or credential exhaustion the
runtime fallback check (run_agent.py:7486 / 12267) found an empty
chain and failed immediately — even though the parent agent was
configured with fallback_providers and would have recovered.

The cron scheduler already propagates fallback_model correctly
(scheduler.py:1038). Fix closes the parity gap by reading the
parent's _fallback_chain (the normalised list form accepted by
AIAgent's fallback_model parameter) and threading it through.

Empty chains coerce to None so AIAgent initialises _fallback_chain=[]
as usual rather than iterating an empty list.
2026-05-04 01:48:56 -07:00
molvikar cb33c73418 fix(run_agent): gate iteration-limit provider routing to OpenRouter 2026-05-04 01:45:59 -07:00
Asunfly 8a364df2c8 fix: inherit reasoning config in API server runs 2026-05-04 01:44:16 -07:00
SHL0MS aede94e757 fix: back up config.yaml before hermes setup modifies it
Create a timestamped backup (~/.hermes/config.yaml.bak.YYYYMMDD_HHMMSS)
before the setup wizard runs any configuration sections. After setup
completes, show the backup path and a restore command.

This protects user-customized values (compression thresholds, provider
routing, PII redaction, auxiliary model configs) from being silently
overwritten by setup defaults.

Addresses #3522
2026-05-04 01:43:17 -07:00
memosr 2c7d7a9b2f fix(security): bind Meet node server to localhost and restrict token file to owner read 2026-05-04 01:42:59 -07:00
yuehei cdde0c8411 fix(feishu): enable MEDIA attachment delivery in send_message tool
The _send_feishu() function already supports media_files (images, video,
audio, documents) via the adapter's send_image_file/send_video/send_voice
/send_document methods, but _send_to_platform() never routed Feishu into
the early media-handling branch — media attachments were silently dropped
with a "not supported" warning.

Add a Feishu-specific media branch (matching the existing Yuanbao/Signal
pattern) so that MEDIA:<path> tags in send_message calls are correctly
delivered as native Feishu attachments. Also update the two error/warning
message strings to include feishu in the supported platform list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-04 01:42:40 -07:00
WanderWang 45fd45103d fix: _chromium_installed() now checks AGENT_BROWSER_EXECUTABLE_PATH and system Chrome
Before this fix, _chromium_installed() only searched Playwright-style
chromium-* / chromium_headless_shell-* directories, which meant users
with system Chrome or AGENT_BROWSER_EXECUTABLE_PATH configured still
had all browser_* tools gated.

Now checks three sources in priority order:
1. AGENT_BROWSER_EXECUTABLE_PATH env var (if set and points to a real binary)
2. System Chrome/Chromium via shutil.which() (google-chrome, chromium-browser, chrome)
3. Playwright browser cache (existing logic, kept as fallback)

Closes #19294
2026-05-04 01:42:23 -07:00
Yanzhong Su c653f5dc3f Clarify session_search auxiliary model docs 2026-05-04 01:42:07 -07:00
ai-ag2026 8bdec80882 fix(agent): surface preflight compression status
Preflight compression can run synchronously before the first model call when a loaded session exceeds the active context threshold. Gateway users saw no visible progress while the compression LLM call was in flight, which can look like a dropped message during long compactions.\n\nEmit the existing lifecycle status through _emit_status before starting preflight compression so CLI, gateway, and WebUI status callbacks all get immediate feedback.\n\nAdds a regression assertion for the preflight path.
2026-05-04 01:41:51 -07:00
qiqufang d8be50d772 fix(web): add missing icons for config page category sidebar
Add icon mappings for 9 categories that fell back to FileQuestion:
- bedrock (Cloud), curator (Sparkles), kanban (LayoutDashboard)
- model_catalog (BookOpen), openrouter (Route), sessions (History)
- tool_loop_guardrails (Shield), tool_output (FileOutput), updates (RefreshCw)
2026-05-04 01:41:27 -07:00
Teknium 06031229e8 fix(tests): tolerate ps ancestor-walk in find_gateway_pids fallback test (#19590)
Follow-up to #19586 (@cixuuz salvage): _get_ancestor_pids walks ps -o ppid=
up the process tree, which the pre-existing mock in
test_find_gateway_pids_falls_back_to_pid_file_when_process_scan_fails didn't
expect. Return empty stdout so the ancestor loop terminates cleanly and the
original fallback assertion still passes.
2026-05-04 01:40:39 -07:00
liuhao1024 9c93fc5775 fix(tui): call process.exit(0) after Ink exit to trigger terminal cleanup
Ink's exit() calls unmount() which resets terminal modes (kitty keyboard,
mouse, etc.) but does NOT call process.exit().  The Node process stays
alive because stdin is still open (Ink listens on it), so the
process.on('exit') handler in entry.tsx — which sends the final
resetTerminalModes() — never fires.

This left kitty keyboard protocol and other terminal modes enabled in the
parent shell after /quit, Ctrl+C, or Ctrl+D, breaking arrow keys and
other input in subsequent programs.

Add explicit process.exit(0) after exit() in die() so the process
actually terminates and the exit handler runs.

Fixes #19194
2026-05-04 01:39:39 -07:00
Hermes Agent 74c997d985 fix(gateway): move quick-command dispatch before built-in handlers
Quick commands of type "alias" that target built-in slash commands
(e.g. /h -> /model) were processed too late in _handle_message — after
the if-canonical=="model" checks. This meant alias expansion never
reached the target handler and fell through to the LLM as raw text.

Two fixes:
1. Move the quick_commands block before built-in dispatch so alias
   targets (like /model) hit the correct handler after expansion.
2. Extract bare command name from target_command via .split()[0] to
   feed _resolve_cmd() correctly (was using the full arg-string).
2026-05-04 01:39:23 -07:00
holynn c857592558 fix(cli): allow custom:* provider slugs in model validation
Two related fixes for custom_providers model switching:

1. validate_requested_model() now recognizes custom:<name> slugs
   (e.g. custom:volcengine) as custom endpoints, not generic providers.
   Previously only the bare 'custom' slug matched the relaxed validation
   branch, causing model validation to fail with 'not found in provider
   listing' for all named custom providers.

2. switch_model() now consults the custom_providers list when deciding
   whether to override a validation rejection. If the requested model
   matches the entry's 'model' field or any key in its 'models' dict,
   the switch is accepted even when the remote /v1/models endpoint does
   not list it.

Both changes are covered by existing tests (86 passed).
2026-05-04 01:39:06 -07:00
Byrn Tong e8cdcf5328 fix: exclude ancestor PIDs from gateway process scan (#13242)
_scan_gateway_pids() uses ps-based pattern matching to find running
gateways. When invoked from the CLI (e.g. `hermes gateway status`),
the calling process itself matches gateway patterns, causing false
positives — the CLI is mistakenly counted as a running gateway.

Add _get_ancestor_pids() that walks the process tree from the current
PID up to init (PID 1). Merge this set into exclude_pids at the top
of _scan_gateway_pids() so the entire ancestor chain is filtered out.

This complements the existing os.getpid() exclusion in
_append_unique_pid() by also covering parent/grandparent processes
(e.g. when hermes is invoked via a wrapper script or shell).

Closes #13242
2026-05-04 01:38:41 -07:00
Aleksandr Pasevin 8a4fe80f8d fix(signal): skip reactions for unauthorized senders
The on_processing_start hook fired a reaction emoji (👀) on every
inbound Signal message before run.py's _is_user_authorized check.
This meant contacts not in SIGNAL_ALLOWED_USERS would see the bot
react to their messages even though Hermes silently dropped them —
leaking the presence of the bot and causing confusing UX.

Two changes to gateway/platforms/signal.py:

1. Read SIGNAL_ALLOWED_USERS into self.dm_allow_from in __init__
   (mirrors the group_allow_from pattern already in place).

2. Add _reactions_enabled(event) — two-gate check:
   - SIGNAL_REACTIONS=false/0/no disables reactions globally
   - If SIGNAL_ALLOWED_USERS is set, only react to senders in
     the allowlist (skips unauthorized contacts)

Both on_processing_start and on_processing_complete now call this
guard before sending any reaction.

Telegram already has an equivalent _reactions_enabled() guard
(controlled by TELEGRAM_REACTIONS). This brings Signal to parity.
2026-05-04 01:38:21 -07:00
nftpoetrist e89376d66f fix(setup): add missing SLACK_HOME_CHANNEL prompt to _setup_slack()
_setup_slack() was the only platform setup function that did not prompt
for a home channel. All four sibling setups (_setup_telegram,
_setup_discord, _setup_mattermost, _setup_bluebubbles) close with an
identical home-channel block, and setup_gateway() already checks for
SLACK_HOME_CHANNEL presence at the end of the wizard — but the value
was never collected, leaving cron delivery and cross-platform
notifications silently broken for Slack after a fresh hermes setup run.

Add the standard home-channel prompt at the end of _setup_slack(),
symmetric with the Discord implementation. Add two unit tests that
verify the prompt is saved when provided and skipped when left blank.
2026-05-04 01:37:18 -07:00
Byrn Tong 81ce945450 fix(gateway): show other profiles in gateway status to prevent confusion
When multiple gateway profiles are running (e.g. default and wx1),
`hermes gateway status` can be misleading — stopping one profile's
gateway and checking status may still show the other profile's process
without indicating which profile it belongs to.

Add `_print_other_profiles_gateway_status()` which displays running
gateways from other profiles at the bottom of the status output:

    Other profiles:
      ✓ wx1              — PID 166893

This uses the existing `find_profile_gateway_processes()` and
`get_active_profile_name()` — no new dependencies.

Closes #19113
Related: #4402, #4587
2026-05-04 01:37:02 -07:00
wanazhar df88375f0d fix: treat ctrl-c as curses cancel 2026-05-04 01:36:44 -07:00
leavr ccb5d87076 test: cover max-iterations summary message sanitization 2026-05-04 01:36:27 -07:00
tmdgusya a1cb811cb8 fix(cli): avoid voice TTS restart race 2026-05-04 01:36:07 -07:00
Teknium 314fe9f827 chore(release): add AUTHOR_MAP entries for upcoming salvage batch
Pre-adds author-email mappings for the 21 Tier 1b salvage PRs so
their cherry-picked commits land with mapped GitHub logins in the
release notes.
2026-05-04 01:34:32 -07:00
ethan 645b99aadd test(cron): cover null next_run_at recovery and non-dict origin tolerance
Adds four regression tests guarding the bugfix in the previous commit:
- TestGetDueJobs::test_broken_cron_without_next_run_is_recovered exercises
  cron schedules whose next_run_at was lost; expects compute_next_run to
  repopulate it within get_due_jobs() rather than silently skipping the job.
- TestGetDueJobs::test_broken_interval_without_next_run_is_recovered does
  the same for interval schedules.
- TestResolveOrigin::test_string_origin_is_tolerated and
  test_non_dict_origin_is_tolerated confirm _resolve_origin() returns None
  for legacy/hand-edited origins (string, list, int) instead of raising.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-04 01:32:58 -07:00
ethan 78b635ee3c fix(cron): recover null next_run_at jobs and tolerate non-dict origin
Fixes #18722

get_due_jobs() now recomputes next_run_at via compute_next_run() for
cron/interval jobs that arrived with null next_run_at (e.g. via direct
jobs.json edits) instead of silently skipping them. _resolve_origin()
guards with isinstance(origin, dict), and _deliver_result() now routes
through _resolve_origin() so string/non-dict origins no longer crash
the ticker.

References: references #18735 (open competing fix from automated bulk PR touching 79 files); this PR is a focused single-issue contribution and adds the missing interval-recovery test variant

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-04 01:32:58 -07:00
Teknium 91ea3ae4b2 test(skills): add bytes-vs-str equivalence and on-disk hash parity tests
Follow-up on #9925 cherry-pick adding two additional tests:
- bytes content hashes identically to its str-decoded form
- mixed bytes+str bundle hash equals the on-disk content_hash from
  skills_guard (the production invariant used to detect drift)

Also map dodofun@126.com and 1615063567@qq.com in AUTHOR_MAP so the
CI contributor check passes for the cherry-picked commit.

Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
Co-authored-by: zhao0112 <1615063567@qq.com>
2026-05-04 01:28:12 -07:00
dh 3072e5543b skills-hub: hash binary skill bundle files correctly 2026-05-04 01:28:12 -07:00
Teknium c90f25dd1f chore(release): map daixin1204@gmail.com to @SimbaKingjoe 2026-05-04 01:21:23 -07:00
daixin1204 744079ffe6 fix(curator): prevent false-positive consolidation from substring matching
_classify_removed_skills used naive 'in' substring matching to detect
whether a removed skill's name appeared in skill_manage arguments.
Short/common skill names (api, git, test, foo, etc.) matched
incorrectly when they appeared as substrings of longer words in file
paths (references/api-design.md) or content (latest, testing).

Replace with field-aware matching:
- file_path: needle must match a complete filename stem or directory
  name, with -/_ normalised for variant tolerance
- content fields: word-boundary regex (\b) prevents embedding in
  longer words

Also add 3 regression tests covering the false-positive scenarios.
2026-05-04 01:21:23 -07:00
Clooooode c0300575c1 fix(kanban): use get_default_hermes_root() in list_profiles_on_disk
Path.home() / ".hermes" / "profiles" breaks custom-root deployments
(e.g. HERMES_HOME=/opt/data). Switch to get_default_hermes_root() so
profile discovery is consistent with kanban_db_path() and
workspaces_root() fixed in #18985.

Fixes #19017.
Related to #18442, #18985.
2026-05-04 01:21:14 -07:00
Clooooode 1964b0565b test(kanban): add failing test for list_profiles_on_disk with custom HERMES_HOME
list_profiles_on_disk() hardcodes Path.home() / ".hermes" / "profiles",
ignoring HERMES_HOME when set to a custom root (e.g. /opt/data).

Add test_list_profiles_on_disk_custom_root to cover this case.

Related to #18442, #18985.
2026-05-04 01:21:14 -07:00
Siddharth Balyan 8163d37192 fix(skill): reference built-in video_analyze/vision_analyze tools in kanban-video-orchestrator (#19562)
The tool-matrix.md had a vague 'Gemini multimodal / Claude vision' entry
in the external tools table that didn't point to the actual built-in
Hermes tools. Now that video_analyze exists (merged in #19301), update
the skill to reference it properly:

- Add 'Built-in Hermes tools for media review' section with proper
  toolset names, enablement instructions, and capability details
- Add video + vision toolsets to cinematographer, editor, and reviewer
  profile configs
- Update role-archetypes.md to reference tools by name
- Update API key table to explain video_analyze routing
2026-05-04 12:54:50 +05:30
Siddharth Balyan a11aed1acc fix(cli): local backend CLI always uses launch directory, stops .env sync of TERMINAL_CWD (#19334)
The old CWD heuristic was fooled by:
1. TERMINAL_CWD persisted to .env by `hermes config set terminal.cwd`
2. Inherited TERMINAL_CWD from parent hermes processes
3. Only resolved when config had a placeholder value (not explicit paths)

Fix:
- load_cli_config() unconditionally uses os.getcwd() for local backend
- TERMINAL_CWD always force-exported in CLI mode (overrides stale values)
- Gateway sets _HERMES_GATEWAY=1 marker so lazy cli.py imports don't clobber
- Remove terminal.cwd from config-set .env sync map (prevents re-poisoning)
- Clarify setup wizard label as 'Gateway working directory'

Closes #19214
2026-05-04 11:36:19 +05:30
Ben Barclay 434d70d8bc Merge pull request #19540 from NousResearch/single_container_for_all
feat(docker): launch dashboard as side-process via HERMES_DASHBOARD=1
2026-05-04 15:38:19 +10:00
Ben 5671059f62 feat(docker): launch dashboard as side-process via HERMES_DASHBOARD=1
Adds an optional dashboard side-process to the container entrypoint,
toggled by `HERMES_DASHBOARD=1` (also accepts `true` / `yes`).  When set,
the entrypoint backgrounds `hermes dashboard` before `exec`-ing the main
command so the user's chosen foreground process (gateway, chat, `sleep
infinity`, …) remains PID-of-interest for the container runtime.
  docker run -d \
    -v ~/.hermes:/opt/data \
    -p 8642:8642 -p 9119:9119 \
    -e HERMES_DASHBOARD=1 \
    nousresearch/hermes-agent gateway run
Defaults chosen for the container case:
 - Host: 0.0.0.0 (reachable through published port; can override to
   127.0.0.1 via HERMES_DASHBOARD_HOST for sidecar/reverse-proxy setups)
 - Port: 9119 (matches `hermes dashboard`)
 - Auto-adds `--insecure` when binding to non-localhost, matching the
   dashboard's own safety gate for exposing API keys
 - HERMES_DASHBOARD_TUI is read by `hermes dashboard` directly — no
   entrypoint plumbing needed
Dashboard output is prefixed with `[dashboard]` via `stdbuf`+`sed -u` so
it's easy to separate from gateway logs in `docker logs`.  No supervision:
if the dashboard crashes it stays down until the container restarts
(documented in the `:::note` panel).
Other changes bundled in:
 - Deprecate GATEWAY_HEALTH_URL / GATEWAY_HEALTH_TIMEOUT env vars in
   hermes_cli/web_server.py with a DEPRECATED block comment and a
   `.. deprecated::` note on _probe_gateway_health.  The feature still
   works for this release; it'll be removed alongside the move to a
   first-class dashboard config key.
 - Rewrite the "Running the dashboard" doc section around the new
   single-container pattern.  Drops the previously-documented
   dashboard-as-its-own-container setup — that pattern relied on the
   deprecated env vars for cross-container gateway-liveness detection,
   and without them the dashboard would permanently report the gateway
   as "not running".
 - Collapse the two-service Compose example (gateway + dashboard
   container) into a single service with HERMES_DASHBOARD=1.  Removes
   the now-unnecessary bridge network and `depends_on`.
 - Drop the ":::warning" caveat about "Running a dashboard container
   alongside the gateway is safe" — that case no longer exists.
2026-05-04 15:37:27 +10:00
Ben Barclay 95f395027f Merge pull request #19520 from NousResearch/fix_docker_tui
fix(docker/tui): tolerate npm's peer-flag drop in lockfile comparison
2026-05-04 14:29:43 +10:00
Ben 2f2998bb1b fix(tui): tolerate npm's peer-flag drop in lockfile comparison
`_tui_need_npm_install()` compares the canonical `package-lock.json` against
the hidden `node_modules/.package-lock.json` to decide whether `npm install`
needs to re-run. npm 9 drops the `"peer": true` field from the hidden lock
on dev-deps that are *also* declared as peers (the canonical lock preserves
the dual annotation). That made the check flag 16 packages (`@babel/core`,
`@types/node`, `@types/react`, `@typescript-eslint/*`, `react`, `vite`,
`tsx`, `typescript`, …) as mismatched on every launch, triggering a runtime
`npm install`.
Inside the Docker image, that runtime install then fails with EACCES because
`/opt/hermes/ui-tui/node_modules/` is root-owned from build time, so
`docker run … hermes-agent --tui` prints:
    Installing TUI dependencies…
    npm install failed.
…and exits 1, with no preview. The empty preview is a second bug: the
launcher captured only stderr, but npm 9 writes EACCES to stdout, which
was DEVNULL'd.
Fixes:
 - Add `"peer"` to `_NPM_LOCK_RUNTIME_KEYS` so the comparison ignores the
   non-deterministic field, alongside the existing `"ideallyInert"`.
 - Capture stdout as well as stderr in the install subprocess so future
   failures surface a useful preview instead of a bare "failed." line.
Regression tests:
 - `test_no_install_when_only_peer_annotation_differs` — the exact scenario
 - `test_install_when_version_differs_even_with_peer_drop` — guards against
   the peer-drop tolerance masking a real version skew
On-host impact: the same false-positive was firing on every `hermes --tui`
invocation from a normal checkout, silently running a no-op `npm install`
each time (it converged because the host's `node_modules/` is writable).
Startup time on the TUI should drop noticeably.
2026-05-04 14:13:38 +10:00
Chris Danis 363cc93674 fix(cron): bump skill usage when cron jobs load skills
Cron jobs that reference skills via their skills: config never bumped
the usage counters in .usage.json, so the curator could auto-archive
skills actively used by cron jobs based on stale timestamps.

Now _build_job_prompt() calls bump_use(skill_name) for each
successfully loaded skill so the curator sees them as active.
2026-05-03 17:06:48 -07:00
nftpoetrist 808fee151d fix(auxiliary): propagate explicit_api_key to _try_anthropic()
_try_anthropic() lacked the explicit_api_key parameter added to
_try_openrouter() in #18768. When resolve_provider_client() is called
with provider="anthropic" and an explicit key (e.g. from a fallback_model
entry with api_key set), the key was silently ignored — _try_anthropic()
always fell back to resolve_anthropic_token(), so the fallback returned
None,None for users without a default Anthropic credential configured.

Fix: add explicit_api_key: str = None to _try_anthropic() and use
explicit_api_key or <pool/env fallback> in both the pool-present and
no-pool paths. Pass explicit_api_key=explicit_api_key at the call site
in resolve_provider_client(). Symmetric with the _try_openrouter() fix.
No behavior change when explicit_api_key is None.
2026-05-03 17:00:55 -07:00
molvikar 74636f9c4a fix(gateway): clear queued reload-skills notes on new/resume/branch 2026-05-03 17:00:31 -07:00
Kenny Wang 222767e5e8 fix: sanitize Telegram help command mentions 2026-05-03 17:00:09 -07:00
konsisumer 6fda92aa7f fix(gateway): bridge top-level require_mention to Telegram config
Users commonly place `require_mention: true` at the top level of
config.yaml alongside `group_sessions_per_user`, expecting it to gate
Telegram group messages. The key was silently ignored because the
config loader only checked `yaml_cfg["telegram"]["require_mention"]`.

When `require_mention` is found at the top level and no telegram-specific
value is set, the fix now:
- adds it to platforms_data["telegram"]["extra"] so _telegram_require_mention()
  picks it up via the primary config.extra path
- sets TELEGRAM_REQUIRE_MENTION env var for the secondary fallback path

A telegram-specific value (telegram.require_mention) still takes
precedence over the top-level shorthand.

Also corrects telegram.md: bare /cmd without @botname is rejected when
require_mention is enabled; only /cmd@botname (bot-menu form) passes.

Fixes #3979
2026-05-03 16:59:46 -07:00
clawbot 1bd975c0ba fix(gateway): suppress duplicate voice transcripts
Deduplicate exact and near-exact Discord voice STT transcripts per guild/user over a short window to avoid duplicate delayed agent replies.

Adds regression tests for exact and near-duplicate voice transcript suppression.
2026-05-03 16:59:21 -07:00
Teknium b58db237e4 fix(kanban): drop worker identity claim from KANBAN_GUIDANCE (#19427)
KANBAN_GUIDANCE layer 3 of the system prompt started with 'You are a
Kanban worker', overriding the profile's SOUL.md identity at layer 1.
Profiles with strict role boundaries (e.g. a reviewer profile that
never writes code) still executed implementation tasks because the
kanban identity claim diluted SOUL's.

Drop the identity line. Layer 3 now describes the task-execution
protocol only; SOUL.md remains the sole identity slot.

Fixes #19351
2026-05-03 16:59:00 -07:00
LeonSGP43 6713274a42 fix(file): strip leaked terminal fences from reads 2026-05-03 16:58:50 -07:00
Alan Chen 2d7543c61f fix(windows): enforce UTF-8 stdout/stderr to prevent UnicodeEncodeError crash
On Windows, services and terminals default to cp1252 encoding. The CLI
uses box-drawing characters (┌│├└─) in banners, doctor output, and
status displays. When print() tries to encode these under cp1252, an
unhandled UnicodeEncodeError crashes the gateway on startup.

This fix adds early UTF-8 enforcement in hermes_cli/__init__.py:
- Sets PYTHONUTF8=1 and PYTHONIOENCODING=utf-8
- Re-opens stdout/stderr with UTF-8 encoding if not already UTF-8

Runs at import time so it protects all CLI subcommands. No effect on
Unix (gated on sys.platform == "win32"). Backwards-compatible: on
systems already using UTF-8, the function is a no-op.

Fixes #10956
2026-05-03 16:58:25 -07:00
Teknium 2ababfe6ed chore(release): map 0xKingBack noreply email 2026-05-03 16:55:16 -07:00
0xKingBack 3c42024539 fix(curator): pass auxiliary curator api_key/base_url into runtime resolution
Curator review fork now forwards per-slot credentials from auxiliary.curator
and legacy curator.auxiliary to resolve_runtime_provider, matching the
canonical aux task schema. Add regression tests for binding and main fallback.
2026-05-03 16:55:16 -07:00
Kiala 3792b77bd1 fix(send_message): support QQBot C2C and group chats
The _send_qqbot function was hardcoded to use the guild channel
endpoint (/channels/{id}/messages), which fails for C2C private
chats and QQ groups with 'channel does not exist' (code 11263).

This change tries the appropriate endpoints in order:
1. /channels/{id}/messages     (guild channels)
2. /v2/users/{id}/messages     (C2C private chats)
3. /v2/groups/{id}/messages    (QQ groups)

Fixes active sending to QQBot C2C and group recipients.
2026-05-03 16:54:39 -07:00
MrBob 86e64c1d3b fix(gateway): hide required-arg commands from Telegram menu 2026-05-03 15:29:06 -07:00
sprmn24 408dd8aa28 fix(compressor): skip non-string tool content in dedup pass to prevent AttributeError 2026-05-03 15:28:30 -07:00
sprmn24 5bd937533c fix(vision): guard user_prompt type in video_analyze_tool before debug_call_data construction 2026-05-03 15:28:04 -07:00
sprmn24 6c4aca7adc fix(vision): guard user_prompt type before debug_call_data construction 2026-05-03 15:27:40 -07:00
Zyproth a5cae16496 fix(api_server): fall back to default port on malformed API_SERVER_PORT 2026-05-03 15:27:03 -07:00
Amit Gaur 65bebb9b80 fix(cli): follow 307 redirects in MiniMax OAuth httpx clients
The MiniMax OAuth API endpoints have moved from api.minimax.io to
account.minimax.io and the old paths now respond with HTTP 307.
httpx defaults to follow_redirects=False (unlike requests), so the
device-code and token-refresh flows fail with "Temporary Redirect".

Adds follow_redirects=True to the two httpx.Client instances in
hermes_cli/auth.py used by the MiniMax OAuth flow. This is forward-
compatible -- if endpoints move again, the redirect chain is
followed automatically.

Repro before patch:
  curl -i -X POST https://api.minimax.io/oauth/code  # -> 307
  curl -i -X POST https://api.minimax.io/oauth/token # -> 307

Verified end-to-end against a real MiniMax Plus account on macOS;
the existing tests/test_minimax_oauth.py suite (15 tests) still
passes.
2026-05-03 15:26:33 -07:00
Zyproth dfdd7b6e6f fix(codex-transport): preserve request override headers for xai responses 2026-05-03 15:25:45 -07:00
LeonSGP43 4a2f822137 fix(mcp): reconnect on terminated sessions 2026-05-03 15:23:33 -07:00
teknium1 2658494e81 fix(kanban): add per-path env overrides + dispatcher env injection
Layers defense-in-depth on top of the shared-root anchoring (base commit).

Changes in hermes_cli/kanban_db.py:
- kanban_db_path() now honours HERMES_KANBAN_DB first, then falls through
  to kanban_home()/kanban.db.
- workspaces_root() now honours HERMES_KANBAN_WORKSPACES_ROOT first, then
  falls through to kanban_home()/kanban/workspaces.
- All three overrides (HERMES_KANBAN_HOME, HERMES_KANBAN_DB,
  HERMES_KANBAN_WORKSPACES_ROOT) now call .expanduser() for consistency.
- _default_spawn() injects HERMES_KANBAN_DB and
  HERMES_KANBAN_WORKSPACES_ROOT into the worker subprocess env. Even
  when the worker's get_default_hermes_root() resolution somehow
  disagrees with the dispatcher's (symlinks, unusual Docker layouts),
  the two processes still open the same SQLite file.

Module docstring updated to describe all three overrides and the
dispatcher env-injection contract.

Tests (tests/hermes_cli/test_kanban_db.py, TestSharedBoardPaths):
- test_hermes_kanban_db_pin_beats_kanban_home
- test_hermes_kanban_workspaces_root_pin_beats_kanban_home
- test_empty_per_path_overrides_fall_through
- test_dispatcher_spawn_injects_kanban_db_and_workspaces_root
  (monkeypatches subprocess.Popen, asserts both env vars reach the
  child even after HERMES_HOME is rewritten by `hermes -p <profile>`.)

Docs: website/docs/reference/environment-variables.md gets entries
for the three kanban env vars.

This fusion is built on the cleanest of the seven competing PRs that
targeted issue #18442:

* Base commit (from PR #19350 by @GodsBoy): add `kanban_home()` helper
  anchored at `get_default_hermes_root()`, reroute all 5 kanban path
  sites through it (including the 3 sibling log-dir sites that the
  other six PRs missed), 8-test regression class.
* Dispatcher env-var injection approach drawn from PRs #18300
  (@quocanh261997) and #19100 (@cg2aigc).
* Per-path env overrides drawn from PR #19100 (@cg2aigc).
* get_default_hermes_root() resolution direction first proposed in
  PR #18503 (@beibi9966) and PR #18985 (@Gosuj).

Closes the duplicate/competing PRs: #18300, #18503, #18670, #18985,
#19037, #19056, #19100. Fixes #18442 and #19348.

Co-authored-by: quocanh261997 <17986614+quocanh261997@users.noreply.github.com>
Co-authored-by: cg2aigc <232694053+cg2aigc@users.noreply.github.com>
Co-authored-by: beibi9966 <beibei1988@proton.me>
Co-authored-by: Gosuj <123411271+Gosuj@users.noreply.github.com>
Co-authored-by: LeonSGP43 <154585401+LeonSGP43@users.noreply.github.com>
2026-05-03 15:13:39 -07:00
GodsBoy f5bd77b3e1 fix(kanban): anchor board, workspaces, and worker logs at the shared Hermes root
The Kanban board is documented as shared across all Hermes profiles, but
`kanban_db_path()` and `workspaces_root()` resolved through `get_hermes_home()`,
which returns the active profile's HERMES_HOME. When the dispatcher spawned a
worker with `hermes -p <profile> --skills kanban-worker chat -q "work kanban
task <id>"`, the worker rewrote HERMES_HOME to the profile subdirectory before
kanban_db.py imported, opening a profile-local `kanban.db` that did not contain
the dispatcher's task. `kanban_show` and `kanban_complete` failed; the
dispatcher's row stayed `running` and was retried/crashed. The same defect
applied to `_default_spawn`'s log directory and `worker_log_path`, so
`hermes kanban tail` did not see the worker's output.

Add `kanban_home()` in `hermes_cli/kanban_db.py` that resolves through
`HERMES_KANBAN_HOME` (explicit override) then `get_default_hermes_root()`,
which already understands the `<root>/profiles/<name>` and Docker / custom
HERMES_HOME shapes. Reroute `kanban_db_path`, `workspaces_root`, the
`_default_spawn` log directory, `gc_worker_logs`, and `worker_log_path`
through it. Profile-specific config, `.env`, memory, and sessions stay
isolated as before; only the kanban surface is shared.

Add a `TestSharedBoardPaths` regression class to `tests/hermes_cli/test_kanban_db.py`
covering: default install, profile-worker convergence, Docker custom HERMES_HOME,
Docker profile layout, explicit `HERMES_KANBAN_HOME` override, and a real
SQLite round-trip across dispatcher and worker HERMES_HOME perspectives.
The dispatcher/worker convergence tests fail on origin/main and pass after
the fix.

Update the `kanban.md` user-guide page and the misleading docstrings in
`kanban_db.py` to describe the shared-root behavior.

Fixes #19348
2026-05-03 15:13:39 -07:00
Siddharth Balyan 167b5648ea Revert "fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)" (#19329)
This reverts commit 9eaddfafa3.
2026-05-04 00:43:58 +05:30
Siddharth Balyan 9eaddfafa3 fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)
CLI/TUI sessions on the local backend now unconditionally use
os.getcwd() as the working directory. The terminal.cwd config value is
only consumed by gateway/cron/delegation modes (where there's no shell
to cd from).

Previously, 'hermes setup' would write an absolute path (e.g. $HOME)
into terminal.cwd which then pinned the CLI to that directory regardless
of where the user launched hermes from. This was a silent foot-gun —
the user's 'cd' was being ignored.

Changes:

1. cli.py: Restructured CWD resolution — if TERMINAL_CWD is not already
   set by the gateway, and the backend is local, always use os.getcwd().
   Config terminal.cwd is irrelevant for interactive CLI/TUI sessions.

2. setup.py: Moved the cwd prompt from setup_terminal_backend() to
   setup_gateway(). It now only appears when configuring messaging
   platforms and is labeled 'Gateway working directory'.

3. Tests: Rewrote test_cwd_env_respect.py to validate the new behavior:
   explicit config paths are ignored for CLI, gateway pre-set values are
   preserved, non-local backends keep their config paths.

4. Docs: Updated configuration.md, profiles.md, and
   environment-variables.md to clarify that terminal.cwd only affects
   gateway/cron mode on local backend.

Closes #19214
2026-05-04 00:14:36 +05:30
GodsBoy b8ae8cc801 fix(debug): redact log content at upload time in hermes debug share
Apply agent.redact.redact_sensitive_text with force=True to log content
captured by _capture_log_snapshot before it reaches upload_to_pastebin.
On-disk logs are untouched. Compatible with the off-by-default local
redaction policy from #16794: this is upload-time-only and applies
regardless of security.redact_secrets because the public paste service
is the leak surface. A visible banner is prepended to each uploaded log
paste so reviewers know redaction was applied. --no-redact preserves
deliberate unredacted sharing for maintainer-coordinated cases.

The bug-report, setup-help, and feature-request issue templates direct
users to run hermes debug share and paste the resulting public URLs.
With redaction off by default per #16794, those uploads have been
carrying credentials onto paste.rs and dpaste.com.

force=True is non-negotiable: without it, redact_sensitive_text
short-circuits at agent/redact.py:322 when the env var is unset, so the
fix would silently be a no-op for its target audience. A regression
test pins this down.

Fixes #19316
2026-05-03 11:42:20 -07:00
Siddharth Balyan c9a3f36f56 feat: add video_analyze tool for native video understanding (#19301)
* feat: add video_analyze tool for native video understanding

Adds a video_analyze tool that sends video files to multimodal LLMs
(e.g. Gemini) for analysis via the OpenRouter-compatible video_url
content type. Mirrors vision_analyze in structure, error handling,
and registration pattern.

Key design:
- Base64 encodes entire video (no frame extraction, no ffmpeg dep)
- Uses 'video_url' content block type (OpenRouter standard)
- Supports mp4, webm, mov, avi, mkv, mpeg formats
- 50 MB hard cap, 20 MB warning threshold
- 180s minimum timeout (videos take longer than images)
- AUXILIARY_VIDEO_MODEL env override, falls back to AUXILIARY_VISION_MODEL
- Same SSRF protection, retry logic, and cleanup as vision_analyze

Default disabled: registered in 'video' toolset (not in _HERMES_CORE_TOOLS).
Users opt in via: hermes tools enable video, or enabled_toolsets=['video'].

* feat(video): add models.dev capability pre-check + CONFIGURABLE_TOOLSETS entry

- Pre-checks model video capability via models.dev modalities.input
  before expensive base64 encoding. Fails early with helpful message
  suggesting video-capable alternatives (gemini, mimo-v2.5-pro).
- Passes optimistically if model unknown or lookup fails.
- Adds ModelInfo.supports_video_input() helper.
- Adds 'video' to CONFIGURABLE_TOOLSETS and _DEFAULT_OFF_TOOLSETS
  so 'hermes tools enable video' works from CLI.
- 8 new tests for the capability check (37 total).

* refactor(video): remove models.dev capability pre-check

Removes _check_video_model_capability and ModelInfo.supports_video_input.
The vision_analyze tool doesn't pre-check image capability either — both
tools rely on the same pattern: send request, handle API errors gracefully
with categorized user-facing messages. The pre-check was inconsistent
(only worked for some providers/models) so drop it for parity.

* cleanup: compress comments, fix fragile timeout coupling

- Replace _VISION_DOWNLOAD_TIMEOUT * 2 with hardcoded 60s (no silent
  breakage if vision timeout changes independently)
- Strip verbose comments and redundant log lines throughout
- No behavioral changes
2026-05-04 00:04:36 +05:30
SHL0MS 0dd8e3f8d8 rename: video-orchestrator → kanban-video-orchestrator
The kanban prefix makes the skill discoverable alongside `kanban-orchestrator`
and `kanban-worker`, and signals up front that this skill drives the kanban
plugin rather than being a generic video tool.

Updated:
- directory rename
- SKILL.md frontmatter `name:` and H1
- setup.sh.tmpl header
2026-05-03 10:26:54 -07:00
SHL0MS 511add7249 feat(skill): add video-orchestrator optional creative skill
Meta-pipeline that wraps any video request — narrative film, product /
marketing, music video, explainer, ASCII, generative, comic, 3D,
real-time/installation — in a Hermes Kanban pipeline. Performs adaptive
discovery, designs an appropriate team for the requested style, generates
the setup script that creates Hermes profiles + initial kanban task, and
helps monitor execution.

Routes scenes to whichever existing Hermes skill fits each beat
(`ascii-video`, `manim-video`, `p5js`, `comfyui`, `touchdesigner-mcp`,
`blender-mcp`, `pixel-art`, `baoyu-comic`, `claude-design`, `excalidraw`,
`songsee`, `heartmula`, …) plus external APIs for TTS, image-gen, and
image-to-video. Kanban orchestration uses the `kanban-orchestrator` and
`kanban-worker` skills.

The single-project workspace layout, profile-config patching pattern,
SOUL.md-per-profile model, and `--workspace dir:<path>` discipline are
adapted from alt-glitch's original kanban-video-pipeline at
https://github.com/NousResearch/kanban-video-pipeline. This skill
generalizes those patterns across video styles and replaces the original
string-replacement config patcher with a PyYAML-based one that touches
only `toolsets` and `skills.always_load` (preserving security-sensitive
fields like `approvals.mode`).

Includes:
- SKILL.md — workflow + critical rules
- references/ — intake, role archetypes, tool matrix, kanban setup,
  monitoring, six worked examples
- assets/ — brief / setup.sh / soul.md templates
- scripts/ — bootstrap_pipeline.py (plan.json -> setup.sh) and
  monitor.py (poll + issue detection)

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-05-03 10:26:54 -07:00
brooklyn! e97a9993b9 Merge pull request #19307 from NousResearch/bb/fix-terminal-resize-jumble
fix(tui): clear Apple Terminal resize artifacts
2026-05-03 10:17:15 -07:00
Brooklyn Nicholson 279b656adc fix(tui): clear Apple Terminal resize artifacts
Use a deeper alt-screen clear for Apple Terminal resize repaints so host reflow artifacts do not survive the recovery frame.
2026-05-03 12:11:24 -05:00
Bartok9 e527240b27 fix(tools): write_file handler now rejects missing 'content'/'path' args instead of silently writing zero-byte files (#19096)
Under context pressure, frontier models sometimes emit tool calls with
required fields dropped. Previously _handle_write_file() used
args.get('content', '') which substituted an empty string for the missing
key, returned success with bytes_written=0, and created a zero-byte file
on disk. The model had no way to detect the failure.

Changes:
- Reject calls where 'path' is absent or not a non-empty string
- Reject calls where 'content' key is entirely absent (key-presence check,
  not truthiness) — distinguishing a legitimately empty file from a dropped arg
- Reject calls where 'content' is a non-string type
- All error messages include guidance to re-emit the tool call or switch
  to execute_code with hermes_tools.write_file() for large payloads
- Explicit empty string content (file truncation) continues to work

Regression tests added for all four cases: missing path, missing content,
explicit-empty content, and wrong content type.

Fixes #19096
2026-05-03 08:52:41 -07:00
Tranquil-Flow 6b4fb9f878 fix(cron): treat non-dict origin as missing instead of crashing tick
``_resolve_origin`` called ``origin.get('platform')`` on whatever
``job.get('origin')`` returned. The leading ``if not origin: return None``
short-circuited the falsy cases (None, empty dict, "") but a non-empty
string passed that guard and then crashed with
``AttributeError: 'str' object has no attribute 'get'`` on every fire
attempt. Observed in the wild after a migration script tagged jobs with
free-form provenance strings (e.g.
``"combined-digest-replaces-x-and-y-20260503"``).

``mark_job_run`` did record ``last_status: error,
last_error: "'str' object has no attribute 'get'"`` once, but the next
tick re-loaded the same poisoned origin and crashed identically. The
job stayed enabled, fired every tick, and accumulated cascading errors
in the log until ``origin`` was patched manually.

Replace the falsy guard with ``isinstance(origin, dict)``. Non-dict
origins (string, int, list, tuple, float — anything that survived a
hand-edit, JSON-script write, or migration) are now treated the same
as a missing origin: the job continues with ``deliver`` falling back
through its normal home-channel path instead of crashing the scheduler
loop.

Test parametrises the non-dict shapes that can appear in jobs.json
through external writers and asserts ``_resolve_origin`` returns None
for each.

Note: this fix scope is the non-dict-``origin`` crash only. The
``next_run_at: null`` recurring-job recovery (the second sub-bug in
#18722) is independently addressed by the in-flight #18825, which
extends the never-silently-disable defense from #16265 to
``get_due_jobs()`` — that approach is well-aligned with the existing
recovery pattern and ships fine without a competing change here.

Fixes #18722 (non-dict origin crash; recurring-job recovery covered by #18825)
2026-05-03 08:51:50 -07:00
JasonOA888 69dd0f7cf1 fix(approval): extend sensitive write target to cover shell RC and credential files
Terminal commands can write to shell RC files (~/.bashrc, ~/.zshrc,
~/.profile) and credential files (~/.netrc, ~/.pgpass, ~/.npmrc,
~/.pypirc) via redirection or tee without triggering approval, even
though write_file already blocks these paths in file_safety.py.

This creates an inconsistency: write_file protects these paths but
terminal shell redirections bypass the same protection. An agent
prompted via indirect injection could install persistent backdoors
(e.g. PATH manipulation, alias overrides) or write credential entries
without user approval.

Extend _SENSITIVE_WRITE_TARGET with two new regex groups matching the
same paths that file_safety.py's WRITE_DENIED_PATHS already covers:
  _SHELL_RC_FILES  — ~/.bashrc, ~/.zshrc, ~/.profile, ~/.bash_profile,
                     ~/.zprofile
  _CREDENTIAL_FILES — ~/.netrc, ~/.pgpass, ~/.npmrc, ~/.pypirc

All 130 existing tests pass.
2026-05-03 08:49:13 -07:00
teknium1 3c59566cc5 chore(release): map leprincep35700 email for PR #18440 salvage 2026-05-03 08:47:49 -07:00
leprincep35700 b59bb4e351 fix(gateway): preserve home-channel thread targets across restart notifications 2026-05-03 08:47:49 -07:00
Teknium d87fd9f039 fix(goals): make /goal work in TUI and fix gateway verdict delivery (#19209)
/goal was silently broken outside the classic CLI.

TUI: /goal was routed through the HermesCLI slash-worker subprocess,
which set the goal row in SessionDB but then called
_pending_input.put(state.goal) — the subprocess has no reader for that
queue, so the kickoff message was discarded. No post-turn judge was
wired into prompt.submit either, so even a manual kickoff would not
continue the goal loop. Intercept /goal in command.dispatch instead,
drive GoalManager directly, and return {type: send, notice, message}
so the TUI client renders the Goal-set notice and fires the kickoff.
Run the judge in _run_prompt_submit after message.complete, surface
the verdict via status.update {kind: goal}, and chain the continuation
turn after the running guard is released.

Gateway: _post_turn_goal_continuation was gated on
hasattr(adapter, 'send_message'), but adapters only expose send().
That branch was dead on every platform — users never saw
'✓ Goal achieved', 'Continuing toward goal', or budget-exhausted
messages. Replace the dead call with adapter.send(chat_id, content,
metadata) and drop a broken reference to self._loop.

Tests:
- tests/tui_gateway/test_goal_command.py — full /goal dispatch matrix
  (set / status / pause / resume / clear / stop / done / whitespace)
  plus regressions for slash.exec → 4018 and 'goal' staying in
  _PENDING_INPUT_COMMANDS.
- tests/gateway/test_goal_verdict_send.py — locks in the adapter.send
  path for done / continue / budget-exhausted and verifies the hook
  no-ops when no goal is set or the adapter lacks send().
2026-05-03 05:49:12 -07:00
Teknium 55647a5813 fix(whatsapp): pin protobufjs >=7.5.5 via npm overrides to clear 3 critical vulns (#19204)
The whatsapp-bridge pulls @whiskeysockets/baileys at a pinned git
commit whose transitive dep tree ships protobufjs <7.5.5, triggering
GHSA-xq3m-2v4x-88gg (critical, arbitrary code execution). npm audit
reported 3 cascading criticals: protobufjs, @whiskeysockets/libsignal-node
(pulls protobufjs), and baileys itself (effect rollup).

Fix: add npm overrides block pinning protobufjs to ^7.5.5. Deduplicates
to a single 7.5.6 copy at node_modules/protobufjs that both libsignal-node
and any other consumers resolve through normal module resolution.

Why not bump baileys: npm-published baileys@6.17.16 is deprecated by the
maintainers (wrong version), 7.0.0-rc.* still pulls the same vulnerable
libsignal-node, and upstream Baileys HEAD adds a 4th vuln (music-metadata).
The override is the minimal, behavior-preserving fix.

Validation:
- npm audit: 3 critical -> 0 vulnerabilities
- node -e "import('@whiskeysockets/baileys')" -> all 5 named exports
  (makeWASocket, useMultiFileAuthState, DisconnectReason,
  fetchLatestBaileysVersion, downloadMediaMessage) resolve
- node bridge.js loads all modules and reaches Express bind
  (exits only on EADDRINUSE because the live gateway owns :3000)
- Single deduped protobufjs@7.5.6 in the tree
2026-05-03 05:22:30 -07:00
kshitijk4poor 6f2dab248a fix: update tests for resume_pending semantics + add AUTHOR_MAP entries
Tests updated to reflect suspend_recently_active now setting
resume_pending=True (preserves session) instead of suspended=True
(wipes session history).

AUTHOR_MAP entries: millerc79 (#19033), shellybotmoyer (#18915)
2026-05-03 03:54:03 -07:00
charliekerfoot 1148c46241 fix(gateway): correct ws scheme conversion for https urls 2026-05-03 03:54:03 -07:00
kshitijk4poor 7a22c639dc chore: add shellybotmoyer to AUTHOR_MAP 2026-05-03 03:54:03 -07:00
Hermes Agent 934103476f fix(gateway): send /new response before cancel_session_processing to avoid race (#18912)
When /new is issued while an agent is actively processing, the confirmation response was never sent to the user because cancel_session_processing() was called before _send_with_retry(). Task cancellation side effects could silently drop the response.

Fix: reorder to send the response BEFORE cancelling the old task. Add logging at the send point (matching the pattern at line 2800 in _process_message_background) so future failures are visible.

Closes: #18912
2026-05-03 03:54:03 -07:00
kshitijk4poor bf3239472f chore: add millerc79 to AUTHOR_MAP 2026-05-03 03:54:03 -07:00
millerc79 f1e0292517 fix(gateway): resume sessions after crash/restart instead of blanket suspend
suspend_recently_active() was unconditionally setting suspended=True on
startup, causing get_or_create_session() to wipe conversation history on
every restart. Change to set resume_pending=True instead, so sessions
auto-resume while still allowing stuck-loop escalation after 3 failures.
2026-05-03 03:54:03 -07:00
kshitijk4poor 0a97ce6bff chore: add nftpoetrist to AUTHOR_MAP 2026-05-03 03:47:49 -07:00
nftpoetrist 6c1322b997 fix(slack): close previous handler in connect() to prevent zombie Socket Mode connections
SlackAdapter.connect() overwrote self._handler, self._app, and
self._socket_mode_task without closing the prior AsyncSocketModeHandler
first. If connect() was called a second time on the same adapter (e.g.
during a gateway restart or in-process reconnect attempt), the old Socket
Mode websocket stayed alive. Both the old and new connections received
every Slack event and dispatched it twice — producing double responses
with different wording, the same bug that affected DiscordAdapter (#18187,
fixed in #18758).

Fix: add a close-before-reassign guard at the start of the connection
setup path, mirroring the guard DiscordAdapter.connect() already has.
When self._handler is None (fresh adapter, first connect()) the block is
a harmless no-op. Scoped to the handler/app fields only — no behavior
change for any path that does not call connect() twice.

Fixes #18980
2026-05-03 03:47:49 -07:00
kshitijk4poor c14bf441a3 chore: add 0xyg3n noreply email to AUTHOR_MAP 2026-05-03 03:44:55 -07:00
0xyg3n 19ba9e43b6 fix(gateway/discord): require allowlist auth on slash commands
Slash commands (_run_simple_slash, _handle_thread_create_slash) bypassed
every DISCORD_ALLOWED_* gate enforced by on_message. Any guild member
could invoke /background (RCE via terminal), /restart, /model, /skill,
etc. CVSS 9.8 Critical.

- _evaluate_slash_authorization mirrors on_message gates (user, role,
  channel, ignored channel) with fail-closed semantics
- _check_slash_authorization sends ephemeral reject + logs + admin alert
- Auth gate runs before defer() so rejections are ephemeral
- /skill autocomplete returns [] for unauthorized users (no catalog leak)
- Component views (ExecApproval, SlashConfirm, UpdatePrompt, ModelPicker)
  now honor role allowlists via shared _component_check_auth helper
- Optional DISCORD_HIDE_SLASH_COMMANDS defense-in-depth
- Cross-platform admin alert (Telegram/Slack fallback) on unauthorized attempts

Based on PR #18125 by @0xyg3n.
2026-05-03 03:44:55 -07:00
kshitijk4poor 5d5b8912be test: add tests for cmd_key preservation through name clamping
- TestClampCommandNamesTriples: unit tests for 3-tuple support in
  _clamp_command_names (short names, long names, collisions, multiple
  entries, backward compat with 2-tuples)
- TestDiscordSkillCmdKeyDispatch: integration test through the full
  discord_skill_commands pipeline verifying long skill names retain
  their original cmd_key after clamping
- Add contributor CharlieKerfoot to AUTHOR_MAP
2026-05-03 03:25:45 -07:00
charliekerfoot c4c0e5abc2 fix: After _clamp_command_names truncates skill names to fit the 32-cha… 2026-05-03 03:25:45 -07:00
124 changed files with 10150 additions and 559 deletions
+4 -4
View File
@@ -1529,7 +1529,7 @@ def _build_codex_client(model: str) -> Tuple[Optional[Any], Optional[str]]:
return CodexAuxiliaryClient(real_client, model), model
def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
def _try_anthropic(explicit_api_key: str = None) -> Tuple[Optional[Any], Optional[str]]:
try:
from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
except ImportError:
@@ -1539,10 +1539,10 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
if pool_present:
if entry is None:
return None, None
token = _pool_runtime_api_key(entry)
token = explicit_api_key or _pool_runtime_api_key(entry)
else:
entry = None
token = resolve_anthropic_token()
token = explicit_api_key or resolve_anthropic_token()
if not token:
return None, None
@@ -2336,7 +2336,7 @@ def resolve_provider_client(
if pconfig.auth_type == "api_key":
if provider == "anthropic":
client, default_model = _try_anthropic()
client, default_model = _try_anthropic(explicit_api_key=explicit_api_key)
if client is None:
logger.warning("resolve_provider_client: anthropic requested but no Anthropic credentials found")
return None, None
+2
View File
@@ -569,6 +569,8 @@ class ContextCompressor(ContextEngine):
# Skip multimodal content (list of content blocks)
if isinstance(content, list):
continue
if not isinstance(content, str):
continue
if len(content) < 200:
continue
h = hashlib.md5(content.encode("utf-8", errors="replace")).hexdigest()[:12]
+110 -35
View File
@@ -24,11 +24,12 @@ from __future__ import annotations
import json
import logging
import os
import re
import tempfile
import threading
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any, Callable, Dict, List, Optional, Set
from typing import Any, Callable, Dict, List, NamedTuple, Optional, Set
from hermes_constants import get_hermes_home
from tools import skill_usage
@@ -36,6 +37,22 @@ from tools import skill_usage
logger = logging.getLogger(__name__)
def _strip_aux_credential(value: Any) -> Optional[str]:
if value is None:
return None
text = str(value).strip()
return text or None
class _ReviewRuntimeBinding(NamedTuple):
"""Provider/model for the curator review fork plus optional per-slot overrides."""
provider: str
model: str
explicit_api_key: Optional[str]
explicit_base_url: Optional[str]
DEFAULT_INTERVAL_HOURS = 24 * 7 # 7 days
DEFAULT_MIN_IDLE_HOURS = 2
DEFAULT_STALE_AFTER_DAYS = 30
@@ -453,6 +470,24 @@ def _reports_root() -> Path:
return root
def _needle_in_path_component(needle: str, path: str) -> bool:
"""Check if *needle* is a complete filename stem or directory name in *path*.
Unlike simple substring matching, this avoids false positives where short
skill names are embedded in longer filenames (e.g. "api" matching
"references/api-design.md"). Hyphens and underscores are normalised so
"open-webui-setup" matches "open_webui_setup.md".
"""
norm_needle = needle.replace("-", "_")
for part in path.replace("\\", "/").split("/"):
if not part:
continue
stem = part.rsplit(".", 1)[0] if "." in part else part
if stem.replace("-", "_") == norm_needle:
return True
return False
def _classify_removed_skills(
removed: List[str],
added: List[str],
@@ -531,15 +566,29 @@ def _classify_removed_skills(
continue
# Look for the removed skill's name in file_path / content / raw.
haystacks: List[str] = []
# Matching strategy differs by field type:
# file_path — needle must be a complete path component
# (filename stem or directory name), so "api" does NOT
# falsely match "references/api-design.md".
# content fields — word-boundary regex so "test" does NOT
# falsely match "latest" or "testing".
haystacks: List[tuple[str, str]] = []
for key in ("file_path", "file_content", "content", "new_string", "_raw"):
v = args.get(key)
if isinstance(v, str):
haystacks.append(v)
haystacks.append((key, v))
hit = False
for hay in haystacks:
for key, hay in haystacks:
for needle in needles:
if needle and needle in hay:
if not needle:
continue
if key == "file_path":
matched = _needle_in_path_component(needle, hay)
else:
matched = bool(
re.search(rf'\b{re.escape(needle)}\b', hay)
)
if matched:
hit = True
evidence = (
f"skill_manage action={args.get('action', '?')} "
@@ -1398,6 +1447,52 @@ def run_curator_review(
}
def _resolve_review_runtime(cfg: Dict[str, Any]) -> _ReviewRuntimeBinding:
"""Resolve provider/model and per-slot credentials for the curator review fork.
Same precedence as `_resolve_review_model()`. Non-empty ``api_key`` /
``base_url`` from the active slot are returned as explicit overrides so
``resolve_runtime_provider`` does not silently reuse the main chat
credential chain for a routed auxiliary model.
"""
_main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
_main_provider = _main.get("provider") or "auto"
_main_model = _main.get("default") or _main.get("model") or ""
# 1. Canonical aux task slot
_aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
_cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
_task_provider = (_cur_task.get("provider") or "").strip() or None
_task_model = (_cur_task.get("model") or "").strip() or None
if _task_provider and _task_provider != "auto" and _task_model:
return _ReviewRuntimeBinding(
_task_provider,
_task_model,
_strip_aux_credential(_cur_task.get("api_key")),
_strip_aux_credential(_cur_task.get("base_url")),
)
# 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
_cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
_legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
_legacy_provider = _legacy.get("provider") or None
_legacy_model = _legacy.get("model") or None
if _legacy_provider and _legacy_model:
logger.info(
"curator: using deprecated curator.auxiliary.{provider,model} "
"config — please migrate to auxiliary.curator.{provider,model}"
)
return _ReviewRuntimeBinding(
str(_legacy_provider),
str(_legacy_model),
_strip_aux_credential(_legacy.get("api_key")),
_strip_aux_credential(_legacy.get("base_url")),
)
# 3. Fall through to the main chat model
return _ReviewRuntimeBinding(_main_provider, _main_model, None, None)
def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
"""Pick (provider, model) for the curator review fork.
@@ -1413,32 +1508,8 @@ def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
2. Legacy ``curator.auxiliary.{provider,model}`` when both are set
3. Main ``model.{provider,default/model}`` pair
"""
_main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
_main_provider = _main.get("provider") or "auto"
_main_model = _main.get("default") or _main.get("model") or ""
# 1. Canonical aux task slot
_aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
_cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
_task_provider = (_cur_task.get("provider") or "").strip() or None
_task_model = (_cur_task.get("model") or "").strip() or None
if _task_provider and _task_provider != "auto" and _task_model:
return _task_provider, _task_model
# 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
_cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
_legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
_legacy_provider = _legacy.get("provider") or None
_legacy_model = _legacy.get("model") or None
if _legacy_provider and _legacy_model:
logger.info(
"curator: using deprecated curator.auxiliary.{provider,model} "
"config — please migrate to auxiliary.curator.{provider,model}"
)
return _legacy_provider, _legacy_model
# 3. Fall through to the main chat model
return _main_provider, _main_model
b = _resolve_review_runtime(cfg)
return b.provider, b.model
def _run_llm_review(prompt: str) -> Dict[str, Any]:
@@ -1477,10 +1548,10 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
# arguments hits an auto-resolution path that fails for OAuth-only
# providers and for pool-backed credentials.
#
# `_resolve_review_model()` honors `auxiliary.curator.{provider,model}`
# `_resolve_review_runtime()` honors `auxiliary.curator.{provider,model,...}`
# (canonical aux-task slot, wired through `hermes model` → auxiliary
# picker and the dashboard Models tab), with a legacy fallback to
# `curator.auxiliary.{provider,model}`. See docs/user-guide/features/curator.md.
# `curator.auxiliary.{provider,model,...}`. See docs/user-guide/features/curator.md.
_api_key = None
_base_url = None
_api_mode = None
@@ -1490,9 +1561,13 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
from hermes_cli.config import load_config
from hermes_cli.runtime_provider import resolve_runtime_provider
_cfg = load_config()
_provider, _model_name = _resolve_review_model(_cfg)
_binding = _resolve_review_runtime(_cfg)
_provider, _model_name = _binding.provider, _binding.model
_rp = resolve_runtime_provider(
requested=_provider, target_model=_model_name
requested=_provider,
target_model=_model_name,
explicit_api_key=_binding.explicit_api_key,
explicit_base_url=_binding.explicit_base_url,
)
_api_key = _rp.get("api_key")
_base_url = _rp.get("base_url")
+2 -2
View File
@@ -183,8 +183,8 @@ SKILLS_GUIDANCE = (
)
KANBAN_GUIDANCE = (
"# You are a Kanban worker\n"
"You were spawned by the Hermes Kanban dispatcher to execute ONE task from "
"# Kanban task execution protocol\n"
"You have been assigned ONE task from "
"the shared board at `~/.hermes/kanban.db`. Your task id is in "
"`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
"The `kanban_*` tools in your schema are your primary coordination surface — "
+12 -1
View File
@@ -143,7 +143,18 @@ class ResponsesApiTransport(ProviderTransport):
kwargs["max_output_tokens"] = max_tokens
if is_xai_responses and session_id:
kwargs["extra_headers"] = {"x-grok-conv-id": session_id}
existing_extra_headers = kwargs.get("extra_headers")
merged_extra_headers: Dict[str, str] = {}
if isinstance(existing_extra_headers, dict):
merged_extra_headers.update(
{
str(key): str(value)
for key, value in existing_extra_headers.items()
if key and value is not None
}
)
merged_extra_headers["x-grok-conv-id"] = session_id
kwargs["extra_headers"] = merged_extra_headers
return kwargs
+34 -35
View File
@@ -459,32 +459,19 @@ def load_cli_config() -> Dict[str, Any]:
if "backend" in terminal_config:
terminal_config["env_type"] = terminal_config["backend"]
# Handle special cwd values: "." or "auto" means use current working directory.
# Only resolve to the host's CWD for the local backend where the host
# filesystem is directly accessible. For ALL remote/container backends
# (ssh, docker, modal, singularity), the host path doesn't exist on the
# target -- remove the key so terminal_tool.py uses its per-backend default.
#
# GUARD: If TERMINAL_CWD is already set to a real absolute path (by the
# gateway's config bridge earlier in the process), don't clobber it.
# This prevents a lazy import of cli.py during gateway runtime from
# rewriting TERMINAL_CWD to the service's working directory.
# See issue #10817.
# CWD resolution for CLI/TUI. The gateway has its own config bridge in
# gateway/run.py but may lazily import cli.py (triggering this code).
# Local backend: always os.getcwd(). Use `cd /dir && hermes` to control it.
# Non-local with placeholder: pop so terminal_tool uses its per-backend default.
# Non-local with explicit path: keep as-is.
_CWD_PLACEHOLDERS = (".", "auto", "cwd")
if terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
_existing_cwd = os.environ.get("TERMINAL_CWD", "")
if _existing_cwd and _existing_cwd not in _CWD_PLACEHOLDERS and os.path.isabs(_existing_cwd):
# Gateway (or earlier startup) already resolved a real path — keep it
terminal_config["cwd"] = _existing_cwd
defaults["terminal"]["cwd"] = _existing_cwd
else:
effective_backend = terminal_config.get("env_type", "local")
if effective_backend == "local":
terminal_config["cwd"] = os.getcwd()
defaults["terminal"]["cwd"] = terminal_config["cwd"]
else:
# Remove so TERMINAL_CWD stays unset → tool picks backend default
terminal_config.pop("cwd", None)
effective_backend = terminal_config.get("env_type", "local")
if effective_backend == "local":
terminal_config["cwd"] = os.getcwd()
defaults["terminal"]["cwd"] = terminal_config["cwd"]
elif terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
terminal_config.pop("cwd", None)
env_mappings = {
"env_type": "TERMINAL_ENV",
@@ -517,13 +504,18 @@ def load_cli_config() -> Dict[str, Any]:
"sudo_password": "SUDO_PASSWORD",
}
# Apply config values to env vars so terminal_tool picks them up.
# If the config file explicitly has a [terminal] section, those values are
# authoritative and override any .env settings. When using defaults only
# (no config file or no terminal section), don't overwrite env vars that
# were already set by .env -- the user's .env is the fallback source.
# Bridge config env vars for terminal_tool. TERMINAL_CWD is force-exported
# UNLESS we're inside a gateway process (detected by _HERMES_GATEWAY marker)
# where it was already set correctly by gateway/run.py's config bridge.
_is_gateway = os.environ.get("_HERMES_GATEWAY") == "1"
for config_key, env_var in env_mappings.items():
if config_key in terminal_config:
if env_var == "TERMINAL_CWD":
if _is_gateway:
continue
# CLI: always export (overrides stale .env or inherited values)
os.environ[env_var] = str(terminal_config[config_key])
continue
if _file_has_terminal_config or env_var not in os.environ:
val = terminal_config[config_key]
if isinstance(val, list):
@@ -8383,6 +8375,17 @@ class HermesCLI:
_cprint(f"{_DIM}Voice auto-restart failed: {e}{_RST}")
threading.Thread(target=_restart_recording, daemon=True).start()
def _voice_speak_response_async(self, text: str) -> None:
"""Schedule TTS and mark it pending before continuous recording can restart."""
if not self._voice_tts or not text:
return
self._voice_tts_done.clear()
threading.Thread(
target=self._voice_speak_response,
args=(text,),
daemon=True,
).start()
def _voice_speak_response(self, text: str):
"""Speak the agent's response aloud using TTS (runs in background thread)."""
if not self._voice_tts:
@@ -9543,11 +9546,7 @@ class HermesCLI:
# Speak response aloud if voice TTS is enabled
# Skip batch TTS when streaming TTS already handled it
if self._voice_tts and response and not use_streaming_tts:
threading.Thread(
target=self._voice_speak_response,
args=(response,),
daemon=True,
).start()
self._voice_speak_response_async(response)
# Re-queue the interrupt message (and any that arrived while we were
+19 -2
View File
@@ -797,19 +797,36 @@ def get_due_jobs() -> List[Dict[str, Any]]:
next_run = job.get("next_run_at")
if not next_run:
schedule = job.get("schedule", {})
kind = schedule.get("kind")
# One-shot jobs use a small grace window via the dedicated helper.
recovered_next = _recoverable_oneshot_run_at(
job.get("schedule", {}),
schedule,
now,
last_run_at=job.get("last_run_at"),
)
recovery_kind = "one-shot" if recovered_next else None
# Recurring jobs reach here only when something — typically a
# direct jobs.json edit that bypassed add_job() — left
# next_run_at unset. Without this branch, such jobs are
# silently skipped forever; recompute next_run_at from the
# schedule so they pick up at their next scheduled tick.
if not recovered_next and kind in ("cron", "interval"):
recovered_next = compute_next_run(schedule, now.isoformat())
if recovered_next:
recovery_kind = kind
if not recovered_next:
continue
job["next_run_at"] = recovered_next
next_run = recovered_next
logger.info(
"Job '%s' had no next_run_at; recovering one-shot run at %s",
"Job '%s' had no next_run_at; recovering %s run at %s",
job.get("name", job["id"]),
recovery_kind,
recovered_next,
)
for rj in raw_jobs:
+35 -5
View File
@@ -123,9 +123,19 @@ _LOCK_FILE = _LOCK_DIR / ".tick.lock"
def _resolve_origin(job: dict) -> Optional[dict]:
"""Extract origin info from a job, preserving any extra routing metadata."""
"""Extract origin info from a job, preserving any extra routing metadata.
Treats non-dict origins (free-form provenance strings, ints, lists from
migration scripts or hand-edited jobs.json) as missing instead of
crashing with ``AttributeError`` on ``origin.get(...)``. Without this
guard, a job tagged with e.g. ``"combined-digest-replaces-x-and-y"``
crashed every fire attempt with
``'str' object has no attribute 'get'`` — ``mark_job_run`` recorded the
failure, but the next tick re-loaded the same poisoned origin and
crashed identically until the field was patched manually (#18722).
"""
origin = job.get("origin")
if not origin:
if not isinstance(origin, dict):
return None
platform = origin.get("platform")
chat_id = origin.get("chat_id")
@@ -147,6 +157,19 @@ def _get_home_target_chat_id(platform_name: str) -> str:
return value
def _get_home_target_thread_id(platform_name: str) -> Optional[str]:
"""Return the optional thread/topic ID for a platform home target."""
env_var = _HOME_TARGET_ENV_VARS.get(platform_name.lower())
if not env_var:
return None
value = os.getenv(f"{env_var}_THREAD_ID", "").strip()
if not value:
legacy = _LEGACY_HOME_TARGET_ENV_VARS.get(env_var)
if legacy:
value = os.getenv(f"{legacy}_THREAD_ID", "").strip()
return value or None
def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[dict]:
"""Resolve one concrete auto-delivery target for a cron job."""
@@ -175,7 +198,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
return {
"platform": platform_name,
"chat_id": chat_id,
"thread_id": None,
"thread_id": _get_home_target_thread_id(platform_name),
}
return None
@@ -229,7 +252,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
return {
"platform": platform_name,
"chat_id": chat_id,
"thread_id": None,
"thread_id": _get_home_target_thread_id(platform_name),
}
@@ -394,7 +417,7 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
thread_id = target.get("thread_id")
# Diagnostic: log thread_id for topic-aware delivery debugging
origin = job.get("origin") or {}
origin = _resolve_origin(job) or {}
origin_thread = origin.get("thread_id")
if origin_thread and not thread_id:
logger.warning(
@@ -759,6 +782,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
return prompt
from tools.skills_tool import skill_view
from tools.skill_usage import bump_use
parts = []
skipped: list[str] = []
@@ -770,6 +794,12 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
skipped.append(skill_name)
continue
# Bump usage so the curator sees this skill as actively used.
try:
bump_use(skill_name)
except Exception:
logger.debug("Cron job: failed to bump skill usage for '%s'", skill_name, exc_info=True)
content = str(loaded.get("content") or "").strip()
if parts:
parts.append("")
+35
View File
@@ -86,6 +86,41 @@ if [ -d "$INSTALL_DIR/skills" ]; then
python3 "$INSTALL_DIR/tools/skills_sync.py"
fi
# Optionally start `hermes dashboard` as a side-process.
#
# Toggled by HERMES_DASHBOARD=1 (also accepts "true"/"yes", case-insensitive).
# Host/port/TUI can be overridden via:
# HERMES_DASHBOARD_HOST (default 0.0.0.0 — exposed outside the container)
# HERMES_DASHBOARD_PORT (default 9119, matches `hermes dashboard` default)
# HERMES_DASHBOARD_TUI (already honored by `hermes dashboard` itself)
#
# The dashboard is a long-lived server. We background it *before* the final
# `exec hermes "$@"` so the user's chosen foreground command (chat, gateway,
# sleep infinity, …) remains PID-of-interest for the container runtime. When
# the container stops the whole process tree is torn down, so no explicit
# cleanup is needed.
case "${HERMES_DASHBOARD:-}" in
1|true|TRUE|True|yes|YES|Yes)
dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"
dash_port="${HERMES_DASHBOARD_PORT:-9119}"
dash_args=(--host "$dash_host" --port "$dash_port" --no-open)
# Binding to anything other than localhost requires --insecure — the
# dashboard refuses otherwise because it exposes API keys. Inside a
# container this is the expected deployment (host reaches it via
# published port), so opt in automatically.
if [ "$dash_host" != "127.0.0.1" ] && [ "$dash_host" != "localhost" ]; then
dash_args+=(--insecure)
fi
echo "Starting hermes dashboard on ${dash_host}:${dash_port} (background)"
# Prefix dashboard output so it's distinguishable from the main
# process in `docker logs`. stdbuf keeps the pipe line-buffered.
(
stdbuf -oL -eL hermes dashboard "${dash_args[@]}" 2>&1 \
| sed -u 's/^/[dashboard] /'
) &
;;
esac
# Final exec: two supported invocation patterns.
#
# docker run <image> -> exec `hermes` with no args (legacy default)
+45 -4
View File
@@ -186,18 +186,24 @@ class HomeChannel:
Default destination for a platform.
When a cron job specifies deliver="telegram" without a specific chat ID,
messages are sent to this home channel.
messages are sent to this home channel. Thread-aware platforms may also
store a thread/topic ID so the bare platform target routes to the exact
conversation where /sethome was run.
"""
platform: Platform
chat_id: str
name: str # Human-readable name for display
thread_id: Optional[str] = None
def to_dict(self) -> Dict[str, Any]:
return {
result = {
"platform": self.platform.value,
"chat_id": self.chat_id,
"name": self.name,
}
if self.thread_id:
result["thread_id"] = self.thread_id
return result
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "HomeChannel":
@@ -205,6 +211,7 @@ class HomeChannel:
platform=Platform(data["platform"]),
chat_id=str(data["chat_id"]),
name=data.get("name", "Home"),
thread_id=str(data["thread_id"]) if data.get("thread_id") else None,
)
@@ -839,11 +846,25 @@ def load_gateway_config() -> GatewayConfig:
if yaml_key in allow_mentions_cfg and not os.getenv(env_key):
os.environ[env_key] = str(allow_mentions_cfg[yaml_key]).lower()
# Bridge top-level require_mention to Telegram when the telegram: section
# does not already provide one. Users often write "require_mention: true"
# at the top level alongside group_sessions_per_user, expecting it to work
# the same way (#3979).
_tl_require_mention = yaml_cfg.get("require_mention")
if _tl_require_mention is not None:
_tg_section = yaml_cfg.get("telegram") or {}
if "require_mention" not in _tg_section:
_tg_plat = platforms_data.setdefault(Platform.TELEGRAM.value, {})
_tg_extra = _tg_plat.setdefault("extra", {})
_tg_extra.setdefault("require_mention", _tl_require_mention)
# Telegram settings → env vars (env vars take precedence)
telegram_cfg = yaml_cfg.get("telegram", {})
if isinstance(telegram_cfg, dict):
if "require_mention" in telegram_cfg and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(telegram_cfg["require_mention"]).lower()
# Prefer telegram.require_mention; fall back to the top-level shorthand.
_effective_rm = telegram_cfg.get("require_mention", yaml_cfg.get("require_mention"))
if _effective_rm is not None and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(_effective_rm).lower()
if "mention_patterns" in telegram_cfg and not os.getenv("TELEGRAM_MENTION_PATTERNS"):
os.environ["TELEGRAM_MENTION_PATTERNS"] = json.dumps(telegram_cfg["mention_patterns"])
frc = telegram_cfg.get("free_response_chats")
@@ -1071,6 +1092,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.TELEGRAM,
chat_id=telegram_home,
name=os.getenv("TELEGRAM_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("TELEGRAM_HOME_CHANNEL_THREAD_ID") or None,
)
# Discord
@@ -1087,6 +1109,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.DISCORD,
chat_id=discord_home,
name=os.getenv("DISCORD_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("DISCORD_HOME_CHANNEL_THREAD_ID") or None,
)
# Reply threading mode for Discord (off/first/all)
@@ -1108,6 +1131,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WHATSAPP,
chat_id=whatsapp_home,
name=os.getenv("WHATSAPP_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WHATSAPP_HOME_CHANNEL_THREAD_ID") or None,
)
# Slack
@@ -1135,6 +1159,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SLACK,
chat_id=slack_home,
name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
thread_id=os.getenv("SLACK_HOME_CHANNEL_THREAD_ID") or None,
)
# Signal
@@ -1155,6 +1180,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SIGNAL,
chat_id=signal_home,
name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("SIGNAL_HOME_CHANNEL_THREAD_ID") or None,
)
# Mattermost
@@ -1174,6 +1200,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.MATTERMOST,
chat_id=mattermost_home,
name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("MATTERMOST_HOME_CHANNEL_THREAD_ID") or None,
)
# Matrix
@@ -1205,6 +1232,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.MATRIX,
chat_id=matrix_home,
name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
thread_id=os.getenv("MATRIX_HOME_ROOM_THREAD_ID") or None,
)
# Home Assistant
@@ -1238,6 +1266,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.EMAIL,
chat_id=email_home,
name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
thread_id=os.getenv("EMAIL_HOME_ADDRESS_THREAD_ID") or None,
)
# SMS (Twilio)
@@ -1253,6 +1282,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SMS,
chat_id=sms_home,
name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("SMS_HOME_CHANNEL_THREAD_ID") or None,
)
# API Server
@@ -1315,6 +1345,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.DINGTALK,
chat_id=dingtalk_home,
name=os.getenv("DINGTALK_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("DINGTALK_HOME_CHANNEL_THREAD_ID") or None,
)
# Feishu / Lark
@@ -1342,6 +1373,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.FEISHU,
chat_id=feishu_home,
name=os.getenv("FEISHU_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("FEISHU_HOME_CHANNEL_THREAD_ID") or None,
)
# WeCom (Enterprise WeChat)
@@ -1364,6 +1396,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WECOM,
chat_id=wecom_home,
name=os.getenv("WECOM_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WECOM_HOME_CHANNEL_THREAD_ID") or None,
)
# WeCom callback mode (self-built apps)
@@ -1422,6 +1455,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WEIXIN,
chat_id=weixin_home,
name=os.getenv("WEIXIN_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WEIXIN_HOME_CHANNEL_THREAD_ID") or None,
)
# BlueBubbles (iMessage)
@@ -1445,6 +1479,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.BLUEBUBBLES,
chat_id=bluebubbles_home,
name=os.getenv("BLUEBUBBLES_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("BLUEBUBBLES_HOME_CHANNEL_THREAD_ID") or None,
)
# QQ (Official Bot API v2)
@@ -1482,6 +1517,11 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.QQBOT,
chat_id=qq_home,
name=os.getenv("QQBOT_HOME_CHANNEL_NAME") or os.getenv(qq_home_name_env, "Home"),
thread_id=(
os.getenv("QQBOT_HOME_CHANNEL_THREAD_ID")
or os.getenv("QQ_HOME_CHANNEL_THREAD_ID")
or None
),
)
# Yuanbao — YUANBAO_APP_ID preferred
@@ -1512,6 +1552,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.YUANBAO,
chat_id=yuanbao_home,
name=os.getenv("YUANBAO_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("YUANBAO_HOME_CHANNEL_THREAD_ID") or None,
)
yuanbao_dm_policy = os.getenv("YUANBAO_DM_POLICY")
if yuanbao_dm_policy:
+15 -3
View File
@@ -62,6 +62,14 @@ MAX_NORMALIZED_TEXT_LENGTH = 65_536 # 64 KB cap for normalized content parts
MAX_CONTENT_LIST_SIZE = 1_000 # Max items when content is an array
def _coerce_port(value: Any, default: int = DEFAULT_PORT) -> int:
"""Parse a listen port without letting malformed env/config values crash startup."""
try:
return int(value)
except (TypeError, ValueError):
return default
def _normalize_chat_content(
content: Any, *, _max_depth: int = 10, _depth: int = 0,
) -> str:
@@ -573,7 +581,10 @@ class APIServerAdapter(BasePlatformAdapter):
super().__init__(config, Platform.API_SERVER)
extra = config.extra or {}
self._host: str = extra.get("host", os.getenv("API_SERVER_HOST", DEFAULT_HOST))
self._port: int = int(extra.get("port", os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))))
raw_port = extra.get("port")
if raw_port is None:
raw_port = os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))
self._port: int = _coerce_port(raw_port, DEFAULT_PORT)
self._api_key: str = extra.get("key", os.getenv("API_SERVER_KEY", ""))
self._cors_origins: tuple[str, ...] = self._parse_cors_origins(
extra.get("cors_origins", os.getenv("API_SERVER_CORS_ORIGINS", "")),
@@ -727,10 +738,11 @@ class APIServerAdapter(BasePlatformAdapter):
gateway platforms), falling back to the hermes-api-server default.
"""
from run_agent import AIAgent
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config, GatewayRunner
from hermes_cli.tools_config import _get_platform_tools
runtime_kwargs = _resolve_runtime_agent_kwargs()
reasoning_config = GatewayRunner._load_reasoning_config()
model = _resolve_gateway_model()
user_config = _load_gateway_config()
@@ -740,7 +752,6 @@ class APIServerAdapter(BasePlatformAdapter):
# Load fallback provider chain so the API server platform has the
# same fallback behaviour as Telegram/Discord/Slack (fixes #4954).
from gateway.run import GatewayRunner
fallback_model = GatewayRunner._load_fallback_model()
agent = AIAgent(
@@ -759,6 +770,7 @@ class APIServerAdapter(BasePlatformAdapter):
tool_complete_callback=tool_complete_callback,
session_db=self._ensure_session_db(),
fallback_model=fallback_model,
reasoning_config=reasoning_config,
)
return agent
+19 -7
View File
@@ -2489,15 +2489,20 @@ class BasePlatformAdapter(ABC):
try:
response = await self._message_handler(event)
# Old adapter task (if any) is cancelled AFTER the runner has
# fully handled the command — keeps ordering deterministic.
await self.cancel_session_processing(
session_key,
release_guard=False,
discard_pending=False,
)
_text, _eph_ttl = self._unwrap_ephemeral(response)
# Send the response BEFORE cancelling the old task so the send
# cannot be affected by task-cancellation side effects (race
# condition fix — issue #18912). Previously the send happened
# after cancel_session_processing, which could silently drop the
# "/new" confirmation when an agent was actively running.
if _text:
logger.info(
"[%s] Sending command '/%s' response (%d chars) to %s",
self.name,
cmd,
len(_text),
event.source.chat_id,
)
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
@@ -2510,6 +2515,13 @@ class BasePlatformAdapter(ABC):
message_id=_r.message_id,
ttl_seconds=_eph_ttl,
)
# Old adapter task (if any) is cancelled AFTER the response has
# been sent — keeps ordering deterministic and avoids the race.
await self.cancel_session_processing(
session_key,
release_guard=False,
discard_pending=False,
)
except Exception:
# On failure, restore the original guard if one still exists so
# we don't leave the session in a half-reset state.
+406 -16
View File
@@ -497,6 +497,7 @@ class DiscordAdapter(BasePlatformAdapter):
self._ready_event = asyncio.Event()
self._allowed_user_ids: set = set() # For button approval authorization
self._allowed_role_ids: set = set() # For DISCORD_ALLOWED_ROLES filtering
self.gateway_runner = None # Set by gateway/run.py for cross-platform delivery
# Voice channel state (per-guild)
self._voice_clients: Dict[int, Any] = {} # guild_id -> VoiceClient
self._voice_locks: Dict[int, asyncio.Lock] = {} # guild_id -> serialize join/leave
@@ -1929,6 +1930,225 @@ class DiscordAdapter(BasePlatformAdapter):
return True
return False
# ── Slash command authorization ─────────────────────────────────────
# Slash commands (``_run_simple_slash`` and ``_handle_thread_create_slash``)
# are a separate Discord interaction surface from regular messages and
# historically ran with NO authorization check — bypassing every gate
# ``on_message`` enforces (DISCORD_ALLOWED_USERS, DISCORD_ALLOWED_ROLES,
# DISCORD_ALLOWED_CHANNELS, DISCORD_IGNORED_CHANNELS). Any guild member
# could invoke ``/background``, ``/restart``, ``/sethome``, etc. as the
# operator. ``_check_slash_authorization`` mirrors the on_message gates
# one-for-one so the slash surface honors the same trust boundary.
#
# By design, this is a no-op for deployments with no allowlist env vars
# set — ``_is_allowed_user`` returns True and the channel checks early-out
# — preserving the existing "single-tenant, all guild members trusted"
# default. Deployments that DO set any DISCORD_ALLOWED_* var get slash
# parity with on_message.
def _evaluate_slash_authorization(
self, interaction: "discord.Interaction",
) -> Tuple[bool, Optional[str]]:
"""Evaluate slash authorization without producing any response.
Returns ``(allowed, reason)``. ``reason`` is populated only when
``allowed`` is False. This is the shared core used by both the
responding wrapper (``_check_slash_authorization``) and side-effect-
free callers like the ``/skill`` autocomplete callback, which must
return an empty list for unauthorized users instead of leaking an
ephemeral rejection per-keystroke.
Fail-closed semantics for malformed payloads: when an allowlist is
configured but the interaction is missing the data needed to
evaluate it (no channel id with channel policy active, no user
with user/role policy active), the gate REJECTS rather than
falling through. Without these guards a guild interaction that
happens to deserialize without a channel id would silently bypass
``DISCORD_ALLOWED_CHANNELS`` and a payload missing ``user`` would
raise ``AttributeError`` in the user check below, surfacing as
an opaque interaction failure rather than a clean rejection.
"""
chan_obj = getattr(interaction, "channel", None)
in_dm = isinstance(chan_obj, discord.DMChannel) if chan_obj is not None else False
# ── Channel scope (mirrors on_message lines 3374-3388) ──
# DMs aren't channel-gated — DMs follow on_message's DM lockdown
# path which has its own user-allowlist enforcement.
if not in_dm:
chan_id_raw = getattr(interaction, "channel_id", None) or getattr(
chan_obj, "id", None,
)
channel_ids: set = set()
if chan_id_raw is not None:
channel_ids.add(str(chan_id_raw))
# Mirror on_message: also test the parent channel for threads
# so per-channel allow/deny lists work consistently.
if isinstance(chan_obj, discord.Thread):
parent_id = self._get_parent_channel_id(chan_obj)
if parent_id:
channel_ids.add(str(parent_id))
allowed_raw = os.getenv("DISCORD_ALLOWED_CHANNELS", "")
if allowed_raw:
allowed = {c.strip() for c in allowed_raw.split(",") if c.strip()}
if "*" not in allowed:
if not channel_ids:
# Channel policy is configured but the interaction
# has no resolvable channel id. Fail closed.
return (
False,
"channel id missing with DISCORD_ALLOWED_CHANNELS configured",
)
if not (channel_ids & allowed):
return (False, "channel not in DISCORD_ALLOWED_CHANNELS")
# Ignored beats allowed: even when a thread's parent channel
# is on the allowlist, an explicit DISCORD_IGNORED_CHANNELS
# entry on the thread or its parent rejects the interaction.
ignored_raw = os.getenv("DISCORD_IGNORED_CHANNELS", "")
if ignored_raw and channel_ids:
ignored = {c.strip() for c in ignored_raw.split(",") if c.strip()}
if "*" in ignored or (channel_ids & ignored):
return (False, "channel in DISCORD_IGNORED_CHANNELS")
# ── User / role allowlist (mirrors on_message line 681) ──
user = getattr(interaction, "user", None)
allowed_users = getattr(self, "_allowed_user_ids", set()) or set()
allowed_roles = getattr(self, "_allowed_role_ids", set()) or set()
if user is None or getattr(user, "id", None) is None:
# No identifiable user. With any user/role allowlist
# configured, fail closed rather than raise AttributeError
# on ``interaction.user.id`` below. With no allowlist this
# is the existing "no allowlist = everyone" backwards-compat.
if allowed_users or allowed_roles:
return (False, "missing interaction.user with allowlist configured")
return (True, None)
user_id = str(user.id)
if not self._is_allowed_user(user_id, author=user):
return (
False,
"user not in DISCORD_ALLOWED_USERS / DISCORD_ALLOWED_ROLES",
)
return (True, None)
async def _check_slash_authorization(
self, interaction: "discord.Interaction", command_text: str,
) -> bool:
"""Mirror on_message's user/role/channel gates onto a slash invocation.
Returns True to proceed. Returns False *after* sending an ephemeral
rejection, logging a warning, and scheduling a cross-platform admin
alert the caller must stop on False (the interaction has already
been responded to).
"""
allowed, reason = self._evaluate_slash_authorization(interaction)
if allowed:
return True
return await self._reject_slash(
interaction, command_text, reason=reason or "unauthorized",
)
async def _reject_slash(
self, interaction: "discord.Interaction", command_text: str, *, reason: str,
) -> bool:
"""Send ephemeral reject + log warning + schedule admin alert. Returns False.
Tolerates a missing ``interaction.user`` -- the fail-closed branch
in ``_evaluate_slash_authorization`` deliberately routes here for
malformed payloads (no user) when an allowlist is configured, and
``str(interaction.user.id)`` would raise AttributeError before the
ephemeral rejection could be sent.
"""
user = getattr(interaction, "user", None)
if user is not None:
user_id = str(getattr(user, "id", "?"))
user_name = getattr(user, "name", "?")
else:
user_id = "?"
user_name = "?"
chan_id = getattr(interaction, "channel_id", None) or getattr(
getattr(interaction, "channel", None), "id", None,
)
guild_id = getattr(interaction, "guild_id", None)
logger.warning(
"[Discord] Unauthorized slash attempt: user=%s id=%s channel=%s "
"guild=%s cmd=%r reason=%r",
user_name, user_id, chan_id, guild_id, command_text, reason,
)
try:
await interaction.response.send_message(
"You're not authorized to use this command.",
ephemeral=True,
)
except Exception as e:
# Interaction may already be responded to (e.g. caller deferred
# before the auth check, or Discord retried). Best-effort only.
logger.debug("[Discord] Could not send unauthorized ephemeral: %s", e)
# Fire-and-forget: don't block the interaction handler on Telegram I/O.
try:
asyncio.create_task(self._notify_unauthorized_slash(
user_name, user_id, chan_id, guild_id, command_text, reason,
))
except Exception as e:
logger.debug("[Discord] Could not schedule admin notify task: %s", e)
return False
async def _notify_unauthorized_slash(
self, user_name: str, user_id: str, chan_id, guild_id,
command_text: str, reason: str,
) -> None:
"""Best-effort cross-platform alert to the gateway operator.
Tries TELEGRAM first (most operators set TELEGRAM_HOME_CHANNEL),
then SLACK. Silently no-ops if no other platform is configured
with a home channel.
A soft send failure -- adapter.send() returning a result with
``success=False`` rather than raising -- continues the fallback
chain. Treating a SendResult(success=False) as delivered would
mean a Telegram outage that the adapter politely surfaces (e.g.
rate-limit, auth failure) silently swallows the alert without
attempting Slack. Hard exceptions still take the same path via
the except branch below.
"""
runner = getattr(self, "gateway_runner", None)
if not runner:
return
for target in (Platform.TELEGRAM, Platform.SLACK):
try:
adapter = runner.adapters.get(target)
if not adapter:
continue
home = runner.config.get_home_channel(target)
if not home or not getattr(home, "chat_id", None):
continue
msg = (
"⚠️ Unauthorized Discord slash attempt\n"
f"User: {user_name} ({user_id})\n"
f"Channel: {chan_id} (guild {guild_id})\n"
f"Command: {command_text}\n"
f"Reason: {reason}"
)
result = await adapter.send(str(home.chat_id), msg)
# Only return on confirmed delivery. SendResult(success=False)
# -> continue to the next platform.
if getattr(result, "success", None) is False:
logger.debug(
"[Discord] Admin notify via %s returned success=False"
" (error=%r); falling through",
target, getattr(result, "error", None),
)
continue
return
except Exception as e:
logger.debug("[Discord] Admin notify via %s failed: %s", target, e)
async def send_image_file(
self,
chat_id: str,
@@ -2316,6 +2536,11 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception:
pass # logging must never block command dispatch
# Auth gate — must run before defer() so an ephemeral rejection can
# be delivered on the still-unresponded interaction.
if not await self._check_slash_authorization(interaction, command_text):
return
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, command_text)
await self.handle_message(event)
@@ -2460,7 +2685,8 @@ class DiscordAdapter(BasePlatformAdapter):
message: str = "",
auto_archive_duration: int = 1440,
):
await interaction.response.defer(ephemeral=True)
# defer() is performed inside the handler *after* the auth gate
# so a rejected invoker can receive an ephemeral rejection.
await self._handle_thread_create_slash(interaction, name, message, auto_archive_duration)
@tree.command(name="queue", description="Queue a prompt for the next turn (doesn't interrupt)")
@@ -2581,6 +2807,54 @@ class DiscordAdapter(BasePlatformAdapter):
# supporting up to 25 categories × 25 skills = 625 skills.
self._register_skill_group(tree)
# Optional defense-in-depth: hide every slash command from non-admin
# guild members in Discord's slash picker. Server-side authorization
# (``_check_slash_authorization``) is the actual gate; this is purely
# UX so users don't see commands they can't invoke. Off by default
# to preserve the slash UX for deployments that intentionally allow
# everyone in the guild.
if os.getenv("DISCORD_HIDE_SLASH_COMMANDS", "false").strip().lower() in (
"true", "1", "yes", "on",
):
self._apply_owner_only_visibility(tree)
def _apply_owner_only_visibility(self, tree) -> None:
"""Set default_member_permissions=0 on every registered slash command.
Discord interprets ``Permissions(0)`` as "requires no permissions",
which paradoxically means the command is hidden from every guild
member except those with the Administrator permission. Server admins
can re-grant per user/role via Server Settings Integrations
<bot> Permissions.
Authoritative gate is ``_check_slash_authorization`` on every
invocation, which catches stale clients, role grants made by
mistake, and direct API calls bypassing Discord's UI hide.
"""
try:
no_perms = discord.Permissions(0)
except Exception as e:
logger.warning(
"[Discord] _apply_owner_only_visibility: cannot build Permissions(0): %s",
e,
)
return
applied = 0
for cmd in tree.get_commands():
try:
cmd.default_permissions = no_perms
applied += 1
except Exception as e:
logger.debug(
"[Discord] Could not set default_permissions on %r: %s",
getattr(cmd, "name", "?"), e,
)
logger.info(
"[Discord] Hid %d slash command(s) from non-admin guild members "
"(opt-in defense in depth via DISCORD_HIDE_SLASH_COMMANDS).",
applied,
)
def _register_skill_group(self, tree) -> None:
"""Register a single ``/skill`` command with autocomplete on the name.
@@ -2635,9 +2909,25 @@ class DiscordAdapter(BasePlatformAdapter):
PDFs even if the name doesn't. Discord caps this list at
25 entries per query.
Authorization: a quiet pre-check evaluates the slash
allowlists and returns ``[]`` for unauthorized users so
the installed skill catalog is not leaked to anyone who
can see the command in the picker. Returning a generic
empty list here is intentional sending a per-keystroke
ephemeral rejection would produce a barrage of error
popups during typing.
Reads ``self._skill_entries`` so a ``/reload-skills`` run
since process start shows up on the very next keystroke.
"""
try:
allowed, _reason = self._evaluate_slash_authorization(interaction)
except Exception:
# Defensive: never raise from autocomplete. Fail
# closed by returning an empty suggestion list.
return []
if not allowed:
return []
q = (current or "").strip().lower()
choices: list = []
for name, desc, _key in self._skill_entries:
@@ -2664,6 +2954,12 @@ class DiscordAdapter(BasePlatformAdapter):
async def _skill_handler(
interaction: "discord.Interaction", name: str, args: str = "",
):
# Authorize BEFORE any skill lookup so that known and
# unknown skill names produce identical rejections for
# unauthorized users (no probing the installed catalog
# via "Unknown skill: <name>" responses).
if not await self._check_slash_authorization(interaction, "/skill"):
return
entry = self._skill_lookup.get(name)
if not entry:
await interaction.response.send_message(
@@ -2811,6 +3107,9 @@ class DiscordAdapter(BasePlatformAdapter):
auto_archive_duration: int = 1440,
) -> None:
"""Create a Discord thread from a slash command and start a session in it."""
if not await self._check_slash_authorization(interaction, "/thread"):
return
await interaction.response.defer(ephemeral=True)
result = await self._create_thread(
interaction,
name=name,
@@ -3105,6 +3404,7 @@ class DiscordAdapter(BasePlatformAdapter):
view = ExecApprovalView(
session_key=session_key,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3143,6 +3443,7 @@ class DiscordAdapter(BasePlatformAdapter):
session_key=session_key,
confirm_id=confirm_id,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3177,6 +3478,7 @@ class DiscordAdapter(BasePlatformAdapter):
view = UpdatePromptView(
session_key=session_key,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
return SendResult(success=True, message_id=str(msg.id))
@@ -3234,6 +3536,7 @@ class DiscordAdapter(BasePlatformAdapter):
session_key=session_key,
on_model_selected=on_model_selected,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3789,6 +4092,72 @@ class DiscordAdapter(BasePlatformAdapter):
# Discord UI Components (outside the adapter class)
# ---------------------------------------------------------------------------
def _component_check_auth(
interaction,
allowed_user_ids: Optional[set],
allowed_role_ids: Optional[set],
) -> bool:
"""Shared user-or-role OR semantics for component view button clicks.
Mirrors ``DiscordAdapter._is_allowed_user`` / the slash and on_message
gates so every Discord interaction surface honors the same trust
boundary. Component views (ExecApprovalView, SlashConfirmView,
UpdatePromptView, ModelPickerView) used to receive only
``allowed_user_ids``: in role-only deployments
(DISCORD_ALLOWED_ROLES set, DISCORD_ALLOWED_USERS empty) the user
set was empty and the legacy "no allowlist = allow everyone" branch
let any guild member click the buttons -- approving exec commands,
cancelling slash confirmations, switching the model.
Behavior:
- both allowlists empty -> allow (preserves existing no-allowlist
deployments, no regression)
- user is in user allowlist -> allow
- role allowlist set + user has a role in it -> allow
- role allowlist set + interaction.user has no resolvable
``roles`` attribute (e.g. DM context with a role policy active)
-> reject (fail closed)
- otherwise -> reject
"""
user_set = allowed_user_ids or set()
role_set = allowed_role_ids or set()
has_users = bool(user_set)
has_roles = bool(role_set)
if not has_users and not has_roles:
return True
user = getattr(interaction, "user", None)
if user is None:
return False
if has_users:
try:
uid = str(user.id)
except AttributeError:
uid = ""
if uid and uid in user_set:
return True
if has_roles:
roles_attr = getattr(user, "roles", None)
if roles_attr is None:
# Role policy is configured but the interaction doesn't
# carry role data (DM-context Member, raw User payload).
# Fail closed: a user without a resolvable role list cannot
# satisfy a role allowlist.
return False
try:
user_role_ids = {getattr(r, "id", None) for r in roles_attr}
except TypeError:
return False
if user_role_ids & role_set:
return True
return False
if DISCORD_AVAILABLE:
class ExecApprovalView(discord.ui.View):
@@ -3801,17 +4170,23 @@ if DISCORD_AVAILABLE:
Only users in the allowed list can click. Times out after 5 minutes.
"""
def __init__(self, session_key: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300) # 5-minute timeout
self.session_key = session_key
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
"""Verify the user clicking is authorized."""
if not self.allowed_user_ids:
return True # No allowlist = anyone can approve
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _resolve(
self, interaction: discord.Interaction, choice: str,
@@ -3903,17 +4278,24 @@ if DISCORD_AVAILABLE:
5 minutes (matches the gateway primitive's timeout).
"""
def __init__(self, session_key: str, confirm_id: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
confirm_id: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300)
self.session_key = session_key
self.confirm_id = confirm_id
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _resolve(
self, interaction: discord.Interaction, choice: str,
@@ -3991,16 +4373,22 @@ if DISCORD_AVAILABLE:
5-minute timeout on its side).
"""
def __init__(self, session_key: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300)
self.session_key = session_key
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _respond(
self, interaction: discord.Interaction, answer: str,
@@ -4077,6 +4465,7 @@ if DISCORD_AVAILABLE:
session_key: str,
on_model_selected,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=120)
self.providers = providers
@@ -4085,15 +4474,16 @@ if DISCORD_AVAILABLE:
self.session_key = session_key
self.on_model_selected = on_model_selected
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
self._selected_provider: str = ""
self._build_provider_select()
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
def _build_provider_select(self):
"""Build the provider dropdown menu."""
+1 -1
View File
@@ -139,7 +139,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
async def _ws_connect(self) -> bool:
"""Establish WebSocket connection and authenticate."""
ws_url = self._hass_url.replace("http://", "ws://").replace("https://", "wss://")
ws_url = self._hass_url.replace("https://", "wss://").replace("http://", "ws://")
ws_url = f"{ws_url}/api/websocket"
self._session = aiohttp.ClientSession(
+31
View File
@@ -192,6 +192,15 @@ class SignalAdapter(BasePlatformAdapter):
group_allowed_str = os.getenv("SIGNAL_GROUP_ALLOWED_USERS", "")
self.group_allow_from = set(_parse_comma_list(group_allowed_str))
# DM allowlist — mirrors SIGNAL_ALLOWED_USERS checked by run.py.
# Stored here so the reaction hooks can skip unauthorized senders
# (reactions fire before run.py's auth gate, so without this check
# every inbound DM from any contact gets a 👀 reaction).
# "*" means all users allowed (open mode); empty means no restriction
# recorded at adapter level (run.py still enforces auth separately).
dm_allowed_str = os.getenv("SIGNAL_ALLOWED_USERS", "*")
self.dm_allow_from = set(_parse_comma_list(dm_allowed_str))
# HTTP client
self.client: Optional[httpx.AsyncClient] = None
@@ -1430,8 +1439,28 @@ class SignalAdapter(BasePlatformAdapter):
return None
return (author, ts)
def _reactions_enabled(self, event: "MessageEvent" = None) -> bool:
"""Check if message reactions are enabled for this event.
Two gates:
1. SIGNAL_REACTIONS env var set to false/0/no to disable globally.
2. DM allowlist if SIGNAL_ALLOWED_USERS is set, only react to
messages from senders in that list. This prevents unauthorized
contacts from seeing the 👀 reaction (which fires before run.py's
auth gate and would otherwise reveal that a bot is listening).
"""
if os.getenv("SIGNAL_REACTIONS", "true").lower() in ("false", "0", "no"):
return False
if event is not None:
sender = getattr(getattr(event, "source", None), "user_id", None)
if sender and "*" not in self.dm_allow_from and sender not in self.dm_allow_from:
return False
return True
async def on_processing_start(self, event: MessageEvent) -> None:
"""React with 👀 when processing begins."""
if not self._reactions_enabled(event):
return
target = self._extract_reaction_target(event)
if target:
await self.send_reaction(event.source.chat_id, "👀", *target)
@@ -1442,6 +1471,8 @@ class SignalAdapter(BasePlatformAdapter):
On CANCELLED we leave the 👀 in place no terminal outcome means
the reaction should keep reflecting "in progress" (matches Telegram).
"""
if not self._reactions_enabled(event):
return
if outcome == ProcessingOutcome.CANCELLED:
return
target = self._extract_reaction_target(event)
+15
View File
@@ -528,6 +528,21 @@ class SlackAdapter(BasePlatformAdapter):
return False
lock_acquired = True
# Close any previous handler before creating a new one so that
# calling connect() a second time (e.g. during a gateway restart or
# in-process reconnect attempt) does not leave a zombie Socket Mode
# connection alive. Both the old and new connections would otherwise
# receive every Slack event and dispatch it twice, producing double
# responses — the same bug that affected DiscordAdapter (#18187).
if self._handler is not None:
try:
await self._handler.close_async()
except Exception:
logger.debug("[%s] Failed to close previous Slack handler", self.name)
finally:
self._handler = None
self._app = None
# First token is the primary — used for AsyncApp / Socket Mode
primary_token = bot_tokens[0]
self._app = AsyncApp(token=primary_token)
+321 -40
View File
@@ -49,6 +49,29 @@ from hermes_cli.config import cfg_get
_AGENT_CACHE_MAX_SIZE = 128
_AGENT_CACHE_IDLE_TTL_SECS = 3600.0 # evict agents idle for >1h
_PLATFORM_CONNECT_TIMEOUT_SECS_DEFAULT = 30.0
_TELEGRAM_COMMAND_MENTION_RE = re.compile(r"(?<![\w:/])/([A-Za-z0-9][A-Za-z0-9_-]*)")
def _telegramize_command_mentions(text: str, platform: Any) -> str:
"""Rewrite slash-command mentions to Telegram-valid command names.
Telegram Bot API command names allow only lowercase letters, digits, and
underscores. Keep other platform renderings unchanged, but normalize
Telegram help text so command mentions remain clickable/valid there.
"""
platform_value = getattr(platform, "value", platform)
if platform_value != "telegram":
return text
from hermes_cli.commands import _sanitize_telegram_name
def _replace(match: re.Match[str]) -> str:
sanitized = _sanitize_telegram_name(match.group(1))
return f"/{sanitized}" if sanitized else match.group(0)
return _TELEGRAM_COMMAND_MENTION_RE.sub(_replace, text)
# Only auto-continue interrupted gateway turns while the interruption is fresh.
# Stale tool-tail/resume markers can otherwise revive an unrelated old task
# after a gateway restart when the user's next message starts new work.
@@ -283,6 +306,20 @@ def _home_target_env_var(platform_name: str) -> str:
)
def _home_thread_env_var(platform_name: str) -> str:
"""Return the optional thread/topic env var for a platform home target."""
return f"{_home_target_env_var(platform_name)}_THREAD_ID"
def _restart_notification_pending() -> bool:
"""Return True when a /restart completion marker is waiting to be delivered."""
return (_hermes_home / ".restart_notify.json").exists()
# Mark this process as a gateway so cli.py's module-level load_cli_config()
# knows not to clobber TERMINAL_CWD if lazily imported.
os.environ["_HERMES_GATEWAY"] = "1"
_ensure_ssl_certs()
# Add parent directory to path
@@ -507,6 +544,8 @@ from gateway.config import (
Platform,
_BUILTIN_PLATFORM_VALUES,
GatewayConfig,
HomeChannel,
PlatformConfig,
load_gateway_config,
)
from gateway.session import (
@@ -1149,6 +1188,10 @@ class GatewayRunner:
# Per-chat voice reply mode: "off" | "voice_only" | "all"
self._voice_mode: Dict[str, str] = self._load_voice_modes()
# Recent voice transcripts per (guild,user) for duplicate suppression.
# Protects against the same utterance being emitted twice by the voice
# capture / STT pipeline, which otherwise produces a second delayed reply.
self._recent_voice_transcripts: Dict[tuple[int, int], List[tuple[float, str]]] = {}
# Track background tasks to prevent garbage collection mid-execution
self._background_tasks: set = set()
@@ -2257,15 +2300,13 @@ class GatewayRunner:
logger.debug("Failed interrupting agent during shutdown: %s", e)
async def _notify_active_sessions_of_shutdown(self) -> None:
"""Send a notification to every chat with an active agent.
"""Send shutdown/restart notifications to active chats and home channels.
Called at the very start of stop() adapters are still connected so
messages can be delivered. Best-effort: individual send failures are
messages can be delivered. Best-effort: individual send failures are
logged and swallowed so they never block the shutdown sequence.
"""
active = self._snapshot_running_agents()
if not active:
return
action = "restarting" if self._restart_requested else "shutting down"
hint = (
@@ -2276,7 +2317,7 @@ class GatewayRunner:
)
msg = f"⚠️ Gateway {action}{hint}"
notified: set = set()
notified: set[tuple[str, str, Optional[str]]] = set()
for session_key in active:
source = None
try:
@@ -2293,7 +2334,7 @@ class GatewayRunner:
if source is not None:
platform_str = source.platform.value
chat_id = source.chat_id
chat_id = str(source.chat_id)
thread_id = source.thread_id
else:
# Fall back to parsing the session key when no persisted
@@ -2305,9 +2346,10 @@ class GatewayRunner:
chat_id = _parsed["chat_id"]
thread_id = _parsed.get("thread_id")
# Deduplicate: one notification per chat, even if multiple
# sessions (different users/threads) share the same chat.
dedup_key = (platform_str, chat_id)
# Deduplicate only identical delivery targets. Thread/topic-aware
# platforms can share a parent chat while still routing to distinct
# destinations via metadata.
dedup_key = (platform_str, chat_id, str(thread_id) if thread_id else None)
if dedup_key in notified:
continue
@@ -2321,10 +2363,19 @@ class GatewayRunner:
# correct forum topic / thread.
metadata = {"thread_id": thread_id} if thread_id else None
await adapter.send(chat_id, msg, metadata=metadata)
result = await adapter.send(chat_id, msg, metadata=metadata)
if result is not None and getattr(result, "success", True) is False:
logger.debug(
"Failed to send shutdown notification to %s:%s: %s",
platform_str,
chat_id,
getattr(result, "error", "send returned success=False"),
)
continue
notified.add(dedup_key)
logger.info(
"Sent shutdown notification to %s:%s",
"Sent shutdown notification to active chat %s:%s",
platform_str, chat_id,
)
except Exception as e:
@@ -2333,6 +2384,44 @@ class GatewayRunner:
platform_str, chat_id, e,
)
for platform, adapter in self.adapters.items():
home = self.config.get_home_channel(platform)
if not home or not home.chat_id:
continue
dedup_key = (platform.value, str(home.chat_id), str(home.thread_id) if home.thread_id else None)
if dedup_key in notified:
continue
try:
metadata = {"thread_id": home.thread_id} if home.thread_id else None
if metadata:
result = await adapter.send(str(home.chat_id), msg, metadata=metadata)
else:
result = await adapter.send(str(home.chat_id), msg)
if result is not None and getattr(result, "success", True) is False:
logger.debug(
"Failed to send shutdown notification to home channel %s:%s: %s",
platform.value,
home.chat_id,
getattr(result, "error", "send returned success=False"),
)
continue
notified.add(dedup_key)
logger.info(
"Sent shutdown notification to home channel %s:%s",
platform.value,
home.chat_id,
)
except Exception as e:
logger.debug(
"Failed to send shutdown notification to home channel %s:%s: %s",
platform.value,
home.chat_id,
e,
)
def _finalize_shutdown_agents(self, active_agents: Dict[str, Any]) -> None:
for agent in active_agents.values():
try:
@@ -2755,7 +2844,7 @@ class GatewayRunner:
try:
suspended = self.session_store.suspend_recently_active()
if suspended:
logger.info("Suspended %d in-flight session(s) from previous run", suspended)
logger.info("Marked %d in-flight session(s) as resumable from previous run", suspended)
except Exception as e:
logger.warning("Session suspension on startup failed: %s", e)
@@ -2953,8 +3042,28 @@ class GatewayRunner:
):
self._schedule_update_notification_watch()
# Give freshly connected platform adapters a brief moment to settle
# before sending restart/startup lifecycle messages. In practice this
# helps Discord thread deliveries right after reconnect.
if connected_count > 0:
await asyncio.sleep(1.0)
# Notify the chat that initiated /restart that the gateway is back.
await self._send_restart_notification()
restart_notification_pending = _restart_notification_pending()
delivered_restart_target = await self._send_restart_notification()
# Broadcast a lightweight "gateway is back" message to configured
# home channels only when this startup is resuming from /restart. If a
# /restart requester already received a direct completion notice in the
# same chat, skip the generic broadcast there to avoid duplicates while
# still allowing a home-channel fallback when the direct send fails.
if restart_notification_pending or delivered_restart_target is not None:
skip_home_targets = (
{delivered_restart_target} if delivered_restart_target else None
)
await self._send_home_channel_startup_notifications(
skip_targets=skip_home_targets,
)
# Drain any recovered process watchers (from crash recovery checkpoint)
try:
@@ -3982,7 +4091,9 @@ class GatewayRunner:
if not check_discord_requirements():
logger.warning("Discord: discord.py not installed")
return None
return DiscordAdapter(config)
adapter = DiscordAdapter(config)
adapter.gateway_runner = self # For cross-platform admin alerts on unauthorized slash
return adapter
elif platform == Platform.WHATSAPP:
from gateway.platforms.whatsapp import WhatsAppAdapter, check_whatsapp_requirements
@@ -5027,6 +5138,28 @@ class GatewayRunner:
_cmd_def = _resolve_cmd(command) if command else None
canonical = _cmd_def.name if _cmd_def else command
# Expand alias quick commands before built-in dispatch so targets like
# /model openai/gpt-5.5 --provider openrouter reach the /model handler.
# Preserve built-in precedence; aliases only need early handling when
# the typed command is not already known.
if command and _cmd_def is None:
if isinstance(self.config, dict):
quick_commands = self.config.get("quick_commands", {}) or {}
else:
quick_commands = getattr(self.config, "quick_commands", {}) or {}
if isinstance(quick_commands, dict) and command in quick_commands:
qcmd = quick_commands[command]
if qcmd.get("type") == "alias":
target = qcmd.get("target", "").strip()
if target:
target = target if target.startswith("/") else f"/{target}"
target_command = target.lstrip("/")
user_args = event.get_command_args().strip()
event.text = f"{target} {user_args}".strip()
command = target_command.split()[0] if target_command else target_command
_cmd_def = _resolve_cmd(command) if command else None
canonical = _cmd_def.name if _cmd_def else command
# Fire the ``command:<canonical>`` hook for any recognized slash
# command — built-in OR plugin-registered. Handlers can return a
# dict with ``{"decision": "deny" | "handled" | "rewrite", ...}``
@@ -5240,7 +5373,7 @@ class GatewayRunner:
target_command = target.lstrip("/")
user_args = event.get_command_args().strip()
event.text = f"{target} {user_args}".strip()
command = target_command
command = target_command.split()[0] if target_command else target_command
# Fall through to normal command dispatch below
else:
return f"Quick command '/{command}' has no target defined."
@@ -7218,7 +7351,10 @@ class GatewayRunner:
lines.append(f"\n... and {len(sorted_cmds) - 10} more. Use `/commands` for the full paginated list.")
except Exception:
pass
return "\n".join(lines)
return _telegramize_command_mentions(
"\n".join(lines),
getattr(getattr(event, "source", None), "platform", None),
)
async def _handle_commands_command(self, event: MessageEvent) -> str:
"""Handle /commands [page] - paginated list of all commands and skills."""
@@ -7271,7 +7407,10 @@ class GatewayRunner:
lines.extend(["", " | ".join(nav_parts)])
if page != requested_page:
lines.append(f"_(Requested page {requested_page} was out of range, showing page {page}.)_")
return "\n".join(lines)
return _telegramize_command_mentions(
"\n".join(lines),
getattr(getattr(event, "source", None), "platform", None),
)
async def _handle_model_command(self, event: MessageEvent) -> Optional[str]:
"""Handle /model command — switch model for this session.
@@ -7885,24 +8024,33 @@ class GatewayRunner:
msg = decision.get("message") or ""
# Send the status line back to the user so they see the judge's
# verdict. Fire-and-forget via the adapter.
# verdict. Fire-and-forget via the adapter's ``send()`` method —
# adapters expose ``send(chat_id, content, reply_to, metadata)``,
# not a ``send_message(source, msg)`` wrapper, so an earlier
# ``hasattr(adapter, "send_message")`` gate here was dead code and
# users never saw ``✓ Goal achieved`` / ``⏸ budget exhausted``
# verdicts.
if msg and source is not None:
try:
adapter = self.adapters.get(source.platform)
if adapter and hasattr(adapter, "send_message"):
if adapter is not None and hasattr(adapter, "send"):
import asyncio as _asyncio
coro = adapter.send_message(source, msg)
thread_meta = (
{"thread_id": source.thread_id} if source.thread_id else None
)
coro = adapter.send(
chat_id=source.chat_id,
content=msg,
metadata=thread_meta,
)
if _asyncio.iscoroutine(coro):
try:
loop = _asyncio.get_event_loop()
if loop.is_running():
loop.create_task(coro)
else:
loop.run_until_complete(coro)
loop = _asyncio.get_running_loop()
loop.create_task(coro)
except RuntimeError:
# No event loop in this thread — schedule on the main one.
# No running loop in this thread — best effort.
try:
_asyncio.run_coroutine_threadsafe(coro, self._loop)
_asyncio.run(coro)
except Exception:
pass
except Exception as exc:
@@ -7965,14 +8113,33 @@ class GatewayRunner:
chat_name = source.chat_name or chat_id
env_key = _home_target_env_var(platform_name)
thread_env_key = _home_thread_env_var(platform_name)
thread_id = source.thread_id
# Save to .env so it persists across restarts
try:
from hermes_cli.config import save_env_value
save_env_value(env_key, str(chat_id))
# Keep thread/topic routing explicit and clear stale values when
# /sethome is run from the parent chat instead of a thread.
save_env_value(thread_env_key, str(thread_id or ""))
except Exception as e:
return f"Failed to save home channel: {e}"
# Keep the running gateway config in sync too. The pre-restart
# notification path reads self.config before the process reloads env.
if source.platform:
platform_config = self.config.platforms.setdefault(
source.platform,
PlatformConfig(enabled=True),
)
platform_config.home_channel = HomeChannel(
platform=source.platform,
chat_id=str(chat_id),
name=chat_name,
thread_id=str(thread_id) if thread_id else None,
)
return (
f"✅ Home channel set to **{chat_name}** (ID: {chat_id}).\n"
f"Cron jobs and cross-platform messages will be delivered here."
@@ -8153,6 +8320,47 @@ class GatewayRunner:
adapter = self.adapters.get(Platform.DISCORD)
self._set_adapter_auto_tts_disabled(adapter, chat_id, disabled=True)
def _is_duplicate_voice_transcript(self, guild_id: int, user_id: int, transcript: str) -> bool:
"""Suppress repeated STT outputs for the same recent utterance.
Voice capture can occasionally emit the same utterance twice a few
seconds apart, which creates a second queued agent run and overlapping
spoken replies. Dedup exact and near-exact repeats per guild/user over a
short window while allowing genuinely new turns through.
"""
from difflib import SequenceMatcher
normalized = re.sub(r"\s+", " ", transcript).strip().lower()
normalized = re.sub(r"[^\w\s]", "", normalized)
if not normalized:
return False
now = time.monotonic()
window_seconds = 12.0
key = (guild_id, user_id)
recent_store = getattr(self, "_recent_voice_transcripts", None)
if not isinstance(recent_store, dict):
recent_store = {}
self._recent_voice_transcripts = recent_store
recent = [
(ts, txt)
for ts, txt in recent_store.get(key, [])
if now - ts <= window_seconds
]
for _, prior in recent:
if prior == normalized:
recent_store[key] = recent
return True
if len(prior) >= 16 and len(normalized) >= 16:
if SequenceMatcher(None, prior, normalized).ratio() >= 0.95:
recent_store[key] = recent
return True
recent.append((now, normalized))
recent_store[key] = recent[-5:]
return False
async def _handle_voice_channel_input(
self, guild_id: int, user_id: int, transcript: str
):
@@ -8190,6 +8398,15 @@ class GatewayRunner:
logger.debug("Unauthorized voice input from user %d, ignoring", user_id)
return
if self._is_duplicate_voice_transcript(guild_id, user_id, transcript):
logger.info(
"Suppressing duplicate voice transcript for guild=%s user=%s: %s",
guild_id,
user_id,
transcript[:100],
)
return
# Show transcript in text channel (after auth, with mention sanitization)
try:
channel = adapter._client.get_channel(text_ch_id)
@@ -10456,11 +10673,11 @@ class GatewayRunner:
return True
async def _send_restart_notification(self) -> None:
async def _send_restart_notification(self) -> Optional[tuple[str, str, Optional[str]]]:
"""Notify the chat that initiated /restart that the gateway is back."""
notify_path = _hermes_home / ".restart_notify.json"
if not notify_path.exists():
return
return None
try:
data = json.loads(notify_path.read_text())
@@ -10469,7 +10686,7 @@ class GatewayRunner:
thread_id = data.get("thread_id")
if not platform_str or not chat_id:
return
return None
platform = Platform(platform_str)
adapter = self.adapters.get(platform)
@@ -10478,11 +10695,11 @@ class GatewayRunner:
"Restart notification skipped: %s adapter not connected",
platform_str,
)
return
return None
metadata = {"thread_id": thread_id} if thread_id else None
result = await adapter.send(
chat_id,
str(chat_id),
"♻ Gateway restarted successfully. Your session continues.",
metadata=metadata,
)
@@ -10490,24 +10707,82 @@ class GatewayRunner:
# and returns SendResult(success=False) rather than raising, so
# we must inspect the result before claiming success — otherwise
# the log line is misleading and hides real delivery failures.
if getattr(result, "success", False):
logger.info(
"Sent restart notification to %s:%s",
platform_str,
chat_id,
)
else:
if result is not None and getattr(result, "success", True) is False:
logger.warning(
"Restart notification to %s:%s was not delivered: %s",
platform_str,
chat_id,
getattr(result, "error", "unknown error"),
getattr(result, "error", "send returned success=False"),
)
return None
logger.info(
"Sent restart notification to %s:%s",
platform_str,
chat_id,
)
return str(platform_str), str(chat_id), str(thread_id) if thread_id else None
except Exception as e:
logger.warning("Restart notification failed: %s", e)
return None
finally:
notify_path.unlink(missing_ok=True)
async def _send_home_channel_startup_notifications(
self,
*,
skip_targets: Optional[set[tuple[str, str, Optional[str]]]] = None,
) -> set[tuple[str, str, Optional[str]]]:
"""Notify configured home channels that the gateway is back online.
The notification is best-effort and sent once per connected platform
home channel. ``skip_targets`` lets startup avoid duplicate messages
when a more specific restart notification is queued for the same chat.
"""
delivered: set[tuple[str, str, Optional[str]]] = set()
skipped = skip_targets or set()
message = "♻️ Gateway online — Hermes is back and ready."
for platform, adapter in self.adapters.items():
home = self.config.get_home_channel(platform)
if not home or not home.chat_id:
continue
target = (platform.value, str(home.chat_id), str(home.thread_id) if home.thread_id else None)
if target in skipped or target in delivered:
continue
try:
metadata = {"thread_id": home.thread_id} if home.thread_id else None
if metadata:
result = await adapter.send(str(home.chat_id), message, metadata=metadata)
else:
result = await adapter.send(str(home.chat_id), message)
if result is not None and getattr(result, "success", True) is False:
logger.warning(
"Home-channel startup notification failed for %s:%s: %s",
platform.value,
home.chat_id,
getattr(result, "error", "send returned success=False"),
)
continue
delivered.add(target)
logger.info(
"Sent home-channel startup notification to %s:%s",
platform.value,
home.chat_id,
)
except Exception as exc:
logger.warning(
"Home-channel startup notification failed for %s:%s: %s",
platform.value,
home.chat_id,
exc,
)
return delivered
def _set_session_env(self, context: SessionContext) -> list:
"""Set session context variables for the current async task.
@@ -11145,6 +11420,12 @@ class GatewayRunner:
if not session_key:
return
pending_skills_reload_notes = getattr(
self, "_pending_skills_reload_notes", None
)
if isinstance(pending_skills_reload_notes, dict):
pending_skills_reload_notes.pop(session_key, None)
pending_approvals = getattr(self, "_pending_approvals", None)
if isinstance(pending_approvals, dict):
pending_approvals.pop(session_key, None)
+17 -12
View File
@@ -1086,19 +1086,22 @@ class SessionStore:
return len(removed_keys)
def suspend_recently_active(self, max_age_seconds: int = 120) -> int:
"""Mark recently-active sessions as suspended.
"""Mark recently-active sessions as resumable after an unexpected exit.
Called on gateway startup to prevent sessions that were likely
in-flight when the gateway last exited from being blindly resumed
(#7536). Only suspends sessions updated within *max_age_seconds*
to avoid resetting long-idle sessions that are harmless to resume.
Returns the number of sessions that were suspended.
Called on gateway startup after a crash or fast restart to preserve
in-flight sessions instead of destroying their conversation history
(#7536). Only marks sessions updated within *max_age_seconds* to
avoid touching long-idle sessions. Sets ``resume_pending=True`` so
the next incoming message on the same session_key auto-resumes from
the existing transcript.
Entries flagged ``resume_pending=True`` are skipped those were
marked intentionally by the drain-timeout path as recoverable.
Terminal escalation for genuinely stuck ``resume_pending`` sessions
is handled by the existing ``.restart_failure_counts`` stuck-loop
counter, which runs after this method on startup.
Entries already flagged ``resume_pending=True`` are skipped. Entries
explicitly ``suspended=True`` (from /stop or stuck-loop escalation)
are also skipped. Terminal escalation for genuinely stuck sessions
is still handled by the existing ``.restart_failure_counts`` counter
(threshold 3), which runs after this method and sets ``suspended=True``.
Returns the number of sessions marked resumable.
"""
from datetime import timedelta
@@ -1110,7 +1113,9 @@ class SessionStore:
if entry.resume_pending:
continue
if not entry.suspended and entry.updated_at >= cutoff:
entry.suspended = True
entry.resume_pending = True
entry.resume_reason = "restart_interrupted"
entry.last_resume_marked_at = _now()
count += 1
if count:
self._save()
+33 -1
View File
@@ -5,11 +5,43 @@ Provides subcommands for:
- hermes chat - Interactive chat (same as ./hermes)
- hermes gateway - Run gateway in foreground
- hermes gateway start - Start gateway service
- hermes gateway stop - Stop gateway service
- hermes gateway stop - Stop gateway service
- hermes setup - Interactive setup wizard
- hermes status - Show status of all components
- hermes cron - Manage cron jobs
"""
import os
import sys
__version__ = "0.12.0"
__release_date__ = "2026.4.30"
def _ensure_utf8():
"""Force UTF-8 stdout/stderr on Windows to prevent UnicodeEncodeError.
Windows services and terminals default to cp1252, which cannot encode
box-drawing characters used in CLI output. This causes unhandled
UnicodeEncodeError crashes on gateway startup.
"""
if sys.platform != "win32":
return
os.environ.setdefault("PYTHONUTF8", "1")
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
for stream_name in ("stdout", "stderr"):
stream = getattr(sys, stream_name, None)
if stream is None:
continue
try:
if getattr(stream, "encoding", "").lower().replace("-", "") != "utf8":
new_stream = open(
stream.fileno(), "w", encoding="utf-8",
buffering=1, closefd=False,
)
setattr(sys, stream_name, new_stream)
except (AttributeError, OSError):
pass
_ensure_utf8()
+4 -2
View File
@@ -4283,7 +4283,8 @@ def _minimax_oauth_login(
print(f"Portal: {portal_base_url}")
with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
headers={"Accept": "application/json"}) as client:
headers={"Accept": "application/json"},
follow_redirects=True) as client:
code_data = _minimax_request_user_code(
client, portal_base_url=portal_base_url,
client_id=pconfig.client_id,
@@ -4360,7 +4361,8 @@ def _refresh_minimax_oauth_state(
return state
portal_base_url = state["portal_base_url"]
with httpx.Client(timeout=httpx.Timeout(timeout_seconds)) as client:
with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
follow_redirects=True) as client:
response = client.post(
f"{portal_base_url}/oauth/token",
data={
+29 -15
View File
@@ -399,6 +399,11 @@ def _is_gateway_available(cmd: CommandDef, config_overrides: set[str] | None = N
return False
def _requires_argument(args_hint: str) -> bool:
"""Return True when selecting a command without text would be incomplete."""
return args_hint.strip().startswith("<")
def gateway_help_lines() -> list[str]:
"""Generate gateway help text lines from the registry."""
overrides = _resolve_config_gates()
@@ -455,7 +460,9 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
Telegram command names cannot contain hyphens, so they are replaced with
underscores. Aliases are skipped -- Telegram shows one menu entry per
canonical command.
canonical command. Commands that require arguments are skipped because
selecting a Telegram BotCommand sends only ``/command`` and would execute
an incomplete command.
Plugin-registered slash commands are included so plugins get native
autocomplete in Telegram without touching core code.
@@ -465,10 +472,14 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
for cmd in COMMAND_REGISTRY:
if not _is_gateway_available(cmd, overrides):
continue
if _requires_argument(cmd.args_hint):
continue
tg_name = _sanitize_telegram_name(cmd.name)
if tg_name:
result.append((tg_name, cmd.description))
for name, description, _args_hint in _iter_plugin_command_entries():
for name, description, args_hint in _iter_plugin_command_entries():
if _requires_argument(args_hint):
continue
tg_name = _sanitize_telegram_name(name)
if tg_name:
result.append((tg_name, description))
@@ -502,9 +513,9 @@ def _sanitize_telegram_name(raw: str) -> str:
def _clamp_command_names(
entries: list[tuple[str, str]],
entries: list[tuple[str, ...]],
reserved: set[str],
) -> list[tuple[str, str]]:
) -> list[tuple[str, ...]]:
"""Enforce 32-char command name limit with collision avoidance.
Both Telegram and Discord cap slash command names at 32 characters.
@@ -512,10 +523,15 @@ def _clamp_command_names(
(against *reserved* names or earlier entries in the same batch), the name is
shortened to 31 chars and a digit ``0``-``9`` is appended to differentiate.
If all 10 digit slots are taken the entry is silently dropped.
Accepts tuples of any length >= 2. Extra elements beyond ``(name, desc)``
(e.g. ``cmd_key``) are passed through unchanged, so callers can attach
metadata that survives the rename.
"""
used: set[str] = set(reserved)
result: list[tuple[str, str]] = []
for name, desc in entries:
result: list[tuple] = []
for entry in entries:
name, desc, *extra = entry
if len(name) > _CMD_NAME_LIMIT:
candidate = name[:_CMD_NAME_LIMIT]
if candidate in used:
@@ -531,7 +547,7 @@ def _clamp_command_names(
if name in used:
continue
used.add(name)
result.append((name, desc))
result.append((name, desc, *extra))
return result
@@ -651,17 +667,15 @@ def _collect_gateway_skill_entries(
except Exception:
pass
# Clamp names; _clamp_command_names works on (name, desc) pairs so we
# need to zip/unzip.
skill_pairs = [(n, d) for n, d, _ in skill_triples]
key_by_pair = {(n, d): k for n, d, k in skill_triples}
skill_pairs = _clamp_command_names(skill_pairs, reserved_names)
# Clamp names; cmd_key is passed through as extra payload so it survives
# any clamp-induced renames.
skill_triples = _clamp_command_names(skill_triples, reserved_names)
# Skills fill remaining slots — only tier that gets trimmed
remaining = max(0, max_slots - len(all_entries))
hidden_count = max(0, len(skill_pairs) - remaining)
for n, d in skill_pairs[:remaining]:
all_entries.append((n, d, key_by_pair.get((n, d), "")))
hidden_count = max(0, len(skill_triples) - remaining)
for n, d, k in skill_triples[:remaining]:
all_entries.append((n, d, k))
return all_entries[:max_slots], hidden_count
+3 -1
View File
@@ -4675,7 +4675,9 @@ def set_config_value(key: str, value: str):
"terminal.vercel_runtime": "TERMINAL_VERCEL_RUNTIME",
"terminal.docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
"terminal.docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
"terminal.cwd": "TERMINAL_CWD",
# terminal.cwd intentionally excluded — CLI resolves at runtime,
# gateway bridges it in gateway/run.py. Persisting to .env causes
# stale values to poison child processes.
"terminal.timeout": "TERMINAL_TIMEOUT",
"terminal.sandbox_dir": "TERMINAL_SANDBOX_DIR",
"terminal.persistent_shell": "TERMINAL_PERSISTENT_SHELL",
+6
View File
@@ -156,6 +156,8 @@ def curses_checklist(
flush_stdin()
return result_holder[0] if result_holder[0] is not None else cancel_returns
except KeyboardInterrupt:
return cancel_returns
except Exception:
return _numbered_fallback(title, items, selected, cancel_returns, status_fn)
@@ -278,6 +280,8 @@ def curses_radiolist(
flush_stdin()
return result_holder[0] if result_holder[0] is not None else cancel_returns
except KeyboardInterrupt:
return cancel_returns
except Exception:
return _radio_numbered_fallback(title, items, selected, cancel_returns)
@@ -401,6 +405,8 @@ def curses_single_select(
return None
return result_holder[0]
except KeyboardInterrupt:
return None
except Exception:
all_items = list(items) + [cancel_label]
cancel_idx = len(items)
+82 -7
View File
@@ -1,12 +1,19 @@
"""``hermes debug`` debug tools for Hermes Agent.
"""``hermes debug`` debug tools for Hermes Agent.
Currently supports:
hermes debug share Upload debug report (system info + logs) to a
paste service and print a shareable URL.
By default, log content is run through
``agent.redact.redact_sensitive_text`` with
``force=True`` before upload so credentials in
``~/.hermes/logs/*.log`` are not leaked into
the public paste service. Pass ``--no-redact``
to disable.
"""
import io
import json
import logging
import sys
import time
import urllib.error
@@ -19,6 +26,16 @@ from typing import Optional
from hermes_constants import get_hermes_home
from utils import atomic_replace
logger = logging.getLogger(__name__)
# Banner prepended to upload-bound log content when redaction is enabled.
# Visible in the public paste so reviewers know the content was sanitized.
# Kept short; the trailing newline guarantees the banner sits on its own line.
_REDACTION_BANNER = (
"[hermes debug share: log content redacted at upload time. "
"run with --no-redact to disable]\n"
)
# ---------------------------------------------------------------------------
# Paste services — try paste.rs first, dpaste.com as fallback.
@@ -368,17 +385,40 @@ def _resolve_log_path(log_name: str) -> Optional[Path]:
return None
def _redact_log_text(text: str) -> str:
"""Run ``redact_sensitive_text`` with ``force=True`` over upload-bound text.
Uses ``force=True`` so redaction fires regardless of the operator's
``security.redact_secrets`` setting. The local on-disk log file is
not modified; only the in-memory copy headed for the public paste
service is sanitized. Returns the redacted text (or the original
when empty / non-string).
"""
if not text:
return text
from agent.redact import redact_sensitive_text
return redact_sensitive_text(text, force=True)
def _capture_log_snapshot(
log_name: str,
*,
tail_lines: int,
max_bytes: int = _MAX_LOG_BYTES,
redact: bool = True,
) -> LogSnapshot:
"""Capture a log once and derive summary/full-log views from it.
The report tail and standalone log upload must come from the same file
snapshot. Otherwise a rotation/truncate between reads can make the report
look newer than the uploaded ``agent.log`` paste.
When ``redact`` is True (the default), both ``tail_text`` and
``full_text`` are run through ``_redact_log_text`` so the snapshot
returned is upload-safe. The on-disk log file is never modified.
Pass ``redact=False`` to capture original log content (used by
``hermes debug share --no-redact``).
"""
log_path = _resolve_log_path(log_name)
if log_path is None:
@@ -438,18 +478,34 @@ def _capture_log_snapshot(
if truncated:
full_text = f"[... truncated — showing last ~{max_bytes // 1024}KB ...]\n{full_text}"
if redact:
tail_text = _redact_log_text(tail_text)
full_text = _redact_log_text(full_text)
return LogSnapshot(path=log_path, tail_text=tail_text, full_text=full_text)
except Exception as exc:
return LogSnapshot(path=log_path, tail_text=f"(error reading: {exc})", full_text=None)
def _capture_default_log_snapshots(log_lines: int) -> dict[str, LogSnapshot]:
"""Capture all logs used by debug-share exactly once."""
def _capture_default_log_snapshots(
log_lines: int, *, redact: bool = True
) -> dict[str, LogSnapshot]:
"""Capture all logs used by debug-share exactly once.
``redact`` is forwarded to each ``_capture_log_snapshot`` call so all
captured logs share the same redaction policy for a given run.
"""
errors_lines = min(log_lines, 100)
return {
"agent": _capture_log_snapshot("agent", tail_lines=log_lines),
"errors": _capture_log_snapshot("errors", tail_lines=errors_lines),
"gateway": _capture_log_snapshot("gateway", tail_lines=errors_lines),
"agent": _capture_log_snapshot(
"agent", tail_lines=log_lines, redact=redact
),
"errors": _capture_log_snapshot(
"errors", tail_lines=errors_lines, redact=redact
),
"gateway": _capture_log_snapshot(
"gateway", tail_lines=errors_lines, redact=redact
),
}
@@ -532,6 +588,7 @@ def run_debug_share(args):
log_lines = getattr(args, "lines", 200)
expiry = getattr(args, "expire", 7)
local_only = getattr(args, "local", False)
redact = not getattr(args, "no_redact", False)
if not local_only:
print(_PRIVACY_NOTICE)
@@ -539,8 +596,16 @@ def run_debug_share(args):
print("Collecting debug report...")
# Capture dump once — prepended to every paste for context.
# The dump is already redacted at extract time via dump.py:_redact;
# log_snapshots are redacted by _capture_default_log_snapshots when
# redact=True so credentials never reach the public paste service.
dump_text = _capture_dump()
log_snapshots = _capture_default_log_snapshots(log_lines)
log_snapshots = _capture_default_log_snapshots(log_lines, redact=redact)
if redact:
logger.info(
"hermes debug share: applied force-mode redaction to log snapshots before upload"
)
report = collect_debug_report(
log_lines=log_lines,
@@ -556,6 +621,15 @@ def run_debug_share(args):
if gateway_log:
gateway_log = dump_text + "\n\n--- full gateway.log ---\n" + gateway_log
# Visible banner so reviewers reading the public paste know redaction
# was applied at upload time. Banner is omitted under --no-redact.
if redact:
report = _REDACTION_BANNER + report
if agent_log:
agent_log = _REDACTION_BANNER + agent_log
if gateway_log:
gateway_log = _REDACTION_BANNER + gateway_log
if local_only:
print(report)
if agent_log:
@@ -666,6 +740,7 @@ def run_debug(args):
print(" --lines N Number of log lines to include (default: 200)")
print(" --expire N Paste expiry in days (default: 7)")
print(" --local Print report locally instead of uploading")
print(" --no-redact Disable upload-time secret redaction (default: redact)")
print()
print("Options (delete):")
print(" <url> ... One or more paste URLs to delete")
+53
View File
@@ -237,6 +237,26 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
return False
def _get_ancestor_pids() -> set[int]:
"""Return the set of PIDs in the current process's ancestor chain.
Walks from the current PID up to PID 1 (init) so that process-table scans
never match the calling CLI process or any of its parents. This prevents
``hermes gateway status`` from falsely counting the ``hermes`` CLI that
invoked it as a running gateway instance (see #13242).
"""
ancestors: set[int] = set()
pid = os.getpid()
# Cap iterations to avoid infinite loops on exotic platforms.
for _ in range(64):
ancestors.add(pid)
parent = _get_parent_pid(pid)
if parent is None or parent <= 0 or parent in ancestors:
break
pid = parent
return ancestors
def _append_unique_pid(pids: list[int], pid: int | None, exclude_pids: set[int]) -> None:
if pid is None or pid <= 0:
return
@@ -252,6 +272,10 @@ def _scan_gateway_pids(exclude_pids: set[int], all_profiles: bool = False) -> li
a live gateway when the PID file is stale/missing, and ``--all`` sweeps can
discover gateways outside the current profile.
"""
# Exclude the entire ancestor chain so the CLI process that invoked this
# scan (e.g. ``hermes gateway status``) is never mistaken for a running
# gateway. See #13242.
exclude_pids = exclude_pids | _get_ancestor_pids()
pids: list[int] = []
patterns = [
"hermes_cli.main gateway",
@@ -690,6 +714,32 @@ def _print_gateway_process_mismatch(snapshot: GatewayRuntimeSnapshot) -> None:
print(" can refuse to start another copy until this process stops.")
def _print_other_profiles_gateway_status() -> None:
"""Print a summary of gateway status across all profiles.
Shown at the bottom of ``hermes gateway status`` output so users with
multiple profiles can tell at a glance which gateways are running and
avoid confusing another profile's process with the current one.
"""
try:
from hermes_cli.profiles import get_active_profile_name
current = get_active_profile_name()
other_processes = [
p for p in find_profile_gateway_processes()
if p.profile != current
]
if not other_processes:
return
print()
print("Other profiles:")
for proc in other_processes:
print(f"{proc.profile:<16s} — PID {proc.pid}")
except Exception:
pass
def kill_gateway_processes(force: bool = False, exclude_pids: set | None = None,
all_profiles: bool = False) -> int:
"""Kill any running gateway processes. Returns count killed.
@@ -4456,6 +4506,9 @@ def _gateway_command_inner(args):
print(" hermes gateway install # Install as user service")
print(" sudo hermes gateway install --system # Install as boot-time system service")
# Show other profiles' gateway status for multi-profile awareness
_print_other_profiles_gateway_status()
elif subcmd == "migrate-legacy":
# Stop, disable, and remove legacy Hermes gateway unit files from
# pre-rename installs (e.g. hermes.service). Profile units and
+1 -1
View File
@@ -366,7 +366,7 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
# --- log ---
p_log = sub.add_parser(
"log",
help="Print the worker log for a task (from $HERMES_HOME/kanban/logs/)",
help="Print the worker log for a task (from <kanban-root>/kanban/logs/)",
)
p_log.add_argument("task_id")
p_log.add_argument("--tail", type=int, default=None,
+91 -19
View File
@@ -1,8 +1,28 @@
"""SQLite-backed Kanban board for multi-profile collaboration.
The board lives at ``$HERMES_HOME/kanban.db`` (profile-agnostic on purpose:
multiple profiles on the same machine all see the same board, which IS the
coordination primitive).
The board lives at ``<root>/kanban.db`` where ``<root>`` is the **shared
Hermes root** (the parent of any active profile). Profiles intentionally
collapse onto a single board: it IS the cross-profile coordination
primitive. A worker spawned with ``hermes -p <profile>`` joins the same
board as the dispatcher that claimed the task. The same applies to
``<root>/kanban/workspaces/`` and ``<root>/kanban/logs/``.
In standard installs ``<root>`` is ``~/.hermes``. In Docker / custom
deployments where ``HERMES_HOME`` points outside ``~/.hermes`` (e.g.
``/opt/hermes``), ``<root>`` is ``HERMES_HOME``. Three env-var overrides
are available (highest precedence first, all optional):
* ``HERMES_KANBAN_DB`` pin the database file path directly.
* ``HERMES_KANBAN_WORKSPACES_ROOT`` pin the workspaces root directly.
* ``HERMES_KANBAN_HOME`` pin the umbrella root that anchors all three
kanban paths (db + workspaces + logs). Useful for tests and unusual
deployments where a single override is enough.
The dispatcher injects ``HERMES_KANBAN_DB`` and
``HERMES_KANBAN_WORKSPACES_ROOT`` into the worker subprocess env as a
defense-in-depth measure: even if the worker's ``get_default_hermes_root()``
resolution somehow disagrees with the dispatcher's (unusual symlink or
Docker layout), the two processes still converge on the same files.
Schema is intentionally small: tasks, task_links, task_comments,
task_events. The ``workspace_kind`` field decouples coordination from git
@@ -61,16 +81,57 @@ _CTX_MAX_COMMENT_BYTES = 2 * 1024 # 2 KB per comment
# Paths
# ---------------------------------------------------------------------------
def kanban_home() -> Path:
"""Return the shared Hermes root that anchors the kanban board.
Resolution order:
1. ``HERMES_KANBAN_HOME`` env var when set and non-empty (explicit
override for tests and unusual deployments).
2. ``get_default_hermes_root()``, which already returns ``<root>``
when ``HERMES_HOME`` is ``<root>/profiles/<name>``, and returns
``HERMES_HOME`` directly for Docker / custom deployments.
The kanban board is shared across profiles **by design** (see the
module docstring). Resolving the kanban paths through the active
profile's ``HERMES_HOME`` would silently fork the board per profile,
which breaks the dispatcher / worker handoff.
"""
override = os.environ.get("HERMES_KANBAN_HOME", "").strip()
if override:
return Path(override).expanduser()
from hermes_constants import get_default_hermes_root
return get_default_hermes_root()
def kanban_db_path() -> Path:
"""Return the path to ``kanban.db`` inside the active HERMES_HOME."""
from hermes_constants import get_hermes_home
return get_hermes_home() / "kanban.db"
"""Return the path to the shared ``kanban.db``.
Anchored at :func:`kanban_home`, not the active profile's
``HERMES_HOME``, so profile workers and the dispatcher converge on
the same board. ``HERMES_KANBAN_DB`` pins the path directly (highest
precedence) the dispatcher injects this into worker subprocess env
as defense-in-depth.
"""
override = os.environ.get("HERMES_KANBAN_DB", "").strip()
if override:
return Path(override).expanduser()
return kanban_home() / "kanban.db"
def workspaces_root() -> Path:
"""Return the directory under which ``scratch`` workspaces are created."""
from hermes_constants import get_hermes_home
return get_hermes_home() / "kanban" / "workspaces"
"""Return the directory under which ``scratch`` workspaces are created.
Anchored at :func:`kanban_home` so workspace paths are stable across
profile workers spawned by the dispatcher.
``HERMES_KANBAN_WORKSPACES_ROOT`` pins the path directly (highest
precedence) the dispatcher injects this into worker subprocess env
as defense-in-depth.
"""
override = os.environ.get("HERMES_KANBAN_WORKSPACES_ROOT", "").strip()
if override:
return Path(override).expanduser()
return kanban_home() / "kanban" / "workspaces"
# ---------------------------------------------------------------------------
@@ -1516,12 +1577,15 @@ def archive_task(conn: sqlite3.Connection, task_id: str) -> bool:
def resolve_workspace(task: Task) -> Path:
"""Resolve (and create if needed) the workspace for a task.
- ``scratch``: a fresh dir under ``$HERMES_HOME/kanban/workspaces/<id>/``.
- ``scratch``: a fresh dir under ``<kanban-root>/kanban/workspaces/<id>/``,
where ``<kanban-root>`` is the shared Hermes root (see
:func:`kanban_home`). The path is the same for the dispatcher and
every profile worker, so handoff is path-stable.
- ``dir:<path>``: the path stored in ``workspace_path``. Created
if missing. MUST be absolute relative paths are rejected to
prevent confused-deputy traversal where ``../../../tmp/attacker``
resolves against the dispatcher's CWD instead of a meaningful
root. Users who want a HERMES_HOME-relative workspace should
root. Users who want a kanban-root-relative workspace should
compute the absolute path themselves.
- ``worktree``: a git worktree at ``workspace_path``. Not created
automatically in v1 -- the kanban-worker skill documents
@@ -2070,6 +2134,14 @@ def _default_spawn(task: Task, workspace: str) -> Optional[int]:
env["HERMES_TENANT"] = task.tenant
env["HERMES_KANBAN_TASK"] = task.id
env["HERMES_KANBAN_WORKSPACE"] = workspace
# Pin the shared board + workspaces root the dispatcher resolved, so
# that even when the worker activates a profile (`hermes -p <name>`
# rewrites HERMES_HOME), its kanban paths still match the
# dispatcher's. Belt-and-braces with the `get_default_hermes_root()`
# resolution in `kanban_home()` — symmetric resolution is the norm,
# but unusual symlink / Docker layouts are caught here too.
env["HERMES_KANBAN_DB"] = str(kanban_db_path())
env["HERMES_KANBAN_WORKSPACES_ROOT"] = str(workspaces_root())
# HERMES_PROFILE is the author the kanban_comment tool defaults to.
# `hermes -p <assignee>` activates the profile, but the env var is
# what the tool reads — set it explicitly here so comments are
@@ -2104,9 +2176,10 @@ def _default_spawn(task: Task, workspace: str) -> Optional[int]:
"chat",
"-q", prompt,
])
# Redirect output to a per-task log under HERMES_HOME/kanban/logs/.
from hermes_constants import get_hermes_home
log_dir = get_hermes_home() / "kanban" / "logs"
# Redirect output to a per-task log under <kanban-root>/kanban/logs/.
# Anchored at the shared kanban root, not the worker's profile home,
# so `hermes kanban tail` reads the same file the worker writes to.
log_dir = kanban_home() / "kanban" / "logs"
log_dir.mkdir(parents=True, exist_ok=True)
log_path = log_dir / f"{task.id}.log"
_rotate_worker_log(log_path, DEFAULT_LOG_ROTATE_BYTES)
@@ -2591,8 +2664,7 @@ def gc_worker_logs(
"""Delete worker log files older than ``older_than_seconds``. Returns
the number of files removed. Kept separate from ``gc_events`` because
log files live on disk, not in SQLite."""
from hermes_constants import get_hermes_home
log_dir = get_hermes_home() / "kanban" / "logs"
log_dir = kanban_home() / "kanban" / "logs"
if not log_dir.exists():
return 0
cutoff = time.time() - older_than_seconds
@@ -2614,8 +2686,7 @@ def gc_worker_logs(
def worker_log_path(task_id: str) -> Path:
"""Return the path to a worker's log file. The file may not exist
(task never spawned, or log already GC'd)."""
from hermes_constants import get_hermes_home
return get_hermes_home() / "kanban" / "logs" / f"{task_id}.log"
return kanban_home() / "kanban" / "logs" / f"{task_id}.log"
def read_worker_log(
@@ -2661,7 +2732,8 @@ def list_profiles_on_disk() -> list[str]:
``config.yaml`` a bare dir without config isn't a real profile.
"""
try:
home = Path.home() / ".hermes" / "profiles"
from hermes_constants import get_default_hermes_root
home = get_default_hermes_root() / "profiles"
except Exception:
return []
if not home.is_dir():
+35 -4
View File
@@ -837,7 +837,17 @@ def _print_tui_exit_summary(session_id: Optional[str], active_session_file: Opti
)
_NPM_LOCK_RUNTIME_KEYS = frozenset({"ideallyInert"})
_NPM_LOCK_RUNTIME_KEYS = frozenset({"ideallyInert", "peer"})
"""Lockfile fields npm writes non-deterministically at install time.
``ideallyInert`` is npm's runtime annotation for packages it skipped installing
(per-platform opt-outs). ``peer`` is dropped from the hidden ``.package-lock.json``
on dev-dependencies that are *also* declared as peers the canonical
``package-lock.json`` records the dual role, but npm 9's actualized tree strips
it. Neither key represents a real skew between what was declared and what was
installed, so we exclude them from the comparison in :func:`_tui_need_npm_install`
to avoid false-positive reinstalls on every launch.
"""
def _tui_need_npm_install(root: Path) -> bool:
@@ -1042,17 +1052,21 @@ def _make_tui_argv(tui_dir: Path, tui_dev: bool) -> tuple[list[str], Path]:
if _tui_need_npm_install(tui_dir):
if not os.environ.get("HERMES_QUIET"):
print("Installing TUI dependencies…")
# Capture stdout as well as stderr — some npm errors (notably EACCES on a
# root-owned node_modules in containers) are emitted on stdout, and a
# bare "npm install failed." with no preview defeats debugging. We keep
# the failure-only print path so a successful install stays silent.
result = subprocess.run(
[npm, "install", "--silent", "--no-fund", "--no-audit", "--progress=false"],
cwd=str(tui_dir),
stdout=subprocess.DEVNULL,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
env={**os.environ, "CI": "1"},
)
if result.returncode != 0:
err = (result.stderr or "").strip()
preview = "\n".join(err.splitlines()[-30:])
combined = f"{result.stdout or ''}\n{result.stderr or ''}".strip()
preview = "\n".join(combined.splitlines()[-30:])
print("npm install failed.")
if preview:
print(preview)
@@ -8460,6 +8474,12 @@ def main():
)
slack_parser.set_defaults(func=cmd_slack)
# =========================================================================
# send command — pipe shell-script output to any configured platform
# =========================================================================
from hermes_cli.send_cmd import register_send_subparser
register_send_subparser(subparsers)
# =========================================================================
# login command
# =========================================================================
@@ -8891,6 +8911,7 @@ Examples:
hermes debug share --lines 500 Include more log lines
hermes debug share --expire 30 Keep paste for 30 days
hermes debug share --local Print report locally (no upload)
hermes debug share --no-redact Disable upload-time secret redaction
hermes debug delete <url> Delete a previously uploaded paste
""",
)
@@ -8916,6 +8937,16 @@ Examples:
action="store_true",
help="Print the report locally instead of uploading",
)
share_parser.add_argument(
"--no-redact",
action="store_true",
help=(
"Disable upload-time secret redaction (default: redact). Logs "
"are normally run through agent.redact.redact_sensitive_text "
"with force=True before upload so credentials are not leaked "
"into the public paste service."
),
)
delete_parser = debug_sub.add_parser(
"delete",
help="Delete a paste uploaded by 'hermes debug share'",
+20
View File
@@ -904,6 +904,26 @@ def switch_model(
if any(m.get("name") == new_model for m in cfg_models if isinstance(m, dict)):
override = True
break
# Also check custom_providers list — models declared there should be accepted
# even if the remote /v1/models endpoint doesn't list them.
if not override and custom_providers and isinstance(custom_providers, list):
for entry in custom_providers:
if not isinstance(entry, dict):
continue
# Match by provider slug (custom:<name>) or by base_url
entry_name = entry.get("name", "")
entry_slug = f"custom:{entry_name}" if entry_name else ""
entry_url = entry.get("base_url", "")
if entry_slug == target_provider or entry_url == base_url:
# Check if the requested model matches the entry's model
entry_model = entry.get("model", "")
entry_models = entry.get("models", {})
if new_model == entry_model:
override = True
break
if isinstance(entry_models, dict) and new_model in entry_models:
override = True
break
if override:
validation = {"accepted": True, "persist": True, "recognized": False, "message": validation.get("message", "")}
else:
+1 -1
View File
@@ -3087,7 +3087,7 @@ def validate_requested_model(
"message": f"Model `{requested}` was not found in LM Studio's model listing.",
}
if normalized == "custom":
if normalized == "custom" or normalized.startswith("custom:"):
# Try probing with correct auth for the api_mode.
if api_mode == "anthropic_messages":
probe = probe_api_models(api_key, base_url, api_mode=api_mode)
+445
View File
@@ -0,0 +1,445 @@
"""CLI subcommand: ``hermes send`` — pipe text from shell scripts to any
configured messaging platform (Telegram, Discord, Slack, Signal, SMS, etc.).
This is a thin wrapper around ``tools.send_message_tool.send_message_tool``
that exposes its functionality as a standalone CLI entry point so ops
scripts, cron jobs, CI hooks, and monitoring daemons can reuse the gateway's
already-configured credentials without having to reimplement each platform's
REST API client.
Design notes:
* No LLM, no agent loop the subcommand just resolves arguments, reads the
message body, calls the shared tool function, and prints/returns the
result. It is intentionally fast, cheap, and side-effect-only.
* For platforms that send via bot token (Telegram, Discord, Slack, Signal,
SMS, WhatsApp-CloudAPI, ) no running gateway is required. The tool
talks directly to each platform's REST endpoint. For platforms that rely
on a persistent adapter connection (plugin platforms, Matrix in some
modes, ) a live gateway is needed; the underlying tool surfaces that
error to the caller.
* Exit codes follow the classic Unix convention:
0 delivery (or list) succeeded
1 delivery failed at the platform level
2 usage / argument / config error (argparse already uses 2)
"""
from __future__ import annotations
import argparse
import json
import sys
from pathlib import Path
from typing import Optional
_USAGE_EXIT = 2
_FAILURE_EXIT = 1
_SUCCESS_EXIT = 0
def _read_message_body(
positional: Optional[str],
file_path: Optional[str],
) -> Optional[str]:
"""Resolve the message body from (in order):
1. An explicit positional message argument.
2. ``--file PATH`` or ``--file -`` (where ``-`` means stdin).
3. Piped stdin when it is not attached to a TTY.
Returns ``None`` when nothing is available callers must treat that as
a usage error.
"""
if positional:
return positional
if file_path:
if file_path == "-":
return sys.stdin.read()
try:
return Path(file_path).read_text()
except OSError as exc:
print(f"hermes send: cannot read {file_path}: {exc}", file=sys.stderr)
sys.exit(_USAGE_EXIT)
# Piped input: only consume stdin when it is not a TTY. Reading from a
# TTY would block the user in a half-broken "type your message" state,
# which is a poor default for an ops CLI.
if not sys.stdin.isatty():
data = sys.stdin.read()
if data:
return data
return None
def _resolve_target(arg_to: Optional[str]) -> Optional[str]:
"""Return a cleaned ``--to`` value, or ``None`` when nothing is set."""
if arg_to and arg_to.strip():
return arg_to.strip()
return None
def _emit_result(
result_json: str,
*,
json_mode: bool,
quiet: bool,
) -> int:
"""Print the tool result in the requested format and return the exit code.
The underlying ``send_message_tool`` always returns a JSON string. We
parse it, decide success/failure, and format accordingly.
"""
try:
payload = json.loads(result_json) if result_json else {}
except json.JSONDecodeError:
# Shouldn't happen with the shared tool, but be defensive — pass the
# raw string through so the user can still see what went wrong.
payload = {"error": "invalid JSON from send_message_tool", "raw": result_json}
if json_mode:
print(json.dumps(payload, indent=2))
elif quiet:
pass
else:
if payload.get("error"):
print(f"hermes send: {payload['error']}", file=sys.stderr)
elif payload.get("success"):
note = payload.get("note")
if note:
print(note)
else:
print("sent")
else:
# Unknown shape — dump it so nothing is silently dropped.
print(json.dumps(payload, indent=2))
if payload.get("error"):
return _FAILURE_EXIT
if payload.get("skipped"):
return _SUCCESS_EXIT
if payload.get("success"):
return _SUCCESS_EXIT
# Unknown / unexpected — treat as failure so scripts notice.
return _FAILURE_EXIT
def _list_targets(platform_filter: Optional[str], *, json_mode: bool) -> int:
"""Print the channel directory (all configured targets across platforms).
Uses ``load_directory()`` for structured JSON output and
``format_directory_for_display()`` for the human-readable rendering that
the send_message tool itself shows to the model keeps the two surfaces
identical.
"""
try:
from gateway.channel_directory import (
format_directory_for_display,
load_directory,
)
except Exception as exc:
print(f"hermes send: failed to load channel directory: {exc}", file=sys.stderr)
return _FAILURE_EXIT
try:
raw = load_directory()
except Exception as exc:
print(f"hermes send: failed to read channel directory: {exc}", file=sys.stderr)
return _FAILURE_EXIT
platforms = dict(raw.get("platforms") or {})
if platform_filter:
key = platform_filter.strip().lower()
filtered = {k: v for k, v in platforms.items() if k.lower() == key}
if not filtered:
print(
f"hermes send: no targets found for platform '{platform_filter}'. "
f"Configured: {', '.join(sorted(platforms)) or '(none)'}",
file=sys.stderr,
)
return _FAILURE_EXIT
platforms = filtered
if json_mode:
print(json.dumps({"platforms": platforms}, indent=2, default=str))
return _SUCCESS_EXIT
if not any(platforms.values()):
print("No messaging platforms configured or no channels discovered yet.")
print("Set one up with `hermes gateway setup`, or run the gateway once so")
print("channel discovery can populate ~/.hermes/channel_directory.json.")
return _SUCCESS_EXIT
# Human display — when unfiltered, reuse the shared formatter the agent
# already sees. When filtered, build a minimal view ourselves.
if platform_filter is None:
print(format_directory_for_display())
return _SUCCESS_EXIT
for plat_name in sorted(platforms):
channels = platforms[plat_name]
print(f"{plat_name}:")
if not channels:
print(" (no channels discovered yet)")
continue
for ch in channels:
name = ch.get("name", "?")
chat_id = ch.get("id") or ch.get("chat_id") or ""
suffix = f" [{chat_id}]" if chat_id and chat_id != name else ""
print(f" {plat_name}:{name}{suffix}")
print()
return _SUCCESS_EXIT
def _load_hermes_env() -> None:
"""Populate ``os.environ`` from ``~/.hermes/.env`` AND bridge top-level
``config.yaml`` keys into the environment so the underlying gateway
config loader sees platform credentials and home channel IDs.
``send_message_tool`` reads tokens and home-channel IDs via
``os.getenv(...)`` on each call. The gateway process does two things at
startup that ``hermes send`` must replicate when invoked standalone:
1. ``load_dotenv(~/.hermes/.env)`` brings bot tokens into the env.
2. Bridge top-level simple values from ``~/.hermes/config.yaml`` into
``os.environ`` (without overriding existing env vars). This is where
``TELEGRAM_HOME_CHANNEL`` and friends live when the user saved them
via ``hermes config set``.
See ``gateway/run.py`` for the canonical version of this bridge we
intentionally reimplement the minimum needed here so ``hermes send``
doesn't pull in the full gateway module just to resolve a home channel.
"""
# Step 1: dotenv
try:
from dotenv import load_dotenv
except Exception:
load_dotenv = None # type: ignore[assignment]
try:
from hermes_cli.config import get_hermes_home
home = get_hermes_home()
except Exception:
return
env_path = home / ".env"
if load_dotenv and env_path.exists():
try:
load_dotenv(str(env_path), override=True, encoding="utf-8")
except UnicodeDecodeError:
try:
load_dotenv(str(env_path), override=True, encoding="latin-1")
except Exception:
pass
except Exception:
pass
# Step 2: bridge top-level config.yaml values into the environment so
# gateway.config.load_gateway_config() sees them. Scalars only; don't
# override values already in the env.
import os
config_path = home / "config.yaml"
if not config_path.exists():
return
try:
import yaml # type: ignore[import-not-found]
except Exception:
return
try:
with open(config_path, "r", encoding="utf-8") as fh:
raw = yaml.safe_load(fh) or {}
except Exception:
return
try:
from hermes_cli.config import _expand_env_vars
raw = _expand_env_vars(raw)
except Exception:
pass
if not isinstance(raw, dict):
return
for key, val in raw.items():
if not isinstance(val, (str, int, float, bool)):
continue
if key in os.environ:
continue
os.environ[key] = str(val)
def cmd_send(args: argparse.Namespace) -> None:
"""Entry point wired into the top-level argparse dispatcher."""
# Bridge ~/.hermes/.env and ~/.hermes/config.yaml into os.environ so the
# gateway config loader (invoked downstream by send_message_tool and by
# the channel directory) can see platform credentials and home channels.
_load_hermes_env()
# --list short-circuits everything else.
if getattr(args, "list_targets", False):
# When `--list telegram` is used, argparse stores "telegram" in the
# `message` positional (since list_targets takes no argument).
platform_filter = getattr(args, "message", None)
exit_code = _list_targets(platform_filter, json_mode=getattr(args, "json", False))
sys.exit(exit_code)
target = _resolve_target(getattr(args, "to", None))
if not target:
print(
"hermes send: --to PLATFORM[:channel[:thread]] is required\n"
"Examples:\n"
" hermes send --to telegram \"hello\"\n"
" hermes send --to discord:#ops --file report.md\n"
" hermes send --list # list available targets",
file=sys.stderr,
)
sys.exit(_USAGE_EXIT)
message = _read_message_body(
getattr(args, "message", None),
getattr(args, "file", None),
)
if message is None or not message.strip():
print(
"hermes send: no message provided. Pass text as a positional "
"argument, use --file PATH, or pipe data via stdin.",
file=sys.stderr,
)
sys.exit(_USAGE_EXIT)
# Optional: prepend a subject line. Useful for alerting scripts that
# want a consistent header without inlining it into every call.
subject = getattr(args, "subject", None)
if subject:
message = f"{subject}\n\n{message.lstrip()}"
# Import lazily so `hermes send --help` stays fast and does not pull in
# the full tool registry / gateway config stack.
from tools.send_message_tool import send_message_tool
# send_message_tool auto-loads gateway config + env and routes to the
# appropriate platform adapter (bot-token path for Telegram/Discord/Slack/
# Signal/SMS/WhatsApp; live-adapter path for plugin platforms).
#
# It expects the standard tool-call dict and returns a JSON string.
tool_args = {
"action": "send",
"target": target,
"message": message,
}
result = send_message_tool(tool_args)
exit_code = _emit_result(
result,
json_mode=getattr(args, "json", False),
quiet=getattr(args, "quiet", False),
)
sys.exit(exit_code)
def register_send_subparser(subparsers) -> argparse.ArgumentParser:
"""Create the ``send`` subparser and return it.
Kept as a standalone function so the top-level parser builder can wire
it in next to the other messaging subcommands without cluttering
``_parser.py`` or ``main.py``.
"""
parser = subparsers.add_parser(
"send",
help="Send a message to a configured platform (scripts, cron jobs, CI).",
description=(
"Pipe text from any shell script to any messaging platform Hermes "
"is already configured for. Reuses the gateway's platform "
"credentials (~/.hermes/.env + ~/.hermes/config.yaml) — no LLM, "
"no agent loop, no running gateway required for bot-token "
"platforms like Telegram/Discord/Slack/Signal."
),
epilog=(
"Examples:\n"
" hermes send --to telegram \"deploy finished\"\n"
" echo \"RAM 92%\" | hermes send --to telegram:-1001234567890\n"
" hermes send --to discord:#ops --file /tmp/report.md\n"
" hermes send --to slack:#eng --subject \"[CI]\" --file build.log\n"
" hermes send --list # all platforms\n"
" hermes send --list telegram # filter by platform\n"
"\n"
"Exit codes: 0 ok, 1 delivery/backend error, 2 usage error."
),
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
"-t",
"--to",
metavar="TARGET",
default=None,
help=(
"Delivery target. Format: 'platform' (home channel), "
"'platform:chat_id', 'platform:chat_id:thread_id', or "
"'platform:#channel-name'. Examples: telegram, "
"telegram:-1001234567890:17585, discord:#ops, slack:C0123ABCD, "
"signal:+15551234567."
),
)
parser.add_argument(
"message",
nargs="?",
default=None,
help="Message text. If omitted, read from --file or stdin.",
)
# Legacy / convenience positional removed — use --to for clarity.
parser.add_argument(
"-f",
"--file",
metavar="PATH",
default=None,
help="Read message body from PATH. Use '-' to force stdin.",
)
parser.add_argument(
"-s",
"--subject",
metavar="LINE",
default=None,
help="Prepend a subject/header line before the message body.",
)
parser.add_argument(
"-l",
"--list",
dest="list_targets",
action="store_true",
default=False,
help="List available targets. Optional positional filter: `hermes send --list telegram`.",
)
parser.add_argument(
"-q",
"--quiet",
action="store_true",
default=False,
help="Suppress stdout on success (exit code only).",
)
parser.add_argument(
"--json",
action="store_true",
default=False,
help="Emit raw JSON result instead of human-readable output.",
)
parser.set_defaults(func=cmd_send)
return parser
__all__ = ["cmd_send", "register_send_subparser"]
+34 -7
View File
@@ -1328,15 +1328,13 @@ def setup_terminal_backend(config: dict):
print_success("Terminal backend: Local")
print_info("Commands run directly on this machine.")
# CWD for messaging
# Gateway/cron working directory
print()
print_info("Working directory for messaging sessions:")
print_info(" When using Hermes via Telegram/Discord, this is where")
print_info(
" the agent starts. CLI mode always starts in the current directory."
)
print_info("Gateway working directory:")
print_info(" Used by Telegram/Discord/cron sessions.")
print_info(" CLI/TUI always uses your launch directory instead.")
current_cwd = cfg_get(config, "terminal", "cwd", default="")
cwd = prompt(" Messaging working directory", current_cwd or str(Path.home()))
cwd = prompt(" Gateway working directory", current_cwd or str(Path.home()))
if cwd:
config["terminal"]["cwd"] = cwd
@@ -2049,6 +2047,16 @@ def _setup_slack():
print_warning("⚠️ No Slack allowlist set - unpaired users will be denied by default.")
print_info(" Set SLACK_ALLOW_ALL_USERS=true or GATEWAY_ALLOW_ALL_USERS=true only if you intentionally want open workspace access.")
print()
print_info("📬 Home Channel: where Hermes delivers cron job results,")
print_info(" cross-platform messages, and notifications.")
print_info(" To get a channel ID: open the channel in Slack, then right-click")
print_info(" the channel name → Copy link — the ID starts with C (e.g. C01ABC2DE3F).")
print_info(" You can also set this later by typing /set-home in a Slack channel.")
home_channel = prompt("Home channel ID (leave empty to set later with /set-home)")
if home_channel:
save_env_value("SLACK_HOME_CHANNEL", home_channel.strip())
def _write_slack_manifest_and_instruct():
"""Generate the Slack manifest, write it under HERMES_HOME, and print
@@ -2995,6 +3003,21 @@ def run_setup_wizard(args):
config = load_config()
hermes_home = get_hermes_home()
# Back up existing config before setup modifies it (#3522)
config_path = get_config_path()
if config_path.exists():
from datetime import datetime as _dt
_backup_path = config_path.with_suffix(
f".yaml.bak.{_dt.now().strftime('%Y%m%d_%H%M%S')}"
)
try:
import shutil
shutil.copy2(config_path, _backup_path)
except Exception:
_backup_path = None
else:
_backup_path = None
# Detect non-interactive environments (headless SSH, Docker, CI/CD)
non_interactive = getattr(args, 'non_interactive', False)
if not non_interactive and not is_interactive_stdin():
@@ -3164,6 +3187,10 @@ def run_setup_wizard(args):
# Save and show summary
save_config(config)
if _backup_path and _backup_path.exists():
print_info(f"Previous config backed up to: {_backup_path}")
print_info("If setup changed a value you customized, restore it with:")
print_info(f" cp {_backup_path} {config_path}")
_print_setup_summary(config, hermes_home)
_offer_launch_chat()
+2 -1
View File
@@ -56,6 +56,7 @@ CONFIGURABLE_TOOLSETS = [
("file", "📁 File Operations", "read, write, patch, search"),
("code_execution", "⚡ Code Execution", "execute_code"),
("vision", "👁️ Vision / Image Analysis", "vision_analyze"),
("video", "🎬 Video Analysis", "video_analyze (requires video-capable model)"),
("image_gen", "🎨 Image Generation", "image_generate"),
("moa", "🧠 Mixture of Agents", "mixture_of_agents"),
("tts", "🔊 Text-to-Speech", "text_to_speech"),
@@ -78,7 +79,7 @@ CONFIGURABLE_TOOLSETS = [
# Toolsets that are OFF by default for new installs.
# They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
# but the setup checklist won't pre-select them for first-time users.
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin"}
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin", "video"}
# Platform-scoped toolsets: only appear in the `hermes tools` checklist for
# these platforms, and only resolve/save for these platforms. A toolset
+13
View File
@@ -470,10 +470,23 @@ except (ValueError, TypeError):
)
_GATEWAY_HEALTH_TIMEOUT = 3.0
# DEPRECATED (scheduled for removal): GATEWAY_HEALTH_URL / GATEWAY_HEALTH_TIMEOUT.
# Cross-container / cross-host gateway liveness detection will be folded into a
# first-class dashboard config key so it's no longer Docker-adjacent lore buried
# in env vars. The env vars still work for now so existing Compose deployments
# don't break. Do not add new callers — wire new uses through the planned
# config surface.
def _probe_gateway_health() -> tuple[bool, dict | None]:
"""Probe the gateway via its HTTP health endpoint (cross-container).
.. deprecated::
Driven by the deprecated ``GATEWAY_HEALTH_URL`` /
``GATEWAY_HEALTH_TIMEOUT`` env vars. Scheduled for removal alongside
a move to a first-class dashboard config key. See
:data:`_GATEWAY_HEALTH_URL` for context.
Uses ``/health/detailed`` first (returns full state), falling back to
the simpler ``/health`` endpoint. Returns ``(is_alive, body_dict)``.
@@ -0,0 +1,206 @@
---
name: kanban-video-orchestrator
description: Plan, set up, and monitor a multi-agent video production pipeline backed by Hermes Kanban. Use when the user wants to make ANY video — narrative film, product/marketing, music video, explainer, ASCII/terminal art, abstract/generative loop, comic, 3D, real-time/installation — and the work warrants decomposition into specialized profiles (writer, designer, animator, renderer, voice, editor, etc.) coordinated through a kanban board. Performs adaptive discovery to scope the brief, designs an appropriate team for the requested style, generates the setup script that creates Hermes profiles + initial kanban task, then helps monitor execution and intervene when tasks stall or fail. Routes scenes to whichever Hermes rendering / audio / design skill fits each beat (`ascii-video`, `manim-video`, `p5js`, `comfyui`, `touchdesigner-mcp`, `blender-mcp`, `pixel-art`, `baoyu-comic`, `claude-design`, `excalidraw`, `songsee`, `heartmula`, …) plus external APIs for TTS, image-gen, and image-to-video as needed.
version: 1.0.0
author: [SHL0MS, alt-glitch]
license: MIT
metadata:
hermes:
tags: [video, kanban, multi-agent, orchestration, production-pipeline]
related_skills: [kanban-orchestrator, kanban-worker, ascii-video, manim-video, p5js, comfyui, touchdesigner-mcp, blender-mcp, pixel-art, ascii-art, songwriting-and-ai-music, heartmula, songsee, spotify, youtube-content, claude-design, excalidraw, architecture-diagram, concept-diagrams, baoyu-comic, baoyu-infographic, humanizer, gif-search, meme-generation]
credits: |
The single-project workspace layout, profile-config patching pattern,
SOUL.md-per-profile model, TEAM.md task-graph convention, and
`--workspace dir:<path>` discipline are adapted from alt-glitch's
original multi-agent video pipeline at
https://github.com/NousResearch/kanban-video-pipeline.
---
# Kanban Video Orchestrator
Wrap any video request — from a 15-second product teaser to a 5-minute narrative
short to a music video to an ASCII loop — in a Hermes Kanban pipeline that
decomposes the work to specialized agent profiles.
This skill does **not** render anything itself. It is a meta-pipeline that:
1. **Scopes** the request through targeted discovery
2. **Designs** an appropriate team (which roles, which tools per role) based on the style
3. **Generates** a setup script that creates Hermes profiles, project workspace, and the initial kanban task
4. **Hands off** to the director profile, which decomposes via the kanban
5. **Monitors** execution, helps intervene when tasks stall or fail
The actual rendering happens inside the kanban once it's running, via whichever
existing skills + tools fit the scenes — `ascii-video`, `manim-video`, `p5js`,
`comfyui`, `touchdesigner-mcp`, `blender-mcp`, `songwriting-and-ai-music`,
`heartmula`, external APIs, or plain Python with PIL + ffmpeg.
## When NOT to use this skill
- The video is one continuous procedural project that needs no specialists. Just write the code directly.
- The user wants a quick one-shot conversion (e.g. "convert this mp4 to a GIF") — use ffmpeg directly.
- The output is a static image, GIF, or audio-only artifact — use the matching specific skill (`ascii-art`, `gifs`, `meme-generation`, `songwriting-and-ai-music`).
- The work fits a single existing skill cleanly (e.g. a pure ASCII video — just use `ascii-video`).
## Workflow
```
DISCOVER → BRIEF → TEAM DESIGN → SETUP → EXECUTE → MONITOR
```
### Step 1 — Discover (ask the right questions)
The discovery process is **adaptive**: ask only what is actually needed. Always
start with three questions to identify the broad shape:
- **What is the video?** (one-sentence brief)
- **How long?** (5-30s teaser / 30-90s short / 90s-3min explainer / 3-10min film / longer)
- **What aspect ratio + target platform?** (1:1 / 9:16 / 16:9; X, IG, YouTube, internal, etc.)
From the answer, classify the style category. The style determines which
follow-up questions to ask. **Do not ask all questions at once.** Ask 2-4 at a
time, listen, then proceed. Make reasonable assumptions whenever the user
implies an answer.
For complete intake patterns and per-style question banks, see
**[references/intake.md](references/intake.md)**.
### Step 2 — Brief
Once enough is known, produce a structured `brief.md` using the template in
`assets/brief.md.tmpl`. Stages:
1. **Concept** — the one-sentence pitch + emotional north star
2. **Scope** — duration, aspect, platform, deadline
3. **Style** — visual references, brand constraints, tone
4. **Scenes** — beat-by-beat breakdown (durations, content, target tool)
5. **Audio** — narration / music / SFX / silent (per scene if needed)
6. **Deliverables** — file format, resolution, optional alternates (vertical cut, GIF, etc.)
Show the brief to the user for confirmation before designing the team. **The
brief is the contract** — every downstream task references it.
### Step 3 — Team design
Pick role archetypes from the library that fit this video. **Compose, don't
clone.** Most videos need 4-7 profiles. The director is always present; the
rest are picked by what the brief actually requires.
For the role library and per-style team compositions, see
**[references/role-archetypes.md](references/role-archetypes.md)**.
For mapping role → which Hermes skills + toolsets it loads, see
**[references/tool-matrix.md](references/tool-matrix.md)**.
### Step 4 — Setup
Generate a setup script (`setup.sh`) and run it. The script:
1. Creates the project workspace (`~/projects/video-pipeline/<slug>/`)
2. Copies any provided assets into `taste/`, `audio/`, `assets/`
3. Creates each Hermes profile via `hermes profile create --clone`
4. Writes per-profile `SOUL.md` (personality + role definition)
5. Configures profile YAML (toolsets, always_load skills, cwd)
6. Writes `brief.md`, `TEAM.md`, and `taste/` content
7. Fires the initial `hermes kanban create` task assigned to the director
Use `scripts/bootstrap_pipeline.py` to generate setup.sh from a brief +
team-design JSON. See **[references/kanban-setup.md](references/kanban-setup.md)**
for the setup script structure, profile config patterns, and the critical
"shared workspace" rule.
### Step 5 — Execute
Run `setup.sh`. Then provide the user with monitoring commands:
```bash
hermes kanban watch --tenant <project-tenant> # live events
hermes kanban list --tenant <project-tenant> # board snapshot
hermes dashboard # visual board UI
```
The director profile takes over from here, decomposing the work and routing
tasks to specialist profiles via the kanban toolset.
### Step 6 — Monitor and intervene
Stay engaged — the kanban runs autonomously but a stuck task or bad output
needs human (or AI) judgment.
Monitoring patterns: poll `kanban list` periodically, inspect any RUNNING task
that exceeds its expected duration with `kanban show <id>`, and check
heartbeats. When a worker's output fails review, the standard interventions are:
1. Comment on the worker's task with specific feedback (`kanban_comment`)
2. Create a re-run task with the original as parent
3. Adjust the brief's scope and let the director re-decompose
For diagnostic patterns, intervention recipes, and the "task is stuck"
playbook, see **[references/monitoring.md](references/monitoring.md)**.
## Reference: worked examples
Six concrete pipelines covering very different video styles — narrative film,
product/marketing, music video, math/algorithm explainer, ASCII video, real-time
installation — showing how the same workflow yields very different teams and
task graphs. See **[references/examples.md](references/examples.md)**.
## Critical rules
1. **Discovery before action.** Never start generating a brief or team without
asking at least the three baseline questions. A bad brief cascades through
the entire pipeline.
2. **Match the team to the video.** Don't reuse the same 4-profile setup for
every job. A music video that doesn't have a beat-analysis profile will
misfire. A narrative film that doesn't have a writer profile will produce
incoherent scenes. See `references/role-archetypes.md`.
3. **One workspace per project.** All profiles for a given video share the same
`dir:` workspace. Tasks pass artifacts via shared filesystem and structured
handoffs. **Every** `kanban_create` call passes
`workspace_kind="dir"` + `workspace_path="<absolute project path>"`.
4. **Tenant every project.** Use a project-specific tenant
(`--tenant <project-slug>`). Keeps the dashboard scoped and prevents
cross-pollination with other ongoing kanbans.
5. **Respect existing skills.** When a scene fits an existing skill, the
relevant renderer should load that skill via `--skill <name>` on its task
or `always_load` in its profile. Do not re-derive what a skill already
provides.
6. **The director never executes.** Even with the full `kanban + terminal +
file` toolset, the director's `SOUL.md` rules forbid it from executing
work itself. It decomposes and routes only — every concrete task becomes
a `hermes kanban create` call to a specialist profile. The
`kanban-orchestrator` skill spells this out further.
7. **Don't over-decompose.** A 30-second product video does NOT need 20 tasks.
Aim for the smallest task graph that still parallelizes well and exposes the
right human-review gates.
8. **Verify API keys BEFORE firing.** External APIs (TTS, image-gen,
image-to-video) need keys in `~/.hermes/.env` or the user's secret store.
A worker that hits a missing-key error wastes a task slot. The setup
script's `check_key` helper aborts cleanly if a required key is missing.
## File map
```
SKILL.md ← this file (workflow + rules)
references/
intake.md ← discovery question banks per style
role-archetypes.md ← role library (writer, designer, animator, …)
tool-matrix.md ← skill + toolset mapping per role
kanban-setup.md ← setup script structure & profile config
monitoring.md ← watch + intervene patterns
examples.md ← six worked pipelines
assets/
brief.md.tmpl ← brief skeleton
setup.sh.tmpl ← setup script skeleton
soul.md.tmpl ← profile personality skeleton
scripts/
bootstrap_pipeline.py ← generate setup.sh from brief + team JSON
monitor.py ← polling + intervention helpers
```
@@ -0,0 +1,79 @@
# Video Brief — {{TITLE}}
> Slug: `{{SLUG}}` · Tenant: `{{TENANT}}` · Project workspace: `{{WORKSPACE}}`
## 1. Concept
**One-line pitch.** {{ONE_LINE_PITCH}}
**Emotional north star.** {{EMOTIONAL_NORTH_STAR}}
*(What should the viewer feel walking away?)*
## 2. Scope
| | |
|---|---|
| Duration | {{DURATION_S}} seconds |
| Aspect ratio | {{ASPECT}} |
| Resolution | {{RESOLUTION}} |
| Frame rate | {{FPS}} fps |
| Target platforms | {{PLATFORMS}} |
| Deadline | {{DEADLINE}} |
| Quality bar | {{QUALITY_BAR}} *(rough draft / polished / archival)* |
## 3. Style
**Visual references.** {{VISUAL_REFS}}
**Tone.** {{TONE}}
**Brand constraints.** {{BRAND_CONSTRAINTS}}
*(colors, typography, motion language; or "n/a")*
**Aesthetic rules.**
{{AESTHETIC_RULES}}
## 4. Scenes
Beat-by-beat breakdown. Each scene gets a row.
| # | Time | Content | Target tool / skill | Audio | Notes |
|---|------|---------|---------------------|-------|-------|
| 1 | 0:000:0X | {{SCENE_1_CONTENT}} | {{SCENE_1_TOOL}} | {{SCENE_1_AUDIO}} | {{SCENE_1_NOTES}} |
| 2 | 0:0X0:0Y | ... | ... | ... | ... |
## 5. Audio
**Approach.** {{AUDIO_APPROACH}}
*(narration / music-only / synced to track / silent / mixed)*
**Voiceover.** {{VO_DETAILS}}
*(provider, voice, language, script source — "n/a" if no VO)*
**Music.** {{MUSIC_DETAILS}}
*(provided track path / commission via Suno / commission via heartmula /
license-free / "n/a")*
**SFX.** {{SFX_DETAILS}}
*(generated, library, or "n/a")*
## 6. Deliverables
| Format | Resolution | Notes |
|--------|-----------|-------|
| {{PRIMARY_FORMAT}} | {{PRIMARY_RES}} | The main output |
| {{ALT_FORMAT_1}} | {{ALT_RES_1}} | {{ALT_NOTES_1}} |
**Final filename.** `output/final.mp4`
*(plus optional `output/final-9x16.mp4`, `output/captions.srt`, etc.)*
## 7. Constraints
- API keys required: {{API_KEYS_REQUIRED}}
- External dependencies: {{EXT_DEPS}}
- Source assets to incorporate: {{SOURCE_ASSETS}}
---
**This brief is the contract. The director and every downstream profile read
it. If the brief changes, the kanban must be re-fired — don't edit live.**
@@ -0,0 +1,185 @@
#!/usr/bin/env bash
# ═══════════════════════════════════════════════════════════════════════
# Video Pipeline Setup — {{TITLE}}
#
# Generated by kanban-video-orchestrator skill.
#
# Slug: {{SLUG}}
# Workspace: {{WORKSPACE}}
# Tenant: {{TENANT}}
# ═══════════════════════════════════════════════════════════════════════
set -euo pipefail
PROJECT_SLUG="{{SLUG}}"
WORKSPACE="$HOME/projects/video-pipeline/${PROJECT_SLUG}"
TENANT="{{TENANT}}"
# ─────────────────────────────────────────────────────────────────────
# 1. Verify required API keys
# ─────────────────────────────────────────────────────────────────────
echo "═══ Checking required API keys ═══"
check_key() {
local var="$1"
local kc_account="${2:-hermes}"
local kc_service="${3:-$1}"
if grep -q "^${var}=" "$HOME/.hermes/.env" 2>/dev/null && \
[ -n "$(grep "^${var}=" "$HOME/.hermes/.env" | cut -d= -f2-)" ]; then
echo " ✓ ${var} (env)"
return 0
fi
if command -v security >/dev/null 2>&1 && \
security find-generic-password -a "${kc_account}" -s "${kc_service}" -w >/dev/null 2>&1; then
echo " ✓ ${var} (Keychain ${kc_account}/${kc_service})"
return 0
fi
echo " ✗ ${var} not set in ~/.hermes/.env or Keychain (${kc_account}/${kc_service})"
return 1
}
# Customize this list per project — only check keys actually used:
{{KEY_CHECKS}}
# ─────────────────────────────────────────────────────────────────────
# 2. Create project workspace
# ─────────────────────────────────────────────────────────────────────
echo "═══ Creating project workspace ═══"
mkdir -p "$WORKSPACE"/{taste,audio/{voiceover,sfx},assets,scenes,checkpoints,tools,output}
{{SCENE_DIRS}}
echo " ✓ $WORKSPACE"
# ─────────────────────────────────────────────────────────────────────
# 3. Create Hermes profiles
# ─────────────────────────────────────────────────────────────────────
echo "═══ Creating Hermes profiles ═══"
{{PROFILE_CREATE_COMMANDS}}
# ─────────────────────────────────────────────────────────────────────
# 4. Configure profiles (toolsets, skills, cwd)
# ─────────────────────────────────────────────────────────────────────
echo "═══ Configuring profiles ═══"
configure_profile() {
local profile="$1"
local toolsets_json="$2" # JSON array string, e.g. '["kanban","terminal","file"]'
local skills_json="$3" # JSON array string, e.g. '["kanban-worker","ascii-video"]'
python3 - "$profile" "$toolsets_json" "$skills_json" "$WORKSPACE" <<'PY'
"""Patch a Hermes profile config.yaml using PyYAML so we don't depend on the
exact default-config string format. Validates the patch took effect and exits
non-zero if anything's off."""
import json
import os
import sys
try:
import yaml
except ImportError:
print("ERROR: PyYAML required. pip install pyyaml", file=sys.stderr)
sys.exit(1)
profile, toolsets_json, skills_json, workspace = sys.argv[1:5]
toolsets = json.loads(toolsets_json)
skills = json.loads(skills_json)
p = os.path.expanduser(f"~/.hermes/profiles/{profile}/config.yaml")
if not os.path.exists(p):
print(f" ✗ profile config not found: {p}", file=sys.stderr)
sys.exit(1)
with open(p) as f:
cfg = yaml.safe_load(f) or {}
# Apply our changes — only the keys we actually want to set.
cfg["toolsets"] = toolsets
cfg.setdefault("skills", {})
cfg["skills"]["always_load"] = skills
# Note: we do NOT touch cfg["approvals"] — that's a security-sensitive
# setting (manual confirmation of tool calls). Workspace cwd is overridden
# per-task by `--workspace dir:<path>` on `hermes kanban create`, so we
# don't need to mutate cfg["terminal"]["cwd"] either.
with open(p, "w") as f:
yaml.safe_dump(cfg, f, sort_keys=False)
# Validate
with open(p) as f:
after = yaml.safe_load(f)
errors = []
if after.get("toolsets") != toolsets:
errors.append(f"toolsets mismatch: {after.get('toolsets')!r}")
if after.get("skills", {}).get("always_load") != skills:
errors.append(f"skills.always_load mismatch: {after.get('skills', {}).get('always_load')!r}")
if errors:
print(f" ✗ {profile}: " + "; ".join(errors), file=sys.stderr)
sys.exit(1)
PY
if [ $? -ne 0 ]; then
echo " ✗ failed to configure ${profile}" >&2
exit 1
fi
echo " ✓ ${profile}"
}
{{PROFILE_CONFIG_COMMANDS}}
# ─────────────────────────────────────────────────────────────────────
# 5. Write SOUL.md per profile
# ─────────────────────────────────────────────────────────────────────
echo "═══ Writing profile personalities ═══"
{{SOUL_WRITES}}
# ─────────────────────────────────────────────────────────────────────
# 6. Copy brief, TEAM.md, and any provided assets
# ─────────────────────────────────────────────────────────────────────
echo "═══ Writing brief + taste ═══"
cat > "$WORKSPACE/brief.md" <<'BRIEF_EOF'
{{BRIEF_CONTENTS}}
BRIEF_EOF
cat > "$WORKSPACE/TEAM.md" <<'TEAM_EOF'
{{TEAM_CONTENTS}}
TEAM_EOF
{{TASTE_WRITES}}
{{ASSET_COPIES}}
# ─────────────────────────────────────────────────────────────────────
# 7. Fire the initial kanban task
# ─────────────────────────────────────────────────────────────────────
echo "═══ Firing initial kanban task ═══"
hermes kanban create "Direct production of {{TITLE}}" \
--assignee director \
--workspace dir:"$WORKSPACE" \
--tenant "$TENANT" \
--priority 2 \
--max-runtime 4h \
--body "$(cat <<EOF
Read brief.md, TEAM.md, and taste/.
Decompose into the team graph defined in TEAM.md.
All child tasks MUST use:
workspace_kind="dir"
workspace_path="$WORKSPACE"
tenant="$TENANT"
Do not execute the work yourself — route every concrete subtask to the
appropriate profile via kanban_create.
EOF
)"
echo ""
echo "═══ Setup complete ═══"
echo ""
echo "Monitor with:"
echo " hermes kanban watch --tenant $TENANT"
echo " hermes kanban list --tenant $TENANT"
echo " hermes dashboard"
echo ""
echo "Workspace: $WORKSPACE"
@@ -0,0 +1,38 @@
# {{ROLE_NAME}}
You are the **{{ROLE_NAME}}** for this video production.
## Project context
- **Brief:** read `brief.md` in your CWD
- **Team graph:** read `TEAM.md` in your CWD
- **Style spec:** read `taste/brand-guide.md` and `taste/emotional-dna.md` in
your CWD
## What you do
{{ROLE_RESPONSIBILITIES}}
## Inputs you read
{{INPUTS_READ}}
## Outputs you produce
{{OUTPUTS_PRODUCED}}
## Tools and skills available
- **Toolsets:** {{TOOLSETS}}
- **Skills loaded:** {{SKILLS}}
- **External APIs / CLIs:** {{EXTERNAL_TOOLS}}
## Rules
{{ROLE_RULES}}
{{COMMON_RULES}}
## Common reference commands
{{COMMON_COMMANDS}}
@@ -0,0 +1,227 @@
# Worked Examples
Six concrete pipelines covering different video styles. Each shows the team
composition, task graph, and skill/tool choices the orchestrator would make
for that brief. **These are illustrative, not templates** — adapt to the
actual brief.
## Example 1 — Narrative short film (text-to-image → image-to-video → cut)
**Brief:** A 90-second noir-style short. A detective walks through a rainy
city. Voiceover narration. AI-generated visuals.
**Team:**
- `director` — vision, decomposition, approval
- `writer` — script + voiceover copy (loads `humanizer` for natural voice)
- `storyboarder` — beat-by-beat shot list (loads `excalidraw`)
- `image-generator` — generates each shot's still via local ComfyUI workflows
(loads `comfyui`)
- `image-to-video-generator` — animates each still (Runway/Kling, OR
ComfyUI's AnimateDiff/WAN workflows via `comfyui`)
- `voice-talent` — narration via ElevenLabs
- `audio-mixer` — VO + ambient pad
- `editor` — assembly + transitions
- `reviewer` — final QA
**Task graph:**
```
T0 director decompose
T1 writer script + voiceover.md (parent: T0)
T2 storyboarder shot list with framing per beat (parent: T1)
T3 image-generator one still per shot (~12 shots) (parent: T2)
T4 image-to-video animate each still (parent: T3)
T5 voice-talent generate narration audio (parent: T1)
T6 audio-mixer mix VO + ambient (parent: T5)
T7 editor cut + transitions + audio mux (parents: T4, T6)
T8 reviewer final QA (parent: T7)
```
**Key choices:**
- Local ComfyUI via `comfyui` skill is preferred over external API for
cost/control — but external APIs are fine if ComfyUI isn't installed
- `editor` profile is ffmpeg-only, no Hermes skill required beyond
`kanban-worker`
- Storyboarder produces `storyboard.excalidraw` alongside the markdown
## Example 2 — Product / marketing teaser
**Brief:** A 30-second product teaser for a developer tool. Shows code +
terminal + UI screen recordings, voiceover, CTA at end. Square 1:1.
**Team:**
- `director`
- `copywriter` — taglines, voiceover script, CTA (loads `humanizer`)
- `concept-artist` — style frames (loads `claude-design` for UI mockups)
- `renderer-motion-graphics` — animated UI sequences (Remotion CLI)
- `renderer-ascii` — terminal-style demo scenes (loads `ascii-video`)
- `voice-talent` — VO via ElevenLabs
- `editor` — assembly + brand-color treatment
- `audio-mixer` — VO + light music bed
- `captioner` — burned subtitles for muted-autoplay platforms
- `masterer` — produces 1:1 + 9:16 + 16:9 variants
**Task graph:**
```
T0 director decompose
T1 copywriter copy.md + cta + vo script (parent: T0)
T2 concept-artist visual-spec.md + style frames (parent: T1)
T3a renderer-motion-graphics scene 1: UI sequence (parent: T2)
T3b renderer-ascii scene 2: terminal demo (parent: T2)
T3c renderer-motion-graphics scene 3: feature highlight (parent: T2)
T3d renderer-motion-graphics scene 4: CTA card (parent: T2)
T4 voice-talent narration (parent: T1)
T5 audio-mixer VO + music bed (parent: T4)
T6 editor cut + transitions (parents: T3*, T5)
T7 captioner SRT + burned subtitles (parent: T6)
T8 masterer 1:1, 9:16, 16:9 variants (parent: T7)
```
**Key choices:**
- Multiple specialized renderers (motion-graphics + ASCII) coexist
- Captioner is included because muted autoplay is the norm on social
- `claude-design` skill for UI mockups maps directly to the product video idiom
## Example 3 — Music video (synced to provided track)
**Brief:** A 3-minute music video for a provided lo-fi hip-hop track. Visuals
should pulse with the beat. Generative + ASCII hybrid. Vertical 9:16.
**Team:**
- `director`
- `music-supervisor` — analyze track, emit `audio/beats.json` (loads `songsee`)
- `storyboarder` — beat-aligned shot list (loads `excalidraw`)
- `renderer-ascii` — ASCII scenes synced to bass kicks (loads `ascii-video`)
- `renderer-p5js` — generative particle scenes synced to highs (loads `p5js`)
- `editor` — beat-cut assembly using `beats.json`
- `reviewer` — sync QA
**Task graph:**
```
T0 director decompose
T1 music-supervisor analyze track → beats.json + spectrogram (parent: T0)
T2 storyboarder shot list aligned to beats (parents: T1, T0)
T3a renderer-ascii scene 1: bass-driven ASCII (parent: T2)
T3b renderer-p5js scene 2: high-end particle field (parent: T2)
... (more scenes)
T4 editor cut to beats + mux track (parents: T3*, T1)
T5 reviewer sync QA + final approval (parent: T4)
```
**Key choices:**
- `music-supervisor` runs FIRST — `beats.json` gates the renderers
- `editor` uses `beats.json` directly to align cuts to bass kicks
- No voice-talent — music is the audio
- Two specialized renderers (`ascii-video` + `p5js`) for visual variety
## Example 4 — Math/algorithm explainer
**Brief:** A 2-minute explainer of an algorithm. 3Blue1Brown-style. Animated
diagrams, equations, narration. Square 1:1.
**Team:**
- `director`
- `writer` — narration script (loads `humanizer`)
- `cinematographer` — visual spec (loads `manim-video`)
- `renderer-manim` — all animated scenes (loads `manim-video`)
- `voice-talent` — narration via ElevenLabs
- `editor` — assembly + audio mux
- `captioner` — burned subtitles
**Task graph:**
```
T0 director decompose
T1 writer script + narration (parent: T0)
T2 cinematographer visual spec for all scenes (parent: T1)
T3a-Tn renderer-manim scenes 1..N (parents: T2)
T4 voice-talent narration audio (parent: T1)
T5 editor cut + mux (parents: T3*, T4)
T6 captioner SRT + burn (parent: T5)
```
**Key choices:**
- `manim-video` skill drives both the cinematographer (visual language) and
the renderer (actual scene production)
- The `manim-video` skill's reference docs (animation-design-thinking,
scene-planning, equations) auto-load when needed via the renderer's pinned skill
## Example 5 — ASCII video, music-track-only
**Brief:** A 60-second pure-ASCII video reactive to an existing track. No
voiceover, no other tools. Square 1:1.
**Team:**
- `director`
- `music-supervisor` — track analysis (loads `songsee`)
- `renderer-ascii` — all visuals (loads `ascii-video`)
- `editor` — assembly + audio mux
**Task graph:**
```
T0 director decompose
T1 music-supervisor analyze track (parent: T0)
T2a renderer-ascii scene 1 (parents: T1, T0)
T2b renderer-ascii scene 2 (parents: T1, T0)
T2c renderer-ascii scene 3 (parents: T1, T0)
T3 editor stitch + mux audio (parents: T2*)
```
**Key choices:**
- Minimal team (4 profiles) for a focused single-tool project
- No reviewer — short experimental piece, director approves directly
- All scenes run through one `renderer-ascii` profile because the `ascii-video`
skill covers everything
This example illustrates the rule: **don't over-decompose**. Three scenes
through one renderer is fine. Don't spawn three renderer profiles.
## Example 6 — Real-time / installation art
**Brief:** A 2-minute audio-reactive visual for a gallery installation. Driven
by an audio input feed. TouchDesigner-based. 16:9 4K.
**Team:**
- `director`
- `cinematographer` — visual language spec (loads `touchdesigner-mcp`)
- `renderer-touchdesigner` — all visuals + record-to-disk
(loads `touchdesigner-mcp`)
- `audio-mixer` — final loudness pass on the captured audio (optional if
pre-mixed source)
- `editor` — assemble final clip from TouchDesigner recording
- `reviewer` — visual QA
**Task graph:**
```
T0 director decompose
T1 cinematographer TD operator graph spec (parent: T0)
T2 renderer-touchdesigner build TD network + record output (parent: T1)
T3 editor trim + audio mux (parent: T2)
T4 reviewer final QA (parent: T3)
```
**Key choices:**
- `touchdesigner-mcp` controls a running TouchDesigner instance — the
cinematographer designs the operator graph, renderer builds it
- Output is a recording from the running TD network, not a render-to-frames
process; editor mostly just trims
## Pattern recognition
When the user describes a video, look for these signals to map to an example:
- **Plot, characters, scripted dialogue** → Example 1 (narrative)
- **Specific product, CTA, brand colors, voiceover** → Example 2 (marketing)
- **Track file provided, "synced to music"** → Example 3 (music video)
- **"Explain how X works", math/algorithm/concept walkthrough** → Example 4 (manim explainer)
- **Terminal aesthetic, ASCII, retro pixel** → Example 5 (ASCII)
- **"Audio-reactive", "real-time", "installation"** → Example 6 (TouchDesigner)
- **Comic-style narrative** → use `renderer-comic` (`baoyu-comic` skill)
- **Retro game / pixel-art aesthetic** → use `renderer-pixel` (`pixel-art` skill)
- **3D scene, photoreal environment** → use `renderer-3d` (`blender-mcp`)
- **Generative art, particle system, shader** → use `renderer-p5js` (`p5js`)
- **AI-generated photoreal stills + animation** → use `renderer-comfyui`
(`comfyui`) for both stills and image-to-video
- **"video about how the system works", recursive demo** → composable from
any of the above; the recursion is a rendering technique, not a style
The actual team should be derived from the specific brief — these examples are
starting points, not endpoints.
@@ -0,0 +1,166 @@
# Intake — Discovery Question Banks
The discovery process is **adaptive**. Always start with three baseline
questions to identify the broad style category, then drill into a per-style
question bank. Ask 2-4 questions at a time, listen, then proceed. Make
reasonable assumptions whenever the user implies an answer.
## Tier 0 — Baseline (always ask)
1. **What is the video?** — One-sentence pitch
2. **How long?** — Approximate duration
3. **Aspect ratio + target platform?** — 16:9 / 9:16 / 1:1 / 4:5; X, IG, YouTube, internal, etc.
From these answers, classify the style category and pick the relevant Tier 1
follow-ups. **Do not** continue asking until you have at least these three.
## Style classification
Map the brief to one of these archetypes (or a hybrid):
| Archetype | Tells |
|-----------|-------|
| **Narrative film** | Plot, characters, scenes-with-events, dialogue, location |
| **Product / marketing** | A specific product or feature being shown / sold; CTA at end |
| **Music video** | A specific track exists; visuals sync to music |
| **Explainer / educational** | A concept being taught; voiceover-driven |
| **Tutorial / changelog** | Software demo, terminal-heavy, technical |
| **ASCII / terminal art** | Retro terminal aesthetic explicit, character-grid |
| **Abstract / loop** | Generative, no plot, often perfect-loop |
| **Documentary / interview cut** | Real footage, transcription-driven |
| **Real-time / installation** | Audio-reactive, gallery installation, VJ output |
If ambiguous, **ask** which category fits — don't guess. Hybrids are common
(e.g., a product video with a narrative arc); decompose into the dominant
mode + secondary modifiers.
**Recursive / meta** ("a video that shows its own production") is a
*rendering technique*, not a separate style — compose it from any of the
above by adding a two-pass render step where pass 2 uses pass 1's output as
texture inside the final scene.
## Tier 1 — Per-style follow-ups
### Narrative film
- **Setting / world?** — When and where the story takes place
- **Characters?** — How many, archetypes, who carries dialogue
- **Beat list or full script?** — Has the user written the story or do we draft it
- **Dialogue language?** — Spoken lines, on-screen subs only, silent
- **Visual generation approach?** — Text-to-image (FAL/Midjourney/Imagen) →
image-to-video (Runway/Kling), 3D animation (Blender), 2D animation,
procedural, or hybrid
- **Voice approach?** — TTS (which voice), recorded VO, no dialogue
- **Music / score?** — Commissioned (via `songwriting-and-ai-music` Suno
prompts, or local `heartmula`), licensed track provided, silent
### Product / marketing
- **Product?** — Name, what it does, key feature being shown
- **Target audience?** — Who's watching, what they care about
- **CTA?** — Visit URL, install, sign up, etc.
- **Tone?** — Serious, playful, technical, premium, edgy
- **Brand assets available?** — Logo files, color palette, fonts, existing footage
- **Animation style?** — Motion graphics (Remotion / AE-style), screen recording,
generative, illustrated
- **Voiceover?** — Yes (which voice / language) or text-only
- **Music?** — Track provided, license-free needed, custom-composed
### Music video
- **Track file?** — Path to the audio (essential — we'll analyze BPM + beats)
- **Track length to use?** — Full song or a section
- **Genre / energy?** — Tells what visual rhythm and density to use
- **Lyric / narrative content?** — Are there lyrics to render on screen,
or is it purely visual?
- **Visual reference style?** — Existing music videos / artists for reference
- **Performer footage?** — None, has clips, will provide
- **Visual generation approach?** — Per-beat generative, edit-driven cuts of stock
footage, illustrated, hybrid
### Explainer / educational
- **What concept is being taught?** — One-sentence concept, key takeaway
- **Audience expertise?** — Beginner / intermediate / expert
- **Diagram density?** — Heavy math / formulas / code / abstract concepts
- **Voiceover?** — TTS / recorded / on-screen text only
- **Tool preference?**`manim-video` (math), `p5js` (generative),
Remotion (UI motion graphics), `comfyui` (AI-generated visuals),
`ascii-video` (technical/retro), hybrid
- **Pacing?** — Fast and dense (3Blue1Brown) or slow and contemplative
### Tutorial / changelog / software demo
- **Software being demonstrated?** — Name, what it does
- **Demo script?** — Sequence of commands / screens to show
- **Terminal-only or with GUI?**
- **Voiceover for narration?**
- **Diagram support needed?** — Often these benefit from a diagram skill
alongside the screen-capture/render step (`excalidraw`,
`architecture-diagram`, `concept-diagrams`)
### ASCII / terminal art
- **Source material?** — Generative / driven by audio / converting existing
video / static image starting point
- **Color palette?** — Brand-driven (gold/black/blue), Matrix green, full
rainbow, monochrome
- **Audio reactivity?** — None / loose mood / tight beat sync / FFT-driven
- **Character set?** — ASCII only / Unicode block-drawing / mystic glyphs
- **Loop or narrative?** — Perfect loop or one-shot
### Abstract / loop
- **Mood / emotion?** — One word that captures the feel
- **Motion type?** — Zoom-into-itself, particle drift, wave, geometric, organic
- **Loop required?** — Perfect loop (Droste-style) or just satisfying ending
- **Audio?** — Silent, ambient pad, beat-synced
### Documentary / interview cut
- **Source footage?** — Provided clips, length per clip
- **Transcript / subtitles?** — Provided or to be generated
- **Story structure?** — Chronological / thematic / arc
- **B-roll approach?** — Generated, stock library, none
### Real-time / installation
- **Output environment?** — Gallery wall, projector, screen, web embed
- **Audio source?** — Live audio input, pre-recorded track, both
- **Reactivity tightness?** — Mood-level (loose) vs. tight beat-sync vs. live
parameter control
- **Tool preference?**`touchdesigner-mcp` for full TD operator graphs;
`p5js` for web-canvas; `comfyui` for generative-AI fed by audio features
## Tier 2 — Always ask near the end
- **Brand assets path?** — Where logo / color palette / fonts / music library lives
- **Output format requirements?** — Codec preference, target file size, accepted
alternates (vertical cut, GIF, audio-only)
- **Deadline?** — Affects task `max_runtime_seconds` and acceptable scope
- **Quality bar?** — Rough draft for review / polished final / archival
- **Existing footage / assets to reuse?** — Anything that should appear, not just inform
## Reasonable assumption defaults
When the user under-specifies, fill in these defaults rather than asking:
| Question | Default |
|----------|---------|
| Frame rate | 30 fps for X / IG; 60 fps for tutorials/explainers; 24 fps for narrative film |
| Resolution | 1080×1080 for square, 1920×1080 for 16:9, 1080×1920 for 9:16 |
| Codec | H.264 / yuv420p, CRF 18 |
| Audio codec | AAC 192 kbps |
| Voice | Provider's mid-range neutral voice unless brand calls for distinctive timbre |
| Music | Silent (require user to specify if music is wanted) |
| Captions | On for explainer/tutorial; off for narrative/abstract unless requested |
| Quality bar | Polished final unless user says draft |
State the assumption explicitly: *"Assuming 30fps and AAC audio unless you say otherwise — proceed?"*
## Anti-patterns
- **Asking 10 questions at once.** Maximum 4 per turn.
- **Asking for things the brief already implies.** If the user said "music video for my track," do not ask "is there a track?"
- **Failing to classify before drilling in.** Tier-1 questions depend on classification; mixing them up wastes turns.
- **Treating "make a video" as enough to proceed.** Always confirm the three baseline questions.
@@ -0,0 +1,276 @@
# Kanban Setup — Project Bootstrap & Profile Configuration
Once the brief is locked and the team is designed, the next step is producing
the actual `setup.sh` that creates the project workspace, configures Hermes
profiles, and fires the initial kanban task.
This file documents the patterns. The companion script
`scripts/bootstrap_pipeline.py` automates most of it from a structured input
JSON.
> **Credit:** the single-project-workspace layout, profile-config patching
> approach, SOUL.md-per-profile convention, and `--workspace dir:<path>` rule
> are adapted from alt-glitch's original multi-agent video pipeline:
> [NousResearch/kanban-video-pipeline](https://github.com/NousResearch/kanban-video-pipeline).
> This skill generalizes those patterns across video styles and replaces the
> string-replacement config patcher with a PyYAML-based one.
## Project workspace structure
Every video project gets one workspace under `~/projects/video-pipeline/<slug>/`:
```
~/projects/video-pipeline/<slug>/
├── brief.md ← the contract; all tasks reference
├── TEAM.md ← team composition + task graph (director reads this)
├── taste/
│ ├── brand-guide.md ← color, typography, motion rules
│ ├── emotional-dna.md ← what the piece should FEEL like
│ └── style-frames/ ← optional: visual references
├── audio/
│ ├── track.mp3 ← provided music (if any)
│ ├── voiceover/ ← per-line TTS clips
│ └── sfx/ ← sound effects
├── assets/
│ ├── logos/
│ ├── fonts/
│ └── existing-footage/ ← reusable provided clips
├── scenes/
│ ├── scene-01/
│ │ ├── VISUAL_SPEC.md ← cinematographer's per-scene spec
│ │ ├── render.py ← renderer's code (or sketch.html, etc.)
│ │ ├── checkpoints/ ← preview frames for QA
│ │ └── clip.mp4 ← the deliverable for this scene
│ ├── scene-02/...
│ └── ...
├── checkpoints/ ← global review frames
├── tools/ ← optional project-local helpers
└── output/
├── final.mp4 ← stitched + audio
├── final-noaudio.mp4
├── final-9x16.mp4 ← optional: vertical alternate
└── captions.srt ← optional: subtitle file
```
**The slug** is derived from the brief title: lowercase, hyphen-separated.
Example: `q3-product-teaser`, `ascii-mood-loop`, `interview-cut-2026-q1`.
## The setup.sh script
The setup script does six things in order:
1. **Create workspace tree** — all directories above
2. **Create profiles**`hermes profile create <name> --clone`
3. **Configure profiles** — patch each profile's
`~/.hermes/profiles/<name>/config.yaml` to set toolsets, always_load skills,
and `cwd`
4. **Write SOUL.md per profile** — the personality + role definition
5. **Copy any provided assets + write `brief.md`, `TEAM.md`, and `taste/`**
6. **Fire the initial kanban task**`hermes kanban create` assigned to the director
See `assets/setup.sh.tmpl` for the skeleton.
### Profile creation pattern
```bash
hermes profile create director --clone 2>/dev/null || true
```
The `--clone` flag clones from the active profile (preserving model, base
config). The `|| true` makes the script idempotent — re-running won't error if
the profile already exists.
### Profile config patching
Each profile has a YAML config at `~/.hermes/profiles/<name>/config.yaml`. The
setup script edits exactly two keys:
1. `toolsets:` — replace the default with the role's required toolsets
2. `skills.always_load:` — list the role's must-load skills (may be empty)
**Do NOT** modify `approvals.mode` (controls user-confirmation of tool calls
— a security setting that must stay as the user configured it). **Do NOT**
modify `terminal.cwd` — the kanban dispatcher overrides cwd per-task via
`--workspace dir:<path>`, so the profile's cwd is irrelevant to the kanban
work and changing it could break the user's interactive use of the profile.
Use **PyYAML**, not string replacement, so the patch is robust against
default-config schema drift:
```bash
configure_profile() {
local profile="$1"
local toolsets_json="$2" # JSON array, e.g. '["kanban","terminal","file"]'
local skills_json="$3" # JSON array, e.g. '["kanban-worker","ascii-video"]'
python3 - "$profile" "$toolsets_json" "$skills_json" <<'PY'
import json, os, sys, yaml
profile, ts_json, sk_json = sys.argv[1:4]
p = os.path.expanduser(f"~/.hermes/profiles/{profile}/config.yaml")
with open(p) as f:
cfg = yaml.safe_load(f) or {}
cfg["toolsets"] = json.loads(ts_json)
cfg.setdefault("skills", {})["always_load"] = json.loads(sk_json)
with open(p, "w") as f:
yaml.safe_dump(cfg, f, sort_keys=False)
PY
}
```
PyYAML must be installed in the user's Python (it ships with most Hermes
installs). If absent: `pip install pyyaml`.
The setup script should also **validate** the patch by re-reading the file
and comparing — see `assets/setup.sh.tmpl` for the validation pattern.
### SOUL.md per profile
Each profile gets a `SOUL.md` at `~/.hermes/profiles/<name>/SOUL.md` that
defines its role, voice, and rules. See `assets/soul.md.tmpl` for the
template. Customize per role and per project.
The director's SOUL.md should be the most opinionated — its voice flavors
the entire production. **Critical content for the director's SOUL.md:**
- **Anti-temptation rules:** "Do not execute the work yourself. For every
concrete task, create a kanban task and assign it. Decompose, route, comment,
approve — that's the whole job." (The `kanban-orchestrator` skill provides
the deeper playbook; load it.)
- **Decomposition steps:** Read `brief.md`, `TEAM.md`, `taste/`. Use the team
graph in `TEAM.md` to fan out tasks.
- **The workspace_path rule** (see below).
Other profiles' SOUL.md is briefer; mostly mechanical: who you are, what you
read, what you produce, what skills/tools to use, where to write outputs.
Most non-director profiles should `always_load: kanban-worker` for the
deeper-than-baseline kanban guidance.
### Initial kanban task
The final action of setup.sh is firing the kanban:
```bash
hermes kanban create "Direct production of <video title>" \
--assignee director \
--workspace dir:"$HOME/projects/video-pipeline/${PROJECT_SLUG}" \
--tenant ${PROJECT_SLUG} \
--priority 2 \
--max-runtime 4h \
--body "$(cat <<EOF
Read brief.md, TEAM.md, and taste/.
Decompose into the team graph defined in TEAM.md.
All child tasks MUST use:
workspace_kind="dir"
workspace_path="$HOME/projects/video-pipeline/${PROJECT_SLUG}"
tenant="${PROJECT_SLUG}"
EOF
)"
```
The `--workspace dir:<path>` flag is **critical** — it tells the kanban that
all child tasks share this workspace. Skipping or using `worktree` will
isolate profiles and break artifact sharing.
## The TEAM.md file
Alongside `brief.md`, write a `TEAM.md` that the director reads. It documents
the team composition + task graph the orchestrator should follow. This
removes ambiguity and prevents the director from inventing extra steps.
Example structure (for an ASCII video with a music supervisor and editor):
```markdown
# Team & Task Graph — <video title>
## Team
- `director` (this profile) — vision, decomposition, approval
- `cinematographer` — visual spec, quality review (loads `ascii-video`)
- `renderer-ascii` — ASCII scenes (loads `ascii-video`)
- `music-supervisor` — track analysis (loads `songsee`)
- `voice-talent` — narration (uses ElevenLabs API)
- `audio-mixer` — final mix (ffmpeg)
- `editor` — assembly (ffmpeg)
- `reviewer` — final QA gate
## Task Graph
T0: this task — decompose
├── T1: cinematographer "Design visual language" (parent: T0)
│ │
│ ├── T2a: renderer-ascii "Scene 1 — title card" (parent: T1)
│ ├── T2b: renderer-ascii "Scene 2 — main beat" (parent: T1)
│ ├── T2c: renderer-ascii "Scene 3 — outro" (parent: T1)
├── T3: music-supervisor "Analyze track + emit beats.json" (parent: T0)
├── T4: voice-talent "Generate narration" (parent: T0)
├── T5: audio-mixer "Mix VO + bg music" (parents: T3, T4)
├── T6: editor "Assemble cut + mux audio" (parents: T2*, T5)
└── T7: reviewer "Final QA" (parent: T6)
```
The director turns this into actual `kanban_create` calls.
## API-key prerequisites check
Before firing the kanban, verify required keys are available. Check both
`~/.hermes/.env` and macOS Keychain (if on macOS):
```bash
check_key() {
local var="$1"
local kc_account="$2"
local kc_service="$3"
if grep -q "^${var}=" ~/.hermes/.env 2>/dev/null && \
[ -n "$(grep "^${var}=" ~/.hermes/.env | cut -d= -f2-)" ]; then
return 0
fi
if command -v security >/dev/null 2>&1 && \
security find-generic-password -a "${kc_account}" -s "${kc_service}" -w >/dev/null 2>&1; then
return 0
fi
echo "ERROR: ${var} not set in ~/.hermes/.env or Keychain (${kc_account}/${kc_service})"
return 1
}
check_key ELEVENLABS_API_KEY hermes ELEVENLABS_API_KEY || exit 1
check_key OPENROUTER_API_KEY hermes OPENROUTER_API_KEY || exit 1
# ...
```
If a key is missing, the script aborts with a clear message rather than
firing a kanban that will hit credential errors mid-execution.
## Critical rules
1. **`workspace_kind="dir"` + `workspace_path="<absolute>"` on every kanban_create.** Otherwise profiles can't share artifacts.
2. **Tenant every task.** `--tenant <project-slug>` keeps the dashboard scoped
and prevents cross-pollination with other ongoing kanbans.
3. **Idempotency keys.** For tasks that should not duplicate on re-run (e.g.,
setup creating profiles), use the `idempotency_key` argument or check
existence first.
4. **`max_runtime_seconds` per task.** Renderers that get stuck eat compute.
Standard defaults:
- Renderer task: 1800s (30min)
- Editor task: 600s (10min)
- Voice-talent task: 300s (5min)
- Image-generator task: 600s (10min)
- Image-to-video-generator task: 900s (15min)
5. **Heartbeats for long renders.** Tasks expected to run >5min should emit
`kanban_heartbeat` periodically with progress. Renderers should report
frame counts; the editor should report assembly progress.
6. **The `audio/` and `taste/` dirs are populated BEFORE firing the kanban.**
Don't ask the director's pipeline to source these — copy at setup time.
7. **`brief.md` is read-only after setup.** If the brief changes during
execution, that's a significant pivot — re-fire the kanban rather than edit
live.
@@ -0,0 +1,180 @@
# Monitoring — Watch the Pipeline + Intervene
After `setup.sh` fires the kanban, the work runs autonomously. The role of
this skill in the execution phase is to help the user (and the AI overseeing
the session) detect problems early and intervene effectively.
## Live monitoring commands
```bash
# Live event stream — task spawns, status changes, heartbeats, completions
hermes kanban watch --tenant <project-slug>
# Snapshot of the board
hermes kanban list --tenant <project-slug>
hermes kanban list --tenant <project-slug> --json # machine-readable
# Per-status counts + oldest-ready age
hermes kanban stats --tenant <project-slug>
# Visual dashboard (browser)
hermes dashboard
# Inspect a specific task (includes comments + events)
hermes kanban show <task-id>
# Follow a single task's event stream
hermes kanban tail <task-id>
```
Verify available subcommands with `hermes kanban --help` — the kanban CLI
ships with `init / create / list / show / assign / link / unlink / claim /
comment / complete / block / unblock / archive / tail / dispatch / watch /
stats / heartbeat / log / runs / context / gc`.
The companion `scripts/monitor.py` polls the kanban via the CLI and surfaces
common issues (stuck tasks, missing heartbeats, repeated retries, dependency
deadlocks).
## What to watch for
### Healthy pipeline indicators
- Tasks transition `READY → RUNNING → DONE` in roughly the expected order
- Renderers emit periodic `kanban_heartbeat` events with progress (e.g. "frame
240/720")
- Each task's runtime is well under its `max_runtime_seconds` cap
- No task accumulates more than 1 retry
- Dependency arrows resolve (children unblock as parents complete)
### Warning signs
| Symptom | Likely cause | Action |
|---------|--------------|--------|
| Task RUNNING but no heartbeat in 2+ min | Worker stuck, infinite loop, blocked on input | `hermes kanban show <id>` — read the worker's last events. The dispatcher SIGTERMs tasks that exceed their `max-runtime`; if you need to stop one earlier, `hermes kanban block <id>` then `hermes kanban archive <id>`, and create a re-run task. |
| Same task retried 2+ times | Reproducible failure (missing key, bad spec, broken tool) | `hermes kanban show <id>` to read failure events. Fix root cause before re-running. |
| RUNNING longer than max_runtime | Task is slow but progressing OR genuinely stuck | Check heartbeats with `hermes kanban tail <id>`. If progressing, the dispatcher will SIGTERM eventually anyway — raise `max-runtime` on a re-created task. |
| Child task READY but parents still RUNNING for >2× expected | Cascade slow, dependency miswired | Check the dependency graph. Inspect the parent: sometimes it completed but its handoff fields (summary, metadata) were empty so the child has nothing to consume. |
| New tasks not appearing | Director is hung in decomposition | Inspect director task with `kanban show`. Often a malformed `kanban_create` call. |
| Specialist tasks completing instantly | Decomposition created tasks without bodies | Director didn't pass enough context. Re-create with explicit body content. |
| Tasks created but never picked up | Profile not running, or tenant mismatch, or dispatcher not running | Check `hermes profile list` (profile exists?), `hermes status` (gateway/dispatcher up?), and verify tenant. |
| Specific renderer task fails → review note → renderer redoes → fails again | Brief is asking for the impossible | Pivot the brief, not the renderer. |
## Intervention recipes
### Rejecting bad output
When a renderer ships a clip that doesn't pass review:
```bash
# 1. Comment on the renderer's task with specific feedback
hermes kanban comment <renderer-task-id> "Scene 3 looks too sparse \
— increase visual density. Tighten color palette to brand spec."
# 2. Create a re-render task with the original as parent
hermes kanban create "Scene 3 — re-render with feedback" \
--assignee renderer-ascii \
--parent <renderer-task-id> \
--workspace dir:"$HOME/projects/video-pipeline/<slug>" \
--tenant <slug> \
--skill ascii-video \
--max-runtime 30m
```
### Adding a new dependency mid-flight
When the editor needs an asset that wasn't originally planned (e.g., a captions
file):
```bash
# 1. Create the new task and capture its id
NEW_TASK_ID=$(hermes kanban create "Generate SRT captions from voiceover" \
--assignee captioner \
--workspace dir:"$HOME/projects/video-pipeline/<slug>" \
--tenant <slug> \
--json | python3 -c "import json,sys;print(json.load(sys.stdin)['id'])")
# 2. Wire it as a parent of the editor's task with `kanban link`
hermes kanban link "$NEW_TASK_ID" <editor-task-id>
```
`kanban link` takes `parent_id child_id` (parent first). Use `kanban unlink`
to remove a dependency.
### Stopping a worker that's stuck
The kanban dispatcher will SIGTERM (then SIGKILL) any task that exceeds its
`--max-runtime` automatically. To stop one sooner:
```bash
# Mark blocked so the dispatcher leaves it alone, then archive
hermes kanban block <task-id>
hermes kanban archive <task-id>
# Diagnose what happened
hermes kanban show <task-id> # task body, comments, recent events
hermes kanban tail <task-id> # follow the live event stream
hermes kanban log <task-id> # worker process log
```
After stopping, decide: fix root cause + re-create the task, or skip and
adjust dependent tasks.
### Pivoting the brief
If during execution the user wants something fundamentally different:
1. Cancel the active director task and all RUNNING children
2. Edit `brief.md` and `TEAM.md`
3. Re-fire the initial `hermes kanban create` for the director
Don't try to "edit while running" — the kanban's audit trail makes a clean
pivot more legible than mid-stream changes.
## Periodic check-in script
A simple polling pattern for hands-off monitoring:
```bash
while true; do
clear
hermes kanban list --tenant <slug>
echo "---"
hermes kanban stats --tenant <slug>
sleep 30
done
```
For a live event feed, run `hermes kanban watch --tenant <slug>` in a
separate terminal — it streams task lifecycle events as they happen.
For automated intervention (auto-restart stuck tasks, auto-create re-render on
review failure), see the `scripts/monitor.py` patterns.
## When to call it done
The pipeline is finished when:
1. All RENDER tasks complete and pass review
2. The editor's `output/final.mp4` exists and `ffprobe` confirms expected
duration + streams
3. The reviewer (if present) has approved
4. Optional masterer variants exist
At this point, present the final.mp4 path to the user along with any review
notes. Do NOT delete the workspace — the user may want to iterate on a single
scene without re-running the whole pipeline.
## Common gotchas
- **Tenant mismatches.** A task created with the wrong tenant won't appear in
monitoring. Always pass `--tenant <slug>` consistently.
- **Profile process not running.** Tasks queue indefinitely in READY if no
worker for that profile is online. Check `hermes profile list` and start
any missing profiles.
- **Workspace permissions.** All profiles need read+write to the workspace
directory. `chmod -R u+rw <workspace>` if any worker reports permission
errors.
- **Audio/visual sync.** The editor's clip stitching must match the
renderer's actual output durations. Don't hardcode scene durations in
the editor — read from the renderer's handoff metadata.
@@ -0,0 +1,298 @@
# Role Archetypes
The library of role archetypes for video production. **Compose a team from this
list, don't clone a fixed roster.** Most videos need 4-7 profiles. The director
is always present; everything else is conditional on the brief.
Each role's profile name is by convention `kebab-case` (e.g. `creative-director`,
`image-generator`). Multiple instances of the same role get descriptive suffixes
when they need different focus (e.g., `renderer-ascii`, `renderer-3d`).
For toolset + skill mapping per role, see [tool-matrix.md](tool-matrix.md).
## Always present
### director
The vision-holder. Reads the brief and brand guide, decomposes into a task
graph, comments to steer creative direction, approves the final cut.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-orchestrator`. The kanban plugin auto-injects baseline
orchestration guidance for free; `kanban-orchestrator` is the deeper
decomposition playbook. Add `creative-ideation` if the brief is wide-open
and needs framing help.
- **Personality:** Tied to the brand voice — see `assets/soul.md.tmpl`
The director has the same toolset as everyone else, but its `SOUL.md` rules
**forbid** execution. The "decompose, don't execute" discipline is enforced
by personality + the kanban-orchestrator skill, not by missing tools.
## Pre-production roles
Pick based on what the brief needs.
### writer / screenwriter
Writes scripts, dialogue, voiceover copy, narration. Use for any video with
spoken or written words beyond a tagline.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker`, `humanizer` (post-process to strip AI-tells)
- **Outputs:** `script.md`, `narration.md`, `dialogue/scene-NN.md`
### copywriter
Like `writer` but specifically for marketing copy: taglines, CTAs, voiceover
scripts for product videos.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker`, `humanizer`
- **Outputs:** `copy.md`
### concept-artist / visual-designer
Develops the visual identity: mood board, style frames, color palette
rationale, typography choices. Produces a `visual-spec.md` that all generators
follow. Often produces still reference frames using image-generation APIs or
local skills.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker` plus any project-specific design skill —
`claude-design` (UI/web), `sketch` (quick mockup variants),
`popular-web-designs` (matching known web aesthetic), `pixel-art` (retro),
`ascii-art` (terminal/retro), `excalidraw` (hand-drawn frames),
`design-md` (text-based design docs)
- **Outputs:** `visual-spec.md`, `taste/style-frames/*.png`
### storyboarder
Maps the brief to a beat-by-beat shot list with timing. Critical for narrative
film and music video. Often pairs with a diagramming tool.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker` plus a diagram skill — `excalidraw` (sketch),
`architecture-diagram` (technical/system), `concept-diagrams` (educational/
scientific)
- **Outputs:** `storyboard.md` with one row per scene/shot, optional
storyboard sketches
### cinematographer / dp
Designs the visual language: framing, color, motion, transitions. Reviews
generator output for visual consistency. Hands off per-scene `VISUAL_SPEC.md`.
- **Toolsets:** kanban, terminal, file, video, vision
- **Skills:** `kanban-worker` plus the visual skill that matches the project
(e.g., `ascii-video` for ASCII work, `manim-video` for explainers,
`touchdesigner-mcp` for real-time visuals, etc.)
- **Outputs:** `scenes/scene-NN/VISUAL_SPEC.md`, review comments on renderer
tasks
- **Reviews via:** `video_analyze` (sends full clip to multimodal LLM for
native review), `vision_analyze` for spot-checking frames, ffprobe summaries
## Production roles
### renderer (generic)
A worker that produces visual content for one or more scenes. Loaded with
whichever creative skill fits the scene's style. Multiple renderers can run in
parallel, each pinned to a different skill via `always_load` in their profile
or `--skill` on the task.
- **Toolsets:** kanban, terminal, file
- **Skills:** one creative skill (see specialized variants below)
- **Outputs:** `scenes/scene-NN/clip.mp4`
### Specialized renderer variants
When scenes need very different tools, create specialized renderer profiles
instead of overloading one. Each loads a different creative skill.
| Variant | Skill | Best for |
|---------|-------|----------|
| `renderer-ascii` | `ascii-video` | Terminal aesthetic, retro pixel, audio-reactive grid, video-to-ASCII conversion |
| `renderer-manim` | `manim-video` | Math, algorithms, 3Blue1Brown-style explainers, equation derivations |
| `renderer-p5js` | `p5js` | Generative art, particles, shaders, organic motion, web-canvas content |
| `renderer-comfyui` | `comfyui` | AI-generated stills + video using local ComfyUI workflows (img-to-img, img-to-video, etc.) |
| `renderer-touchdesigner` | `touchdesigner-mcp` | Real-time, audio-reactive, installation art, VJ-style content |
| `renderer-3d` | `blender-mcp` *(optional)* | 3D modeling, animation, photoreal environments, character animation |
| `renderer-pixel` | `pixel-art` | Retro game aesthetic with era-correct palettes |
| `renderer-comic` | `baoyu-comic` | Knowledge-comic style narrative scenes |
| `renderer-meme` | `meme-generation` *(optional)* | Meme-style stills for satirical/social content |
| `renderer-procedural` | (none — Python with PIL + ffmpeg directly) | Custom procedural content where no skill fits |
| `renderer-video` | (external image-to-video API: Runway / Kling / Luma) | Animating still images in narrative film |
| `renderer-motion-graphics` | (external — Remotion CLI) | Motion graphics, kinetic typography, UI animations |
For external-API renderers, the profile holds the API client logic; only
`kanban-worker` is loaded, plus the terminal toolset and the API key.
### image-generator
Specifically for text-to-image generation. Often produces stills that go to
`renderer-video` for animation.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, optionally `comfyui` (drives a local
ComfyUI install for image generation)
- **External APIs (alternative to local ComfyUI):** FAL, Replicate, OpenAI
Images, Midjourney
- **Outputs:** `scenes/scene-NN/stills/*.png`
### image-to-video-generator
Takes still images and animates them via Runway/Kling/Luma APIs, or via
ComfyUI's image-to-video workflows locally. Almost always follows
`image-generator` in narrative film pipelines.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, optionally `comfyui` (for local image-to-video
workflows like AnimateDiff or WAN)
- **External APIs:** Runway, Kling, Luma, Pika
- **Outputs:** `scenes/scene-NN/clip.mp4`
### music-supervisor
Sources, analyzes, and prepares the music track. For music videos, also
produces a beat/BPM map and key-moment timestamps. Uses `songsee` for
spectrograms when the editor or renderer needs a visual reference of the
audio's energy.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, `songsee` (audio visualization), plus one of:
- `songwriting-and-ai-music` — when commissioning lyrics + Suno prompts
- `heartmula` — when generating music with the open-source local model
- `spotify` — when sourcing existing tracks
- **Outputs:** `audio/track.mp3`, `audio/beats.json`, optional
`audio/track-spectrogram.png`
### voice-talent / narrator
Generates voiceover audio. Calls a TTS API directly; no Hermes skill required
beyond `kanban-worker`. The user can also supply pre-recorded VO instead of
generation.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External APIs:** ElevenLabs, OpenAI TTS, etc.
- **Outputs:** `audio/voiceover/line-NN.mp3`, `audio/voiceover/timeline.mp3`
### foley / sfx-designer
Sound effects and ambient design. Often optional unless the brief calls for
sound design specifically.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, `songsee` for audio-feature visualization when
designing to a track
- **Outputs:** `audio/sfx/*.mp3`
## Post-production roles
### editor
Assembles the final cut from clips. Uses ffmpeg for stitching, fades,
transitions. Reviews each clip for pacing and quality before assembly.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External tools:** ffmpeg, ffprobe
- **Outputs:** `output/final.mp4`, `output/final-noaudio.mp4`
### colorist
Color grading. Usually optional — if the renderers already produce
brand-consistent output and the editor just stitches, the colorist is overkill.
Worth including for narrative film with hero shots.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **Outputs:** `output/final-graded.mp4`
### audio-mixer
Mixes voiceover + music + SFX into a final audio track. Sets levels, ducks
music under VO, normalizes loudness (LUFS).
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External tools:** ffmpeg with `loudnorm` filter, optional `sox`
- **Outputs:** `audio/final-mix.mp3`
### captioner
Burns subtitles into the video, generates SRT, handles accessibility. Can also
generate captions from audio via Whisper.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External tools:** Whisper (CLI or API), ffmpeg subtitle filters
- **Outputs:** `output/captions.srt`, `output/final-captioned.mp4`
### masterer
Final encode + format variants. Produces deliverables for each platform target
(square for IG, vertical for TikTok, full HD for YouTube, etc.).
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **Outputs:** `output/final-1080.mp4`, `output/final-9x16.mp4`, etc.
## QA roles
### reviewer
A neutral quality gate. Reads the brief, watches the cut, comments
specifically on what's off (pacing, sync, brand alignment, technical
quality). Distinct from the cinematographer (who reviews visuals during
production) and the editor (who reviews for assembly).
- **Toolsets:** kanban, terminal, file, video, vision
- **Skills:** `kanban-worker`
- **Review tools:** `video_analyze` (native clip review via multimodal LLM),
`vision_analyze` (frame/thumbnail review), ffprobe
- **Outputs:** `review-notes.md`, comments on tasks
### brand-cop
Reviews specifically for brand compliance — colors, typography, voice. Use
when the brand guidelines are detailed and a generic reviewer might miss
violations.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker`
- **Outputs:** comments + `brand-review.md`
## Composing teams — heuristics
- **Always:** director + at least one renderer + editor.
- **Add writer** if scripted dialogue / narration / on-screen text exceeds a
tagline.
- **Add storyboarder** if the brief has more than 5 distinct beats and the
director hasn't already laid out a beat list.
- **Add cinematographer** if multiple renderer instances need consistent
visual language. (For a single-tool video, the renderer's own skill spec
is enough.)
- **Add image-generator + image-to-video-generator pair** for narrative film
with photorealistic visuals.
- **Add music-supervisor** when music is provided and rhythm matters
(music videos always; explainers sometimes).
- **Add voice-talent** for any voiceover / narrative dialogue.
- **Add audio-mixer** when there are 2+ audio sources (VO + music, music + SFX).
- **Add captioner** for accessibility-priority projects (explainer, tutorial,
any platform that defaults to muted playback).
- **Add reviewer** for high-stakes projects. Skip for quick experimental loops.
- **Add masterer** when multiple platform deliverables are needed.
## Anti-patterns
- **One renderer doing everything.** If scenes use very different tools
(ASCII + 3D + motion graphics), use specialized renderer variants. The
renderer loads ONE creative skill at a time; mixing styles in a single
renderer causes thrashing.
- **A separate profile per scene.** No. Profiles are per-role, not per-scene.
Eight scenes use one or two renderer profiles, not eight.
- **A "general" profile that does everything.** Worse than no specialization.
The kanban routing breaks down if every task fits every profile.
- **No reviewer for important deliverables.** Saves an hour of pipeline time
but ships flaws.
@@ -0,0 +1,317 @@
# Tool Matrix — Skills + Toolsets per Role
Maps each role archetype to the Hermes skills it should `always_load` and the
toolsets it needs. Only references skills that ship in the public hermes-agent
repository (under `skills/` or `optional-skills/`). External APIs and CLIs are
called from the terminal toolset; they don't appear in `always_load`.
## Hermes skills relevant to video production
### Visual / rendering skills (`hermes-agent/skills/creative/`)
| Skill | What it does | Best fit for |
|-------|--------------|--------------|
| `ascii-video` | Production pipeline for ASCII art video — generative, audio-reactive, video-to-ASCII | Renderer for ASCII / terminal / retro pixel content; cinematographer for ASCII projects |
| `ascii-art` | Static ASCII art generation | Concept artist for ASCII style frames; secondary tool for ASCII renderer |
| `manim-video` | Manim CE animations — math, algorithms, 3Blue1Brown-style explainers | Renderer for math, algorithm walkthroughs, technical concept explainers |
| `p5js` | p5.js sketches — generative art, shaders, interactive, 3D | Renderer for generative art, particle systems, organic motion, web-canvas content |
| `comfyui` | Generate images, video, audio with ComfyUI workflows (image-to-image, image-to-video, etc.) | image-generator, image-to-video-generator, or general renderer for AI-generated content |
| `touchdesigner-mcp` | Control a running TouchDesigner instance — real-time visuals, audio-reactive installation art, VJ | Renderer for real-time/audio-reactive content; installation art; live performance |
| `blender-mcp` *(optional)* | Control Blender 4.3+ via MCP — 3D modeling, animation, rendering | Renderer for 3D scenes, photoreal environments, character animation |
| `pixel-art` | Pixel art with era palettes (NES, Game Boy, PICO-8) | Renderer for retro game aesthetic; concept artist for pixel-style frames |
| `baoyu-comic` | Knowledge-comic generation (educational, biography, tutorial) | Renderer for comic-style narrative; explainer in panel form |
| `baoyu-infographic` | Infographic generation | Renderer for data-driven explainer scenes |
| `meme-generation` *(optional)* | Generate meme images by overlaying text on templates | Generator for satirical/social content; meme-style stills |
### Design / pre-production skills (`hermes-agent/skills/creative/`)
| Skill | What it does | Best fit for |
|-------|--------------|--------------|
| `claude-design` | Design one-off HTML artifacts (landing, deck, prototype) | Concept artist for product video style frames; storyboarder for UI-heavy content |
| `design-md` | Design markdown docs | Concept artist documenting visual specs |
| `popular-web-designs` | Reference patterns for popular web designs | Concept artist; cinematographer when matching a known UI aesthetic |
| `sketch` | Throwaway HTML mockups (2-3 design variants to compare) | Concept artist exploring directions; storyboarder for UI flows |
| `excalidraw` | Excalidraw-style hand-drawn diagrams | Storyboarder; concept artist for sketch-style frames |
| `architecture-diagram` | Software architecture diagrams | Storyboarder for technical content; explainer scenes about systems |
| `concept-diagrams` *(optional)* | Flat, minimal SVG diagrams (educational visual language; physics, chemistry, math, anatomy, etc.) | Renderer / storyboarder for explainer scenes with clean educational diagrams |
| `pretext` | Mathematical/scientific content authoring | Writer / cinematographer for technical-explainer pretexts |
| `creative-ideation` | Constraint-driven project ideation | Director / cinematographer when the brief is wide-open and needs framing |
| `humanizer` | Strip AI-isms from text, add real voice | Writer / copywriter post-process to avoid AI-tells in scripts and VO copy |
### Audio / media skills (`hermes-agent/skills/creative/` + `skills/media/`)
| Skill | What it does | Best fit for |
|-------|--------------|--------------|
| `songwriting-and-ai-music` | Songwriting craft + Suno prompt patterns | Music supervisor when commissioning a track via Suno |
| `heartmula` | Open-source music generation (Apache-2.0, Suno-like) | Music supervisor generating bespoke tracks without external APIs |
| `songsee` | Spectrograms, mel/chroma/MFCC of audio files | Music supervisor analyzing tracks; foley-designer designing to a beat; editor visualizing a mix |
| `spotify` | Spotify control — play, search, queue, manage playlists | Music supervisor sourcing existing tracks; reference research |
| `youtube-content` | Fetch transcripts + transform to chapters/summaries/posts | Documentary cut, content adaptation, research for explainers |
| `gif-search` | Find existing GIFs | Editor / concept artist sourcing references |
| `gifs` | GIF tooling | Masterer producing GIF deliverables |
### Kanban infrastructure (`hermes-agent/skills/devops/`)
| Skill | What it does | When to load |
|-------|--------------|--------------|
| `kanban-orchestrator` | Decomposition playbook + anti-temptation rules for orchestrator profiles | Director only |
| `kanban-worker` | Pitfalls, examples, edge cases for kanban workers (deeper than auto-injected guidance) | Any profile — load when handling tricky multi-step workflows |
The kanban plugin auto-injects baseline orchestration guidance into every
worker's system prompt — the `kanban_create` fan-out pattern, claim/handoff
lifecycle, and the "decompose, don't execute" rule for orchestrators.
`kanban-orchestrator` and `kanban-worker` are deeper playbooks loaded when a
profile needs them.
## External tools (called from terminal toolset)
These are **not** Hermes skills but external CLIs / APIs that profiles invoke.
They don't appear in `always_load`; instead the role's terminal commands hit
them directly.
| Tool | What it does | Profile that uses it |
|------|--------------|----------------------|
| `ffmpeg` | Video / audio encode, splice, mux | renderer, editor, audio-mixer, masterer |
| `ffprobe` | Inspect media | All media-touching profiles |
| Whisper (CLI or API) | Speech-to-text for captions | captioner |
| Text-to-image API (FAL / Replicate / OpenAI / Midjourney) | Stills generation | image-generator (alternative to local `comfyui`) |
| Image-to-video API (Runway / Kling / Luma / Pika) | Animate stills | image-to-video-generator |
| Text-to-speech API (ElevenLabs / OpenAI TTS / etc.) | Voiceover generation | voice-talent |
| Suno API or web | Track composition (paired with `songwriting-and-ai-music`) | music-supervisor |
| Remotion CLI (`npx remotion render`) | React-based motion graphics | renderer-motion-graphics |
| Manim CE (`manim`) | Math animation render (driven by `manim-video` skill's recipes) | renderer-manim |
| Blender (`blender -b`) | 3D rendering (alternative to `blender-mcp`) | renderer-3d |
## Built-in Hermes tools for media review
These are native Hermes tools — not invoked via terminal but through their own
toolsets. Enable them per-profile by adding the toolset to the profile config.
| Tool | Toolset | What it does | Profile that uses it |
|------|---------|--------------|----------------------|
| `video_analyze` | `video` (opt-in — `hermes tools enable video`) | Native video understanding — sends full clip to a multimodal LLM (Gemini via OpenRouter) for review without frame extraction. Supports mp4, webm, mov, avi, mkv. 50 MB cap. Model: `AUXILIARY_VIDEO_MODEL` env → `AUXILIARY_VISION_MODEL` fallback. | reviewer, cinematographer, editor |
| `vision_analyze` | `vision` (core — enabled by default) | Image/frame analysis — review stills, thumbnails, exported frames. Already available to all profiles without opt-in. | reviewer, cinematographer, concept-artist |
## Standard toolset configurations per role
### director
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-orchestrator
```
The director's terminal access is conventional but the SOUL.md rules forbid
execution. Audit logs catch violations.
### writer / copywriter
```yaml
toolsets:
- kanban
- file
skills:
always_load:
- kanban-worker
- humanizer # post-process scripts to strip AI-tells
```
No terminal — writers don't need it.
### concept-artist
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
# plus one or more (style-dependent):
# - claude-design (UI / web product video)
# - sketch (quick mockup variants)
# - excalidraw (hand-drawn frames)
# - ascii-art (ASCII style frames)
# - pixel-art (retro/game aesthetic)
# - popular-web-designs (matching known web aesthetic)
# - design-md (text-based design docs)
```
### storyboarder
```yaml
toolsets:
- kanban
- file
skills:
always_load:
- kanban-worker
# one of:
# - excalidraw (sketch storyboards)
# - architecture-diagram (technical/system content)
# - concept-diagrams (educational / scientific content)
```
### cinematographer
```yaml
toolsets:
- kanban
- terminal
- file
- video # video_analyze — review full clips natively
- vision # vision_analyze — review stills / exported frames
skills:
always_load:
- kanban-worker
# the visual skill that matches the project, e.g.:
# - ascii-video (ASCII projects)
# - manim-video (math/explainer)
# - p5js (generative)
# - comfyui (AI-generated visuals)
# - blender-mcp (3D)
# - touchdesigner-mcp (real-time/installation)
```
### renderer (specialized variants)
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
# ONE skill per renderer variant (or empty for external-API renderers):
# - ascii-video (renderer-ascii)
# - manim-video (renderer-manim)
# - p5js (renderer-p5js)
# - comfyui (renderer-comfyui — img/video AI gen)
# - touchdesigner-mcp (renderer-touchdesigner)
# - blender-mcp (renderer-3d)
# - pixel-art (renderer-pixel)
# - baoyu-comic (renderer-comic)
# - meme-generation (renderer-meme)
```
For external-API renderers (image-to-video-generator using Runway, voice-talent
using ElevenLabs, renderer-motion-graphics using Remotion), `always_load` only
contains `kanban-worker` — the role's work is API-driven and the API key +
terminal commands suffice.
For multi-skill renderer setups (rare — usually one variant per skill is
cleaner) use `--skill <name>` on individual `kanban_create` calls to override
which skill loads for that specific task.
### image-generator / image-to-video-generator / voice-talent
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
# for image-generator that drives ComfyUI locally:
# - comfyui
env_required:
# populate based on the chosen API:
- FAL_KEY # or REPLICATE_API_TOKEN, OPENAI_API_KEY for image-gen
- RUNWAY_API_KEY # or KLING_API_KEY, LUMA_API_KEY for image-to-video
- ELEVENLABS_API_KEY # or OPENAI_API_KEY for TTS
```
If the user's setup has ComfyUI installed locally, the `comfyui` skill can
replace the external image-gen API entirely (cheaper, more control, supports
custom workflows for image-to-video too).
### music-supervisor
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
- songsee # spectrograms / audio analysis
# plus (depending on what the project needs):
# - songwriting-and-ai-music (commissioning Suno tracks)
# - heartmula (commissioning open-source local generation)
# - spotify (sourcing existing tracks)
```
### editor / audio-mixer / captioner / masterer
```yaml
toolsets:
- kanban
- terminal
- file
- video # video_analyze — editor reviews assembled cuts natively
- vision # vision_analyze — spot-check frames
skills:
always_load:
- kanban-worker
```
These are mostly ffmpeg-driven; no special skill needed beyond `kanban-worker`.
For captioner add Whisper invocation patterns to the SOUL.md.
### reviewer / brand-cop
```yaml
toolsets:
- kanban
- terminal # for media inspection (ffprobe, etc.)
- file
- video # video_analyze — review full clips natively
- vision # vision_analyze — review stills / exported frames
skills:
always_load:
- kanban-worker
```
## API key requirements
Track these in the project setup. The setup script should verify each required
key is present in `~/.hermes/.env` (or macOS Keychain) before firing the kanban.
| Service | Env var | Used by |
|---------|---------|---------|
| ElevenLabs | `ELEVENLABS_API_KEY` | voice-talent |
| OpenAI | `OPENAI_API_KEY` | image-generator (DALL-E), voice-talent (TTS) |
| OpenRouter | `OPENROUTER_API_KEY` | reviewer, cinematographer, editor (`video_analyze` routes through `AUXILIARY_VIDEO_MODEL` → OpenRouter) |
| FAL | `FAL_KEY` | image-generator (FAL flux models) |
| Replicate | `REPLICATE_API_TOKEN` | image-generator (alternate provider) |
| Runway | `RUNWAY_API_KEY` | image-to-video-generator |
| Kling | `KLING_API_KEY` | image-to-video-generator (alternate) |
| Luma | `LUMA_API_KEY` | image-to-video-generator (alternate) |
| Suno | `SUNO_API_KEY` | music-supervisor (paired with `songwriting-and-ai-music`) |
| Spotify | `SPOTIFY_CLIENT_ID` + `SPOTIFY_CLIENT_SECRET` | music-supervisor (paired with `spotify` skill) |
| Anthropic | `ANTHROPIC_API_KEY` | every Hermes profile (Claude) |
If a key is missing, prompt the user to add it. Storage methods, in order of
preference: macOS Keychain → `~/.hermes/.env` → environment variable.
## Skill version pinning
If a specific skill version is desired, pass it via the per-task
`--skill <name>=<version>` flag. The default is whatever's installed.
## Adding a new skill to the matrix
When a new Hermes-public video skill ships:
1. Add a row to the relevant table at the top of this file
2. If it warrants a specialized renderer variant, add to `role-archetypes.md`
3. Update relevant per-style examples in `examples.md`
@@ -0,0 +1,501 @@
#!/usr/bin/env python3
"""
Bootstrap a video production kanban from a structured plan JSON.
Reads a plan.json describing the team + brief, expands templates from
../assets/, and writes a setup.sh that creates Hermes profiles and fires the
initial kanban task.
Profile-config patching, SOUL.md-per-profile, TEAM.md task-graph convention,
and the `hermes kanban create --workspace dir:` initial-task pattern are
adapted from alt-glitch's NousResearch/kanban-video-pipeline.
Usage:
bootstrap_pipeline.py plan.json [--out setup.sh]
The plan.json schema is documented inline below see the `validate_plan`
function. A minimal example:
{
"title": "Q3 Product Teaser",
"slug": "q3-product-teaser",
"tenant": "q3-product-teaser",
"duration_s": 30,
"aspect": "1:1",
"resolution": "1080x1080",
"fps": 30,
"team": [
{
"profile": "director",
"role": "director",
"toolsets": ["kanban", "terminal", "file"],
"skills": [],
"responsibilities": "...",
"inputs": "brief.md, TEAM.md, taste/",
"outputs": "kanban tasks for the team"
},
...
],
"scenes": [
{"n": 1, "time": "0:00-0:08", "content": "...", "tool": "renderer-ascii"},
...
],
"audio": {"approach": "voiceover + music bed", "vo": "ElevenLabs Lily",
"music": "license-free", "sfx": "n/a"},
"deliverables": [
{"format": "mp4", "resolution": "1080x1080", "notes": "primary"}
],
"api_keys_required": ["ELEVENLABS_API_KEY", "OPENROUTER_API_KEY"],
"brief_extra": {
"concept_one_liner": "...",
"emotional_north_star": "...",
"visual_refs": "...",
"tone": "...",
"brand_constraints": "..."
}
}
"""
from __future__ import annotations
import argparse
import json
import os
import re
import sys
from pathlib import Path
ASSETS_DIR = Path(__file__).resolve().parent.parent / "assets"
def load_template(name: str) -> str:
return (ASSETS_DIR / name).read_text()
PROFILE_NAME_RE = re.compile(r"^[a-z0-9][a-z0-9_-]{0,63}$")
SLUG_RE = re.compile(r"^[a-z0-9][a-z0-9-]+$")
def validate_plan(plan: dict) -> list[str]:
"""Return a list of validation error strings; empty list = valid."""
errors = []
required_top = ["title", "slug", "tenant", "duration_s", "aspect",
"resolution", "fps", "team", "scenes", "audio",
"deliverables"]
for k in required_top:
if k not in plan:
errors.append(f"missing required key: {k}")
if "team" in plan:
if not isinstance(plan["team"], list) or not plan["team"]:
errors.append("team must be a non-empty list")
else:
roles = [t.get("role") for t in plan["team"]]
if "director" not in roles:
errors.append("team must include a director role")
seen_profiles = set()
for i, t in enumerate(plan["team"]):
for k in ["profile", "role", "toolsets", "skills",
"responsibilities"]:
if k not in t:
errors.append(f"team[{i}] missing {k}")
# Profile name must match Hermes's regex (lowercase
# alphanumeric + hyphens + underscores, up to 64 chars).
if "profile" in t:
if not PROFILE_NAME_RE.match(t["profile"]):
errors.append(
f"team[{i}].profile {t['profile']!r} must match "
f"[a-z0-9][a-z0-9_-]{{0,63}} per Hermes profile rules"
)
if t["profile"] in seen_profiles:
errors.append(
f"team[{i}].profile {t['profile']!r} is duplicated"
)
seen_profiles.add(t["profile"])
# Toolsets / skills must be lists, not strings.
if "toolsets" in t and not isinstance(t["toolsets"], list):
errors.append(
f"team[{i}].toolsets must be a list of strings"
)
if "skills" in t and not isinstance(t["skills"], list):
errors.append(
f"team[{i}].skills must be a list of strings"
)
if "slug" in plan:
if not SLUG_RE.match(plan["slug"]):
errors.append("slug must be lowercase, hyphenated, "
"starting with [a-z0-9]")
return errors
def render_brief(plan: dict) -> str:
"""Render brief.md from the plan."""
tmpl = load_template("brief.md.tmpl")
extra = plan.get("brief_extra", {})
# Scene table rows
scene_rows = []
for s in plan["scenes"]:
scene_rows.append(
f"| {s.get('n', '?')} | {s.get('time', '?')} | "
f"{s.get('content', '')} | {s.get('tool', '')} | "
f"{s.get('audio', '')} | {s.get('notes', '')} |"
)
scene_table = "\n".join(scene_rows) if scene_rows else "_(none yet)_"
# Deliverable rows
deliv_rows = []
for d in plan["deliverables"]:
deliv_rows.append(
f"| {d.get('format', '?')} | {d.get('resolution', '?')} | "
f"{d.get('notes', '')} |"
)
deliv_table = "\n".join(deliv_rows) if deliv_rows else "_(none)_"
# Replacements (single-pass)
replacements = {
"TITLE": plan["title"],
"SLUG": plan["slug"],
"TENANT": plan["tenant"],
"WORKSPACE": f"~/projects/video-pipeline/{plan['slug']}",
"ONE_LINE_PITCH": extra.get("concept_one_liner", "_(TBD)_"),
"EMOTIONAL_NORTH_STAR": extra.get("emotional_north_star", "_(TBD)_"),
"DURATION_S": str(plan["duration_s"]),
"ASPECT": plan["aspect"],
"RESOLUTION": plan["resolution"],
"FPS": str(plan["fps"]),
"PLATFORMS": extra.get("platforms", "_(TBD)_"),
"DEADLINE": extra.get("deadline", "_(none)_"),
"QUALITY_BAR": extra.get("quality_bar", "polished"),
"VISUAL_REFS": extra.get("visual_refs", "_(none)_"),
"TONE": extra.get("tone", "_(TBD)_"),
"BRAND_CONSTRAINTS": extra.get("brand_constraints", "_(none)_"),
"AESTHETIC_RULES": extra.get("aesthetic_rules", "_(TBD)_"),
"AUDIO_APPROACH": plan["audio"].get("approach", "_(TBD)_"),
"VO_DETAILS": plan["audio"].get("vo", "_(n/a)_"),
"MUSIC_DETAILS": plan["audio"].get("music", "_(n/a)_"),
"SFX_DETAILS": plan["audio"].get("sfx", "_(n/a)_"),
"PRIMARY_FORMAT": plan["deliverables"][0]["format"],
"PRIMARY_RES": plan["deliverables"][0]["resolution"],
"ALT_FORMAT_1": (plan["deliverables"][1]["format"]
if len(plan["deliverables"]) > 1 else "_(none)_"),
"ALT_RES_1": (plan["deliverables"][1]["resolution"]
if len(plan["deliverables"]) > 1 else ""),
"ALT_NOTES_1": (plan["deliverables"][1].get("notes", "")
if len(plan["deliverables"]) > 1 else ""),
"API_KEYS_REQUIRED": ", ".join(plan.get("api_keys_required", [])) or "none",
"EXT_DEPS": extra.get("ext_deps", "ffmpeg, Python 3.11+"),
"SOURCE_ASSETS": extra.get("source_assets", "_(none)_"),
}
out = tmpl
for k, v in replacements.items():
out = out.replace("{{" + k + "}}", str(v))
# Scene + deliv tables: replace the placeholder row in the template
out = re.sub(
r"\|\s*1\s*\|\s*0:000:0X.+?\n\|\s*2\s*\|.+?\n",
scene_table + "\n",
out, flags=re.DOTALL,
)
return out
def render_team_md(plan: dict) -> str:
"""Render TEAM.md from the team list + scene → tool mapping."""
lines = [f"# Team & Task Graph — {plan['title']}", "", "## Team", ""]
for t in plan["team"]:
skills = (
f"loads `{', '.join(t['skills'])}`"
if t["skills"] else "no skills required"
)
lines.append(
f"- `{t['profile']}` — {t['responsibilities']} ({skills})"
)
lines.extend(["", "## Task Graph", "", "```"])
# Build a simple task graph based on conventions
profiles_by_role = {t["role"]: t["profile"] for t in plan["team"]}
director = profiles_by_role.get("director", "director")
lines.append(f"T0 {director} — decompose")
next_id = 1
parents_for_renderer: list[str] = ["T0"]
if "cinematographer" in profiles_by_role:
cid = f"T{next_id}"
lines.append(
f"{cid:5} {profiles_by_role['cinematographer']} — visual spec for all scenes (parent: T0)"
)
parents_for_renderer = [cid]
next_id += 1
if "music-supervisor" in profiles_by_role:
cid = f"T{next_id}"
lines.append(
f"{cid:5} {profiles_by_role['music-supervisor']} — track analysis + beats.json (parent: T0)"
)
next_id += 1
ms_id = cid
else:
ms_id = None
# Scenes
scene_ids = []
for s in plan["scenes"]:
cid = f"T{next_id}"
renderer_profile = s.get("tool") or "renderer"
# Lookup the actual profile name
for t in plan["team"]:
if t["role"] == renderer_profile or t["profile"] == renderer_profile:
renderer_profile = t["profile"]
break
parents = parents_for_renderer + ([ms_id] if ms_id else [])
parent_str = ", ".join(parents)
lines.append(
f"{cid:5} {renderer_profile} — scene {s.get('n', '?')}: "
f"{s.get('content', '')[:50]} (parents: {parent_str})"
)
scene_ids.append(cid)
next_id += 1
# VO + audio mix
if "voice-talent" in profiles_by_role:
vo_id = f"T{next_id}"
lines.append(f"{vo_id:5} {profiles_by_role['voice-talent']} — narration (parent: T0)")
next_id += 1
else:
vo_id = None
if "audio-mixer" in profiles_by_role:
am_id = f"T{next_id}"
am_parents = [p for p in [ms_id, vo_id] if p]
lines.append(
f"{am_id:5} {profiles_by_role['audio-mixer']} — mix audio (parents: {', '.join(am_parents)})"
)
next_id += 1
else:
am_id = None
# Editor
if "editor" in profiles_by_role:
ed_id = f"T{next_id}"
ed_parents = scene_ids + [p for p in [am_id, vo_id, ms_id] if p and p not in scene_ids]
lines.append(
f"{ed_id:5} {profiles_by_role['editor']} — assemble + mux (parents: {', '.join(ed_parents)})"
)
next_id += 1
else:
ed_id = None
# Captioner
if "captioner" in profiles_by_role and ed_id:
cap_id = f"T{next_id}"
lines.append(
f"{cap_id:5} {profiles_by_role['captioner']} — SRT + burn (parent: {ed_id})"
)
next_id += 1
last = cap_id
else:
last = ed_id
# Reviewer
if "reviewer" in profiles_by_role and last:
rv_id = f"T{next_id}"
lines.append(
f"{rv_id:5} {profiles_by_role['reviewer']} — final QA (parent: {last})"
)
lines.append("```")
lines.extend([
"",
"## Per-task workspace requirement",
"",
f"All `kanban_create` calls MUST pass:",
f"```",
f'workspace_kind="dir"',
f'workspace_path="$HOME/projects/video-pipeline/{plan["slug"]}"',
f'tenant="{plan["tenant"]}"',
f"```",
])
return "\n".join(lines)
def render_setup_sh(plan: dict, brief_md: str, team_md: str) -> str:
"""Render setup.sh from the plan."""
tmpl = load_template("setup.sh.tmpl")
# API key checks
key_checks = []
for key in plan.get("api_keys_required", []):
key_checks.append(f'check_key {key} hermes {key} || exit 1')
key_checks_str = "\n".join(key_checks) if key_checks else "# (no API keys required)"
# Scene dirs
scene_dir_lines = []
for s in plan["scenes"]:
n = s.get("n", "?")
scene_dir_lines.append(f'mkdir -p "$WORKSPACE/scenes/scene-{n:02d}"/checkpoints')
scene_dirs = "\n".join(scene_dir_lines) if scene_dir_lines else ""
# Profile create
profile_creates = []
for t in plan["team"]:
profile_creates.append(
f'hermes profile create {t["profile"]} --clone 2>/dev/null || true'
)
# Profile config — emit JSON arrays so the bash function can pass them
# safely through to the Python YAML patcher.
profile_configs = []
for t in plan["team"]:
ts_json = json.dumps(t["toolsets"])
sk_json = json.dumps(t["skills"])
# Use single-quoted bash strings; JSON only contains "/[/], no single
# quotes, so this is safe.
profile_configs.append(
f"configure_profile {t['profile']!r} {ts_json!r} {sk_json!r}"
)
# SOUL writes — uses heredocs per profile
soul_writes = []
for t in plan["team"]:
soul_writes.append(
f'cat > "$HOME/.hermes/profiles/{t["profile"]}/SOUL.md" <<\'SOUL_EOF\'\n'
f"{render_soul_md(t, plan)}\n"
f"SOUL_EOF\n"
f'echo " ✓ SOUL.md for {t["profile"]}"'
)
# Taste writes (placeholder; real content optional)
taste_writes = (
'cat > "$WORKSPACE/taste/brand-guide.md" <<\'TASTE_EOF\'\n'
'# Brand Guide\n\n'
'_(Populate with project-specific colors, typography, motion rules)_\n'
'TASTE_EOF\n'
'cat > "$WORKSPACE/taste/emotional-dna.md" <<\'DNA_EOF\'\n'
'# Emotional DNA\n\n'
'_(What this piece should FEEL like — populate from the brief.)_\n'
'DNA_EOF'
)
# Asset copies — leave empty by default; user fills in
asset_copies = "# Add cp/rsync commands here for any provided assets"
out = tmpl
out = out.replace("{{TITLE}}", plan["title"])
out = out.replace("{{SLUG}}", plan["slug"])
out = out.replace("{{TENANT}}", plan["tenant"])
out = out.replace("{{WORKSPACE}}", f"~/projects/video-pipeline/{plan['slug']}")
out = out.replace("{{KEY_CHECKS}}", key_checks_str)
out = out.replace("{{SCENE_DIRS}}", scene_dirs)
out = out.replace("{{PROFILE_CREATE_COMMANDS}}", "\n".join(profile_creates))
out = out.replace("{{PROFILE_CONFIG_COMMANDS}}", "\n".join(profile_configs))
out = out.replace("{{SOUL_WRITES}}", "\n".join(soul_writes))
out = out.replace("{{BRIEF_CONTENTS}}", brief_md)
out = out.replace("{{TEAM_CONTENTS}}", team_md)
out = out.replace("{{TASTE_WRITES}}", taste_writes)
out = out.replace("{{ASSET_COPIES}}", asset_copies)
return out
def render_soul_md(team_member: dict, plan: dict) -> str:
"""Render a profile's SOUL.md from a team member dict + plan context."""
tmpl = load_template("soul.md.tmpl")
role = team_member["role"]
common_rules = (
"- **Read the brief and team graph** before doing anything else.\n"
"- **Pass `workspace_kind=\"dir\"` and `workspace_path` on every "
"`kanban_create` call.** This keeps the team in one shared workspace.\n"
f"- **Use tenant `{plan['tenant']}`** on every kanban call.\n"
"- **Write outputs to predictable paths.** Other profiles depend on "
"your filename conventions.\n"
"- **Emit heartbeats** during long-running work. Renderers should "
"report frame counts; editors should report assembly progress.\n"
)
if role == "director":
common_rules += (
"- **Do not execute the work yourself.** For every concrete task, "
"create a kanban task and assign it to the appropriate profile.\n"
"- **Decompose, route, comment, approve — that's the whole job.**\n"
"- **Read TEAM.md** for the canonical task graph. Do not invent "
"new roles unless the brief truly demands it.\n"
"- **Load the `kanban-orchestrator` skill** for the deeper "
"decomposition playbook beyond the auto-injected baseline.\n"
)
common_commands = (
"```bash\n"
"# Inspect a clip\n"
"ffprobe -v quiet -show_entries format=duration -show_entries "
"stream=codec_name,width,height,r_frame_rate <file.mp4>\n"
"\n"
"# Extract a frame for QA\n"
"ffmpeg -y -i <input.mp4> -vf \"select='eq(n,30)'\" -vsync vfr <out.png>\n"
"```"
)
out = tmpl
out = out.replace("{{ROLE_NAME}}", role)
out = out.replace("{{ROLE_RESPONSIBILITIES}}", team_member["responsibilities"])
out = out.replace("{{INPUTS_READ}}", team_member.get("inputs", "_(see brief)_"))
out = out.replace("{{OUTPUTS_PRODUCED}}", team_member.get("outputs", "_(see brief)_"))
out = out.replace("{{TOOLSETS}}", ", ".join(team_member["toolsets"]))
out = out.replace(
"{{SKILLS}}",
", ".join(team_member["skills"]) if team_member["skills"] else "(none)"
)
out = out.replace(
"{{EXTERNAL_TOOLS}}",
team_member.get("external_tools", "ffmpeg, ffprobe (via terminal)")
)
out = out.replace(
"{{ROLE_RULES}}",
team_member.get("role_rules", "_(see TEAM.md and brief.md)_")
)
out = out.replace("{{COMMON_RULES}}", common_rules)
out = out.replace("{{COMMON_COMMANDS}}", common_commands)
return out
def main():
ap = argparse.ArgumentParser(description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
ap.add_argument("plan_json", help="Path to plan.json")
ap.add_argument("--out", default="setup.sh",
help="Output path for setup.sh (default: ./setup.sh)")
ap.add_argument("--brief-out", default=None,
help="Write brief.md alongside (default: skipped)")
ap.add_argument("--team-out", default=None,
help="Write TEAM.md alongside (default: skipped)")
args = ap.parse_args()
plan = json.loads(Path(args.plan_json).read_text())
errors = validate_plan(plan)
if errors:
print("Plan validation failed:", file=sys.stderr)
for e in errors:
print(f" - {e}", file=sys.stderr)
sys.exit(2)
brief = render_brief(plan)
team = render_team_md(plan)
setup = render_setup_sh(plan, brief, team)
Path(args.out).write_text(setup)
os.chmod(args.out, 0o755)
print(f"Wrote {args.out}")
if args.brief_out:
Path(args.brief_out).write_text(brief)
print(f"Wrote {args.brief_out}")
if args.team_out:
Path(args.team_out).write_text(team)
print(f"Wrote {args.team_out}")
if __name__ == "__main__":
main()
@@ -0,0 +1,195 @@
#!/usr/bin/env python3
"""
Monitor a running video-production kanban. Polls `hermes kanban list` and
`events` for a tenant and surfaces issues (stuck tasks, missing heartbeats,
repeated retries, dependency deadlocks).
Usage:
monitor.py --tenant <project-slug> [--interval 30]
Outputs a periodic snapshot to stdout. Sends alerts via stderr when issues
are detected. Designed to run alongside the kanban kill with Ctrl-C when
you're satisfied (or scripted to stop on completion).
This is best-effort observability. It does not auto-restart tasks; intervention
decisions should remain human/AI-overseen.
"""
from __future__ import annotations
import argparse
import json
import shutil
import subprocess
import sys
import time
from collections import defaultdict
from datetime import datetime, timedelta
def hermes_available() -> bool:
return shutil.which("hermes") is not None
def kanban_list(tenant: str) -> list[dict]:
"""Returns parsed task rows. Falls back to plain stdout parsing if JSON
output isn't supported by the installed hermes CLI."""
try:
out = subprocess.run(
["hermes", "kanban", "list", "--tenant", tenant, "--json"],
capture_output=True, text=True, check=False,
)
if out.returncode == 0 and out.stdout.strip().startswith("["):
return json.loads(out.stdout)
except (FileNotFoundError, json.JSONDecodeError):
pass
# Fallback: textual parse of `hermes kanban list`
out = subprocess.run(
["hermes", "kanban", "list", "--tenant", tenant],
capture_output=True, text=True, check=False,
)
rows = []
for line in out.stdout.splitlines():
line = line.strip()
if not line or line.startswith("#") or "STATUS" in line.upper():
continue
parts = line.split()
if len(parts) >= 4 and parts[0].startswith("t_"):
rows.append({
"id": parts[0],
"status": parts[1] if len(parts) > 1 else "?",
"assignee": parts[2] if len(parts) > 2 else "?",
"title": " ".join(parts[3:]) if len(parts) > 3 else "",
"started_at": None,
"heartbeat_at": None,
"max_runtime_s": None,
})
return rows
def kanban_show(task_id: str) -> dict | None:
out = subprocess.run(
["hermes", "kanban", "show", task_id, "--json"],
capture_output=True, text=True, check=False,
)
if out.returncode != 0:
return None
try:
return json.loads(out.stdout)
except json.JSONDecodeError:
return None
def detect_issues(tasks: list[dict]) -> list[str]:
"""Return a list of issue strings, one per concern."""
now = datetime.now()
issues: list[str] = []
by_status = defaultdict(list)
for t in tasks:
by_status[t.get("status", "?")].append(t)
# Stuck tasks: RUNNING with no heartbeat in 2 min
for t in by_status.get("running", []) + by_status.get("RUNNING", []):
hb = t.get("heartbeat_at")
if not hb:
continue
try:
hb_dt = datetime.fromisoformat(str(hb).rstrip("Z"))
except ValueError:
continue
if now - hb_dt > timedelta(minutes=2):
issues.append(
f"STUCK: {t['id']} ({t.get('assignee', '?')}) — "
f"no heartbeat in {(now - hb_dt).total_seconds():.0f}s"
)
# Tasks exceeding max_runtime
for t in by_status.get("running", []) + by_status.get("RUNNING", []):
started = t.get("started_at")
max_rt = t.get("max_runtime_s")
if not started or not max_rt:
continue
try:
started_dt = datetime.fromisoformat(str(started).rstrip("Z"))
except ValueError:
continue
elapsed = (now - started_dt).total_seconds()
if elapsed > max_rt:
issues.append(
f"OVERTIME: {t['id']} ({t.get('assignee', '?')}) — "
f"running {elapsed:.0f}s, cap was {max_rt}s"
)
# Repeated retries
for t in tasks:
retries = t.get("retries", 0)
if retries and retries >= 2:
issues.append(
f"FLAPPING: {t['id']} ({t.get('assignee', '?')}) — "
f"retried {retries}× — fix root cause before next run"
)
return issues
def snapshot(tenant: str) -> tuple[list[dict], list[str]]:
tasks = kanban_list(tenant)
issues = detect_issues(tasks)
return tasks, issues
def print_snapshot(tasks: list[dict], issues: list[str]):
counts = defaultdict(int)
for t in tasks:
counts[str(t.get("status", "?")).lower()] += 1
print(f"\n[{datetime.now().strftime('%H:%M:%S')}] "
f"Total: {len(tasks)} | "
+ " | ".join(f"{k}: {v}" for k, v in sorted(counts.items())))
for t in tasks:
bar = "" if str(t.get("status", "")).lower() == "done" else \
"" if str(t.get("status", "")).lower() == "running" else \
"·" if str(t.get("status", "")).lower() == "ready" else \
"" if str(t.get("status", "")).lower() == "failed" else "?"
print(f" {bar} {t.get('id', '?'):14} {t.get('assignee', '?'):20} "
f"{t.get('title', '')[:60]}")
if issues:
print("\n ⚠ ISSUES:", file=sys.stderr)
for i in issues:
print(f" {i}", file=sys.stderr)
def main():
ap = argparse.ArgumentParser(description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
ap.add_argument("--tenant", required=True,
help="Project tenant slug to monitor")
ap.add_argument("--interval", type=int, default=30,
help="Poll interval in seconds (default: 30)")
ap.add_argument("--once", action="store_true",
help="Print one snapshot and exit (no polling loop)")
args = ap.parse_args()
if not hermes_available():
print("ERROR: 'hermes' CLI not found in PATH", file=sys.stderr)
sys.exit(1)
if args.once:
tasks, issues = snapshot(args.tenant)
print_snapshot(tasks, issues)
sys.exit(0 if not issues else 2)
print(f"Monitoring tenant '{args.tenant}' every {args.interval}s. "
"Ctrl-C to exit.")
try:
while True:
tasks, issues = snapshot(args.tenant)
print_snapshot(tasks, issues)
time.sleep(args.interval)
except KeyboardInterrupt:
print("\nStopped.")
if __name__ == "__main__":
main()
+8 -1
View File
@@ -43,7 +43,7 @@ class NodeServer:
def __init__(
self,
host: str = "0.0.0.0",
host: str = "127.0.0.1",
port: int = 18789,
token_path: Optional[Path] = None,
display_name: str = "hermes-meet-node",
@@ -76,6 +76,13 @@ class NodeServer:
json.dumps({"token": tok, "generated_at": time.time()}, indent=2),
encoding="utf-8",
)
# Restrict to owner-read-write only — the token grants full RPC
# access to the meet bot (start, transcribe, speak in meetings).
try:
tmp.chmod(0o600)
except (OSError, NotImplementedError):
# Best-effort on non-POSIX filesystems; mode is set on POSIX.
pass
tmp.replace(self.token_path)
self._token = tok
return tok
+9 -6
View File
@@ -10300,7 +10300,10 @@ class AIAgent:
provider_preferences["order"] = self.providers_order
if self.provider_sort:
provider_preferences["sort"] = self.provider_sort
if provider_preferences:
if provider_preferences and (
(self.provider or "").strip().lower() == "openrouter"
or self._is_openrouter_url()
):
summary_extra_body["provider"] = provider_preferences
if summary_extra_body:
@@ -10623,11 +10626,11 @@ class AIAgent:
self.model,
f"{self.context_compressor.context_length:,}",
)
if not self.quiet_mode:
self._safe_print(
f"📦 Preflight compression: ~{_preflight_tokens:,} tokens "
f">= {self.context_compressor.threshold_tokens:,} threshold"
)
self._emit_status(
f"📦 Preflight compression: ~{_preflight_tokens:,} tokens "
f">= {self.context_compressor.threshold_tokens:,} threshold. "
"This may take a moment."
)
# May need multiple passes for very large sessions with small
# context windows (each pass summarises the middle N turns).
for _pass in range(3):
+37
View File
@@ -67,7 +67,10 @@ AUTHOR_MAP = {
"nbot@liizfq.top": "liizfq",
"274096618+hermes-agent-dhabibi@users.noreply.github.com": "dhabibi",
"dejie.guo@gmail.com": "JayGwod",
"133716830+0xKingBack@users.noreply.github.com": "0xKingBack",
"daixin1204@gmail.com": "SimbaKingjoe",
"maxence@groine.fr": "MaxyMoos",
"61830395+leprincep35700@users.noreply.github.com": "leprincep35700",
# OpenViking viking_read salvage (April 2026)
"hitesh@gmail.com": "htsh",
"pty819@outlook.com": "pty819",
@@ -505,6 +508,10 @@ AUTHOR_MAP = {
"michel.belleau@malaiwah.com": "malaiwah",
"gnanasekaran.sekareee@gmail.com": "gnanam1990",
"jz.pentest@gmail.com": "0xyg3n",
"7093928+0xyg3n@users.noreply.github.com": "0xyg3n",
"nftpoetrist@gmail.com": "nftpoetrist", # PR #18982
"millerc79@users.noreply.github.com": "millerc79", # PR #19033
"hermes@example.com": "shellybotmoyer", # PR #18915 (bot-committed)
"hypnosis.mda@gmail.com": "Hypn0sis",
"ywt000818@gmail.com": "OwenYWT",
"dhandhalyabhavik@gmail.com": "v1k22",
@@ -616,6 +623,33 @@ AUTHOR_MAP = {
"2114364329@qq.com": "cuyua9",
"2557058999@qq.com": "Disaster-Terminator",
"cine.dreamer.one@gmail.com": "LeonSGP43",
"zyprothh@gmail.com": "Zyproth",
"amitgaur@gmail.com": "amitgaur",
"albuquerque.abner@gmail.com": "mrbob-git",
"kiala@users.noreply.github.com": "kiala9",
"alanxchen@gmail.com": "alanxchen85",
"clawbot@clawbots-Mac-mini.local": "John-tip",
"der@konsi.org": "konsisumer",
"cirwel@The-CIRWEL-Group.local": "CIRWEL",
"molvikar8@gmail.com": "molvikar",
"nftpoetrist@gmail.com": "nftpoetrist",
"dodofun@126.com": "colorcross",
"1615063567@qq.com": "zhao0112",
"ethanguo.2003@gmail.com": "EthanGuo-coder",
"dev0jsh@gmail.com": "tmdgusya",
"leavr@163.com": "leavrcn",
"17683456+wanazhar@users.noreply.github.com": "wanazhar",
"26782336+cixuuz@users.noreply.github.com": "cixuuz",
"aleksandr.pasevin@openzeppelin.com": "pasevin",
"ubuntu@localhost.localdomain": "holynn-q",
"holynn@placeholder.local": "holynn-q",
"agent@hermes.local": "jacdevos",
"sunsky.lau@gmail.com": "liuhao1024",
"qiuqfang98@qq.com": "keepcalmqqf",
"261867348+ai-ag2026@users.noreply.github.com": "ai-ag2026",
"yanzh.su@gmail.com": "YanzhongSu",
"wanderwang@users.noreply.github.com": "WanderWang",
"yueheime@gmail.com": "yuehei",
"leozeli@qq.com": "leozeli",
"linlehao@cuhk.edu.cn": "LehaoLin",
"liutong@isacas.ac.cn": "I3eg1nner",
@@ -673,6 +707,9 @@ AUTHOR_MAP = {
"web3blind@gmail.com": "web3blind",
"ztzheng@163.com": "chengoak", # PR #17467
"24110240104@m.fudan.edu.cn": "YuShu", # co-author only
"charliekerfoot@gmail.com": "CharlieKerfoot", # PR #18951
# Debug share upload-time redaction (May 2026)
"dhuysamen@gmail.com": "GodsBoy", # PR #19318
}
+97 -119
View File
@@ -25,15 +25,15 @@
}
},
"node_modules/@cacheable/memory": {
"version": "2.0.7",
"resolved": "https://registry.npmjs.org/@cacheable/memory/-/memory-2.0.7.tgz",
"integrity": "sha512-RbxnxAMf89Tp1dLhXMS7ceft/PGsDl1Ip7T20z5nZ+pwIAsQ1p2izPjVG69oCLv/jfQ7HDPHTWK0c9rcAWXN3A==",
"version": "2.0.8",
"resolved": "https://registry.npmjs.org/@cacheable/memory/-/memory-2.0.8.tgz",
"integrity": "sha512-FvEb29x5wVwu/Kf93IWwsOOEuhHh6dYCJF3vcKLzXc0KXIW181AOzv6ceT4ZpBHDvAfG60eqb+ekmrnLHIy+jw==",
"license": "MIT",
"dependencies": {
"@cacheable/utils": "^2.3.3",
"@keyv/bigmap": "^1.3.0",
"hookified": "^1.14.0",
"keyv": "^5.5.5"
"@cacheable/utils": "^2.4.0",
"@keyv/bigmap": "^1.3.1",
"hookified": "^1.15.1",
"keyv": "^5.6.0"
}
},
"node_modules/@cacheable/node-cache": {
@@ -51,19 +51,19 @@
}
},
"node_modules/@cacheable/utils": {
"version": "2.3.4",
"resolved": "https://registry.npmjs.org/@cacheable/utils/-/utils-2.3.4.tgz",
"integrity": "sha512-knwKUJEYgIfwShABS1BX6JyJJTglAFcEU7EXqzTdiGCXur4voqkiJkdgZIQtWNFhynzDWERcTYv/sETMu3uJWA==",
"version": "2.4.1",
"resolved": "https://registry.npmjs.org/@cacheable/utils/-/utils-2.4.1.tgz",
"integrity": "sha512-eiFgzCbIneyMlLOmNG4g9xzF7Hv3Mga4LjxjcSC/ues6VYq2+gUbQI8JqNuw/ZM8tJIeIaBGpswAsqV2V7ApgA==",
"license": "MIT",
"dependencies": {
"hashery": "^1.3.0",
"hashery": "^1.5.1",
"keyv": "^5.6.0"
}
},
"node_modules/@emnapi/runtime": {
"version": "1.8.1",
"resolved": "https://registry.npmjs.org/@emnapi/runtime/-/runtime-1.8.1.tgz",
"integrity": "sha512-mehfKSMWjjNol8659Z8KxEMrdSJDDot5SXMq00dM8BN4o+CLNXQ0xH2V7EchNHV4RmbZLmmPdEaXZc5H2FXmDg==",
"version": "1.10.0",
"resolved": "https://registry.npmjs.org/@emnapi/runtime/-/runtime-1.10.0.tgz",
"integrity": "sha512-ewvYlk86xUoGI0zQRNq/mC+16R1QeDlKQy21Ki3oSYXNgLb45GV1P6A0M+/s6nyCuNDqe5VpaY84BzXGwVbwFA==",
"license": "MIT",
"optional": true,
"peer": true,
@@ -87,9 +87,9 @@
"license": "BSD-3-Clause"
},
"node_modules/@img/colour": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/@img/colour/-/colour-1.0.0.tgz",
"integrity": "sha512-A5P/LfWGFSl6nsckYtjw9da+19jB8hkJ6ACTGcDfEJ0aE+l2n2El7dsVM7UVHZQ9s2lmYMWlrS21YLy2IR1LUw==",
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/@img/colour/-/colour-1.1.0.tgz",
"integrity": "sha512-Td76q7j57o/tLVdgS746cYARfSyxk8iEfRxewL9h4OMzYhbW4TAcppl0mT4eyqXddh6L/jwoM75mo7ixa/pCeQ==",
"license": "MIT",
"peer": true,
"engines": {
@@ -617,9 +617,9 @@
"license": "BSD-3-Clause"
},
"node_modules/@protobufjs/codegen": {
"version": "2.0.4",
"resolved": "https://registry.npmjs.org/@protobufjs/codegen/-/codegen-2.0.4.tgz",
"integrity": "sha512-YyFaikqM5sH0ziFZCN3xDC7zeGaB/d0IUb9CATugHWbd1FRFwWwt4ld4OYMPWu5a3Xe01mGAULCdqhMlPl29Jg==",
"version": "2.0.5",
"resolved": "https://registry.npmjs.org/@protobufjs/codegen/-/codegen-2.0.5.tgz",
"integrity": "sha512-zgXFLzW3Ap33e6d0Wlj4MGIm6Ce8O89n/apUaGNB/jx+hw+ruWEp7EwGUshdLKVRCxZW12fp9r40E1mQrf/34g==",
"license": "BSD-3-Clause"
},
"node_modules/@protobufjs/eventemitter": {
@@ -645,9 +645,9 @@
"license": "BSD-3-Clause"
},
"node_modules/@protobufjs/inquire": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/@protobufjs/inquire/-/inquire-1.1.0.tgz",
"integrity": "sha512-kdSefcPdruJiFMVSbn801t4vFK7KB/5gd2fYvrxhuJYg8ILrmn9SKSX2tZdV6V+ksulWqS7aXjBcRXl3wHoD9Q==",
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/@protobufjs/inquire/-/inquire-1.1.1.tgz",
"integrity": "sha512-mnzgDV26ueAvk7rsbt9L7bE0SuAoqyuys/sMMrmVcN5x9VsxpcG3rqAUSgDyLp0UZlmNfIbQ4fHfCtreVBk8Ew==",
"license": "BSD-3-Clause"
},
"node_modules/@protobufjs/path": {
@@ -663,9 +663,9 @@
"license": "BSD-3-Clause"
},
"node_modules/@protobufjs/utf8": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/@protobufjs/utf8/-/utf8-1.1.0.tgz",
"integrity": "sha512-Vvn3zZrhQZkkBE8LSuW3em98c0FwgO4nxzv6OdSxPKJIEKY2bGbHn+mhGIPerzI4twdxaP8/0+06HBpwf345Lw==",
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/@protobufjs/utf8/-/utf8-1.1.1.tgz",
"integrity": "sha512-oOAWABowe8EAbMyWKM0tYDKi8Yaox52D+HWZhAIJqQXbqe0xI/GV7FhLWqlEKreMkfDjshR5FKgi3mnle0h6Eg==",
"license": "BSD-3-Clause"
},
"node_modules/@tokenizer/inflate": {
@@ -714,25 +714,20 @@
"integrity": "sha512-OvjF+z51L3ov0OyAU0duzsYuvO01PH7x4t6DJx+guahgTnBHkhJdG7soQeTSFLWN3efnHyibZ4Z8l2EuWwJN3A==",
"license": "MIT"
},
"node_modules/@types/long": {
"version": "4.0.2",
"resolved": "https://registry.npmjs.org/@types/long/-/long-4.0.2.tgz",
"integrity": "sha512-MqTGEo5bj5t157U6fA/BiDynNkn0YknVdh48CMPkTSpFTVmvao5UQmm7uEF6xBEo7qIMAlY/JSleYaE6VOdpaA==",
"license": "MIT"
},
"node_modules/@types/node": {
"version": "25.3.1",
"resolved": "https://registry.npmjs.org/@types/node/-/node-25.3.1.tgz",
"integrity": "sha512-hj9YIJimBCipHVfHKRMnvmHg+wfhKc0o4mTtXh9pKBjC8TLJzz0nzGmLi5UJsYAUgSvXFHgb0V2oY10DUFtImw==",
"version": "25.6.0",
"resolved": "https://registry.npmjs.org/@types/node/-/node-25.6.0.tgz",
"integrity": "sha512-+qIYRKdNYJwY3vRCZMdJbPLJAtGjQBudzZzdzwQYkEPQd+PJGixUL5QfvCLDaULoLv+RhT3LDkwEfKaAkgSmNQ==",
"license": "MIT",
"dependencies": {
"undici-types": "~7.18.0"
"undici-types": "~7.19.0"
}
},
"node_modules/@whiskeysockets/baileys": {
"name": "baileys",
"version": "7.0.0-rc.9",
"resolved": "git+ssh://git@github.com/WhiskeySockets/Baileys.git#01047debd81beb20da7b7779b08edcb06aa03770",
"integrity": "sha512-letWyB96JHD6NdqpAiseOfaUBi13u8AhiRcKSRqcVjc5Vw5xoPTZGvVnw8K/NvGBFAvyLJkwim9Mjvwzhx/SlA==",
"hasInstallScript": true,
"license": "MIT",
"dependencies": {
@@ -807,9 +802,9 @@
}
},
"node_modules/body-parser": {
"version": "1.20.4",
"resolved": "https://registry.npmjs.org/body-parser/-/body-parser-1.20.4.tgz",
"integrity": "sha512-ZTgYYLMOXY9qKU/57FAo8F+HA2dGX7bqGc71txDRC1rS4frdFI5R7NhluHxH6M0YItAP0sHB4uqAOcYKxO6uGA==",
"version": "1.20.5",
"resolved": "https://registry.npmjs.org/body-parser/-/body-parser-1.20.5.tgz",
"integrity": "sha512-3grm+/2tUOvu2cjJkvsIxrv/wVpfXQW4PsQHYm7yk4vfpu7Ekl6nEsYBoJUL6qDwZUx8wUhQ8tR2qz+ad9c9OA==",
"license": "MIT",
"dependencies": {
"bytes": "~3.1.2",
@@ -820,7 +815,7 @@
"http-errors": "~2.0.1",
"iconv-lite": "~0.4.24",
"on-finished": "~2.4.1",
"qs": "~6.14.0",
"qs": "~6.15.1",
"raw-body": "~2.5.3",
"type-is": "~1.6.18",
"unpipe": "~1.0.0"
@@ -830,6 +825,21 @@
"npm": "1.2.8000 || >= 1.4.16"
}
},
"node_modules/body-parser/node_modules/qs": {
"version": "6.15.1",
"resolved": "https://registry.npmjs.org/qs/-/qs-6.15.1.tgz",
"integrity": "sha512-6YHEFRL9mfgcAvql/XhwTvf5jKcOiiupt2FiJxHkiX1z4j7WL8J/jRHYLluORvc1XxB5rV20KoeK00gVJamspg==",
"license": "BSD-3-Clause",
"dependencies": {
"side-channel": "^1.1.0"
},
"engines": {
"node": ">=0.6"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/bytes": {
"version": "3.1.2",
"resolved": "https://registry.npmjs.org/bytes/-/bytes-3.1.2.tgz",
@@ -840,16 +850,16 @@
}
},
"node_modules/cacheable": {
"version": "2.3.2",
"resolved": "https://registry.npmjs.org/cacheable/-/cacheable-2.3.2.tgz",
"integrity": "sha512-w+ZuRNmex9c1TR9RcsxbfTKCjSL0rh1WA5SABbrWprIHeNBdmyQLSYonlDy9gpD+63XT8DgZ/wNh1Smvc9WnJA==",
"version": "2.3.4",
"resolved": "https://registry.npmjs.org/cacheable/-/cacheable-2.3.4.tgz",
"integrity": "sha512-djgxybDbw9fL/ZWMI3+CE8ZilNxcwFkVtDc1gJ+IlOSSWkSMPQabhV/XCHTQ6pwwN6aivXPZ43omTooZiX06Ew==",
"license": "MIT",
"dependencies": {
"@cacheable/memory": "^2.0.7",
"@cacheable/utils": "^2.3.3",
"@cacheable/memory": "^2.0.8",
"@cacheable/utils": "^2.4.0",
"hookified": "^1.15.0",
"keyv": "^5.5.5",
"qified": "^0.6.0"
"keyv": "^5.6.0",
"qified": "^0.9.0"
}
},
"node_modules/call-bind-apply-helpers": {
@@ -1212,21 +1222,21 @@
}
},
"node_modules/hashery": {
"version": "1.5.0",
"resolved": "https://registry.npmjs.org/hashery/-/hashery-1.5.0.tgz",
"integrity": "sha512-nhQ6ExaOIqti2FDWoEMWARUqIKyjr2VcZzXShrI+A3zpeiuPWzx6iPftt44LhP74E5sW36B75N6VHbvRtpvO6Q==",
"version": "1.5.1",
"resolved": "https://registry.npmjs.org/hashery/-/hashery-1.5.1.tgz",
"integrity": "sha512-iZyKG96/JwPz1N55vj2Ie2vXbhu440zfUfJvSwEqEbeLluk7NnapfGqa7LH0mOsnDxTF85Mx8/dyR6HfqcbmbQ==",
"license": "MIT",
"dependencies": {
"hookified": "^1.14.0"
"hookified": "^1.15.0"
},
"engines": {
"node": ">=20"
}
},
"node_modules/hasown": {
"version": "2.0.2",
"resolved": "https://registry.npmjs.org/hasown/-/hasown-2.0.2.tgz",
"integrity": "sha512-0hJU9SCPvmMzIBdZFqNPXWa6dqh7WdH0cII9y+CyS8rG3nL48Bclra9HmKhVVUHyPWNH5Y7xDwAB7bfgSjkUMQ==",
"version": "2.0.3",
"resolved": "https://registry.npmjs.org/hasown/-/hasown-2.0.3.tgz",
"integrity": "sha512-ej4AhfhfL2Q2zpMmLo7U1Uv9+PyhIZpgQLGT1F9miIGmiCJIoCgSmczFdrc97mWT4kVY72KA+WnnhJ5pghSvSg==",
"license": "MIT",
"dependencies": {
"function-bind": "^1.1.2"
@@ -1327,44 +1337,6 @@
"protobufjs": "6.8.8"
}
},
"node_modules/libsignal/node_modules/@types/node": {
"version": "10.17.60",
"resolved": "https://registry.npmjs.org/@types/node/-/node-10.17.60.tgz",
"integrity": "sha512-F0KIgDJfy2nA3zMLmWGKxcH2ZVEtCZXHHdOQs2gSaQ27+lNeEfGxzkIw90aXswATX7AZ33tahPbzy6KAfUreVw==",
"license": "MIT"
},
"node_modules/libsignal/node_modules/long": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/long/-/long-4.0.0.tgz",
"integrity": "sha512-XsP+KhQif4bjX1kbuSiySJFNAehNxgLb6hPRGJ9QsUr8ajHkuXGdrHmFUTUUXhDwVX2R5bY4JNZEwbUiMhV+MA==",
"license": "Apache-2.0"
},
"node_modules/libsignal/node_modules/protobufjs": {
"version": "6.8.8",
"resolved": "https://registry.npmjs.org/protobufjs/-/protobufjs-6.8.8.tgz",
"integrity": "sha512-AAmHtD5pXgZfi7GMpllpO3q1Xw1OYldr+dMUlAnffGTAhqkg72WdmSY71uKBF/JuyiKs8psYbtKrhi0ASCD8qw==",
"hasInstallScript": true,
"license": "BSD-3-Clause",
"dependencies": {
"@protobufjs/aspromise": "^1.1.2",
"@protobufjs/base64": "^1.1.2",
"@protobufjs/codegen": "^2.0.4",
"@protobufjs/eventemitter": "^1.1.0",
"@protobufjs/fetch": "^1.1.0",
"@protobufjs/float": "^1.0.2",
"@protobufjs/inquire": "^1.1.0",
"@protobufjs/path": "^1.1.2",
"@protobufjs/pool": "^1.1.0",
"@protobufjs/utf8": "^1.1.0",
"@types/long": "^4.0.0",
"@types/node": "^10.1.0",
"long": "^4.0.0"
},
"bin": {
"pbjs": "bin/pbjs",
"pbts": "bin/pbts"
}
},
"node_modules/long": {
"version": "5.3.2",
"resolved": "https://registry.npmjs.org/long/-/long-5.3.2.tgz",
@@ -1372,9 +1344,9 @@
"license": "Apache-2.0"
},
"node_modules/lru-cache": {
"version": "11.2.6",
"resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-11.2.6.tgz",
"integrity": "sha512-ESL2CrkS/2wTPfuend7Zhkzo2u0daGJ/A2VucJOgQ/C48S/zB8MMeMHSGKYpXhIjbPxfuezITkaBH1wqv00DDQ==",
"version": "11.3.5",
"resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-11.3.5.tgz",
"integrity": "sha512-NxVFwLAnrd9i7KUBxC4DrUhmgjzOs+1Qm50D3oF1/oL+r1NpZ4gA7xvG0/zJ8evR7zIKn4vLf7qTNduWFtCrRw==",
"license": "BlueOak-1.0.0",
"engines": {
"node": "20 || >=22"
@@ -1552,12 +1524,12 @@
}
},
"node_modules/p-queue": {
"version": "9.1.0",
"resolved": "https://registry.npmjs.org/p-queue/-/p-queue-9.1.0.tgz",
"integrity": "sha512-O/ZPaXuQV29uSLbxWBGGZO1mCQXV2BLIwUr59JUU9SoH76mnYvtms7aafH/isNSNGwuEfP6W/4xD0/TJXxrizw==",
"version": "9.2.0",
"resolved": "https://registry.npmjs.org/p-queue/-/p-queue-9.2.0.tgz",
"integrity": "sha512-dWgLE8AH0HjQ9fe74pUkKkvzzYT18Inp4zra3lKHnnwqGvcfcUBrvF2EAVX+envufDNBOzpPq/IBUONDbI7+3g==",
"license": "MIT",
"dependencies": {
"eventemitter3": "^5.0.1",
"eventemitter3": "^5.0.4",
"p-timeout": "^7.0.0"
},
"engines": {
@@ -1648,22 +1620,22 @@
"license": "MIT"
},
"node_modules/protobufjs": {
"version": "7.5.4",
"resolved": "https://registry.npmjs.org/protobufjs/-/protobufjs-7.5.4.tgz",
"integrity": "sha512-CvexbZtbov6jW2eXAvLukXjXUW1TzFaivC46BpWc/3BpcCysb5Vffu+B3XHMm8lVEuy2Mm4XGex8hBSg1yapPg==",
"version": "7.5.6",
"resolved": "https://registry.npmjs.org/protobufjs/-/protobufjs-7.5.6.tgz",
"integrity": "sha512-M71sTMB146U3u0di3yup8iM+zv8yPRNQVr1KK4tyBitl3qFvEGucq/rGDRShD2rsJhtN02RJaJ7j5X5hmy8SJg==",
"hasInstallScript": true,
"license": "BSD-3-Clause",
"dependencies": {
"@protobufjs/aspromise": "^1.1.2",
"@protobufjs/base64": "^1.1.2",
"@protobufjs/codegen": "^2.0.4",
"@protobufjs/codegen": "^2.0.5",
"@protobufjs/eventemitter": "^1.1.0",
"@protobufjs/fetch": "^1.1.0",
"@protobufjs/float": "^1.0.2",
"@protobufjs/inquire": "^1.1.0",
"@protobufjs/inquire": "^1.1.1",
"@protobufjs/path": "^1.1.2",
"@protobufjs/pool": "^1.1.0",
"@protobufjs/utf8": "^1.1.0",
"@protobufjs/utf8": "^1.1.1",
"@types/node": ">=13.7.0",
"long": "^5.0.0"
},
@@ -1685,17 +1657,23 @@
}
},
"node_modules/qified": {
"version": "0.6.0",
"resolved": "https://registry.npmjs.org/qified/-/qified-0.6.0.tgz",
"integrity": "sha512-tsSGN1x3h569ZSU1u6diwhltLyfUWDp3YbFHedapTmpBl0B3P6U3+Qptg7xu+v+1io1EwhdPyyRHYbEw0KN2FA==",
"version": "0.9.1",
"resolved": "https://registry.npmjs.org/qified/-/qified-0.9.1.tgz",
"integrity": "sha512-n7mar4T0xQ+39dE2vGTAlbxUEpndwPANH0kDef1/MYsB8Bba9wshkybIRx74qgcvKQPEWErf9AqAdYjhzY2Ilg==",
"license": "MIT",
"dependencies": {
"hookified": "^1.14.0"
"hookified": "^2.1.1"
},
"engines": {
"node": ">=20"
}
},
"node_modules/qified/node_modules/hookified": {
"version": "2.2.0",
"resolved": "https://registry.npmjs.org/hookified/-/hookified-2.2.0.tgz",
"integrity": "sha512-p/LgFzRN5FeoD3DLS6bkUapeye6E4SI6yJs6KetENd18S+FBthqYq2amJUWpt5z0EQwwHemidjY5OqJGEKm5uA==",
"license": "MIT"
},
"node_modules/qrcode-terminal": {
"version": "0.12.0",
"resolved": "https://registry.npmjs.org/qrcode-terminal/-/qrcode-terminal-0.12.0.tgz",
@@ -1922,13 +1900,13 @@
}
},
"node_modules/side-channel-list": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/side-channel-list/-/side-channel-list-1.0.0.tgz",
"integrity": "sha512-FCLHtRD/gnpCiCHEiJLOwdmFP+wzCmDEkc9y7NsYxeF4u7Btsn1ZuwgwJGxImImHicJArLP4R0yX4c2KCrMrTA==",
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/side-channel-list/-/side-channel-list-1.0.1.tgz",
"integrity": "sha512-mjn/0bi/oUURjc5Xl7IaWi/OJJJumuoJFQJfDDyO46+hBWsfaVM65TBHq2eoZBhzl9EchxOijpkbRC8SVBQU0w==",
"license": "MIT",
"dependencies": {
"es-errors": "^1.3.0",
"object-inspect": "^1.13.3"
"object-inspect": "^1.13.4"
},
"engines": {
"node": ">= 0.4"
@@ -2094,9 +2072,9 @@
}
},
"node_modules/undici-types": {
"version": "7.18.2",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-7.18.2.tgz",
"integrity": "sha512-AsuCzffGHJybSaRrmr5eHr81mwJU3kjw6M+uprWvCXiNeN9SOGwQ3Jn8jb8m3Z6izVgknn1R0FTCEAP2QrLY/w==",
"version": "7.19.2",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-7.19.2.tgz",
"integrity": "sha512-qYVnV5OEm2AW8cJMCpdV20CDyaN3g0AjDlOGf1OW4iaDEx8MwdtChUp4zu4H0VP3nDRF/8RKWH+IPp9uW0YGZg==",
"license": "MIT"
},
"node_modules/unpipe": {
@@ -2139,9 +2117,9 @@
"license": "MIT"
},
"node_modules/ws": {
"version": "8.19.0",
"resolved": "https://registry.npmjs.org/ws/-/ws-8.19.0.tgz",
"integrity": "sha512-blAT2mjOEIi0ZzruJfIhb3nps74PRWTCz1IjglWEEpQl5XS/UNama6u2/rjFkDDouqr4L67ry+1aGIALViWjDg==",
"version": "8.20.0",
"resolved": "https://registry.npmjs.org/ws/-/ws-8.20.0.tgz",
"integrity": "sha512-sAt8BhgNbzCtgGbt2OxmpuryO63ZoDk/sqaB/znQm94T4fCEsy/yV+7CdC1kJhOU9lboAEU7R3kquuycDoibVA==",
"license": "MIT",
"engines": {
"node": ">=10.0.0"
+3
View File
@@ -12,5 +12,8 @@
"express": "^4.21.0",
"qrcode-terminal": "^0.12.0",
"pino": "^9.0.0"
},
"overrides": {
"protobufjs": "^7.5.5"
}
}
+50
View File
@@ -1893,3 +1893,53 @@ class TestOpenRouterExplicitApiKey:
assert call_kwargs["api_key"] == "env-fallback-key", (
f"Expected env fallback key to be used when explicit_api_key is None, got: {call_kwargs['api_key']}"
)
class TestAnthropicExplicitApiKey:
"""Test that explicit_api_key is correctly propagated to _try_anthropic().
Parity with the OpenRouter fix in #18768: resolve_provider_client() passes
explicit_api_key to _try_openrouter(), but the anthropic branch was not
updated _try_anthropic() always fell back to resolve_anthropic_token()
even when an explicit key was supplied (e.g. from a fallback_model entry).
"""
def test_try_anthropic_uses_explicit_api_key_over_env(self):
"""_try_anthropic(explicit_api_key) must use the supplied key, not the env fallback."""
with patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="env-fallback-key"), \
patch("agent.anthropic_adapter.build_anthropic_client") as mock_build, \
patch("agent.auxiliary_client._select_pool_entry", return_value=(False, None)):
mock_build.return_value = MagicMock()
from agent.auxiliary_client import _try_anthropic
client, model = _try_anthropic("explicit-pool-key")
assert client is not None
assert mock_build.call_args.args[0] == "explicit-pool-key", (
f"Expected explicit_api_key to be passed, got: {mock_build.call_args.args[0]}"
)
assert mock_build.call_args.args[0] != "env-fallback-key"
def test_try_anthropic_without_explicit_key_falls_back_to_resolve(self):
"""Without explicit_api_key, _try_anthropic falls back to resolve_anthropic_token."""
with patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="env-fallback-key"), \
patch("agent.anthropic_adapter.build_anthropic_client") as mock_build, \
patch("agent.auxiliary_client._select_pool_entry", return_value=(False, None)):
mock_build.return_value = MagicMock()
from agent.auxiliary_client import _try_anthropic
client, model = _try_anthropic()
assert client is not None
assert mock_build.call_args.args[0] == "env-fallback-key"
def test_resolve_provider_client_passes_explicit_api_key_to_anthropic(self):
"""resolve_provider_client(provider='anthropic', explicit_api_key=...) must propagate the key."""
with patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="env-key"), \
patch("agent.anthropic_adapter.build_anthropic_client") as mock_build, \
patch("agent.auxiliary_client._select_pool_entry", return_value=(False, None)):
mock_build.return_value = MagicMock()
client, model = resolve_provider_client(
provider="anthropic",
explicit_api_key="explicit-fallback-key",
)
assert client is not None
assert mock_build.call_args.args[0] == "explicit-fallback-key", (
"resolve_provider_client must forward explicit_api_key to _try_anthropic()"
)
+80
View File
@@ -645,6 +645,86 @@ def test_review_model_honors_auxiliary_curator_slot(curator_env):
)
def test_review_runtime_passes_auxiliary_curator_credentials(curator_env):
"""Per-slot api_key/base_url must ride into resolve_runtime_provider (not main-only creds)."""
curator = curator_env["curator"]
cfg = {
"model": {"provider": "openrouter", "default": "openai/gpt-5.5"},
"auxiliary": {
"curator": {
"provider": "custom",
"model": "local-mini",
"api_key": "sk-curator-only",
"base_url": "http://localhost:11434/v1",
},
},
}
binding = curator._resolve_review_runtime(cfg)
assert binding.provider == "custom"
assert binding.model == "local-mini"
assert binding.explicit_api_key == "sk-curator-only"
assert binding.explicit_base_url == "http://localhost:11434/v1"
def test_review_runtime_strips_blank_aux_credentials(curator_env):
curator = curator_env["curator"]
cfg = {
"model": {"provider": "openrouter", "default": "openai/gpt-5.5"},
"auxiliary": {
"curator": {
"provider": "openrouter",
"model": "x/y",
"api_key": " ",
"base_url": "",
},
},
}
binding = curator._resolve_review_runtime(cfg)
assert binding.explicit_api_key is None
assert binding.explicit_base_url is None
def test_review_runtime_ignores_auxiliary_credentials_when_using_main(curator_env):
"""Falling through to main model must not pick up stray auxiliary.curator secrets."""
curator = curator_env["curator"]
cfg = {
"model": {"provider": "openrouter", "default": "openai/gpt-5.5"},
"auxiliary": {
"curator": {
"provider": "auto",
"model": "",
"api_key": "must-not-leak",
"base_url": "http://curator-slot-ignored/",
},
},
}
binding = curator._resolve_review_runtime(cfg)
assert (binding.provider, binding.model) == ("openrouter", "openai/gpt-5.5")
assert binding.explicit_api_key is None
assert binding.explicit_base_url is None
def test_review_runtime_legacy_auxiliary_carry_credentials(curator_env, caplog):
curator = curator_env["curator"]
cfg = {
"model": {"provider": "openrouter", "default": "openai/gpt-5.5"},
"curator": {
"auxiliary": {
"provider": "custom",
"model": "m",
"api_key": "legacy-key",
"base_url": "http://legacy/v1",
},
},
}
import logging
with caplog.at_level(logging.INFO, logger="agent.curator"):
binding = curator._resolve_review_runtime(cfg)
assert binding.explicit_api_key == "legacy-key"
assert binding.explicit_base_url == "http://legacy/v1"
assert any("deprecated curator.auxiliary" in rec.message for rec in caplog.records)
def test_review_model_auxiliary_curator_partial_override_falls_back(curator_env):
"""Only one of slot provider/model set → fall back to the main pair.
@@ -220,6 +220,81 @@ def test_classify_handles_malformed_arguments_string(curator_env):
assert len(result["pruned"]) == 1
def test_classify_no_false_positive_short_name_in_file_path(curator_env):
"""Short skill name that is a substring of another filename = pruned, not consolidated."""
# e.g. "api" should NOT match "references/api-design.md"
result = curator_env._classify_removed_skills(
removed=["api"],
added=[],
after_names={"conventions"},
tool_calls=[
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "write_file",
"name": "conventions",
"file_path": "references/api-design.md",
"file_content": "# API Design\n...",
}),
},
],
)
assert result["consolidated"] == [], (
f"Short name 'api' should NOT match file_path 'references/api-design.md'"
)
assert len(result["pruned"]) == 1
assert result["pruned"][0]["name"] == "api"
def test_classify_no_false_positive_short_name_in_content(curator_env):
"""Short skill name embedded in longer word in content = pruned, not consolidated."""
# e.g. "test" should NOT match content "running latest tests"
result = curator_env._classify_removed_skills(
removed=["test"],
added=[],
after_names={"umbrella"},
tool_calls=[
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "patch",
"name": "umbrella",
"old_string": "old",
"new_string": "running latest tests with pytest",
}),
},
],
)
assert result["consolidated"] == [], (
f"Short name 'test' should NOT match 'latest' via word boundary"
)
assert len(result["pruned"]) == 1
def test_classify_still_matches_exact_word_in_content(curator_env):
"""Word-boundary match still works for exact word occurrences."""
# "api" SHOULD match content "use the api gateway"
result = curator_env._classify_removed_skills(
removed=["api"],
added=[],
after_names={"gateway"},
tool_calls=[
{
"name": "skill_manage",
"arguments": json.dumps({
"action": "edit",
"name": "gateway",
"content": "# Gateway\n\nUse the api gateway for all requests.\n",
}),
},
],
)
assert len(result["consolidated"]) == 1, (
f"'api' should match as a standalone word in content"
)
assert result["consolidated"][0]["into"] == "gateway"
def test_report_md_splits_consolidated_and_pruned_sections(curator_env):
"""End-to-end: REPORT.md shows both sections distinctly."""
curator = curator_env
@@ -126,6 +126,20 @@ class TestCodexBuildKwargs:
)
assert kw.get("extra_headers", {}).get("x-grok-conv-id") == "conv-123"
def test_xai_headers_preserve_request_override_headers(self, transport):
messages = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
model="grok-3", messages=messages, tools=[],
session_id="conv-123",
is_xai_responses=True,
request_overrides={"extra_headers": {"X-Test": "1", "X-Trace": "abc"}},
)
assert kw.get("extra_headers") == {
"X-Test": "1",
"X-Trace": "abc",
"x-grok-conv-id": "conv-123",
}
def test_minimal_effort_clamped(self, transport):
messages = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
+80 -86
View File
@@ -1,107 +1,101 @@
"""Tests that load_cli_config() guards against lazy-import TERMINAL_CWD clobbering.
"""Tests for CLI/TUI CWD resolution in load_cli_config().
When the gateway resolves TERMINAL_CWD at startup and cli.py is later
imported lazily (via delegate_tool CLI_CONFIG), load_cli_config() must
not overwrite the already-resolved value with os.getcwd().
config.yaml terminal.cwd is the canonical source of truth.
.env TERMINAL_CWD and MESSAGING_CWD are deprecated.
See issue #10817.
Rules:
- Local backend CLI/TUI: always os.getcwd(), ignoring config and inherited env.
- Non-local with placeholder: pop cwd for backend default.
- Non-local with explicit path: keep as-is.
"""
import os
import pytest
# The sentinel values that mean "resolve at runtime"
_CWD_PLACEHOLDERS = (".", "auto", "cwd")
def _resolve_terminal_cwd(terminal_config: dict, defaults: dict, env: dict):
"""Simulate the CWD resolution logic from load_cli_config().
def _resolve_cwd(terminal_config: dict, defaults: dict, env: dict):
"""Mirror the CWD resolution logic from cli.py load_cli_config()."""
effective_backend = terminal_config.get("env_type", "local")
This mirrors the code in cli.py that checks for a pre-resolved
TERMINAL_CWD before falling back to os.getcwd().
"""
if terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
_existing_cwd = env.get("TERMINAL_CWD", "")
if _existing_cwd and _existing_cwd not in _CWD_PLACEHOLDERS and os.path.isabs(_existing_cwd):
terminal_config["cwd"] = _existing_cwd
defaults["terminal"]["cwd"] = _existing_cwd
else:
effective_backend = terminal_config.get("env_type", "local")
if effective_backend == "local":
terminal_config["cwd"] = "/fake/getcwd" # stand-in for os.getcwd()
defaults["terminal"]["cwd"] = terminal_config["cwd"]
else:
terminal_config.pop("cwd", None)
if effective_backend == "local":
terminal_config["cwd"] = "/fake/getcwd"
defaults["terminal"]["cwd"] = terminal_config["cwd"]
elif terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
terminal_config.pop("cwd", None)
# Simulate the bridging loop: write terminal_config["cwd"] to env
_file_has_terminal = defaults.get("_file_has_terminal", False)
# Bridge: TERMINAL_CWD always exported in CLI, skipped in gateway
_is_gateway = env.get("_HERMES_GATEWAY") == "1"
if "cwd" in terminal_config:
if _file_has_terminal or "TERMINAL_CWD" not in env:
if _is_gateway:
pass # don't touch env
else:
env["TERMINAL_CWD"] = str(terminal_config["cwd"])
return env.get("TERMINAL_CWD", "")
class TestLazyImportGuard:
"""TERMINAL_CWD resolved by gateway must survive a lazy cli.py import."""
class TestLocalBackendCli:
"""Local backend always uses os.getcwd()."""
def test_gateway_resolved_cwd_survives(self):
"""Gateway set TERMINAL_CWD → lazy cli import must not clobber."""
env = {"TERMINAL_CWD": "/home/user/workspace"}
terminal_config = {"cwd": ".", "env_type": "local"}
defaults = {"terminal": {"cwd": "."}, "_file_has_terminal": False}
result = _resolve_terminal_cwd(terminal_config, defaults, env)
assert result == "/home/user/workspace"
def test_gateway_resolved_cwd_survives_with_file_terminal(self):
"""Even when config.yaml has a terminal: section, resolved CWD survives."""
env = {"TERMINAL_CWD": "/home/user/workspace"}
terminal_config = {"cwd": ".", "env_type": "local"}
defaults = {"terminal": {"cwd": "."}, "_file_has_terminal": True}
result = _resolve_terminal_cwd(terminal_config, defaults, env)
assert result == "/home/user/workspace"
class TestConfigCwdResolution:
"""config.yaml terminal.cwd is the canonical source of truth."""
def test_explicit_config_cwd_wins(self):
"""terminal.cwd: /explicit/path always wins."""
env = {"TERMINAL_CWD": "/old/gateway/value"}
terminal_config = {"cwd": "/explicit/path"}
defaults = {"terminal": {"cwd": "/explicit/path"}, "_file_has_terminal": True}
result = _resolve_terminal_cwd(terminal_config, defaults, env)
assert result == "/explicit/path"
def test_dot_cwd_resolves_to_getcwd_when_no_prior(self):
"""With no pre-set TERMINAL_CWD, "." resolves to os.getcwd()."""
def test_explicit_config_ignored(self):
env = {}
terminal_config = {"cwd": "."}
defaults = {"terminal": {"cwd": "."}, "_file_has_terminal": False}
tc = {"cwd": "/explicit/path", "env_type": "local"}
d = {"terminal": {"cwd": "/explicit/path"}}
assert _resolve_cwd(tc, d, env) == "/fake/getcwd"
result = _resolve_terminal_cwd(terminal_config, defaults, env)
def test_inherited_env_overwritten(self):
env = {"TERMINAL_CWD": "/parent/hermes"}
tc = {"cwd": "/home/user", "env_type": "local"}
d = {"terminal": {"cwd": "/home/user"}}
assert _resolve_cwd(tc, d, env) == "/fake/getcwd"
def test_placeholder_resolved(self):
env = {}
tc = {"cwd": "."}
d = {"terminal": {"cwd": "."}}
assert _resolve_cwd(tc, d, env) == "/fake/getcwd"
def test_env_and_no_config_file(self):
env = {"TERMINAL_CWD": "/stale/value"}
tc = {"cwd": ".", "env_type": "local"}
d = {"terminal": {"cwd": "."}}
assert _resolve_cwd(tc, d, env) == "/fake/getcwd"
class TestNonLocalBackends:
"""Non-local backends use config or per-backend defaults."""
def test_placeholder_popped(self):
env = {}
tc = {"cwd": ".", "env_type": "docker"}
d = {"terminal": {"cwd": "."}}
assert _resolve_cwd(tc, d, env) == ""
def test_explicit_path_kept(self):
env = {}
tc = {"cwd": "/srv/app", "env_type": "ssh"}
d = {"terminal": {"cwd": "/srv/app"}}
assert _resolve_cwd(tc, d, env) == "/srv/app"
def test_auto_placeholder_popped(self):
env = {}
tc = {"cwd": "auto", "env_type": "modal"}
d = {"terminal": {"cwd": "auto"}}
assert _resolve_cwd(tc, d, env) == ""
class TestGatewayLazyImport:
"""Gateway lazy import of cli.py must not clobber TERMINAL_CWD."""
def test_gateway_cwd_preserved(self):
env = {"_HERMES_GATEWAY": "1", "TERMINAL_CWD": "/home/user/project"}
tc = {"cwd": "/home/user", "env_type": "local"}
d = {"terminal": {"cwd": "/home/user"}}
result = _resolve_cwd(tc, d, env)
assert result == "/home/user/project"
def test_cli_overwrites_stale_env(self):
env = {"TERMINAL_CWD": "/stale/from/dotenv"}
tc = {"cwd": "/home/user", "env_type": "local"}
d = {"terminal": {"cwd": "/home/user"}}
result = _resolve_cwd(tc, d, env)
assert result == "/fake/getcwd"
def test_remote_backend_pops_cwd(self):
"""Remote backend + placeholder cwd → popped for backend default."""
env = {}
terminal_config = {"cwd": ".", "env_type": "docker"}
defaults = {"terminal": {"cwd": "."}, "_file_has_terminal": False}
result = _resolve_terminal_cwd(terminal_config, defaults, env)
assert result == "" # cwd popped, no env var set
def test_remote_backend_with_prior_cwd_preserves(self):
"""Remote backend + pre-resolved TERMINAL_CWD → adopted."""
env = {"TERMINAL_CWD": "/project"}
terminal_config = {"cwd": ".", "env_type": "docker"}
defaults = {"terminal": {"cwd": "."}, "_file_has_terminal": False}
result = _resolve_terminal_cwd(terminal_config, defaults, env)
assert result == "/project"
+68
View File
@@ -647,6 +647,74 @@ class TestGetDueJobs:
assert get_due_jobs() == []
assert get_job("oneshot-stale")["next_run_at"] is None
def test_broken_cron_without_next_run_is_recovered(self, tmp_cron_dir, monkeypatch):
now = datetime(2026, 3, 18, 10, 0, 0, tzinfo=timezone.utc)
monkeypatch.setattr("cron.jobs._hermes_now", lambda: now)
save_jobs(
[{
"id": "cron-recover",
"name": "AI Daily Digest",
"prompt": "...",
"schedule": {"kind": "cron", "expr": "0 12 * * *", "display": "0 12 * * *"},
"schedule_display": "0 12 * * *",
"repeat": {"times": None, "completed": 0},
"enabled": True,
"state": "scheduled",
"paused_at": None,
"paused_reason": None,
"created_at": "2026-03-18T09:00:00+00:00",
"next_run_at": None,
"last_run_at": None,
"last_status": None,
"last_error": None,
"deliver": "local",
"origin": None,
}]
)
assert get_due_jobs() == []
recovered = get_job("cron-recover")["next_run_at"]
assert recovered is not None
recovered_dt = datetime.fromisoformat(recovered)
if recovered_dt.tzinfo is None:
recovered_dt = recovered_dt.replace(tzinfo=timezone.utc)
assert recovered_dt > now
def test_broken_interval_without_next_run_is_recovered(self, tmp_cron_dir, monkeypatch):
now = datetime(2026, 3, 18, 10, 0, 0, tzinfo=timezone.utc)
monkeypatch.setattr("cron.jobs._hermes_now", lambda: now)
save_jobs(
[{
"id": "interval-recover",
"name": "Hourly heartbeat",
"prompt": "...",
"schedule": {"kind": "interval", "minutes": 60, "display": "every 60m"},
"schedule_display": "every 1h",
"repeat": {"times": None, "completed": 0},
"enabled": True,
"state": "scheduled",
"paused_at": None,
"paused_reason": None,
"created_at": "2026-03-18T09:00:00+00:00",
"next_run_at": None,
"last_run_at": None,
"last_status": None,
"last_error": None,
"deliver": "local",
"origin": None,
}]
)
assert get_due_jobs() == []
recovered = get_job("interval-recover")["next_run_at"]
assert recovered is not None
recovered_dt = datetime.fromisoformat(recovered)
if recovered_dt.tzinfo is None:
recovered_dt = recovered_dt.replace(tzinfo=timezone.utc)
assert recovered_dt > now
class TestEnabledToolsets:
def test_enabled_toolsets_stored(self, tmp_cron_dir):
+81
View File
@@ -46,6 +46,29 @@ class TestResolveOrigin:
job = {"origin": {}}
assert _resolve_origin(job) is None
@pytest.mark.parametrize(
"non_dict_origin",
[
"combined-digest-replaces-x-and-y-20260503",
123,
["telegram", "12345"],
("platform", "chat_id"),
42.0,
],
)
def test_non_dict_origin_returns_none_instead_of_crashing(self, non_dict_origin):
"""Non-dict origins (provenance strings from hand-edited or migrated
jobs.json) must be treated as missing instead of crashing the
scheduler tick on ``origin.get('platform')`` with
``'str' object has no attribute 'get'`` (#18722).
Before this guard a job in this state crashed every fire attempt
forever; ``mark_job_run`` recorded the error but the next tick
re-loaded the poisoned origin and crashed identically.
"""
job = {"origin": non_dict_origin}
assert _resolve_origin(job) is None
class TestResolveDeliveryTarget:
def test_origin_delivery_preserves_thread_id(self):
@@ -118,6 +141,16 @@ class TestResolveDeliveryTarget:
"thread_id": None,
}
def test_bare_platform_delivery_preserves_home_thread_id(self, monkeypatch):
monkeypatch.setenv("DISCORD_HOME_CHANNEL", "parent-42")
monkeypatch.setenv("DISCORD_HOME_CHANNEL_THREAD_ID", "topic-7")
assert _resolve_delivery_target({"deliver": "discord"}) == {
"platform": "discord",
"chat_id": "parent-42",
"thread_id": "topic-7",
}
def test_explicit_telegram_topic_target_with_thread_id(self):
"""deliver: 'telegram:chat_id:thread_id' parses correctly."""
job = {
@@ -1824,6 +1857,54 @@ class TestBuildJobPromptMissingSkill:
assert "go" in result
class TestBuildJobPromptBumpUse:
"""Verify that cron jobs bump skill usage counters so the curator sees them as active."""
def test_bump_use_called_for_loaded_skill(self):
"""bump_use is called for each successfully loaded skill."""
def _skill_view(name: str) -> str:
return json.dumps({"success": True, "content": f"Content for {name}."})
with patch("tools.skills_tool.skill_view", side_effect=_skill_view), \
patch("tools.skill_usage.bump_use") as mock_bump:
_build_job_prompt({"skills": ["alpha", "beta"], "prompt": "go"})
assert mock_bump.call_count == 2
calls = [c[0][0] for c in mock_bump.call_args_list]
assert "alpha" in calls
assert "beta" in calls
def test_bump_use_not_called_for_missing_skill(self):
"""bump_use is NOT called when a skill fails to load."""
def _missing_view(name: str) -> str:
return json.dumps({"success": False, "error": "not found"})
with patch("tools.skills_tool.skill_view", side_effect=_missing_view), \
patch("tools.skill_usage.bump_use") as mock_bump:
_build_job_prompt({"skills": ["ghost"], "prompt": "go"})
assert mock_bump.call_count == 0
def test_bump_failure_does_not_break_prompt(self, caplog):
"""If bump_use raises, the prompt still builds — error is logged at DEBUG."""
def _skill_view(name: str) -> str:
return json.dumps({"success": True, "content": "Works."})
with patch("tools.skills_tool.skill_view", side_effect=_skill_view), \
patch("tools.skill_usage.bump_use", side_effect=RuntimeError("boom")), \
caplog.at_level(logging.DEBUG, logger="cron.scheduler"):
result = _build_job_prompt({"skills": ["good-skill"], "prompt": "go"})
# Prompt should still contain the skill content and original instruction
assert "Works." in result
assert "go" in result
# The error should be logged at DEBUG level, not crash
assert any("failed to bump" in r.message for r in caplog.records)
class TestSendMediaViaAdapter:
"""Unit tests for _send_media_via_adapter — routes files to typed adapter methods."""
+23
View File
@@ -138,6 +138,29 @@ class TestSlashCommands:
response_text = send.call_args[1].get("content") or send.call_args[0][1]
assert "compress" in response_text.lower() or "context" in response_text.lower()
@pytest.mark.asyncio
async def test_quick_command_alias_targets_builtin_command_with_args(
self, adapter, runner, platform
):
"""Alias targets with args must reach the built-in command handler."""
runner.config.quick_commands = {
"s": {"type": "alias", "target": "/status extra-arg"}
}
async def _handle_status(event):
assert event.get_command_args() == "extra-arg"
return "status via alias"
runner._handle_status_command = AsyncMock(side_effect=_handle_status)
send = await send_and_capture(adapter, "/s", platform)
send.assert_called_once()
response_text = send.call_args[1].get("content") or send.call_args[0][1]
assert response_text == "status via alias"
runner._handle_status_command.assert_awaited_once()
runner._handle_message_with_agent.assert_not_awaited()
class TestSessionLifecycle:
"""Verify session state changes across command sequences."""
+17 -1
View File
@@ -12,6 +12,7 @@ class RestartTestAdapter(BasePlatformAdapter):
def __init__(self):
super().__init__(PlatformConfig(enabled=True, token="***"), Platform.TELEGRAM)
self.sent: list[str] = []
self.sent_calls: list[tuple[str, str, object]] = []
async def connect(self):
return True
@@ -21,6 +22,7 @@ class RestartTestAdapter(BasePlatformAdapter):
async def send(self, chat_id, content, reply_to=None, metadata=None):
self.sent.append(content)
self.sent_calls.append((chat_id, content, metadata))
return SendResult(success=True, message_id="1")
async def send_typing(self, chat_id, metadata=None):
@@ -30,12 +32,17 @@ class RestartTestAdapter(BasePlatformAdapter):
return {"id": chat_id}
def make_restart_source(chat_id: str = "123456", chat_type: str = "dm") -> SessionSource:
def make_restart_source(
chat_id: str = "123456",
chat_type: str = "dm",
thread_id: str | None = None,
) -> SessionSource:
return SessionSource(
platform=Platform.TELEGRAM,
chat_id=chat_id,
chat_type=chat_type,
user_id="u1",
thread_id=thread_id,
)
@@ -81,6 +88,15 @@ def make_restart_runner(
runner._handle_restart_command = GatewayRunner._handle_restart_command.__get__(
runner, GatewayRunner
)
runner._handle_set_home_command = GatewayRunner._handle_set_home_command.__get__(
runner, GatewayRunner
)
runner._send_restart_notification = GatewayRunner._send_restart_notification.__get__(
runner, GatewayRunner
)
runner._send_home_channel_startup_notifications = (
GatewayRunner._send_home_channel_startup_notifications.__get__(runner, GatewayRunner)
)
runner._status_action_label = GatewayRunner._status_action_label.__get__(
runner, GatewayRunner
)
+42
View File
@@ -240,6 +240,48 @@ class TestAdapterInit:
"http://127.0.0.1:3000",
)
def test_invalid_port_from_env_falls_back_to_default(self, monkeypatch):
monkeypatch.setenv("API_SERVER_PORT", "not-a-port")
config = PlatformConfig(enabled=True)
adapter = APIServerAdapter(config)
assert adapter._port == 8642
def test_create_agent_forwards_config_reasoning_effort(self, monkeypatch):
captured = {}
class FakeAgent:
def __init__(self, **kwargs):
captured.update(kwargs)
monkeypatch.setattr("run_agent.AIAgent", FakeAgent)
monkeypatch.setattr(
"gateway.run._resolve_runtime_agent_kwargs",
lambda: {
"provider": "openai-codex",
"base_url": "https://example.test/v1",
"api_mode": "codex_responses",
},
)
monkeypatch.setattr("gateway.run._resolve_gateway_model", lambda: "gpt-5.5")
monkeypatch.setattr(
"gateway.run._load_gateway_config",
lambda: {"agent": {"reasoning_effort": "xhigh"}},
)
monkeypatch.setattr(
"gateway.run.GatewayRunner._load_reasoning_config",
staticmethod(lambda: {"enabled": True, "effort": "xhigh"}),
)
monkeypatch.setattr("gateway.run.GatewayRunner._load_fallback_model", staticmethod(lambda: None))
monkeypatch.setattr("hermes_cli.tools_config._get_platform_tools", lambda *_: set())
adapter = APIServerAdapter(PlatformConfig(enabled=True))
monkeypatch.setattr(adapter, "_ensure_session_db", lambda: None)
agent = adapter._create_agent(session_id="api-session")
assert isinstance(agent, FakeAgent)
assert captured["reasoning_config"] == {"enabled": True, "effort": "xhigh"}
# ---------------------------------------------------------------------------
# Auth checking
+12 -10
View File
@@ -49,9 +49,10 @@ class TestSuspendRecentlyActive:
count = store.suspend_recently_active()
assert count == 1
# Re-fetch — should be suspended now
# Re-fetch — should be resume_pending (preserved, not wiped)
refreshed = store.get_or_create_session(source)
assert refreshed.was_auto_reset
assert refreshed.resume_pending
assert refreshed.session_id == entry.session_id # same session preserved
def test_does_not_suspend_old_sessions(self, tmp_path):
store = _make_store(tmp_path)
@@ -66,21 +67,22 @@ class TestSuspendRecentlyActive:
count = store.suspend_recently_active(max_age_seconds=120)
assert count == 0
def test_already_suspended_not_double_counted(self, tmp_path):
def test_already_resume_pending_not_double_counted(self, tmp_path):
store = _make_store(tmp_path)
source = _make_source()
entry = store.get_or_create_session(source)
# Suspend once
# Mark resume_pending once
count1 = store.suspend_recently_active()
assert count1 == 1
# Create a new session (the old one got reset on next access)
# Re-fetch returns the SAME session (preserved, not reset)
entry2 = store.get_or_create_session(source)
assert entry2.session_id == entry.session_id
# Suspend again — the new session is recent but not yet suspended
# Second call skips already-resume_pending entries
count2 = store.suspend_recently_active()
assert count2 == 1
assert count2 == 0
# ---------------------------------------------------------------------------
@@ -180,11 +182,11 @@ class TestCleanShutdownMarker:
else:
store.suspend_recently_active()
# Session SHOULD be suspended (crash recovery)
# Session SHOULD be resume_pending (crash recovery preserves history)
with store._lock:
store._ensure_loaded_locked()
suspended_count = sum(1 for e in store._entries.values() if e.suspended)
assert suspended_count == 1, "Session should be suspended after crash (no marker)"
resume_count = sum(1 for e in store._entries.values() if e.resume_pending)
assert resume_count == 1, "Session should be resume_pending after crash (no marker)"
def test_marker_written_on_restart_stop(self, tmp_path, monkeypatch):
"""stop(restart=True) should also write the marker."""
@@ -0,0 +1,230 @@
"""Security regression tests: Discord component views honor role allowlists.
The four interactive component views (ExecApprovalView, SlashConfirmView,
UpdatePromptView, ModelPickerView) historically accepted only
``allowed_user_ids``. Deployments that configure DISCORD_ALLOWED_ROLES
without DISCORD_ALLOWED_USERS therefore had a wide-open component
surface: any guild member who could see the prompt could approve exec
commands, cancel slash confirmations, or switch the model -- even when
the same user would be rejected at the slash and on_message gates.
These tests pin the user-or-role OR semantics and the fail-closed
behavior on missing role data so the parity cannot regress.
"""
from types import SimpleNamespace
import pytest
# Trigger the shared discord mock from tests/gateway/conftest.py before
# importing the production module.
from gateway.platforms.discord import ( # noqa: E402
ExecApprovalView,
ModelPickerView,
SlashConfirmView,
UpdatePromptView,
_component_check_auth,
)
# ---------------------------------------------------------------------------
# Direct helper coverage -- the four views all delegate to this helper, so
# pinning the helper's contract pins all four call sites.
# ---------------------------------------------------------------------------
def _interaction(user_id, role_ids=None, *, drop_user=False, drop_roles=False):
"""Build a mock interaction with the requested user/role shape.
drop_user simulates a payload whose .user attribute is None.
drop_roles simulates a payload where .user has no .roles attribute
at all (DM-context Member, raw User payload).
"""
if drop_user:
return SimpleNamespace(user=None)
user_kwargs = {"id": user_id}
if not drop_roles:
user_kwargs["roles"] = [SimpleNamespace(id=r) for r in (role_ids or [])]
return SimpleNamespace(user=SimpleNamespace(**user_kwargs))
# ── back-compat: empty allowlists -> allow everyone ────────────────────────
def test_component_check_empty_allowlists_allows_everyone():
"""SECURITY-CRITICAL backwards-compat: deployments without any
DISCORD_ALLOWED_* env vars set must continue to allow component
interactions from anyone (no regression for unconfigured setups)."""
interaction = _interaction(11111)
assert _component_check_auth(interaction, set(), set()) is True
assert _component_check_auth(interaction, None, None) is True
# ── user allowlist ─────────────────────────────────────────────────────────
def test_component_check_user_in_user_allowlist_passes():
interaction = _interaction(11111)
assert _component_check_auth(interaction, {"11111"}, set()) is True
def test_component_check_user_not_in_user_allowlist_rejected():
interaction = _interaction(99999)
assert _component_check_auth(interaction, {"11111"}, set()) is False
# ── role allowlist OR semantics ────────────────────────────────────────────
def test_component_check_role_only_user_with_matching_role_passes():
"""Role-only deployment (DISCORD_ALLOWED_ROLES set, DISCORD_ALLOWED_USERS
empty) where the user is not in the empty user list but DOES carry a
matching role: must pass. This is the regression that prompted the
fix -- previously _check_auth allowed everyone when the user set was
empty, ignoring the role allowlist."""
interaction = _interaction(99999, role_ids=[42])
assert _component_check_auth(interaction, set(), {42}) is True
def test_component_check_role_only_user_without_matching_role_rejected():
"""Role-only deployment where the user has no matching role: reject.
Previously this allowed everyone because allowed_user_ids was empty."""
interaction = _interaction(99999, role_ids=[7, 8])
assert _component_check_auth(interaction, set(), {42}) is False
def test_component_check_user_or_role_user_match():
"""Both allowlists set; user matches user allowlist: pass."""
interaction = _interaction(11111, role_ids=[7])
assert _component_check_auth(interaction, {"11111"}, {42}) is True
def test_component_check_user_or_role_role_match():
"""Both allowlists set; user not in user list but in role list: pass."""
interaction = _interaction(99999, role_ids=[42])
assert _component_check_auth(interaction, {"11111"}, {42}) is True
def test_component_check_user_or_role_neither_match():
"""Both allowlists set; user matches neither: reject."""
interaction = _interaction(99999, role_ids=[7])
assert _component_check_auth(interaction, {"11111"}, {42}) is False
# ── fail-closed on missing role data ───────────────────────────────────────
def test_component_check_role_policy_with_no_roles_attr_rejects():
"""Role allowlist configured but interaction.user has no .roles
attribute (DM-context Member, raw User payload): must reject. A user
without resolvable roles cannot satisfy a role allowlist."""
interaction = _interaction(11111, drop_roles=True)
assert _component_check_auth(interaction, set(), {42}) is False
def test_component_check_missing_user_with_allowlist_rejects():
"""interaction.user is None with any allowlist configured: fail
closed without raising AttributeError."""
interaction = _interaction(0, drop_user=True)
assert _component_check_auth(interaction, {"11111"}, set()) is False
assert _component_check_auth(interaction, set(), {42}) is False
# ---------------------------------------------------------------------------
# View construction: every view must accept allowed_role_ids and route
# through the shared helper. Default value preserves prior call-sites.
# ---------------------------------------------------------------------------
def test_exec_approval_view_accepts_role_allowlist():
view = ExecApprovalView(
session_key="sess-1",
allowed_user_ids={"11111"},
allowed_role_ids={42},
)
# Role-only user passes
assert view._check_auth(_interaction(99999, role_ids=[42])) is True
# Neither user nor role match: reject
assert view._check_auth(_interaction(99999, role_ids=[7])) is False
def test_exec_approval_view_role_default_is_empty_set():
"""Existing call sites that pass only allowed_user_ids must continue
working with the legacy semantics (no role gate)."""
view = ExecApprovalView(session_key="sess-1", allowed_user_ids={"11111"})
assert view.allowed_role_ids == set()
assert view._check_auth(_interaction(11111)) is True
assert view._check_auth(_interaction(99999)) is False
def test_slash_confirm_view_accepts_role_allowlist():
view = SlashConfirmView(
session_key="sess-1",
confirm_id="c1",
allowed_user_ids=set(),
allowed_role_ids={42},
)
assert view._check_auth(_interaction(99999, role_ids=[42])) is True
assert view._check_auth(_interaction(99999, role_ids=[7])) is False
def test_update_prompt_view_accepts_role_allowlist():
view = UpdatePromptView(
session_key="sess-1",
allowed_user_ids=set(),
allowed_role_ids={42},
)
assert view._check_auth(_interaction(99999, role_ids=[42])) is True
assert view._check_auth(_interaction(99999, role_ids=[7])) is False
def test_model_picker_view_accepts_role_allowlist():
async def _noop(*_a, **_k):
return ""
view = ModelPickerView(
providers=[],
current_model="m",
current_provider="p",
session_key="sess-1",
on_model_selected=_noop,
allowed_user_ids=set(),
allowed_role_ids={42},
)
assert view._check_auth(_interaction(99999, role_ids=[42])) is True
assert view._check_auth(_interaction(99999, role_ids=[7])) is False
# ---------------------------------------------------------------------------
# Empty allowlists across views: legacy "allow everyone" must hold.
# ---------------------------------------------------------------------------
@pytest.mark.parametrize(
"view_factory",
[
lambda: ExecApprovalView(session_key="s", allowed_user_ids=set()),
lambda: SlashConfirmView(session_key="s", confirm_id="c", allowed_user_ids=set()),
lambda: UpdatePromptView(session_key="s", allowed_user_ids=set()),
],
)
def test_views_empty_allowlists_allow_everyone(view_factory):
view = view_factory()
assert view._check_auth(_interaction(99999)) is True
def test_model_picker_view_empty_allowlists_allow_everyone():
async def _noop(*_a, **_k):
return ""
view = ModelPickerView(
providers=[],
current_model="m",
current_provider="p",
session_key="s",
on_model_selected=_noop,
allowed_user_ids=set(),
)
assert view.allowed_role_ids == set()
assert view._check_auth(_interaction(99999)) is True
+737
View File
@@ -0,0 +1,737 @@
"""Security regression tests: slash commands honor on_message authorization gates.
Slash invocations (``_run_simple_slash``, ``_handle_thread_create_slash``)
historically bypassed every gate ``on_message`` enforces DISCORD_ALLOWED_USERS,
DISCORD_ALLOWED_ROLES, DISCORD_ALLOWED_CHANNELS, DISCORD_IGNORED_CHANNELS.
Any guild member could invoke ``/background``, ``/restart``, etc. as the
operator. ``_check_slash_authorization`` mirrors all four gates one-for-one.
These tests pin the security-correct behavior so the bypass cannot regress.
"""
import asyncio
import logging
import sys
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock
import pytest
from gateway.config import PlatformConfig
# ---------------------------------------------------------------------------
# Discord module mock — borrowed from test_discord_slash_commands.py so this
# file runs on machines without discord.py installed.
# ---------------------------------------------------------------------------
def _ensure_discord_mock():
if "discord" in sys.modules and hasattr(sys.modules["discord"], "__file__"):
return # real discord installed
if sys.modules.get("discord") is None:
discord_mod = MagicMock()
discord_mod.Intents.default.return_value = MagicMock()
discord_mod.DMChannel = type("DMChannel", (), {})
discord_mod.Thread = type("Thread", (), {})
discord_mod.ForumChannel = type("ForumChannel", (), {})
discord_mod.Interaction = object
class _FakePermissions:
def __init__(self, value=0, **_):
self.value = value
discord_mod.Permissions = _FakePermissions
class _FakeGroup:
def __init__(self, *, name, description, parent=None):
self.name = name
self.description = description
self.parent = parent
self._children: dict[str, object] = {}
if parent is not None:
parent.add_command(self)
def add_command(self, cmd):
self._children[cmd.name] = cmd
class _FakeCommand:
def __init__(self, *, name, description, callback, parent=None):
self.name = name
self.description = description
self.callback = callback
self.parent = parent
self.default_permissions = None
discord_mod.app_commands = SimpleNamespace(
describe=lambda **kwargs: (lambda fn: fn),
choices=lambda **kwargs: (lambda fn: fn),
autocomplete=lambda **kwargs: (lambda fn: fn),
Choice=lambda **kwargs: SimpleNamespace(**kwargs),
Group=_FakeGroup,
Command=_FakeCommand,
)
ext_mod = MagicMock()
commands_mod = MagicMock()
commands_mod.Bot = MagicMock
ext_mod.commands = commands_mod
sys.modules["discord"] = discord_mod
sys.modules.setdefault("discord.ext", ext_mod)
sys.modules.setdefault("discord.ext.commands", commands_mod)
_ensure_discord_mock()
from gateway.platforms.discord import DiscordAdapter # noqa: E402
@pytest.fixture(autouse=True)
def _isolate_discord_env(monkeypatch):
for var in (
"DISCORD_ALLOWED_USERS",
"DISCORD_ALLOWED_ROLES",
"DISCORD_ALLOWED_CHANNELS",
"DISCORD_IGNORED_CHANNELS",
"DISCORD_HIDE_SLASH_COMMANDS",
"DISCORD_ALLOW_BOTS",
):
monkeypatch.delenv(var, raising=False)
@pytest.fixture(autouse=True)
def _stub_discord_permissions(monkeypatch):
"""Pin discord.Permissions to a plain stand-in so tests can assert the
bitfield value regardless of whether real discord.py or a sibling test
module's MagicMock is loaded."""
import discord
class _Perm:
def __init__(self, value=0, **_):
self.value = value
monkeypatch.setattr(discord, "Permissions", _Perm)
@pytest.fixture
def adapter():
config = PlatformConfig(enabled=True, token="***")
a = DiscordAdapter(config)
a._client = SimpleNamespace(user=SimpleNamespace(id=99999, name="HermesBot"), guilds=[])
return a
_SENTINEL = object()
def _make_interaction(
user_id, *, channel_id=12345, guild_id=42, in_dm=False, in_thread=False,
parent_channel_id=None, user=_SENTINEL,
):
"""Build a mock Discord Interaction with a still-unresponded response.
``channel_id`` may be set to ``None`` to simulate a guild interaction
payload missing a resolvable channel id (fail-closed exercise).
Pass ``user=None`` to simulate a payload missing the user object.
"""
import discord
response = SimpleNamespace(send_message=AsyncMock(), defer=AsyncMock())
if in_dm:
channel = discord.DMChannel()
elif in_thread:
channel = discord.Thread()
channel.id = channel_id
channel.parent_id = parent_channel_id
elif channel_id is None:
channel = None
else:
channel = SimpleNamespace(id=channel_id)
if user is _SENTINEL:
user_obj = SimpleNamespace(id=int(user_id), name=f"user_{user_id}")
else:
user_obj = user
return SimpleNamespace(
user=user_obj,
guild=SimpleNamespace(owner_id=999),
guild_id=guild_id,
channel_id=channel_id,
channel=channel,
response=response,
)
# ---------------------------------------------------------------------------
# Backwards-compat: empty allowlist → everything passes (matches on_message)
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_no_allowlist_allows_everyone(adapter):
"""SECURITY-CRITICAL backwards-compat: deployments without any allowlist
env vars set must see ZERO behavior change. on_message lets everyone
through in this case (returns True at line 1890); slash must do the same.
"""
interaction = _make_interaction("999999999")
assert await adapter._check_slash_authorization(interaction, "/help") is True
interaction.response.send_message.assert_not_awaited()
@pytest.mark.asyncio
async def test_no_allowlist_dm_also_allowed(adapter):
"""Same for DMs — no allowlist means no restriction, matching on_message."""
interaction = _make_interaction("999999999", in_dm=True)
assert await adapter._check_slash_authorization(interaction, "/help") is True
# ---------------------------------------------------------------------------
# User allowlist (DISCORD_ALLOWED_USERS) parity
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_allowed_user_passes(adapter):
adapter._allowed_user_ids = {"100200300"}
interaction = _make_interaction("100200300")
assert await adapter._check_slash_authorization(interaction, "/background hi") is True
interaction.response.send_message.assert_not_awaited()
@pytest.mark.asyncio
async def test_disallowed_user_rejected_with_ephemeral(adapter, caplog):
adapter._allowed_user_ids = {"100200300"}
interaction = _make_interaction("999999999")
with caplog.at_level(logging.WARNING):
assert await adapter._check_slash_authorization(interaction, "/background hi") is False
interaction.response.send_message.assert_awaited_once()
args, kwargs = interaction.response.send_message.call_args
assert kwargs.get("ephemeral") is True
assert "not authorized" in (args[0] if args else kwargs.get("content", "")).lower()
assert any("Unauthorized slash attempt" in r.message for r in caplog.records)
assert any("DISCORD_ALLOWED_USERS" in r.message for r in caplog.records)
# ---------------------------------------------------------------------------
# Role allowlist (DISCORD_ALLOWED_ROLES) parity
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_role_member_passes(adapter):
"""A user whose Member.roles includes an allowed role passes the gate."""
adapter._allowed_role_ids = {1234}
interaction = _make_interaction("999999999")
interaction.user.roles = [SimpleNamespace(id=1234)]
assert await adapter._check_slash_authorization(interaction, "/help") is True
@pytest.mark.asyncio
async def test_role_non_member_rejected(adapter):
"""A user without any matching role is rejected even if no user allowlist."""
adapter._allowed_role_ids = {1234}
interaction = _make_interaction("999999999")
interaction.user.roles = [SimpleNamespace(id=9999)] # different role
assert await adapter._check_slash_authorization(interaction, "/help") is False
# ---------------------------------------------------------------------------
# Channel allowlist (DISCORD_ALLOWED_CHANNELS) parity — the gate prajer used
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_channel_not_in_allowlist_rejected(adapter, monkeypatch, caplog):
"""on_message blocks messages in channels not in DISCORD_ALLOWED_CHANNELS;
slash must do the same. This is the EXACT bypass prajer exploited.
"""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "1111,2222")
interaction = _make_interaction("100200300", channel_id=9999)
with caplog.at_level(logging.WARNING):
assert await adapter._check_slash_authorization(interaction, "/background hi") is False
assert any("DISCORD_ALLOWED_CHANNELS" in r.message for r in caplog.records)
@pytest.mark.asyncio
async def test_channel_in_allowlist_passes(adapter, monkeypatch):
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "1111,2222")
interaction = _make_interaction("100200300", channel_id=1111)
assert await adapter._check_slash_authorization(interaction, "/help") is True
@pytest.mark.asyncio
async def test_channel_allowlist_wildcard_passes(adapter, monkeypatch):
"""``*`` in DISCORD_ALLOWED_CHANNELS = allow any channel, matching on_message."""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "*")
interaction = _make_interaction("100200300", channel_id=9999)
assert await adapter._check_slash_authorization(interaction, "/help") is True
@pytest.mark.asyncio
async def test_channel_allowlist_does_not_apply_to_dms(adapter, monkeypatch):
"""DMs aren't channel-gated — they go through on_message's DM lockdown."""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "1111")
interaction = _make_interaction("100200300", in_dm=True)
assert await adapter._check_slash_authorization(interaction, "/help") is True
# ---------------------------------------------------------------------------
# Channel blocklist (DISCORD_IGNORED_CHANNELS) parity
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_ignored_channel_rejected(adapter, monkeypatch, caplog):
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "9999")
interaction = _make_interaction("100200300", channel_id=9999)
with caplog.at_level(logging.WARNING):
assert await adapter._check_slash_authorization(interaction, "/help") is False
assert any("DISCORD_IGNORED_CHANNELS" in r.message for r in caplog.records)
@pytest.mark.asyncio
async def test_ignored_channel_wildcard_blocks_all(adapter, monkeypatch):
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "*")
interaction = _make_interaction("100200300", channel_id=9999)
assert await adapter._check_slash_authorization(interaction, "/help") is False
# ---------------------------------------------------------------------------
# Cross-platform admin notification
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_unauthorized_attempt_notifies_telegram(adapter):
from gateway.session import Platform
telegram_adapter = SimpleNamespace(send=AsyncMock())
home = SimpleNamespace(chat_id="987654321")
runner = SimpleNamespace(
adapters={Platform.TELEGRAM: telegram_adapter},
config=SimpleNamespace(get_home_channel=lambda p: home if p is Platform.TELEGRAM else None),
)
adapter.gateway_runner = runner
adapter._allowed_user_ids = {"100200300"}
interaction = _make_interaction("999999999")
await adapter._check_slash_authorization(interaction, "/background hi")
# Notify is fire-and-forget — let the scheduled task run.
await asyncio.sleep(0)
await asyncio.sleep(0)
telegram_adapter.send.assert_awaited_once()
chat_id, msg = telegram_adapter.send.call_args.args
assert chat_id == "987654321"
assert "Unauthorized" in msg
assert "999999999" in msg
assert "/background hi" in msg
assert "DISCORD_ALLOWED_USERS" in msg
@pytest.mark.asyncio
async def test_notify_silently_no_ops_without_runner(adapter):
adapter.gateway_runner = None
await adapter._notify_unauthorized_slash("u", "1", 2, 3, "/x", "reason") # must not raise
@pytest.mark.asyncio
async def test_notify_falls_back_to_slack_if_no_telegram(adapter):
from gateway.session import Platform
slack_adapter = SimpleNamespace(send=AsyncMock())
home_slack = SimpleNamespace(chat_id="C12345")
runner = SimpleNamespace(
adapters={Platform.SLACK: slack_adapter},
config=SimpleNamespace(
get_home_channel=lambda p: home_slack if p is Platform.SLACK else None,
),
)
adapter.gateway_runner = runner
await adapter._notify_unauthorized_slash("u", "1", 2, 3, "/x", "reason")
slack_adapter.send.assert_awaited_once()
# ---------------------------------------------------------------------------
# Opt-in visibility hide
# ---------------------------------------------------------------------------
def test_visibility_hide_off_by_default_is_noop(adapter, monkeypatch):
"""DISCORD_HIDE_SLASH_COMMANDS unset → don't touch any command's permissions."""
cmd = SimpleNamespace(name="x", default_permissions="UNCHANGED")
tree = SimpleNamespace(get_commands=lambda: [cmd])
# Re-run the registration tail logic by calling the bit that decides:
# we don't have a clean way to simulate the env-gated branch from
# _register_slash_commands, so we just confirm the helper itself works
# AND assert the env-gating logic is correct.
assert os.environ.get("DISCORD_HIDE_SLASH_COMMANDS") is None
# Helper should still work when called directly:
adapter._apply_owner_only_visibility(tree)
# When called directly the helper applies — env gating is at the call site,
# which we exercise in an integration-style test below.
def test_visibility_hide_helper_zeroes_perms(adapter):
cmd_a = SimpleNamespace(name="a", default_permissions=None)
cmd_b = SimpleNamespace(name="b", default_permissions=None)
tree = SimpleNamespace(get_commands=lambda: [cmd_a, cmd_b])
adapter._apply_owner_only_visibility(tree)
assert cmd_a.default_permissions is not None
assert cmd_b.default_permissions is not None
assert cmd_a.default_permissions.value == 0
assert cmd_b.default_permissions.value == 0
def test_visibility_hide_tolerates_unsetable_command(adapter, caplog):
class _Frozen:
__slots__ = ("name",)
def __init__(self, name):
self.name = name
cmd_ok = SimpleNamespace(name="ok", default_permissions=None)
cmd_bad = _Frozen("bad")
tree = SimpleNamespace(get_commands=lambda: [cmd_bad, cmd_ok])
with caplog.at_level(logging.DEBUG):
adapter._apply_owner_only_visibility(tree)
assert cmd_ok.default_permissions.value == 0
# os import for test_visibility_hide_off_by_default_is_noop
import os # noqa: E402
# ---------------------------------------------------------------------------
# Fail-closed parity on malformed slash auth context
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_missing_channel_id_rejected_when_channel_policy_configured(
adapter, monkeypatch,
):
"""A guild interaction without a resolvable channel id must fail
closed when DISCORD_ALLOWED_CHANNELS is configured. Without this
guard the entire channel-policy block silently fell through."""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "1111,2222")
interaction = _make_interaction("100200300", channel_id=None)
assert await adapter._check_slash_authorization(interaction, "/help") is False
interaction.response.send_message.assert_awaited_once()
@pytest.mark.asyncio
async def test_missing_channel_id_allowed_when_no_channel_policy(adapter):
"""No DISCORD_ALLOWED_CHANNELS configured + missing channel id: still
pass through the channel block (matches no-allowlist default)."""
interaction = _make_interaction("100200300", channel_id=None)
assert await adapter._check_slash_authorization(interaction, "/help") is True
@pytest.mark.asyncio
async def test_missing_user_rejected_when_allowlist_configured(adapter):
"""interaction.user is None with a user/role allowlist active:
fail closed without raising AttributeError."""
adapter._allowed_user_ids = {"100200300"}
interaction = _make_interaction("100200300", user=None)
# Must not raise — must return False with an ephemeral rejection
assert await adapter._check_slash_authorization(interaction, "/help") is False
interaction.response.send_message.assert_awaited_once()
@pytest.mark.asyncio
async def test_missing_user_allowed_when_no_allowlist_configured(adapter):
"""interaction.user is None but no allowlist configured: allow
(preserves no-allowlist back-compat -- anyone is allowed when no
policy is in effect)."""
interaction = _make_interaction("100200300", user=None)
assert await adapter._check_slash_authorization(interaction, "/help") is True
# ---------------------------------------------------------------------------
# Thread parent channel allowlist parity
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_thread_parent_in_allowlist_passes(adapter, monkeypatch):
"""Thread whose parent channel is on DISCORD_ALLOWED_CHANNELS passes
even though the thread id itself isn't on the list."""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "5555")
interaction = _make_interaction(
"100200300", channel_id=9999, in_thread=True, parent_channel_id=5555,
)
assert await adapter._check_slash_authorization(interaction, "/help") is True
@pytest.mark.asyncio
async def test_thread_parent_in_ignorelist_rejects(adapter, monkeypatch):
"""Thread whose parent channel is on DISCORD_IGNORED_CHANNELS rejects
even when the thread id itself isn't ignored."""
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "5555")
interaction = _make_interaction(
"100200300", channel_id=9999, in_thread=True, parent_channel_id=5555,
)
assert await adapter._check_slash_authorization(interaction, "/help") is False
@pytest.mark.asyncio
async def test_ignored_beats_allowed(adapter, monkeypatch):
"""Channel listed in BOTH allowed and ignored: the ignored entry wins.
Anything else would be a foot-gun where adding to ignored does nothing
if the channel is also explicitly allowed."""
monkeypatch.setenv("DISCORD_ALLOWED_CHANNELS", "1111")
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "1111")
interaction = _make_interaction("100200300", channel_id=1111)
assert await adapter._check_slash_authorization(interaction, "/help") is False
# ---------------------------------------------------------------------------
# Admin notify soft-fail fallback
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_notify_falls_back_to_slack_on_telegram_soft_fail(adapter):
"""adapter.send returning SendResult(success=False) must NOT short-
circuit the fallback chain. Treating a soft failure as delivered
means a Telegram outage swallows alerts silently."""
from gateway.session import Platform
soft_fail = SimpleNamespace(success=False, error="rate limited")
telegram_adapter = SimpleNamespace(send=AsyncMock(return_value=soft_fail))
slack_adapter = SimpleNamespace(send=AsyncMock())
home_tg = SimpleNamespace(chat_id="987654321")
home_sl = SimpleNamespace(chat_id="C12345")
homes = {Platform.TELEGRAM: home_tg, Platform.SLACK: home_sl}
runner = SimpleNamespace(
adapters={
Platform.TELEGRAM: telegram_adapter,
Platform.SLACK: slack_adapter,
},
config=SimpleNamespace(get_home_channel=lambda p: homes.get(p)),
)
adapter.gateway_runner = runner
await adapter._notify_unauthorized_slash("u", "1", 2, 3, "/x", "reason")
telegram_adapter.send.assert_awaited_once()
slack_adapter.send.assert_awaited_once()
@pytest.mark.asyncio
async def test_notify_returns_on_telegram_truthy_success(adapter):
"""adapter.send returning SendResult(success=True) -- or any object
without a falsy success attribute -- should still short-circuit at
Telegram. (This guards against the soft-fail patch over-correcting.)"""
from gateway.session import Platform
ok = SimpleNamespace(success=True, message_id="m1")
telegram_adapter = SimpleNamespace(send=AsyncMock(return_value=ok))
slack_adapter = SimpleNamespace(send=AsyncMock())
home_tg = SimpleNamespace(chat_id="987654321")
home_sl = SimpleNamespace(chat_id="C12345")
homes = {Platform.TELEGRAM: home_tg, Platform.SLACK: home_sl}
runner = SimpleNamespace(
adapters={
Platform.TELEGRAM: telegram_adapter,
Platform.SLACK: slack_adapter,
},
config=SimpleNamespace(get_home_channel=lambda p: homes.get(p)),
)
adapter.gateway_runner = runner
await adapter._notify_unauthorized_slash("u", "1", 2, 3, "/x", "reason")
telegram_adapter.send.assert_awaited_once()
slack_adapter.send.assert_not_awaited()
# ---------------------------------------------------------------------------
# /skill autocomplete + callback gating
# ---------------------------------------------------------------------------
def _capture_skill_registration(adapter, monkeypatch, entries):
"""Run ``_register_skill_group`` against a stubbed skill catalog and
return ``(handler_callback, autocomplete_callback)``.
The autocomplete callback is captured by monkeypatching
``discord.app_commands.autocomplete`` -- the production decorator is
a no-op stub in this test file's discord mock, so capturing the
callback through it is the direct route in tests.
"""
import discord
captured: dict = {}
def fake_categories(reserved_names):
# Match discord_skill_commands_by_category's tuple shape:
# (categories_dict, uncategorized_list, hidden_count)
return ({}, list(entries), 0)
import hermes_cli.commands as _hc
monkeypatch.setattr(
_hc, "discord_skill_commands_by_category", fake_categories,
)
def capture_autocomplete(**kwargs):
# Only one autocomplete in /skill registration: name=...
captured["autocomplete"] = kwargs.get("name")
def _passthrough(fn):
return fn
return _passthrough
monkeypatch.setattr(
discord.app_commands, "autocomplete", capture_autocomplete,
raising=False,
)
registered: list = []
class _Tree:
def get_commands(self):
return []
def add_command(self, cmd):
registered.append(cmd)
adapter._register_skill_group(_Tree())
assert registered, "_register_skill_group did not register a command"
return registered[0].callback, captured["autocomplete"]
@pytest.mark.asyncio
async def test_skill_autocomplete_returns_empty_for_unauthorized(
adapter, monkeypatch,
):
"""Autocomplete must not leak the installed skill catalog to users
who can't run /skill. With DISCORD_ALLOWED_USERS configured and the
interaction user outside it, the autocomplete callback returns []."""
adapter._allowed_user_ids = {"100200300"}
entries = [
("alpha", "First skill", "/alpha"),
("beta", "Second skill", "/beta"),
]
_handler, autocomplete = _capture_skill_registration(
adapter, monkeypatch, entries,
)
interaction = _make_interaction("999999999")
result = await autocomplete(interaction, "")
assert result == []
@pytest.mark.asyncio
async def test_skill_autocomplete_returns_choices_for_authorized(
adapter, monkeypatch,
):
"""Sanity: an authorized user still gets the autocomplete suggestions."""
adapter._allowed_user_ids = {"100200300"}
entries = [
("alpha", "First skill", "/alpha"),
("beta", "Second skill", "/beta"),
]
_handler, autocomplete = _capture_skill_registration(
adapter, monkeypatch, entries,
)
interaction = _make_interaction("100200300")
result = await autocomplete(interaction, "")
assert len(result) == 2
assert {choice.value for choice in result} == {"alpha", "beta"}
@pytest.mark.asyncio
async def test_skill_handler_rejects_before_dispatch_for_unauthorized(
adapter, monkeypatch,
):
"""The /skill handler must call _check_slash_authorization BEFORE
skill_lookup. Otherwise unknown vs known names produce divergent
responses ("Unknown skill: foo" vs auth rejection) which is a
catalog-probing oracle."""
adapter._allowed_user_ids = {"100200300"}
entries = [("alpha", "First skill", "/alpha")]
handler, _autocomplete = _capture_skill_registration(
adapter, monkeypatch, entries,
)
# Patch _run_simple_slash so we can detect any leak through it.
dispatched: list = []
async def fake_dispatch(_interaction, text):
dispatched.append(text)
adapter._run_simple_slash = fake_dispatch # type: ignore[assignment]
interaction = _make_interaction("999999999")
await handler(interaction, "alpha", "")
interaction.response.send_message.assert_awaited_once()
args, kwargs = interaction.response.send_message.call_args
assert kwargs.get("ephemeral") is True
assert "not authorized" in (
args[0] if args else kwargs.get("content", "")
).lower()
# Critically: nothing was dispatched, and the auth message did NOT
# mention the skill name "alpha" (no catalog leak).
assert dispatched == []
@pytest.mark.asyncio
async def test_skill_handler_known_and_unknown_produce_same_rejection(
adapter, monkeypatch,
):
"""An unauthorized user probing for valid skill names must see the
same rejection text regardless of whether the name they tried is
on the registered catalog."""
adapter._allowed_user_ids = {"100200300"}
entries = [("alpha", "First skill", "/alpha")]
handler, _ = _capture_skill_registration(adapter, monkeypatch, entries)
adapter._run_simple_slash = AsyncMock() # type: ignore[assignment]
known_interaction = _make_interaction("999999999")
unknown_interaction = _make_interaction("999999999")
await handler(known_interaction, "alpha", "")
await handler(unknown_interaction, "definitely-not-a-skill", "")
known_interaction.response.send_message.assert_awaited_once()
unknown_interaction.response.send_message.assert_awaited_once()
known_args, known_kwargs = known_interaction.response.send_message.call_args
unknown_args, unknown_kwargs = (
unknown_interaction.response.send_message.call_args
)
assert known_args == unknown_args
assert known_kwargs == unknown_kwargs
@pytest.mark.asyncio
async def test_skill_handler_dispatches_for_authorized(
adapter, monkeypatch,
):
"""Sanity: an authorized user reaches _run_simple_slash with the
resolved cmd_key and arguments."""
adapter._allowed_user_ids = {"100200300"}
entries = [("alpha", "First skill", "/alpha")]
handler, _ = _capture_skill_registration(adapter, monkeypatch, entries)
dispatched: list = []
async def fake_dispatch(_interaction, text):
dispatched.append(text)
adapter._run_simple_slash = fake_dispatch # type: ignore[assignment]
interaction = _make_interaction("100200300")
await handler(interaction, "alpha", "extra args")
assert dispatched == ["/alpha extra args"]
+16 -1
View File
@@ -107,6 +107,10 @@ def adapter():
user=SimpleNamespace(id=99999, name="HermesBot"),
)
adapter._text_batch_delay_seconds = 0 # disable batching for tests
# Slash auth is exercised in test_discord_slash_auth.py — bypass it here
# so registration / dispatch / thread behavior tests don't have to
# construct a full auth context (allowlist / channel scope).
adapter._check_slash_authorization = AsyncMock(return_value=True)
return adapter
@@ -117,6 +121,10 @@ def adapter():
@pytest.mark.asyncio
async def test_registers_native_thread_slash_command(adapter):
# The /thread slash closure now delegates ALL the work — including
# defer() — to _handle_thread_create_slash so the auth gate can send
# an ephemeral rejection on the still-unresponded interaction. The
# closure should just forward.
adapter._handle_thread_create_slash = AsyncMock()
adapter._register_slash_commands()
@@ -127,7 +135,9 @@ async def test_registers_native_thread_slash_command(adapter):
await command(interaction, name="Planning", message="", auto_archive_duration=1440)
interaction.response.defer.assert_awaited_once_with(ephemeral=True)
# defer is now performed inside _handle_thread_create_slash, AFTER the
# auth check passes — not by the closure.
interaction.response.defer.assert_not_awaited()
adapter._handle_thread_create_slash.assert_awaited_once_with(interaction, "Planning", "", 1440)
@@ -298,6 +308,7 @@ async def test_handle_thread_create_slash_reports_success(adapter):
user=SimpleNamespace(display_name="Jezza", id=42),
guild=SimpleNamespace(name="TestGuild"),
followup=SimpleNamespace(send=AsyncMock()),
response=SimpleNamespace(defer=AsyncMock()),
)
await adapter._handle_thread_create_slash(interaction, "Planning", "Kickoff", 1440)
@@ -326,6 +337,7 @@ async def test_handle_thread_create_slash_dispatches_session_when_message_provid
user=SimpleNamespace(display_name="Jezza", id=42),
guild=SimpleNamespace(name="TestGuild"),
followup=SimpleNamespace(send=AsyncMock()),
response=SimpleNamespace(defer=AsyncMock()),
)
adapter._dispatch_thread_session = AsyncMock()
@@ -348,6 +360,7 @@ async def test_handle_thread_create_slash_no_dispatch_without_message(adapter):
user=SimpleNamespace(display_name="Jezza", id=42),
guild=SimpleNamespace(name="TestGuild"),
followup=SimpleNamespace(send=AsyncMock()),
response=SimpleNamespace(defer=AsyncMock()),
)
adapter._dispatch_thread_session = AsyncMock()
@@ -371,6 +384,7 @@ async def test_handle_thread_create_slash_falls_back_to_seed_message(adapter):
user=SimpleNamespace(display_name="Jezza", id=42),
guild=SimpleNamespace(name="TestGuild"),
followup=SimpleNamespace(send=AsyncMock()),
response=SimpleNamespace(defer=AsyncMock()),
)
await adapter._handle_thread_create_slash(interaction, "Planning", "Kickoff", 1440)
@@ -395,6 +409,7 @@ async def test_handle_thread_create_slash_reports_failure(adapter):
channel_id=123,
user=SimpleNamespace(display_name="Jezza", id=42),
followup=SimpleNamespace(send=AsyncMock()),
response=SimpleNamespace(defer=AsyncMock()),
)
await adapter._handle_thread_create_slash(interaction, "Planning", "", 1440)
@@ -0,0 +1,78 @@
"""Gateway command help rendering tests."""
import pytest
from gateway.config import Platform
from gateway.platforms.base import MessageEvent
from gateway.session import SessionSource
def _make_event(text: str, platform: Platform) -> MessageEvent:
return MessageEvent(
text=text,
source=SessionSource(
platform=platform,
chat_id="chat-1",
user_id="user-1",
user_name="tester",
chat_type="dm",
),
)
def _make_runner():
from gateway.run import GatewayRunner
return object.__new__(GatewayRunner)
@pytest.mark.asyncio
async def test_help_sanitizes_slash_command_mentions_for_telegram(monkeypatch):
"""Telegram help output must not expose invalid uppercase/hyphenated slashes."""
monkeypatch.setattr(
"agent.skill_commands.get_skill_commands",
lambda: {
"/Linear": {"description": "Open Linear"},
"/Custom-Thing": {"description": "Run a custom thing"},
},
)
result = await _make_runner()._handle_help_command(
_make_event("/help", Platform.TELEGRAM)
)
assert "`/linear`" in result
assert "`/custom_thing`" in result
assert "`/Linear`" not in result
assert "`/Custom-Thing`" not in result
@pytest.mark.asyncio
async def test_commands_sanitizes_slash_command_mentions_for_telegram(monkeypatch):
"""Paginated Telegram /commands output uses Telegram-valid slash mentions."""
monkeypatch.setattr(
"agent.skill_commands.get_skill_commands",
lambda: {"/Linear": {"description": "Open Linear"}},
)
result = await _make_runner()._handle_commands_command(
_make_event("/commands 999", Platform.TELEGRAM)
)
assert "`/linear`" in result
assert "`/Linear`" not in result
@pytest.mark.asyncio
async def test_help_keeps_non_telegram_slash_command_mentions_unchanged(monkeypatch):
"""Only Telegram needs slash mentions rewritten to Telegram command names."""
monkeypatch.setattr(
"agent.skill_commands.get_skill_commands",
lambda: {"/Linear": {"description": "Open Linear"}},
)
result = await _make_runner()._handle_help_command(
_make_event("/help", Platform.DISCORD)
)
assert "`/Linear`" in result
+217
View File
@@ -0,0 +1,217 @@
"""Tests for gateway /goal verdict-message delivery.
The judge verdict message ("✓ Goal achieved", "⏸ budget exhausted", etc.)
must reach the user after each turn. Before this fix the code checked
``hasattr(adapter, "send_message")`` but adapters expose ``send()``,
never ``send_message``, so the check always evaluated False and users
never saw verdicts. This test locks in the fix.
"""
from __future__ import annotations
import asyncio
from datetime import datetime
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from gateway.config import GatewayConfig, Platform, PlatformConfig
from gateway.session import SessionEntry, SessionSource, build_session_key
@pytest.fixture()
def hermes_home(tmp_path, monkeypatch):
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(home))
from hermes_cli import goals
goals._DB_CACHE.clear()
yield home
goals._DB_CACHE.clear()
def _make_source() -> SessionSource:
return SessionSource(
platform=Platform.TELEGRAM,
user_id="u1",
chat_id="c1",
user_name="tester",
chat_type="dm",
)
class _RecordingAdapter:
"""Minimal adapter that records send() invocations."""
def __init__(self) -> None:
self._pending_messages: dict = {}
self.sends: list[dict] = []
async def send(self, chat_id: str, content: str, reply_to=None, metadata=None):
self.sends.append({"chat_id": chat_id, "content": content, "metadata": metadata})
class _R:
success = True
message_id = "mock-msg"
return _R()
def _make_runner_with_adapter():
from gateway.run import GatewayRunner
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(
platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="***")},
)
runner.adapters = {}
runner._running_agents = {}
runner._running_agents_ts = {}
runner._queued_events = {}
src = _make_source()
session_entry = SessionEntry(
session_key=build_session_key(src),
session_id="goal-sess-1",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
chat_type="dm",
)
runner.session_store = MagicMock()
runner.session_store.get_or_create_session.return_value = session_entry
runner.session_store._generate_session_key.return_value = build_session_key(src)
adapter = _RecordingAdapter()
runner.adapters[Platform.TELEGRAM] = adapter
return runner, adapter, session_entry, src
@pytest.mark.asyncio
async def test_goal_verdict_done_sent_via_adapter_send(hermes_home):
"""When the judge says done, the '✓ Goal achieved' message must reach
the user through the adapter's ``send()`` method."""
runner, adapter, session_entry, src = _make_runner_with_adapter()
from hermes_cli.goals import GoalManager
mgr = GoalManager(session_entry.session_id)
mgr.set("ship the feature")
with patch("hermes_cli.goals.judge_goal", return_value=("done", "the feature shipped")):
runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="I shipped the feature.",
)
# fire-and-forget create_task — give the loop a tick
await asyncio.sleep(0.05)
assert len(adapter.sends) == 1, f"expected 1 send, got {len(adapter.sends)}: {adapter.sends}"
msg = adapter.sends[0]
assert msg["chat_id"] == "c1"
assert "Goal achieved" in msg["content"]
assert "the feature shipped" in msg["content"]
@pytest.mark.asyncio
async def test_goal_verdict_continue_enqueues_continuation(hermes_home):
"""When the judge says continue, both the 'continuing' status and the
continuation-prompt event must be delivered. The continuation prompt is
routed through the adapter's pending-messages FIFO so the goal loop
proceeds on the next turn."""
runner, adapter, session_entry, src = _make_runner_with_adapter()
from hermes_cli.goals import GoalManager
mgr = GoalManager(session_entry.session_id)
mgr.set("polish the docs")
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "still needs work")):
runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="here's a partial edit",
)
await asyncio.sleep(0.05)
# Status line sent back
assert len(adapter.sends) == 1
assert "Continuing toward goal" in adapter.sends[0]["content"]
# Continuation prompt enqueued for next turn
assert adapter._pending_messages, "continuation prompt must be enqueued in pending_messages"
@pytest.mark.asyncio
async def test_goal_verdict_budget_exhausted_sends_pause(hermes_home):
"""When the budget is exhausted, a '⏸ Goal paused' message must be sent
and no further continuation enqueued."""
runner, adapter, session_entry, src = _make_runner_with_adapter()
from hermes_cli.goals import GoalManager, save_goal
mgr = GoalManager(session_entry.session_id, default_max_turns=2)
state = mgr.set("tiny goal", max_turns=2)
state.turns_used = 2
save_goal(session_entry.session_id, state)
with patch("hermes_cli.goals.judge_goal", return_value=("continue", "keep going")):
runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="still partial",
)
await asyncio.sleep(0.05)
assert len(adapter.sends) == 1
content = adapter.sends[0]["content"]
assert "paused" in content.lower()
assert "turns used" in content.lower()
# No continuation enqueued when budget is exhausted
assert not adapter._pending_messages
@pytest.mark.asyncio
async def test_goal_verdict_skipped_when_no_active_goal(hermes_home):
"""No goal set → the hook is a no-op. Nothing is sent, nothing enqueued."""
runner, adapter, session_entry, src = _make_runner_with_adapter()
runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="anything",
)
await asyncio.sleep(0.05)
assert adapter.sends == []
assert adapter._pending_messages == {}
@pytest.mark.asyncio
async def test_goal_verdict_survives_adapter_without_send(hermes_home):
"""Bad adapter (no ``send`` attribute) must not crash the judge hook."""
runner, _adapter, session_entry, src = _make_runner_with_adapter()
from hermes_cli.goals import GoalManager
GoalManager(session_entry.session_id).set("survive missing send")
class _NoSendAdapter:
def __init__(self):
self._pending_messages: dict = {}
runner.adapters[Platform.TELEGRAM] = _NoSendAdapter()
with patch("hermes_cli.goals.judge_goal", return_value=("done", "ok")):
# must not raise
runner._post_turn_goal_continuation(
session_entry=session_entry,
source=src,
final_response="whatever",
)
await asyncio.sleep(0.05)
+7 -1
View File
@@ -8,7 +8,7 @@ to env vars nothing read on startup — the home channel appeared to set
successfully but was lost on every new gateway session.
"""
from gateway.run import _home_target_env_var
from gateway.run import _home_target_env_var, _home_thread_env_var
def test_matrix_home_target_env_var_uses_home_room():
@@ -34,3 +34,9 @@ def test_unknown_platform_home_target_env_var_falls_back_to_home_channel():
def test_case_insensitive_platform_name():
assert _home_target_env_var("MATRIX") == "MATRIX_HOME_ROOM"
assert _home_target_env_var("Email") == "EMAIL_HOME_ADDRESS"
def test_home_thread_env_var_uses_home_target_name_plus_thread_id():
assert _home_thread_env_var("discord") == "DISCORD_HOME_CHANNEL_THREAD_ID"
assert _home_thread_env_var("matrix") == "MATRIX_HOME_ROOM_THREAD_ID"
assert _home_thread_env_var("email") == "EMAIL_HOME_ADDRESS_THREAD_ID"
+206 -7
View File
@@ -8,8 +8,8 @@ from unittest.mock import AsyncMock, MagicMock
import pytest
import gateway.run as gateway_run
from gateway.config import Platform
from gateway.platforms.base import MessageEvent, MessageType
from gateway.config import HomeChannel, Platform
from gateway.platforms.base import MessageEvent, MessageType, SendResult
from gateway.session import build_session_key
from tests.gateway.restart_test_helpers import (
make_restart_runner,
@@ -17,6 +17,22 @@ from tests.gateway.restart_test_helpers import (
)
# ── restart marker helpers ───────────────────────────────────────────────
def test_restart_notification_pending_false_without_marker(tmp_path, monkeypatch):
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
assert gateway_run._restart_notification_pending() is False
def test_restart_notification_pending_true_with_marker(tmp_path, monkeypatch):
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
(tmp_path / ".restart_notify.json").write_text("{}")
assert gateway_run._restart_notification_pending() is True
# ── _handle_restart_command writes .restart_notify.json ──────────────────
@@ -143,6 +159,184 @@ async def test_restart_command_uses_atomic_json_writes_for_marker_files(tmp_path
assert calls[1][1]["platform"] == "telegram"
@pytest.mark.asyncio
async def test_sethome_updates_running_config_for_same_process_restart(tmp_path, monkeypatch):
"""/sethome persists to env and updates in-memory config before restart."""
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
saved = {}
def _fake_save_env_value(key, value):
saved[key] = value
monkeypatch.setattr("hermes_cli.config.save_env_value", _fake_save_env_value)
runner, _adapter = make_restart_runner()
source = make_restart_source(chat_id="home-42")
source.chat_name = "Ops Home"
event = MessageEvent(
text="/sethome",
message_type=MessageType.TEXT,
source=source,
message_id="m-home",
)
result = await runner._handle_set_home_command(event)
home = runner.config.get_home_channel(Platform.TELEGRAM)
assert "Home channel set" in result
assert saved["TELEGRAM_HOME_CHANNEL"] == "home-42"
assert home is not None
assert home.chat_id == "home-42"
assert home.name == "Ops Home"
@pytest.mark.asyncio
async def test_sethome_preserves_thread_target_for_same_process_restart(tmp_path, monkeypatch):
"""/sethome from a topic/thread stores the thread-aware home target."""
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
saved = {}
def _fake_save_env_value(key, value):
saved[key] = value
monkeypatch.setattr("hermes_cli.config.save_env_value", _fake_save_env_value)
runner, _adapter = make_restart_runner()
source = make_restart_source(chat_id="parent-42", thread_id="topic-7")
source.chat_name = "Ops Topic"
event = MessageEvent(
text="/sethome",
message_type=MessageType.TEXT,
source=source,
message_id="m-home-thread",
)
result = await runner._handle_set_home_command(event)
home = runner.config.get_home_channel(Platform.TELEGRAM)
assert "Home channel set" in result
assert saved["TELEGRAM_HOME_CHANNEL"] == "parent-42"
assert saved["TELEGRAM_HOME_CHANNEL_THREAD_ID"] == "topic-7"
assert home is not None
assert home.chat_id == "parent-42"
assert home.thread_id == "topic-7"
# ── home-channel startup notifications ─────────────────────────────────────
@pytest.mark.asyncio
async def test_send_home_channel_startup_notification_to_configured_home(tmp_path, monkeypatch):
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
runner, adapter = make_restart_runner()
runner.config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
platform=Platform.TELEGRAM,
chat_id="home-42",
name="Ops Home",
)
adapter.send = AsyncMock()
delivered = await runner._send_home_channel_startup_notifications()
assert delivered == {("telegram", "home-42", None)}
adapter.send.assert_called_once_with(
"home-42",
"♻️ Gateway online — Hermes is back and ready.",
)
@pytest.mark.asyncio
async def test_send_home_channel_startup_notification_preserves_thread_metadata(
tmp_path, monkeypatch
):
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
runner, adapter = make_restart_runner()
runner.config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
platform=Platform.TELEGRAM,
chat_id="parent-42",
name="Ops Topic",
thread_id="topic-7",
)
adapter.send = AsyncMock(return_value=SendResult(success=True, message_id="home"))
delivered = await runner._send_home_channel_startup_notifications()
assert delivered == {("telegram", "parent-42", "topic-7")}
adapter.send.assert_called_once_with(
"parent-42",
"♻️ Gateway online — Hermes is back and ready.",
metadata={"thread_id": "topic-7"},
)
@pytest.mark.asyncio
async def test_send_home_channel_startup_notification_skips_restart_target(
tmp_path, monkeypatch
):
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
runner, adapter = make_restart_runner()
runner.config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
platform=Platform.TELEGRAM,
chat_id="42",
name="Ops Home",
)
adapter.send = AsyncMock()
delivered = await runner._send_home_channel_startup_notifications(
skip_targets={("telegram", "42", None)}
)
assert delivered == set()
adapter.send.assert_not_called()
@pytest.mark.asyncio
async def test_send_home_channel_startup_notification_does_not_skip_different_thread(
tmp_path, monkeypatch
):
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
runner, adapter = make_restart_runner()
runner.config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
platform=Platform.TELEGRAM,
chat_id="42",
name="Ops Home",
)
adapter.send = AsyncMock(return_value=SendResult(success=True, message_id="home"))
delivered = await runner._send_home_channel_startup_notifications(
skip_targets={("telegram", "42", "topic-7")}
)
assert delivered == {("telegram", "42", None)}
adapter.send.assert_called_once()
@pytest.mark.asyncio
async def test_send_home_channel_startup_notification_ignores_false_send_result(
tmp_path, monkeypatch
):
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
runner, adapter = make_restart_runner()
runner.config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
platform=Platform.TELEGRAM,
chat_id="home-42",
name="Ops Home",
)
adapter.send = AsyncMock(return_value=SendResult(success=False, error="network down"))
delivered = await runner._send_home_channel_startup_notifications()
assert delivered == set()
adapter.send.assert_called_once()
# ── _send_restart_notification ───────────────────────────────────────────
@@ -160,8 +354,9 @@ async def test_send_restart_notification_delivers_and_cleans_up(tmp_path, monkey
runner, adapter = make_restart_runner()
adapter.send = AsyncMock()
await runner._send_restart_notification()
delivered_target = await runner._send_restart_notification()
assert delivered_target == ("telegram", "42", None)
adapter.send.assert_called_once()
call_args = adapter.send.call_args
assert call_args[0][0] == "42" # chat_id
@@ -185,8 +380,9 @@ async def test_send_restart_notification_with_thread(tmp_path, monkeypatch):
runner, adapter = make_restart_runner()
adapter.send = AsyncMock()
await runner._send_restart_notification()
delivered_target = await runner._send_restart_notification()
assert delivered_target == ("telegram", "99", "topic_7")
call_args = adapter.send.call_args
assert call_args[1]["metadata"] == {"thread_id": "topic_7"}
assert not notify_path.exists()
@@ -240,9 +436,10 @@ async def test_send_restart_notification_cleans_up_on_send_failure(
runner, adapter = make_restart_runner()
adapter.send = AsyncMock(side_effect=RuntimeError("network down"))
await runner._send_restart_notification()
delivered_target = await runner._send_restart_notification()
# File cleaned up even though send raised.
assert delivered_target is None
assert not notify_path.exists()
@@ -274,7 +471,7 @@ async def test_send_restart_notification_logs_warning_on_sendresult_failure(
)
with caplog.at_level("DEBUG", logger="gateway.run"):
await runner._send_restart_notification()
delivered_target = await runner._send_restart_notification()
success_lines = [
r for r in caplog.records
@@ -286,6 +483,7 @@ async def test_send_restart_notification_logs_warning_on_sendresult_failure(
and "was not delivered" in r.getMessage()
and "Chat not found" in r.getMessage()
]
assert delivered_target is None
assert not success_lines, (
"Expected no INFO 'Sent restart notification' line when send failed, "
f"got: {[r.getMessage() for r in success_lines]}"
@@ -317,12 +515,13 @@ async def test_send_restart_notification_logs_info_on_sendresult_success(
adapter.send = AsyncMock(return_value=SendResult(success=True, message_id="m-1"))
with caplog.at_level("DEBUG", logger="gateway.run"):
await runner._send_restart_notification()
delivered_target = await runner._send_restart_notification()
success_lines = [
r for r in caplog.records
if r.levelname == "INFO" and "Sent restart notification" in r.getMessage()
]
assert delivered_target == ("telegram", "42", None)
assert success_lines, (
"Expected INFO 'Sent restart notification' when send succeeded; "
f"got records: {[(r.levelname, r.getMessage()) for r in caplog.records]}"
+85 -4
View File
@@ -32,7 +32,8 @@ from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from gateway.config import GatewayConfig, Platform, PlatformConfig
from gateway.config import GatewayConfig, HomeChannel, Platform, PlatformConfig
from gateway.platforms.base import SendResult
from gateway.run import (
_auto_continue_freshness_window,
_coerce_gateway_timestamp,
@@ -376,8 +377,8 @@ class TestSuspendRecentlyActiveSkipsResumePending:
assert e.suspended is False
assert e.resume_pending is True
def test_non_resume_pending_still_suspended(self, tmp_path):
"""Non-resume sessions still get the old crash-recovery suspension."""
def test_non_resume_pending_gets_resume_pending(self, tmp_path):
"""Non-resume sessions are now marked resume_pending (not suspended)."""
store = _make_store(tmp_path)
source_a = _make_source(chat_id="a")
source_b = _make_source(chat_id="b")
@@ -386,9 +387,11 @@ class TestSuspendRecentlyActiveSkipsResumePending:
store.mark_resume_pending(entry_a.session_key)
count = store.suspend_recently_active()
# entry_a is already resume_pending → skipped. entry_b gets marked.
assert count == 1
assert store._entries[entry_a.session_key].suspended is False
assert store._entries[entry_b.session_key].suspended is True
assert store._entries[entry_b.session_key].resume_pending is True
assert store._entries[entry_b.session_key].suspended is False
# ---------------------------------------------------------------------------
@@ -929,6 +932,84 @@ async def test_restart_banner_uses_try_to_resume_wording():
assert "try to resume" in msg
@pytest.mark.asyncio
async def test_restart_notifies_home_channel_even_without_active_sessions():
runner, adapter = make_restart_runner()
runner._restart_requested = True
runner.config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
platform=Platform.TELEGRAM,
chat_id="home-42",
name="Ops Home",
)
await runner._notify_active_sessions_of_shutdown()
assert adapter.sent == [
"⚠️ Gateway restarting — Your current task will be interrupted. "
"Send any message after restart and I'll try to resume where you left off."
]
@pytest.mark.asyncio
async def test_restart_home_channel_notification_dedupes_active_chat():
runner, adapter = make_restart_runner()
runner._restart_requested = True
runner._running_agents["agent:main:telegram:dm:999"] = MagicMock()
runner.config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
platform=Platform.TELEGRAM,
chat_id="999",
name="Ops Home",
)
await runner._notify_active_sessions_of_shutdown()
assert len(adapter.sent) == 1
@pytest.mark.asyncio
async def test_restart_home_channel_notification_not_deduped_across_threads():
runner, adapter = make_restart_runner()
runner._restart_requested = True
session_key = "agent:main:telegram:group:999"
runner.session_store._entries[session_key] = MagicMock(
origin=SessionSource(
platform=Platform.TELEGRAM,
chat_id="999",
chat_type="group",
user_id="u1",
thread_id="topic-7",
)
)
runner._running_agents[session_key] = MagicMock()
runner.config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
platform=Platform.TELEGRAM,
chat_id="999",
name="Ops Home",
)
await runner._notify_active_sessions_of_shutdown()
assert len(adapter.sent) == 2
assert adapter.sent_calls[0][2] == {"thread_id": "topic-7"}
assert adapter.sent_calls[1][2] is None
@pytest.mark.asyncio
async def test_restart_home_channel_notification_ignores_false_send_result():
runner, adapter = make_restart_runner()
runner._restart_requested = True
runner.config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
platform=Platform.TELEGRAM,
chat_id="home-42",
name="Ops Home",
)
adapter.send = AsyncMock(return_value=SendResult(success=False, error="network down"))
await runner._notify_active_sessions_of_shutdown()
adapter.send.assert_called_once()
# ---------------------------------------------------------------------------
# Stuck-loop escalation integration
# ---------------------------------------------------------------------------
@@ -124,6 +124,10 @@ async def test_resume_clears_session_scoped_approval_and_yolo_state():
runner, session_key = _make_resume_runner()
other_key = "agent:main:telegram:dm:other-chat"
runner._pending_skills_reload_notes = {
session_key: "[USER INITIATED SKILLS RELOAD: target]",
other_key: "[USER INITIATED SKILLS RELOAD: other]",
}
approve_session(session_key, "recursive delete")
approve_session(other_key, "recursive delete")
enable_session_yolo(session_key)
@@ -140,10 +144,12 @@ async def test_resume_clears_session_scoped_approval_and_yolo_state():
assert is_session_yolo_enabled(session_key) is False
assert session_key not in runner._pending_approvals
assert session_key not in runner._update_prompt_pending
assert session_key not in runner._pending_skills_reload_notes
assert is_approved(other_key, "recursive delete") is True
assert is_session_yolo_enabled(other_key) is True
assert other_key in runner._pending_approvals
assert other_key in runner._update_prompt_pending
assert other_key in runner._pending_skills_reload_notes
@pytest.mark.asyncio
@@ -151,6 +157,10 @@ async def test_branch_clears_session_scoped_approval_and_yolo_state():
runner, session_key = _make_branch_runner()
other_key = "agent:main:telegram:dm:other-chat"
runner._pending_skills_reload_notes = {
session_key: "[USER INITIATED SKILLS RELOAD: target]",
other_key: "[USER INITIATED SKILLS RELOAD: other]",
}
approve_session(session_key, "recursive delete")
approve_session(other_key, "recursive delete")
enable_session_yolo(session_key)
@@ -167,10 +177,12 @@ async def test_branch_clears_session_scoped_approval_and_yolo_state():
assert is_session_yolo_enabled(session_key) is False
assert session_key not in runner._pending_approvals
assert session_key not in runner._update_prompt_pending
assert session_key not in runner._pending_skills_reload_notes
assert is_approved(other_key, "recursive delete") is True
assert is_session_yolo_enabled(other_key) is True
assert other_key in runner._pending_approvals
assert other_key in runner._update_prompt_pending
assert other_key in runner._pending_skills_reload_notes
@pytest.mark.asyncio
@@ -216,6 +228,7 @@ def test_clear_session_boundary_security_state_is_scoped():
runner = object.__new__(GatewayRunner)
runner._pending_approvals = {}
runner._update_prompt_pending = {}
runner._pending_skills_reload_notes = {}
source = _make_source()
session_key = build_session_key(source)
@@ -229,6 +242,12 @@ def test_clear_session_boundary_security_state_is_scoped():
runner._pending_approvals[other_key] = {"command": "rm -rf /tmp/other"}
runner._update_prompt_pending[session_key] = True
runner._update_prompt_pending[other_key] = True
runner._pending_skills_reload_notes[session_key] = (
"[USER INITIATED SKILLS RELOAD: target]"
)
runner._pending_skills_reload_notes[other_key] = (
"[USER INITIATED SKILLS RELOAD: other]"
)
runner._clear_session_boundary_security_state(session_key)
@@ -237,16 +256,19 @@ def test_clear_session_boundary_security_state_is_scoped():
assert is_session_yolo_enabled(session_key) is False
assert session_key not in runner._pending_approvals
assert session_key not in runner._update_prompt_pending
assert session_key not in runner._pending_skills_reload_notes
# Other session untouched
assert is_approved(other_key, "recursive delete") is True
assert is_session_yolo_enabled(other_key) is True
assert other_key in runner._pending_approvals
assert other_key in runner._update_prompt_pending
assert other_key in runner._pending_skills_reload_notes
# Empty session_key is a no-op
runner._clear_session_boundary_security_state("")
assert is_approved(other_key, "recursive delete") is True
assert other_key in runner._update_prompt_pending
assert other_key in runner._pending_skills_reload_notes
def test_clear_session_boundary_security_state_wakes_blocked_approvals():
+49
View File
@@ -231,6 +231,55 @@ class TestSlackConnectCleanup:
mock_release.assert_called_once_with("slack-app-token", "xapp-fake")
assert adapter._platform_lock_identity is None
@pytest.mark.asyncio
async def test_reconnect_closes_previous_handler_to_prevent_zombie_socket(self):
"""Regression for #18980: calling connect() on an adapter that already has
a live handler (e.g. during a gateway restart) must close the old
AsyncSocketModeHandler before creating a new one. Without this guard,
the old Socket Mode websocket stays alive and both connections dispatch
every Slack event, producing double responses the same bug that
affected DiscordAdapter (#18187).
"""
config = PlatformConfig(enabled=True, token="xoxb-fake")
adapter = SlackAdapter(config)
# Simulate state left over from a prior connect() call.
first_handler = AsyncMock()
first_handler.close_async = AsyncMock()
adapter._handler = first_handler
mock_app = MagicMock()
def _noop_decorator(event_type):
def decorator(fn): return fn
return decorator
mock_app.event = _noop_decorator
mock_app.command = _noop_decorator
mock_app.action = _noop_decorator
mock_app.client = AsyncMock()
mock_web_client = AsyncMock()
mock_web_client.auth_test = AsyncMock(return_value={
"user_id": "U_BOT",
"user": "testbot",
"team_id": "T_FAKE",
"team": "FakeTeam",
})
second_handler = MagicMock()
with patch.object(_slack_mod, "AsyncApp", return_value=mock_app), \
patch.object(_slack_mod, "AsyncWebClient", return_value=mock_web_client), \
patch.object(_slack_mod, "AsyncSocketModeHandler", return_value=second_handler), \
patch.dict(os.environ, {"SLACK_APP_TOKEN": "xapp-fake"}), \
patch("gateway.status.acquire_scoped_lock", return_value=(True, None)), \
patch("gateway.status.release_scoped_lock"), \
patch("asyncio.create_task"):
result = await adapter.connect()
assert result is True
first_handler.close_async.assert_awaited_once_with()
assert adapter._handler is second_handler
# ---------------------------------------------------------------------------
# TestSlackProxyBehavior
@@ -261,6 +261,57 @@ def test_group_allow_from_is_enforced_by_gateway_authorization_not_trigger_gate(
assert adapter._should_process_message(_group_message("hello", from_user_id=333)) is True
def test_top_level_require_mention_bridges_to_telegram(monkeypatch, tmp_path):
"""require_mention at the config.yaml top level (alongside group_sessions_per_user)
must behave identically to telegram.require_mention: true (#3979).
"""
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
# Intentionally no "telegram:" section — keys are at the top level.
(hermes_home / "config.yaml").write_text(
"require_mention: true\n"
"group_sessions_per_user: true\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.delenv("TELEGRAM_REQUIRE_MENTION", raising=False)
config = load_gateway_config()
assert config is not None
assert __import__("os").environ.get("TELEGRAM_REQUIRE_MENTION") == "true"
# The adapter's extra dict must also carry the setting so that
# _telegram_require_mention() works even without the env var.
tg_cfg = config.platforms.get(__import__("gateway.config", fromlist=["Platform"]).Platform.TELEGRAM)
if tg_cfg is not None:
assert tg_cfg.extra.get("require_mention") is True
def test_top_level_require_mention_does_not_override_telegram_section(monkeypatch, tmp_path):
"""When telegram.require_mention is explicitly set, top-level require_mention
must not override it (platform-specific config takes precedence).
"""
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
(hermes_home / "config.yaml").write_text(
"require_mention: true\n"
"telegram:\n"
" require_mention: false\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.delenv("TELEGRAM_REQUIRE_MENTION", raising=False)
config = load_gateway_config()
assert config is not None
# The telegram-specific "false" must win over the top-level "true".
assert __import__("os").environ.get("TELEGRAM_REQUIRE_MENTION") == "false"
def test_config_bridges_telegram_ignored_threads(monkeypatch, tmp_path):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
+40
View File
@@ -954,6 +954,46 @@ class TestVoiceChannelCommands:
assert "Test transcript" in msg
assert "42" in msg # user_id in mention
@pytest.mark.asyncio
async def test_input_suppresses_duplicate_transcript(self, runner):
"""Near-immediate duplicate STT output should not dispatch twice."""
from gateway.config import Platform
mock_adapter = AsyncMock()
mock_adapter._voice_text_channels = {111: 123}
mock_adapter._voice_sources = {}
mock_channel = AsyncMock()
mock_adapter._client = MagicMock()
mock_adapter._client.get_channel = MagicMock(return_value=mock_channel)
mock_adapter.handle_message = AsyncMock()
runner.adapters[Platform.DISCORD] = mock_adapter
await runner._handle_voice_channel_input(111, 42, "Hello from VC")
await runner._handle_voice_channel_input(111, 42, "Hello from VC")
mock_adapter.handle_message.assert_called_once()
mock_channel.send.assert_called_once()
@pytest.mark.asyncio
async def test_input_suppresses_near_duplicate_transcript(self, runner):
"""Small STT wording drift should still be treated as the same utterance."""
from gateway.config import Platform
mock_adapter = AsyncMock()
mock_adapter._voice_text_channels = {111: 123}
mock_adapter._voice_sources = {}
mock_channel = AsyncMock()
mock_adapter._client = MagicMock()
mock_adapter._client.get_channel = MagicMock(return_value=mock_channel)
mock_adapter.handle_message = AsyncMock()
runner.adapters[Platform.DISCORD] = mock_adapter
await runner._handle_voice_channel_input(111, 42, "This is a test of the voice system")
await runner._handle_voice_channel_input(111, 42, "This is a test for the voice system")
mock_adapter.handle_message.assert_called_once()
mock_channel.send.assert_called_once()
# -- _get_guild_id --
def test_get_guild_id_from_guild(self, runner):
+117
View File
@@ -236,6 +236,13 @@ class TestTelegramBotCommands:
tg_name = cmd.name.replace("-", "_")
assert tg_name not in names
def test_excludes_commands_with_required_args(self):
names = {name for name, _ in telegram_bot_commands()}
assert "background" not in names
assert "queue" not in names
assert "steer" not in names
assert "background" in GATEWAY_KNOWN_COMMANDS
class TestSlackSubcommandMap:
def test_returns_dict(self):
@@ -822,6 +829,103 @@ class TestClampTelegramNames:
assert result[0] == ("foo", "d1")
class TestClampCommandNamesTriples:
"""Tests for _clamp_command_names with 3-tuples (name, desc, cmd_key).
Skill entries pass through _clamp_command_names as 3-tuples so the
original cmd_key survives name truncation. Before the fix in PR #18951,
the code stripped cmd_key into a side-dict keyed by the *original*
(name, desc) pair after truncation the lookup key no longer matched,
silently losing the cmd_key.
"""
def test_short_triple_preserved(self):
entries = [("skill", "A skill", "/skill")]
result = _clamp_command_names(entries, set())
assert result == [("skill", "A skill", "/skill")]
def test_long_name_preserves_cmd_key(self):
long = "a" * 50
cmd_key = f"/{long}"
result = _clamp_command_names([(long, "desc", cmd_key)], set())
assert len(result) == 1
name, desc, key = result[0]
assert len(name) == _CMD_NAME_LIMIT
assert key == cmd_key, "cmd_key must survive name clamping"
def test_collision_preserves_cmd_key(self):
prefix = "x" * _CMD_NAME_LIMIT
long = "x" * 50
result = _clamp_command_names(
[(long, "desc", "/long-skill")], reserved={prefix},
)
assert len(result) == 1
name, _desc, key = result[0]
assert name == "x" * (_CMD_NAME_LIMIT - 1) + "0"
assert key == "/long-skill"
def test_multiple_long_names_preserve_respective_keys(self):
base = "y" * 40
entries = [
(base + "_alpha", "d1", "/alpha-skill"),
(base + "_beta", "d2", "/beta-skill"),
]
result = _clamp_command_names(entries, set())
assert len(result) == 2
assert result[0][2] == "/alpha-skill"
assert result[1][2] == "/beta-skill"
def test_backward_compat_with_pairs(self):
"""Legacy 2-tuple callers (Telegram) must still work."""
entries = [("help", "Show help"), ("status", "Show status")]
result = _clamp_command_names(entries, set())
assert result == entries
class TestDiscordSkillCmdKeyDispatch:
"""Integration: discord_skill_commands preserves cmd_key for long names.
This tests the full pipeline: skill_commands _collect_gateway_skill_entries
_clamp_command_names returned triples, verifying that skills with names
exceeding Discord's 32-char limit still have their original cmd_key for
dispatch.
"""
def test_long_skill_name_retains_cmd_key(self, tmp_path, monkeypatch):
from unittest.mock import patch
long_name = "this-is-a-very-long-skill-name-that-exceeds-limit"
cmd_key = f"/{long_name}"
fake_skills_dir = tmp_path / "skills"
fake_skills_dir.mkdir(exist_ok=True)
# Use resolved path — macOS /var → /private/var symlink
# causes SKILLS_DIR.resolve() to differ from tmp_path.
resolved_dir = str(fake_skills_dir.resolve())
fake_cmds = {
cmd_key: {
"name": long_name,
"description": "A skill with a long name",
"skill_md_path": f"{resolved_dir}/{long_name}/SKILL.md",
"skill_dir": f"{resolved_dir}/{long_name}",
},
}
with patch("agent.skill_commands.get_skill_commands", return_value=fake_cmds), \
patch("tools.skills_tool.SKILLS_DIR", fake_skills_dir), \
patch("agent.skill_utils.get_external_skills_dirs", return_value=[]):
entries, hidden = discord_skill_commands(
max_slots=100, reserved_names=set(),
)
assert len(entries) == 1
name, desc, key = entries[0]
assert len(name) <= _CMD_NAME_LIMIT, "Name should be clamped to 32 chars"
assert key == cmd_key, (
f"cmd_key must be the original /{long_name}, got {key!r}"
)
class TestTelegramMenuCommands:
"""Integration: telegram_menu_commands enforces the 32-char limit."""
@@ -1564,6 +1668,19 @@ class TestPluginCommandEnumeration:
names = {name for name, _desc in telegram_bot_commands()}
assert "metricas" in names
def test_plugin_command_with_required_args_excluded_from_telegram_menu(self, monkeypatch):
"""Telegram BotCommand selections cannot supply required arguments."""
self._patch_plugin_commands(monkeypatch, {
"background-job": {
"handler": lambda _a: "ok",
"description": "Run a background job",
"args_hint": "<prompt>",
"plugin": "jobs-plugin",
}
})
names = {name for name, _desc in telegram_bot_commands()}
assert "background_job" not in names
def test_plugin_command_appears_in_slack_subcommand_map(self, monkeypatch):
"""/hermes metricas must route through the Slack subcommand map."""
self._patch_plugin_commands(monkeypatch, {
+213
View File
@@ -273,6 +273,101 @@ class TestCaptureLogSnapshot:
assert "rotated agent data" in snap.full_text
# ---------------------------------------------------------------------------
# Capture log redaction (force=True applies regardless of HERMES_REDACT_SECRETS)
# ---------------------------------------------------------------------------
# A vendor-prefixed token used across redaction tests. Long enough to clear
# the redactor's `floor` parameter so it actually masks rather than fully blanks.
_REDACT_FIXTURE_TOKEN = "sk-proj-A1B2C3D4E5F6G7H8I9J0aA"
class TestCaptureLogSnapshotRedaction:
"""Pin upload-time redaction at the _capture_log_snapshot boundary."""
@pytest.fixture
def hermes_home_with_secret(self, tmp_path, monkeypatch):
"""Isolated HERMES_HOME whose agent.log contains a vendor-prefixed token."""
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(home))
# Critical: ensure the user has NOT opted in to redaction. The whole
# point of this PR is that share-time redaction works for users who
# never set this env var.
monkeypatch.delenv("HERMES_REDACT_SECRETS", raising=False)
logs_dir = home / "logs"
logs_dir.mkdir()
(logs_dir / "agent.log").write_text(
f"2026-04-12 17:00:00 INFO config: api_key={_REDACT_FIXTURE_TOKEN} loaded\n"
)
(logs_dir / "errors.log").write_text("")
(logs_dir / "gateway.log").write_text("")
return home
def test_default_redacts_tail_and_full_text(self, hermes_home_with_secret):
from hermes_cli.debug import _capture_log_snapshot
snap = _capture_log_snapshot("agent", tail_lines=10)
# Both views the upload uses must be sanitized.
assert _REDACT_FIXTURE_TOKEN not in snap.tail_text
assert snap.full_text is not None
assert _REDACT_FIXTURE_TOKEN not in snap.full_text
def test_redact_false_passes_through(self, hermes_home_with_secret):
from hermes_cli.debug import _capture_log_snapshot
snap = _capture_log_snapshot("agent", tail_lines=10, redact=False)
# Original token survives when the caller opts out.
assert _REDACT_FIXTURE_TOKEN in snap.tail_text
assert _REDACT_FIXTURE_TOKEN in (snap.full_text or "")
def test_force_true_overrides_unset_env_var(self, hermes_home_with_secret):
"""Regression test: redact_sensitive_text short-circuits without force=True.
If a future refactor drops `force=True` from `_redact_log_text`, this
test fails immediately. Without `force=True`, the redactor returns the
input unchanged when HERMES_REDACT_SECRETS is unset, and the feature
ships silently broken for its target audience.
"""
import os
from hermes_cli.debug import _capture_log_snapshot
# Belt-and-suspenders: confirm the env var is genuinely unset for this
# test so we know we're exercising the force=True path.
assert os.environ.get("HERMES_REDACT_SECRETS", "") == ""
snap = _capture_log_snapshot("agent", tail_lines=10)
assert _REDACT_FIXTURE_TOKEN not in snap.tail_text
assert snap.full_text is not None
assert _REDACT_FIXTURE_TOKEN not in snap.full_text
def test_capture_default_log_snapshots_threads_redact(
self, hermes_home_with_secret
):
from hermes_cli.debug import _capture_default_log_snapshots
snaps = _capture_default_log_snapshots(50)
# Default threads redact=True to all three captured logs.
assert _REDACT_FIXTURE_TOKEN not in snaps["agent"].tail_text
assert _REDACT_FIXTURE_TOKEN not in (snaps["agent"].full_text or "")
def test_capture_default_log_snapshots_no_redact_passes_through(
self, hermes_home_with_secret
):
from hermes_cli.debug import _capture_default_log_snapshots
snaps = _capture_default_log_snapshots(50, redact=False)
assert _REDACT_FIXTURE_TOKEN in snaps["agent"].tail_text
assert _REDACT_FIXTURE_TOKEN in (snaps["agent"].full_text or "")
# ---------------------------------------------------------------------------
# Debug report collection
# ---------------------------------------------------------------------------
@@ -556,6 +651,124 @@ class TestRunDebugShare:
assert "all failed" in out.err
# ---------------------------------------------------------------------------
# Share-time redaction wiring + visible banner
# ---------------------------------------------------------------------------
class TestRunDebugShareRedaction:
"""End-to-end: --no-redact flag, banner injection, default behavior."""
@pytest.fixture
def hermes_home_with_secret(self, tmp_path, monkeypatch):
"""Isolated HERMES_HOME whose agent.log contains a vendor-prefixed token."""
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(home))
monkeypatch.delenv("HERMES_REDACT_SECRETS", raising=False)
logs_dir = home / "logs"
logs_dir.mkdir()
(logs_dir / "agent.log").write_text(
f"2026-04-12 17:00:00 INFO config: api_key={_REDACT_FIXTURE_TOKEN} loaded\n"
)
(logs_dir / "errors.log").write_text("")
(logs_dir / "gateway.log").write_text(
f"2026-04-12 17:00:01 INFO gateway.run: token {_REDACT_FIXTURE_TOKEN}\n"
)
return home
def test_default_share_redacts_uploaded_content(
self, hermes_home_with_secret, capsys
):
"""The uploaded report and full-log pastes do not contain the raw token."""
from hermes_cli.debug import run_debug_share
args = MagicMock()
args.lines = 50
args.expire = 7
args.local = False
args.no_redact = False
captured: list[str] = []
def fake_upload(content, expiry_days=7):
captured.append(content)
return f"https://paste.rs/{len(captured)}"
with patch("hermes_cli.dump.run_dump"), \
patch("hermes_cli.debug._sweep_expired_pastes", return_value=(0, 0)), \
patch("hermes_cli.debug.upload_to_pastebin", side_effect=fake_upload):
run_debug_share(args)
# At least the report plus one full log paste reached the upload path.
assert len(captured) >= 2
for content in captured:
assert _REDACT_FIXTURE_TOKEN not in content, (
"raw token leaked into upload-bound content"
)
def test_default_share_includes_redaction_banner(
self, hermes_home_with_secret, capsys
):
"""Each upload-bound paste carries the visible redaction banner."""
from hermes_cli.debug import run_debug_share
args = MagicMock()
args.lines = 50
args.expire = 7
args.local = False
args.no_redact = False
captured: list[str] = []
def fake_upload(content, expiry_days=7):
captured.append(content)
return f"https://paste.rs/{len(captured)}"
with patch("hermes_cli.dump.run_dump"), \
patch("hermes_cli.debug._sweep_expired_pastes", return_value=(0, 0)), \
patch("hermes_cli.debug.upload_to_pastebin", side_effect=fake_upload):
run_debug_share(args)
for content in captured:
assert "redacted at upload time" in content, (
"redaction banner missing from upload-bound content"
)
def test_no_redact_flag_disables_redaction_and_banner(
self, hermes_home_with_secret, capsys
):
"""--no-redact preserves original log content and omits the banner."""
from hermes_cli.debug import run_debug_share
args = MagicMock()
args.lines = 50
args.expire = 7
args.local = False
args.no_redact = True
captured: list[str] = []
def fake_upload(content, expiry_days=7):
captured.append(content)
return f"https://paste.rs/{len(captured)}"
with patch("hermes_cli.dump.run_dump"), \
patch("hermes_cli.debug._sweep_expired_pastes", return_value=(0, 0)), \
patch("hermes_cli.debug.upload_to_pastebin", side_effect=fake_upload):
run_debug_share(args)
# The agent.log paste should now contain the raw token.
assert any(_REDACT_FIXTURE_TOKEN in c for c in captured), (
"expected raw token in --no-redact upload"
)
# No banner anywhere when redaction is disabled.
for content in captured:
assert "redacted at upload time" not in content, (
"banner present with --no-redact"
)
# ---------------------------------------------------------------------------
# run_debug router
# ---------------------------------------------------------------------------
@@ -169,3 +169,78 @@ def test_no_collision_no_warning(tmp_path: Path, caplog) -> None:
and ("clamp" in r.getMessage() or "reserved" in r.getMessage())
]
assert clamp_warnings == []
def test_long_skill_name_preserves_cmd_key_through_by_category(
tmp_path: Path,
) -> None:
"""Skills with names > 32 chars must keep their original cmd_key.
``discord_skill_commands_by_category`` clamps the display name to 32
chars but the third tuple element (cmd_key) must stay as the original
``/full-skill-name`` so that ``_skill_handler`` dispatches via
``_run_simple_slash`` with the full command, not the truncated one.
This is the actual runtime path used by the Discord adapter via
``_refresh_skill_catalog_state``.
"""
from hermes_cli.commands import discord_skill_commands_by_category
skills_dir = tmp_path / "skills"
skills_dir.mkdir()
resolved = str(skills_dir.resolve())
long_name = "generate-ascii-art-from-text-description-detailed"
cmd_key = f"/{long_name}"
fake_cmds = {
cmd_key: {
"name": long_name,
"description": "Generate ASCII art from a text description",
"skill_md_path": f"{resolved}/creative/{long_name}/SKILL.md",
"skill_dir": f"{resolved}/creative/{long_name}",
},
"/short-skill": {
"name": "short-skill",
"description": "A short skill",
"skill_md_path": f"{resolved}/creative/short-skill/SKILL.md",
"skill_dir": f"{resolved}/creative/short-skill",
},
}
with patch("agent.skill_commands.get_skill_commands", return_value=fake_cmds), \
patch("tools.skills_tool.SKILLS_DIR", skills_dir):
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=set(),
)
# Flatten (same as _refresh_skill_catalog_state does)
entries = list(uncategorized)
for cat_skills in categories.values():
entries.extend(cat_skills)
# Build lookup (same as _refresh_skill_catalog_state does)
skill_lookup = {n: (d, k) for n, d, k in entries}
# Find the long skill
long_entry = [e for e in entries if e[2] == cmd_key]
assert len(long_entry) == 1, f"Long skill should appear once, got: {long_entry}"
display_name, desc, key = long_entry[0]
assert len(display_name) <= 32, (
f"Display name should be clamped to 32 chars, got {len(display_name)}"
)
assert key == cmd_key, (
f"cmd_key must be the original /{long_name}, got {key!r}"
)
# Verify lookup works: clamped display name -> original cmd_key
assert display_name in skill_lookup
_desc, looked_up_key = skill_lookup[display_name]
assert looked_up_key == cmd_key, (
f"Lookup must map clamped name to original cmd_key, got {looked_up_key!r}"
)
# Short skill should also be present and correct
short_entry = [e for e in entries if e[2] == "/short-skill"]
assert len(short_entry) == 1
assert short_entry[0][0] == "short-skill"
+4
View File
@@ -310,6 +310,10 @@ def test_find_gateway_pids_falls_back_to_pid_file_when_process_scan_fails(monkey
def fake_run(cmd, **kwargs):
if cmd[:4] == ["ps", "-A", "eww", "-o"]:
return SimpleNamespace(returncode=1, stdout="", stderr="ps failed")
if cmd[:3] == ["ps", "-o", "ppid="]:
# _get_ancestor_pids() walks up the tree; return "no parent" so
# the loop terminates cleanly.
return SimpleNamespace(returncode=1, stdout="", stderr="")
raise AssertionError(f"Unexpected command: {cmd}")
monkeypatch.setattr(gateway.subprocess, "run", fake_run)
@@ -902,12 +902,13 @@ def test_list_profiles_on_disk(tmp_path, monkeypatch):
"""list_profiles_on_disk returns directories under ~/.hermes/profiles/
that contain a config.yaml."""
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.delenv("HERMES_HOME", raising=False)
profiles = tmp_path / ".hermes" / "profiles"
profiles.mkdir(parents=True)
(profiles / "researcher").mkdir()
(profiles / "researcher" / "config.yaml").write_text("model: {}\n")
(profiles / "writer").mkdir()
(profiles / "writer" / "config.yaml").write_text("model: {}\n")
for name in ("researcher", "writer"):
d = profiles / name
d.mkdir()
(d / "config.yaml").write_text("model: {}\n")
(profiles / "empty_dir").mkdir()
# A stray file; should be ignored.
(profiles / "stray.txt").write_text("noise")
@@ -916,6 +917,20 @@ def test_list_profiles_on_disk(tmp_path, monkeypatch):
assert names == ["researcher", "writer"]
def test_list_profiles_on_disk_custom_root(tmp_path, monkeypatch):
"""list_profiles_on_disk respects a custom HERMES_HOME root."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
profiles = tmp_path / "profiles"
profiles.mkdir(parents=True)
for name in ("researcher", "writer"):
d = profiles / name
d.mkdir()
(d / "config.yaml").write_text("model: {}\n")
names = kb.list_profiles_on_disk()
assert names == ["researcher", "writer"]
def test_known_assignees_merges_disk_and_board(tmp_path, monkeypatch):
"""known_assignees unions profiles on disk with currently-assigned
names, and reports per-status counts."""
+276
View File
@@ -436,3 +436,279 @@ def test_tenant_propagates_to_events(kanban_home):
# The "created" event should have tenant in its payload.
created = [e for e in events if e.kind == "created"]
assert created and created[0].payload.get("tenant") == "biz-a"
# ---------------------------------------------------------------------------
# Shared-board path resolution (issue #19348)
#
# The kanban board is a cross-profile coordination primitive: a worker
# spawned with `hermes -p <profile>` must read/write the same kanban.db
# as the dispatcher that claimed the task. These tests exercise the
# path-resolution layer directly and would have caught the regression
# where `kanban_db_path()` resolved to the active profile's HERMES_HOME.
# ---------------------------------------------------------------------------
class TestSharedBoardPaths:
"""`kanban_home`/`kanban_db_path`/`workspaces_root`/`worker_log_path`
must anchor at the **shared root**, not the active profile's HERMES_HOME."""
def _set_home(self, monkeypatch, tmp_path, hermes_home):
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.delenv("HERMES_KANBAN_HOME", raising=False)
def test_default_install_anchors_at_home_dot_hermes(
self, tmp_path, monkeypatch
):
# Standard install: HERMES_HOME == ~/.hermes, no profile active.
default_home = tmp_path / ".hermes"
default_home.mkdir()
self._set_home(monkeypatch, tmp_path, default_home)
assert kb.kanban_home() == default_home
assert kb.kanban_db_path() == default_home / "kanban.db"
assert kb.workspaces_root() == default_home / "kanban" / "workspaces"
assert (
kb.worker_log_path("t_demo")
== default_home / "kanban" / "logs" / "t_demo.log"
)
def test_profile_worker_resolves_to_shared_root(
self, tmp_path, monkeypatch
):
# Reproduces the bug: dispatcher uses ~/.hermes/kanban.db,
# worker spawned with -p <profile> previously resolved to
# ~/.hermes/profiles/<profile>/kanban.db. After the fix both
# converge on ~/.hermes/kanban.db.
default_home = tmp_path / ".hermes"
default_home.mkdir()
profile_home = default_home / "profiles" / "nehemiahkanban"
profile_home.mkdir(parents=True)
self._set_home(monkeypatch, tmp_path, profile_home)
# All four resolvers must anchor at the shared root, not the
# profile-local HERMES_HOME.
assert kb.kanban_home() == default_home
assert kb.kanban_db_path() == default_home / "kanban.db"
assert kb.workspaces_root() == default_home / "kanban" / "workspaces"
assert (
kb.worker_log_path("t_0d214f19")
== default_home / "kanban" / "logs" / "t_0d214f19.log"
)
# Sanity: the profile-local path that used to be returned is
# explicitly NOT what we resolve to anymore.
assert kb.kanban_db_path() != profile_home / "kanban.db"
def test_dispatcher_and_profile_worker_converge(
self, tmp_path, monkeypatch
):
# End-to-end convergence: resolve the path under each side's
# HERMES_HOME and confirm equality. This is the property the
# dispatcher/worker handoff actually depends on.
default_home = tmp_path / ".hermes"
default_home.mkdir()
profile_home = default_home / "profiles" / "coder"
profile_home.mkdir(parents=True)
# Dispatcher's perspective.
self._set_home(monkeypatch, tmp_path, default_home)
dispatcher_db = kb.kanban_db_path()
dispatcher_ws = kb.workspaces_root()
dispatcher_log = kb.worker_log_path("t_handoff")
# Worker's perspective (profile activated by `hermes -p coder`).
monkeypatch.setenv("HERMES_HOME", str(profile_home))
worker_db = kb.kanban_db_path()
worker_ws = kb.workspaces_root()
worker_log = kb.worker_log_path("t_handoff")
assert dispatcher_db == worker_db
assert dispatcher_ws == worker_ws
assert dispatcher_log == worker_log
def test_docker_custom_hermes_home_uses_env_path_directly(
self, tmp_path, monkeypatch
):
# Docker / custom deployment: HERMES_HOME points outside ~/.hermes.
# `get_default_hermes_root()` returns env_home directly when it
# is not a `<root>/profiles/<name>` shape and not under
# `Path.home() / ".hermes"`.
custom_root = tmp_path / "opt" / "hermes"
custom_root.mkdir(parents=True)
self._set_home(monkeypatch, tmp_path, custom_root)
assert kb.kanban_home() == custom_root
assert kb.kanban_db_path() == custom_root / "kanban.db"
def test_docker_profile_layout_uses_grandparent(
self, tmp_path, monkeypatch
):
# Docker profile shape: HERMES_HOME=/opt/hermes/profiles/coder;
# `get_default_hermes_root()` walks up to /opt/hermes because
# the immediate parent dir is named "profiles".
custom_root = tmp_path / "opt" / "hermes"
profile = custom_root / "profiles" / "coder"
profile.mkdir(parents=True)
self._set_home(monkeypatch, tmp_path, profile)
assert kb.kanban_home() == custom_root
assert kb.kanban_db_path() == custom_root / "kanban.db"
def test_explicit_override_via_hermes_kanban_home(
self, tmp_path, monkeypatch
):
# Explicit override: HERMES_KANBAN_HOME beats every other
# resolution rule.
default_home = tmp_path / ".hermes"
profile_home = default_home / "profiles" / "any"
profile_home.mkdir(parents=True)
override = tmp_path / "shared-board"
override.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(profile_home))
monkeypatch.setenv("HERMES_KANBAN_HOME", str(override))
assert kb.kanban_home() == override
assert kb.kanban_db_path() == override / "kanban.db"
assert kb.workspaces_root() == override / "kanban" / "workspaces"
def test_empty_override_falls_through(self, tmp_path, monkeypatch):
# Empty/whitespace override is treated as unset.
default_home = tmp_path / ".hermes"
default_home.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(default_home))
monkeypatch.setenv("HERMES_KANBAN_HOME", " ")
assert kb.kanban_home() == default_home
def test_dispatcher_and_worker_share_a_real_database(
self, tmp_path, monkeypatch
):
# Belt-and-suspenders: round-trip a task across the two
# HERMES_HOME perspectives via a real SQLite file. Without the
# fix the worker would open a different file and see no rows.
default_home = tmp_path / ".hermes"
default_home.mkdir()
profile_home = default_home / "profiles" / "nehemiahkanban"
profile_home.mkdir(parents=True)
# Dispatcher creates the board and a task.
self._set_home(monkeypatch, tmp_path, default_home)
kb.init_db()
with kb.connect() as conn:
task_id = kb.create_task(conn, title="cross-profile")
# Worker switches to the profile HERMES_HOME and reads.
monkeypatch.setenv("HERMES_HOME", str(profile_home))
with kb.connect() as conn:
task = kb.get_task(conn, task_id)
assert task is not None
assert task.title == "cross-profile"
def test_hermes_kanban_db_pin_beats_kanban_home(
self, tmp_path, monkeypatch
):
# HERMES_KANBAN_DB pins the file path directly and beats both
# HERMES_KANBAN_HOME and the `get_default_hermes_root()` path.
# This is the env the dispatcher injects into workers.
default_home = tmp_path / ".hermes"
default_home.mkdir()
umbrella = tmp_path / "umbrella"
umbrella.mkdir()
pinned_db = tmp_path / "pinned" / "board.db"
pinned_db.parent.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(default_home))
monkeypatch.setenv("HERMES_KANBAN_HOME", str(umbrella))
monkeypatch.setenv("HERMES_KANBAN_DB", str(pinned_db))
assert kb.kanban_db_path() == pinned_db
# workspaces_root still follows HERMES_KANBAN_HOME -- the pins
# are independent.
assert kb.workspaces_root() == umbrella / "kanban" / "workspaces"
def test_hermes_kanban_workspaces_root_pin_beats_kanban_home(
self, tmp_path, monkeypatch
):
# HERMES_KANBAN_WORKSPACES_ROOT pins the workspaces root directly.
default_home = tmp_path / ".hermes"
default_home.mkdir()
umbrella = tmp_path / "umbrella"
umbrella.mkdir()
pinned_ws = tmp_path / "pinned-workspaces"
pinned_ws.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(default_home))
monkeypatch.setenv("HERMES_KANBAN_HOME", str(umbrella))
monkeypatch.setenv("HERMES_KANBAN_WORKSPACES_ROOT", str(pinned_ws))
assert kb.workspaces_root() == pinned_ws
# kanban_db_path still follows HERMES_KANBAN_HOME.
assert kb.kanban_db_path() == umbrella / "kanban.db"
def test_empty_per_path_overrides_fall_through(
self, tmp_path, monkeypatch
):
# Empty/whitespace pins are treated as unset, same as
# HERMES_KANBAN_HOME.
default_home = tmp_path / ".hermes"
default_home.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(default_home))
monkeypatch.setenv("HERMES_KANBAN_DB", " ")
monkeypatch.setenv("HERMES_KANBAN_WORKSPACES_ROOT", "")
assert kb.kanban_db_path() == default_home / "kanban.db"
assert kb.workspaces_root() == default_home / "kanban" / "workspaces"
def test_dispatcher_spawn_injects_kanban_db_and_workspaces_root(
self, tmp_path, monkeypatch
):
# The dispatcher's `_default_spawn` must inject HERMES_KANBAN_DB
# and HERMES_KANBAN_WORKSPACES_ROOT into the worker env so the
# worker converges on the dispatcher's paths even when the
# `-p <profile>` flag rewrites HERMES_HOME.
default_home = tmp_path / ".hermes"
default_home.mkdir()
self._set_home(monkeypatch, tmp_path, default_home)
captured = {}
class _FakePopen:
def __init__(self, cmd, **kwargs):
captured["cmd"] = cmd
captured["env"] = kwargs.get("env", {})
self.pid = 4242
monkeypatch.setattr("subprocess.Popen", _FakePopen)
task = kb.Task(
id="t_dispatch_env",
title="x",
body=None,
assignee="coder",
status="ready",
priority=0,
created_by=None,
created_at=0,
started_at=None,
completed_at=None,
workspace_kind="scratch",
workspace_path=None,
claim_lock=None,
claim_expires=None,
tenant=None,
)
kb._default_spawn(task, str(tmp_path / "ws"))
env = captured["env"]
assert env["HERMES_KANBAN_DB"] == str(default_home / "kanban.db")
assert env["HERMES_KANBAN_WORKSPACES_ROOT"] == str(
default_home / "kanban" / "workspaces"
)
assert env["HERMES_KANBAN_TASK"] == "t_dispatch_env"
+9 -1
View File
@@ -508,7 +508,7 @@ class TestPromptPluginEnvVars:
class TestCursesRadiolist:
"""Test the curses_radiolist function (non-TTY fallback path)."""
"""Test the curses_radiolist function."""
def test_non_tty_returns_default(self):
from hermes_cli.curses_ui import curses_radiolist
@@ -524,6 +524,14 @@ class TestCursesRadiolist:
result = curses_radiolist("Pick", ["x", "y"], selected=0, cancel_returns=1)
assert result == 1
def test_keyboard_interrupt_returns_cancel_value(self):
from hermes_cli.curses_ui import curses_radiolist
with patch("sys.stdin") as mock_stdin, patch("curses.wrapper", side_effect=KeyboardInterrupt):
mock_stdin.isatty.return_value = True
result = curses_radiolist("Pick", ["x", "y"], selected=0, cancel_returns=-1)
assert result == -1
# ── Provider discovery helpers ───────────────────────────────────────────
+387
View File
@@ -0,0 +1,387 @@
"""Tests for the ``hermes send`` CLI subcommand.
Covers the argument parsing / stdin / file / list behavior of
``hermes_cli.send_cmd``. The underlying ``send_message_tool`` is stubbed so
no network I/O or gateway is required.
"""
from __future__ import annotations
import io
import json
from pathlib import Path
import pytest
from hermes_cli import send_cmd
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _parse(argv):
"""Build the top-level parser and return the parsed args for ``argv``."""
import argparse
parser = argparse.ArgumentParser(prog="hermes")
subparsers = parser.add_subparsers(dest="command")
send_cmd.register_send_subparser(subparsers)
return parser.parse_args(["send", *argv])
class _FakeTool:
"""Replacement for ``tools.send_message_tool.send_message_tool``."""
def __init__(self, payload):
self.payload = payload
self.calls = []
def __call__(self, args, **_kw):
self.calls.append(dict(args))
return json.dumps(self.payload)
@pytest.fixture
def fake_tool(monkeypatch):
"""Install a fake send_message_tool and return the stub for inspection."""
import sys
import types
fake = _FakeTool({"success": True, "message_id": "m123"})
mod = types.ModuleType("tools.send_message_tool")
mod.send_message_tool = fake
# Register the stub so ``from tools.send_message_tool import ...`` inside
# cmd_send resolves to our fake. Also patch the parent ``tools`` package
# entry so attribute lookup works.
monkeypatch.setitem(sys.modules, "tools.send_message_tool", mod)
return fake
# ---------------------------------------------------------------------------
# Happy path
# ---------------------------------------------------------------------------
def test_positional_message_success(fake_tool, capsys):
args = _parse(["--to", "telegram", "hello world"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 0
assert fake_tool.calls == [
{"action": "send", "target": "telegram", "message": "hello world"}
]
out = capsys.readouterr()
assert "sent" in out.out or out.out == "" # "sent" is the default success banner
def test_stdin_message(fake_tool, monkeypatch, capsys):
# Piped stdin (not a tty) should be consumed as the message body.
monkeypatch.setattr("sys.stdin", io.StringIO("piped body\n"))
# Force isatty to return False so the CLI reads from stdin.
monkeypatch.setattr("sys.stdin.isatty", lambda: False)
args = _parse(["--to", "discord:#ops"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 0
assert fake_tool.calls[0]["message"] == "piped body\n"
assert fake_tool.calls[0]["target"] == "discord:#ops"
def test_file_message(fake_tool, tmp_path):
body = tmp_path / "msg.txt"
body.write_text("from a file\n")
args = _parse(["--to", "slack:#eng", "--file", str(body)])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 0
assert fake_tool.calls[0]["message"] == "from a file\n"
def test_file_dash_means_stdin(fake_tool, monkeypatch):
monkeypatch.setattr("sys.stdin", io.StringIO("dash body"))
args = _parse(["--to", "telegram", "--file", "-"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 0
assert fake_tool.calls[0]["message"] == "dash body"
def test_subject_prepends_header(fake_tool):
args = _parse(["--to", "telegram", "--subject", "[CI]", "body text"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 0
assert fake_tool.calls[0]["message"] == "[CI]\n\nbody text"
def test_json_mode_emits_payload(fake_tool, capsys):
args = _parse(["--to", "telegram", "--json", "hi"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 0
out = capsys.readouterr().out
payload = json.loads(out)
assert payload.get("success") is True
assert payload.get("message_id") == "m123"
def test_quiet_suppresses_stdout(fake_tool, capsys):
args = _parse(["--to", "telegram", "--quiet", "shh"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 0
out = capsys.readouterr()
assert out.out == ""
# ---------------------------------------------------------------------------
# Error paths
# ---------------------------------------------------------------------------
def test_missing_target(fake_tool, capsys, monkeypatch):
# Ensure stdin is a tty so the CLI does not try to consume it as a body.
monkeypatch.setattr("sys.stdin.isatty", lambda: True)
args = _parse(["hello"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 2
err = capsys.readouterr().err
assert "--to" in err
def test_missing_message(fake_tool, capsys, monkeypatch):
monkeypatch.setattr("sys.stdin.isatty", lambda: True)
args = _parse(["--to", "telegram"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 2
err = capsys.readouterr().err
assert "no message" in err.lower()
def test_file_not_found_is_usage_error(fake_tool, capsys, monkeypatch):
monkeypatch.setattr("sys.stdin.isatty", lambda: True)
args = _parse(["--to", "telegram", "--file", "/nonexistent/does-not-exist.txt"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 2
err = capsys.readouterr().err
assert "cannot read" in err.lower()
def test_tool_error_returns_failure_exit(monkeypatch, capsys):
import sys as _sys
import types as _types
fake_mod = _types.ModuleType("tools.send_message_tool")
def _bad_tool(args, **_kw):
return json.dumps({"error": "platform blew up"})
fake_mod.send_message_tool = _bad_tool
monkeypatch.setitem(_sys.modules, "tools.send_message_tool", fake_mod)
args = _parse(["--to", "telegram", "nope"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 1
err = capsys.readouterr().err
assert "platform blew up" in err
def test_skipped_result_is_success(monkeypatch):
import sys as _sys
import types as _types
fake_mod = _types.ModuleType("tools.send_message_tool")
fake_mod.send_message_tool = lambda args, **_kw: json.dumps(
{"success": True, "skipped": True, "reason": "duplicate"}
)
monkeypatch.setitem(_sys.modules, "tools.send_message_tool", fake_mod)
args = _parse(["--to", "telegram", "dup"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 0
# ---------------------------------------------------------------------------
# --list
# ---------------------------------------------------------------------------
def test_list_human_output(monkeypatch, capsys):
import sys as _sys
import types as _types
fake_dir = _types.ModuleType("gateway.channel_directory")
fake_dir.format_directory_for_display = lambda: "Available messaging targets:\n\nTelegram:\n telegram:-100123\n"
fake_dir.load_directory = lambda: {
"platforms": {"telegram": [{"id": "-100123", "name": "Test Group"}]}
}
monkeypatch.setitem(_sys.modules, "gateway.channel_directory", fake_dir)
args = _parse(["--list"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 0
out = capsys.readouterr().out
assert "Telegram" in out
def test_list_json(monkeypatch, capsys):
import sys as _sys
import types as _types
fake_dir = _types.ModuleType("gateway.channel_directory")
fake_dir.format_directory_for_display = lambda: "(ignored in json mode)"
fake_dir.load_directory = lambda: {
"platforms": {"telegram": [{"id": "-100123", "name": "Test Group"}]}
}
monkeypatch.setitem(_sys.modules, "gateway.channel_directory", fake_dir)
args = _parse(["--list", "--json"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 0
out = capsys.readouterr().out
payload = json.loads(out)
assert payload["platforms"]["telegram"][0]["name"] == "Test Group"
def test_list_filter_platform(monkeypatch, capsys):
import sys as _sys
import types as _types
fake_dir = _types.ModuleType("gateway.channel_directory")
fake_dir.format_directory_for_display = lambda: "(should not be called when filter set)"
fake_dir.load_directory = lambda: {
"platforms": {
"telegram": [{"id": "-100123", "name": "TG Chat"}],
"discord": [{"id": "555", "name": "bot-home"}],
}
}
monkeypatch.setitem(_sys.modules, "gateway.channel_directory", fake_dir)
# When --list is set, argparse puts the optional bareword in the
# `message` positional slot (where the send-mode body would go).
args = _parse(["--list", "telegram"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 0
out = capsys.readouterr().out
assert "telegram" in out.lower()
assert "discord" not in out.lower()
def test_list_unknown_platform_fails(monkeypatch, capsys):
import sys as _sys
import types as _types
fake_dir = _types.ModuleType("gateway.channel_directory")
fake_dir.format_directory_for_display = lambda: ""
fake_dir.load_directory = lambda: {"platforms": {"telegram": []}}
monkeypatch.setitem(_sys.modules, "gateway.channel_directory", fake_dir)
args = _parse(["--list", "pigeon-post"])
with pytest.raises(SystemExit) as exc:
send_cmd.cmd_send(args)
assert exc.value.code == 1
err = capsys.readouterr().err
assert "pigeon-post" in err
# ---------------------------------------------------------------------------
# Parser registration contract
# ---------------------------------------------------------------------------
def test_register_send_subparser_is_reusable():
"""Sanity check: the registrar returns a parser and wires ``cmd_send``."""
import argparse
parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers(dest="command")
send_parser = send_cmd.register_send_subparser(subparsers)
assert send_parser is not None
args = parser.parse_args(["send", "--to", "telegram", "hi"])
assert args.func is send_cmd.cmd_send
assert args.to == "telegram"
assert args.message == "hi"
# ---------------------------------------------------------------------------
# Env loader
# ---------------------------------------------------------------------------
def test_load_hermes_env_bridges_config_yaml_scalars(tmp_path, monkeypatch):
"""Top-level config.yaml scalars should be bridged into os.environ.
This mirrors the gateway/run.py bootstrap behavior: without this, running
``hermes send`` from a fresh shell cannot resolve the home channel
because ``TELEGRAM_HOME_CHANNEL`` (saved by ``hermes config set``) lives
in config.yaml, not in .env and the gateway's config loader reads via
``os.getenv(...)``.
"""
import os
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
(hermes_home / ".env").write_text("SOME_TOKEN=abc123\n")
(hermes_home / "config.yaml").write_text(
"TELEGRAM_HOME_CHANNEL: '5550001111'\nnested:\n ignored: true\n"
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.delenv("TELEGRAM_HOME_CHANNEL", raising=False)
monkeypatch.delenv("SOME_TOKEN", raising=False)
# Force get_hermes_home() to re-resolve under the patched env.
from importlib import reload
import hermes_cli.config as _hc_config
reload(_hc_config)
send_cmd._load_hermes_env()
assert os.environ.get("SOME_TOKEN") == "abc123"
assert os.environ.get("TELEGRAM_HOME_CHANNEL") == "5550001111"
def test_load_hermes_env_does_not_override_existing(tmp_path, monkeypatch):
"""Existing env vars must not be clobbered by config.yaml values."""
import os
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
(hermes_home / "config.yaml").write_text("TELEGRAM_HOME_CHANNEL: yaml_value\n")
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.setenv("TELEGRAM_HOME_CHANNEL", "env_value")
from importlib import reload
import hermes_cli.config as _hc_config
reload(_hc_config)
send_cmd._load_hermes_env()
assert os.environ.get("TELEGRAM_HOME_CHANNEL") == "env_value"
def test_load_hermes_env_handles_missing_files(tmp_path, monkeypatch):
"""No .env or config.yaml should be a silent no-op, not an exception."""
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
from importlib import reload
import hermes_cli.config as _hc_config
reload(_hc_config)
# Should not raise.
send_cmd._load_hermes_env()
+32
View File
@@ -613,3 +613,35 @@ def test_offer_launch_chat_falls_back_to_module(monkeypatch):
setup_mod._offer_launch_chat()
assert exec_calls == [(sys.executable, [sys.executable, "-m", "hermes_cli.main", "chat"])]
def test_setup_slack_saves_home_channel(monkeypatch):
"""_setup_slack() saves SLACK_HOME_CHANNEL when the user provides one."""
saved = {}
prompts = iter(["xoxb-test-token", "xapp-test-token", "", "C01ABC2DE3F"])
monkeypatch.setattr(setup_mod, "get_env_value", lambda key: "")
monkeypatch.setattr(setup_mod, "save_env_value", lambda k, v: saved.update({k: v}))
monkeypatch.setattr(setup_mod, "prompt", lambda *_a, **_kw: next(prompts))
monkeypatch.setattr(setup_mod, "prompt_yes_no", lambda *_a, **_kw: False)
monkeypatch.setattr(setup_mod, "_write_slack_manifest_and_instruct", lambda: None)
setup_mod._setup_slack()
assert saved.get("SLACK_HOME_CHANNEL") == "C01ABC2DE3F"
def test_setup_slack_home_channel_empty_not_saved(monkeypatch):
"""_setup_slack() does not save SLACK_HOME_CHANNEL when left blank."""
saved = {}
prompts = iter(["xoxb-test-token", "xapp-test-token", "", ""])
monkeypatch.setattr(setup_mod, "get_env_value", lambda key: "")
monkeypatch.setattr(setup_mod, "save_env_value", lambda k, v: saved.update({k: v}))
monkeypatch.setattr(setup_mod, "prompt", lambda *_a, **_kw: next(prompts))
monkeypatch.setattr(setup_mod, "prompt_yes_no", lambda *_a, **_kw: False)
monkeypatch.setattr(setup_mod, "_write_slack_manifest_and_instruct", lambda: None)
setup_mod._setup_slack()
assert "SLACK_HOME_CHANNEL" not in saved
+33
View File
@@ -69,6 +69,39 @@ def test_no_install_when_only_optional_peer_package_missing_from_hidden_lock(tmp
assert main_mod._tui_need_npm_install(tmp_path) is False
def test_no_install_when_only_peer_annotation_differs(tmp_path: Path, main_mod) -> None:
"""npm 9 drops the ``peer`` flag from the hidden lock on dev-deps that are
*also* declared as peers. That's a cosmetic difference — the package is
installed at the requested version so it must not trigger a reinstall.
Regression for the TUI-in-Docker failure where 16 such mismatches caused
`Installing TUI dependencies` EACCES on every launch.
"""
_touch_ink(tmp_path)
(tmp_path / "package-lock.json").write_text(
'{"packages":{'
'"node_modules/foo":{"version":"1.0.0","dev":true,"peer":true,"resolved":"https://x/foo.tgz"}'
'}}'
)
(tmp_path / "node_modules" / ".package-lock.json").write_text(
'{"packages":{'
'"node_modules/foo":{"version":"1.0.0","dev":true,"resolved":"https://x/foo.tgz"}'
'}}'
)
assert main_mod._tui_need_npm_install(tmp_path) is False
def test_install_when_version_differs_even_with_peer_drop(tmp_path: Path, main_mod) -> None:
"""The peer-drop tolerance must not mask a real version skew."""
_touch_ink(tmp_path)
(tmp_path / "package-lock.json").write_text(
'{"packages":{"node_modules/foo":{"version":"2.0.0","dev":true,"peer":true}}}'
)
(tmp_path / "node_modules" / ".package-lock.json").write_text(
'{"packages":{"node_modules/foo":{"version":"1.0.0","dev":true}}}'
)
assert main_mod._tui_need_npm_install(tmp_path) is True
def test_no_install_when_lock_older_than_marker(tmp_path: Path, main_mod) -> None:
_touch_ink(tmp_path)
(tmp_path / "package-lock.json").write_text("{}")
+6
View File
@@ -432,6 +432,8 @@ class TestPreflightCompression:
ok_resp = _mock_response(content="After preflight", finish_reason="stop")
agent.client.chat.completions.create.side_effect = [ok_resp]
status_messages = []
agent.status_callback = lambda ev, msg: status_messages.append((ev, msg))
with (
patch.object(agent, "_compress_context") as mock_compress,
@@ -460,6 +462,10 @@ class TestPreflightCompression:
)
assert result["completed"] is True
assert result["final_response"] == "After preflight"
assert any(
ev == "lifecycle" and "Preflight compression" in msg
for ev, msg in status_messages
)
def test_no_preflight_when_under_threshold(self, agent):
"""When history fits within context, no preflight compression needed."""
+77
View File
@@ -2181,6 +2181,83 @@ class TestHandleMaxIterations:
kwargs = agent.client.chat.completions.create.call_args.kwargs
assert "reasoning" not in kwargs.get("extra_body", {})
def test_summary_request_removes_orphan_tool_result(self, agent):
"""Regression: max-iterations summary request must NOT contain
orphan tool results (tool_call_id with no matching assistant tool_call)."""
resp = _mock_response(content="Summary of work done.")
agent.client.chat.completions.create.return_value = resp
agent._cached_system_prompt = "You are helpful."
messages = [
{"role": "user", "content": "Analyze finance-data-router"},
{"role": "assistant", "content": "[Session Arc Summary] ..."},
{"role": "tool", "tool_call_id": "call_cfedFhJjGmu1RvRc1OUC38j8", "content": "file content here"},
{"role": "assistant", "tool_calls": [{"id": "call_8fXBXsT592Vpvm7wnW4obPEu", "function": {"name": "patch", "arguments": "{}"}}]},
{"role": "tool", "tool_call_id": "call_8fXBXsT592Vpvm7wnW4obPEu", "content": "patch result"},
{"role": "assistant", "content": "Done."},
]
result = agent._handle_max_iterations(messages, 120)
assert result == "Summary of work done."
kwargs = agent.client.chat.completions.create.call_args.kwargs
sent_msgs = kwargs.get("messages", [])
orphan_ids = [
m.get("tool_call_id") for m in sent_msgs
if m.get("role") == "tool" and m.get("tool_call_id") == "call_cfedFhJjGmu1RvRc1OUC38j8"
]
assert len(orphan_ids) == 0, f"Orphan tool result still present: {orphan_ids}"
def test_summary_request_inserts_stub_for_missing_tool_result(self, agent):
"""If an assistant tool_call has no matching tool result in the
summary request, a stub must be inserted to satisfy the API contract."""
resp = _mock_response(content="Summary")
agent.client.chat.completions.create.return_value = resp
agent._cached_system_prompt = "You are helpful."
messages = [
{"role": "user", "content": "do stuff"},
{"role": "assistant", "tool_calls": [{"id": "call_no_result", "function": {"name": "terminal", "arguments": "{}"}}]},
{"role": "assistant", "content": "Continuing..."},
]
result = agent._handle_max_iterations(messages, 60)
assert result == "Summary"
kwargs = agent.client.chat.completions.create.call_args.kwargs
sent_msgs = kwargs.get("messages", [])
stub_ids = [
m.get("tool_call_id") for m in sent_msgs
if m.get("role") == "tool" and m.get("tool_call_id") == "call_no_result"
]
assert len(stub_ids) >= 1, f"No stub result for assistant tool_call: {stub_ids}"
def test_summary_omits_provider_preferences_for_non_openrouter(self, agent):
agent.base_url = "https://api.openai.com/v1"
agent._base_url_lower = agent.base_url.lower()
agent.provider = "openai"
agent.providers_allowed = ["Anthropic"]
agent.client.chat.completions.create.return_value = _mock_response(content="Summary")
agent._cached_system_prompt = "You are helpful."
result = agent._handle_max_iterations([{"role": "user", "content": "do stuff"}], 60)
assert result == "Summary"
kwargs = agent.client.chat.completions.create.call_args.kwargs
assert "provider" not in kwargs.get("extra_body", {})
def test_summary_keeps_provider_preferences_for_openrouter(self, agent):
agent.base_url = "https://openrouter.ai/api/v1"
agent._base_url_lower = agent.base_url.lower()
agent.provider = "openrouter"
agent.providers_allowed = ["Anthropic"]
agent.client.chat.completions.create.return_value = _mock_response(content="Summary")
agent._cached_system_prompt = "You are helpful."
result = agent._handle_max_iterations([{"role": "user", "content": "do stuff"}], 60)
assert result == "Summary"
kwargs = agent.client.chat.completions.create.call_args.kwargs
assert kwargs["extra_body"]["provider"]["only"] == ["Anthropic"]
def test_codex_summary_sanitizes_orphan_tool_results(self, agent):
agent.api_mode = "codex_responses"
agent.provider = "openai-codex"
+47
View File
@@ -2403,5 +2403,52 @@ class TestSubagentApprovalCallback(unittest.TestCase):
self.assertIsNone(_get_approval_callback())
class TestFallbackModelInheritance(unittest.TestCase):
"""Subagents must inherit the parent's fallback provider chain."""
def test_child_inherits_fallback_chain(self):
"""_build_child_agent passes parent._fallback_chain as fallback_model."""
parent = _make_mock_parent(depth=0)
fallback_entry = {"provider": "openrouter", "model": "gpt-4o-mini", "api_key": "sk-or-x"}
parent._fallback_chain = [fallback_entry]
with patch("run_agent.AIAgent") as MockAgent:
MockAgent.return_value = MagicMock()
_build_child_agent(
task_index=0,
goal="test fallback inheritance",
context=None,
toolsets=None,
model=None,
max_iterations=10,
parent_agent=parent,
task_count=1,
)
_, kwargs = MockAgent.call_args
self.assertEqual(kwargs["fallback_model"], [fallback_entry])
def test_child_gets_no_fallback_when_parent_chain_empty(self):
"""When parent._fallback_chain is empty, fallback_model is None."""
parent = _make_mock_parent(depth=0)
parent._fallback_chain = []
with patch("run_agent.AIAgent") as MockAgent:
MockAgent.return_value = MagicMock()
_build_child_agent(
task_index=0,
goal="test no fallback",
context=None,
toolsets=None,
model=None,
max_iterations=10,
parent_agent=parent,
task_count=1,
)
_, kwargs = MockAgent.call_args
self.assertIsNone(kwargs["fallback_model"])
if __name__ == "__main__":
unittest.main()
+52
View File
@@ -271,6 +271,58 @@ class TestShellFileOpsHelpers:
ops = ShellFileOperations(env)
assert ops.cwd == "/"
def test_read_file_strips_leaked_terminal_fence_markers(self, mock_env):
leaked = (
"'\x07__HERMES_FENCE_a9f7b3__\x1b]0;cat "
"'/tmp/test/a.py' 2> /dev/null\x07\n"
"print('ok')\n"
"__HERMES_FENCE_a9f7b3__\x07'\n"
)
def side_effect(command, **kwargs):
if command.startswith("wc -c"):
return {"output": "12\n", "returncode": 0}
if command.startswith("head -c"):
return {"output": "print('ok')\n", "returncode": 0}
if command.startswith("sed -n"):
return {"output": leaked, "returncode": 0}
if command.startswith("wc -l"):
return {"output": "1\n", "returncode": 0}
return {"output": "", "returncode": 0}
mock_env.execute.side_effect = side_effect
ops = ShellFileOperations(mock_env)
result = ops.read_file("/tmp/test/a.py")
assert result.error is None
assert "HERMES_FENCE" not in result.content
assert "\x1b]" not in result.content
assert "\x07" not in result.content
assert " 1|print('ok')" in result.content
def test_read_file_raw_strips_leaked_terminal_fence_markers(self, mock_env):
leaked = (
"__HERMES_FENCE_a9f7b3__\x07'\n"
"alpha\n"
"\x1b]0;cat '/tmp/test/a.txt'\x07__HERMES_FENCE_a9f7b3__\n"
)
def side_effect(command, **kwargs):
if command.startswith("wc -c"):
return {"output": "6\n", "returncode": 0}
if command.startswith("head -c"):
return {"output": "alpha\n", "returncode": 0}
if command.startswith("cat "):
return {"output": leaked, "returncode": 0}
return {"output": "", "returncode": 0}
mock_env.execute.side_effect = side_effect
ops = ShellFileOperations(mock_env)
result = ops.read_file_raw("/tmp/test/a.txt")
assert result.error is None
assert result.content == "alpha\n"
class TestSearchPathValidation:
"""Test that search() returns an error for non-existent paths."""
+38
View File
@@ -104,6 +104,44 @@ class TestWriteFileHandler:
assert result["error"] == "boom"
assert any("write_file error" in r.getMessage() for r in caplog.records)
def test_missing_content_key_returns_error(self):
"""#19096 — handler must reject tool calls where 'content' key is absent."""
from tools.file_tools import _handle_write_file
result = json.loads(_handle_write_file({"path": "/tmp/oops.md"}))
assert "error" in result
assert "content" in result["error"]
assert "path" not in result.get("error", "").lower() or "missing" not in result.get("error", "").lower() or True # just check error present
def test_missing_path_key_returns_error(self):
"""#19096 — handler must reject tool calls where 'path' key is absent."""
from tools.file_tools import _handle_write_file
result = json.loads(_handle_write_file({"content": "hello"}))
assert "error" in result
def test_explicit_empty_content_is_allowed(self):
"""#19096 — explicit empty string content (file truncation) must still work."""
from tools.file_tools import _handle_write_file
with patch("tools.file_tools._get_file_ops") as mock_get:
mock_ops = MagicMock()
result_obj = MagicMock()
result_obj.to_dict.return_value = {"status": "ok", "path": "/tmp/empty.txt", "bytes": 0}
mock_ops.write_file.return_value = result_obj
mock_get.return_value = mock_ops
result = json.loads(_handle_write_file({"path": "/tmp/empty.txt", "content": ""}))
assert result["status"] == "ok"
def test_non_string_content_returns_error(self):
"""#19096 — content must be a string, not a dict or list."""
from tools.file_tools import _handle_write_file
result = json.loads(_handle_write_file({"path": "/tmp/x.txt", "content": {"nested": "dict"}}))
assert "error" in result
assert "string" in result["error"].lower() or "content" in result["error"].lower()
class TestPatchHandler:
@patch("tools.file_tools._get_file_ops")
+2 -2
View File
@@ -467,8 +467,8 @@ def test_kanban_guidance_in_worker_prompt(monkeypatch, tmp_path):
skip_memory=True,
)
prompt = a._build_system_prompt()
# Header phrase
assert "You are a Kanban worker" in prompt
# Header phrase (identity-free — SOUL.md owns identity, layer 3 is protocol)
assert "Kanban task execution protocol" in prompt
# Lifecycle signals
assert "kanban_show()" in prompt
assert "kanban_complete" in prompt
@@ -46,6 +46,13 @@ def test_is_session_expired_detects_session_not_found():
assert _is_session_expired_error(RuntimeError("Unknown session: abc123")) is True
def test_is_session_expired_detects_session_terminated():
"""Remote Playwright MCP reports transport loss as ``Session terminated``."""
from tools.mcp_tool import _is_session_expired_error
assert _is_session_expired_error(RuntimeError("Session terminated")) is True
def test_is_session_expired_is_case_insensitive():
"""Match uses lower-cased comparison so servers that emit the
message in different cases (SDK formatter quirks) still trigger."""
+63
View File
@@ -901,6 +901,69 @@ class TestCheckForSkillUpdates:
assert bundle_content_hash(bundle) == content_hash(skill_dir)
def test_bundle_content_hash_accepts_binary_files(self):
bundle = SkillBundle(
name="demo-binary-skill",
files={
"SKILL.md": "# Demo\n",
"assets/logo.png": b"\x89PNG\r\n\x1a\nbinary",
},
source="github",
identifier="owner/repo/demo-binary-skill",
trust_level="community",
)
digest = bundle_content_hash(bundle)
assert digest.startswith("sha256:")
def test_bundle_content_hash_bytes_matches_str_equivalent(self):
"""Bytes content must hash identically to its str-decoded form."""
text_bundle = SkillBundle(
name="demo-skill",
files={
"SKILL.md": "same content",
"references/checklist.md": "- [ ] security\n",
},
source="github",
identifier="owner/repo/demo-skill",
trust_level="community",
)
bytes_bundle = SkillBundle(
name="demo-skill",
files={
"SKILL.md": b"same content",
"references/checklist.md": b"- [ ] security\n",
},
source="github",
identifier="owner/repo/demo-skill",
trust_level="community",
)
assert bundle_content_hash(bytes_bundle) == bundle_content_hash(text_bundle)
def test_bundle_content_hash_mixed_matches_on_disk(self, tmp_path):
"""In-memory bundle hash must equal on-disk content_hash for mixed bytes+str."""
from tools.skills_guard import content_hash
bundle = SkillBundle(
name="demo-skill",
files={
"SKILL.md": b"# Demo Skill\n",
"references/checklist.md": "- [ ] security\n",
},
source="github",
identifier="owner/repo/demo-skill",
trust_level="community",
)
skill_dir = tmp_path / "demo-skill"
skill_dir.mkdir()
(skill_dir / "SKILL.md").write_bytes(b"# Demo Skill\n")
(skill_dir / "references").mkdir()
(skill_dir / "references" / "checklist.md").write_text("- [ ] security\n")
assert bundle_content_hash(bundle) == content_hash(skill_dir)
def test_reports_update_when_remote_hash_differs(self):
lock = MagicMock()
lock.list_installed.return_value = [{
+337
View File
@@ -0,0 +1,337 @@
"""Tests for video_analyze tool in tools/vision_tools.py."""
import asyncio
import json
import os
from pathlib import Path
from typing import Awaitable
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from tools.vision_tools import (
_detect_video_mime_type,
_video_to_base64_data_url,
_handle_video_analyze,
_MAX_VIDEO_BASE64_BYTES,
_VIDEO_MIME_TYPES,
_VIDEO_SIZE_WARN_BYTES,
video_analyze_tool,
VIDEO_ANALYZE_SCHEMA,
)
# ---------------------------------------------------------------------------
# _detect_video_mime_type
# ---------------------------------------------------------------------------
class TestDetectVideoMimeType:
"""Extension-based MIME detection for video files."""
def test_mp4(self, tmp_path):
p = tmp_path / "clip.mp4"
p.write_bytes(b"\x00" * 10)
assert _detect_video_mime_type(p) == "video/mp4"
def test_webm(self, tmp_path):
p = tmp_path / "clip.webm"
p.write_bytes(b"\x00" * 10)
assert _detect_video_mime_type(p) == "video/webm"
def test_mov(self, tmp_path):
p = tmp_path / "clip.mov"
p.write_bytes(b"\x00" * 10)
assert _detect_video_mime_type(p) == "video/mov"
def test_avi_fallback_mp4(self, tmp_path):
p = tmp_path / "clip.avi"
p.write_bytes(b"\x00" * 10)
assert _detect_video_mime_type(p) == "video/mp4"
def test_mkv_fallback_mp4(self, tmp_path):
p = tmp_path / "clip.mkv"
p.write_bytes(b"\x00" * 10)
assert _detect_video_mime_type(p) == "video/mp4"
def test_mpeg(self, tmp_path):
p = tmp_path / "clip.mpeg"
p.write_bytes(b"\x00" * 10)
assert _detect_video_mime_type(p) == "video/mpeg"
def test_mpg(self, tmp_path):
p = tmp_path / "clip.mpg"
p.write_bytes(b"\x00" * 10)
assert _detect_video_mime_type(p) == "video/mpeg"
def test_unsupported_extension(self, tmp_path):
p = tmp_path / "clip.flv"
p.write_bytes(b"\x00" * 10)
assert _detect_video_mime_type(p) is None
def test_case_insensitive(self, tmp_path):
p = tmp_path / "clip.MP4"
p.write_bytes(b"\x00" * 10)
assert _detect_video_mime_type(p) == "video/mp4"
# ---------------------------------------------------------------------------
# _video_to_base64_data_url
# ---------------------------------------------------------------------------
class TestVideoToBase64DataUrl:
"""Base64 encoding of video files."""
def test_produces_data_url(self, tmp_path):
p = tmp_path / "test.mp4"
p.write_bytes(b"\x00\x01\x02\x03")
result = _video_to_base64_data_url(p)
assert result.startswith("data:video/mp4;base64,")
def test_custom_mime_type(self, tmp_path):
p = tmp_path / "test.webm"
p.write_bytes(b"\x00\x01\x02\x03")
result = _video_to_base64_data_url(p, mime_type="video/webm")
assert result.startswith("data:video/webm;base64,")
def test_default_mime_for_unknown_ext(self, tmp_path):
p = tmp_path / "test.xyz"
p.write_bytes(b"\x00\x01\x02\x03")
result = _video_to_base64_data_url(p)
# Falls back to video/mp4
assert result.startswith("data:video/mp4;base64,")
# ---------------------------------------------------------------------------
# Schema validation
# ---------------------------------------------------------------------------
class TestVideoAnalyzeSchema:
"""Schema structure is correct."""
def test_schema_name(self):
assert VIDEO_ANALYZE_SCHEMA["name"] == "video_analyze"
def test_schema_has_required_fields(self):
params = VIDEO_ANALYZE_SCHEMA["parameters"]
assert "video_url" in params["properties"]
assert "question" in params["properties"]
assert params["required"] == ["video_url", "question"]
def test_schema_description_mentions_video(self):
assert "video" in VIDEO_ANALYZE_SCHEMA["description"].lower()
# ---------------------------------------------------------------------------
# _handle_video_analyze handler
# ---------------------------------------------------------------------------
class TestHandleVideoAnalyze:
"""Tests for the registry handler wrapper."""
def test_returns_awaitable(self, tmp_path, monkeypatch):
video_file = tmp_path / "test.mp4"
video_file.write_bytes(b"\x00" * 100)
monkeypatch.setenv("AUXILIARY_VIDEO_MODEL", "")
monkeypatch.setenv("AUXILIARY_VISION_MODEL", "")
with patch("tools.vision_tools.video_analyze_tool", new_callable=AsyncMock) as mock_tool:
mock_tool.return_value = json.dumps({"success": True, "analysis": "test"})
result = _handle_video_analyze({"video_url": str(video_file), "question": "what is this?"})
# Should return an awaitable (coroutine)
assert asyncio.iscoroutine(result)
# Clean up the unawaited coroutine
result.close()
def test_uses_auxiliary_video_model_env(self, tmp_path, monkeypatch):
monkeypatch.setenv("AUXILIARY_VIDEO_MODEL", "google/gemini-2.5-flash")
monkeypatch.setenv("AUXILIARY_VISION_MODEL", "other-model")
with patch("tools.vision_tools.video_analyze_tool", new_callable=AsyncMock) as mock_tool:
mock_tool.return_value = json.dumps({"success": True, "analysis": "ok"})
asyncio.get_event_loop().run_until_complete(
_handle_video_analyze({"video_url": "/tmp/test.mp4", "question": "test"})
)
args = mock_tool.call_args[0]
assert args[2] == "google/gemini-2.5-flash"
def test_falls_back_to_vision_model_env(self, tmp_path, monkeypatch):
monkeypatch.setenv("AUXILIARY_VIDEO_MODEL", "")
monkeypatch.setenv("AUXILIARY_VISION_MODEL", "google/gemini-flash")
with patch("tools.vision_tools.video_analyze_tool", new_callable=AsyncMock) as mock_tool:
mock_tool.return_value = json.dumps({"success": True, "analysis": "ok"})
asyncio.get_event_loop().run_until_complete(
_handle_video_analyze({"video_url": "/tmp/test.mp4", "question": "test"})
)
args = mock_tool.call_args[0]
assert args[2] == "google/gemini-flash"
# ---------------------------------------------------------------------------
# video_analyze_tool — integration-style tests with mocked LLM
# ---------------------------------------------------------------------------
class TestVideoAnalyzeTool:
"""Core video analysis function tests."""
def _run(self, coro):
return asyncio.get_event_loop().run_until_complete(coro)
def test_local_file_success(self, tmp_path, monkeypatch):
"""Analyze a local video file — happy path."""
video = tmp_path / "demo.mp4"
video.write_bytes(b"\x00" * 1024)
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[0].message.content = "A short video showing a demo."
with patch("tools.vision_tools.async_call_llm", new_callable=AsyncMock, return_value=mock_response):
with patch("tools.vision_tools.extract_content_or_reasoning", return_value="A short video showing a demo."):
result = self._run(video_analyze_tool(str(video), "What is this?"))
data = json.loads(result)
assert data["success"] is True
assert "demo" in data["analysis"].lower()
def test_local_file_not_found(self, tmp_path):
"""Non-existent file raises appropriate error."""
result = self._run(video_analyze_tool("/nonexistent/video.mp4", "What?"))
data = json.loads(result)
assert data["success"] is False
assert "invalid video source" in data["analysis"].lower()
def test_unsupported_format(self, tmp_path):
"""Unsupported extension raises error."""
video = tmp_path / "clip.flv"
video.write_bytes(b"\x00" * 100)
result = self._run(video_analyze_tool(str(video), "What is this?"))
data = json.loads(result)
assert data["success"] is False
assert "unsupported video format" in data["analysis"].lower()
def test_video_too_large(self, tmp_path, monkeypatch):
"""Video exceeding max size is rejected."""
video = tmp_path / "huge.mp4"
# Don't actually write 50MB — mock the stat
video.write_bytes(b"\x00" * 100)
# Patch the base64 encoding to return something huge
with patch("tools.vision_tools._video_to_base64_data_url") as mock_encode:
mock_encode.return_value = "data:video/mp4;base64," + "A" * (_MAX_VIDEO_BASE64_BYTES + 1)
result = self._run(video_analyze_tool(str(video), "What?"))
data = json.loads(result)
assert data["success"] is False
assert "too large" in data["analysis"].lower()
def test_interrupt_check(self, tmp_path):
"""Tool respects interrupt flag."""
video = tmp_path / "test.mp4"
video.write_bytes(b"\x00" * 100)
with patch("tools.interrupt.is_interrupted", return_value=True):
result = self._run(video_analyze_tool(str(video), "What?"))
data = json.loads(result)
assert data["success"] is False
def test_empty_response_retries(self, tmp_path):
"""Retries once on empty model response."""
video = tmp_path / "test.mp4"
video.write_bytes(b"\x00" * 100)
call_count = 0
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[0].message.content = "Video analysis result."
async def fake_llm(**kwargs):
nonlocal call_count
call_count += 1
return mock_response
with patch("tools.vision_tools.async_call_llm", side_effect=fake_llm):
with patch("tools.vision_tools.extract_content_or_reasoning", side_effect=["", "Video analysis result."]):
result = self._run(video_analyze_tool(str(video), "What?"))
data = json.loads(result)
assert data["success"] is True
assert call_count == 2 # Initial call + retry
def test_file_scheme_stripped(self, tmp_path):
"""file:// prefix is stripped correctly."""
video = tmp_path / "test.mp4"
video.write_bytes(b"\x00" * 100)
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[0].message.content = "OK"
with patch("tools.vision_tools.async_call_llm", new_callable=AsyncMock, return_value=mock_response):
with patch("tools.vision_tools.extract_content_or_reasoning", return_value="OK"):
result = self._run(video_analyze_tool(f"file://{video}", "What?"))
data = json.loads(result)
assert data["success"] is True
def test_api_message_format(self, tmp_path):
"""Verify the message sent to LLM uses video_url content type."""
video = tmp_path / "test.mp4"
video.write_bytes(b"\x00" * 100)
captured_kwargs = {}
async def capture_llm(**kwargs):
captured_kwargs.update(kwargs)
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[0].message.content = "OK"
return mock_response
with patch("tools.vision_tools.async_call_llm", side_effect=capture_llm):
with patch("tools.vision_tools.extract_content_or_reasoning", return_value="OK"):
self._run(video_analyze_tool(str(video), "Describe this"))
messages = captured_kwargs["messages"]
assert len(messages) == 1
content = messages[0]["content"]
assert len(content) == 2
assert content[0]["type"] == "text"
assert content[1]["type"] == "video_url"
assert "video_url" in content[1]
assert content[1]["video_url"]["url"].startswith("data:video/mp4;base64,")
# ---------------------------------------------------------------------------
# Toolset registration
# ---------------------------------------------------------------------------
class TestVideoToolsetRegistration:
"""Verify the tool is registered correctly."""
def test_registered_in_video_toolset(self):
from tools.registry import registry
entry = registry.get_entry("video_analyze")
assert entry is not None
assert entry.toolset == "video"
assert entry.is_async is True
assert entry.emoji == "🎬"
def test_not_in_core_tools(self):
"""video_analyze should NOT be in _HERMES_CORE_TOOLS (default disabled)."""
from toolsets import _HERMES_CORE_TOOLS
assert "video_analyze" not in _HERMES_CORE_TOOLS
def test_in_video_toolset_definition(self):
"""Toolset 'video' should contain video_analyze."""
from toolsets import TOOLSETS
assert "video" in TOOLSETS
assert "video_analyze" in TOOLSETS["video"]["tools"]
+19
View File
@@ -1040,6 +1040,25 @@ class TestDisableVoiceModeReal:
class TestVoiceSpeakResponseReal:
"""Tests _voice_speak_response with real CLI instance."""
def test_async_scheduling_clears_done_before_thread_start(self):
cli = _make_voice_cli(_voice_tts=True)
starts = []
class FakeThread:
def __init__(self, target=None, args=(), daemon=None):
self.target = target
self.args = args
self.daemon = daemon
def start(self):
starts.append(cli._voice_tts_done.is_set())
with patch("cli.threading.Thread", FakeThread):
cli._voice_speak_response_async("Hello")
assert starts == [False]
assert not cli._voice_tts_done.is_set()
@patch("cli._cprint")
def test_early_return_when_tts_off(self, _cp):
cli = _make_voice_cli(_voice_tts=False)
+196
View File
@@ -0,0 +1,196 @@
"""Tests for /goal handling in tui_gateway.
The TUI routes ``/goal`` through ``command.dispatch`` (not ``slash.exec``)
because the CLI's ``_handle_goal_command`` queues the kickoff message onto
``_pending_input``, which the slash-worker subprocess has no reader for.
Instead we handle ``/goal`` directly in the server and return a
``{"type": "send", "notice": ..., "message": ...}`` payload the TUI client
uses to render a system line and fire the kickoff prompt.
"""
from __future__ import annotations
import importlib
import threading
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
@pytest.fixture()
def hermes_home(tmp_path, monkeypatch):
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(home))
# Bust the goal-module DB cache so it re-resolves HERMES_HOME.
from hermes_cli import goals
goals._DB_CACHE.clear()
yield home
goals._DB_CACHE.clear()
@pytest.fixture()
def server(hermes_home):
with patch.dict(
"sys.modules",
{
"hermes_cli.env_loader": MagicMock(),
"hermes_cli.banner": MagicMock(),
},
):
mod = importlib.import_module("tui_gateway.server")
yield mod
mod._sessions.clear()
mod._pending.clear()
mod._answers.clear()
mod._methods.clear()
importlib.reload(mod)
@pytest.fixture()
def session(server):
sid = "sid-test"
session_key = "tui-goal-session-1"
s = {
"session_key": session_key,
"history": [],
"history_lock": threading.Lock(),
"history_version": 0,
"running": False,
"attached_images": [],
"cols": 120,
}
server._sessions[sid] = s
return sid, session_key, s
def _call(server, method, **params):
handler = server._methods[method]
return handler(1, params)
# ── command.dispatch /goal ────────────────────────────────────────────
def test_goal_bare_shows_status_when_none_set(server, session):
sid, _, _ = session
r = _call(server, "command.dispatch", name="goal", arg="", session_id=sid)
assert r["result"]["type"] == "exec"
assert "No active goal" in r["result"]["output"]
def test_goal_whitespace_only_shows_status(server, session):
sid, _, _ = session
r = _call(server, "command.dispatch", name="goal", arg=" ", session_id=sid)
assert r["result"]["type"] == "exec"
assert "No active goal" in r["result"]["output"]
def test_goal_status_alias_shows_status(server, session):
sid, _, _ = session
r = _call(server, "command.dispatch", name="goal", arg="status", session_id=sid)
assert r["result"]["type"] == "exec"
assert "No active goal" in r["result"]["output"]
def test_goal_set_returns_send_with_notice(server, session):
sid, session_key, _ = session
r = _call(server, "command.dispatch", name="goal", arg="build a rocket", session_id=sid)
result = r["result"]
assert result["type"] == "send"
assert result["message"] == "build a rocket"
assert "notice" in result
assert "Goal set" in result["notice"]
assert "20-turn budget" in result["notice"]
# Persisted in SessionDB
from hermes_cli.goals import GoalManager
mgr = GoalManager(session_key)
assert mgr.state is not None
assert mgr.state.goal == "build a rocket"
assert mgr.state.status == "active"
def test_goal_pause_after_set(server, session):
sid, session_key, _ = session
_call(server, "command.dispatch", name="goal", arg="write a story", session_id=sid)
r = _call(server, "command.dispatch", name="goal", arg="pause", session_id=sid)
assert r["result"]["type"] == "exec"
assert "paused" in r["result"]["output"].lower()
from hermes_cli.goals import GoalManager
assert GoalManager(session_key).state.status == "paused"
def test_goal_resume_reactivates(server, session):
sid, session_key, _ = session
_call(server, "command.dispatch", name="goal", arg="write a story", session_id=sid)
_call(server, "command.dispatch", name="goal", arg="pause", session_id=sid)
r = _call(server, "command.dispatch", name="goal", arg="resume", session_id=sid)
assert r["result"]["type"] == "exec"
assert "resumed" in r["result"]["output"].lower()
from hermes_cli.goals import GoalManager
assert GoalManager(session_key).state.status == "active"
def test_goal_clear_removes_active_goal(server, session):
sid, session_key, _ = session
_call(server, "command.dispatch", name="goal", arg="write a story", session_id=sid)
r = _call(server, "command.dispatch", name="goal", arg="clear", session_id=sid)
assert r["result"]["type"] == "exec"
assert "cleared" in r["result"]["output"].lower()
from hermes_cli.goals import GoalManager
# After clear the row is marked status=cleared (kept for audit);
# ``has_goal()`` / ``is_active()`` return False so the goal loop
# stays off and ``status`` reports "No active goal".
mgr = GoalManager(session_key)
assert not mgr.has_goal()
assert not mgr.is_active()
assert "No active goal" in mgr.status_line()
def test_goal_stop_and_done_are_clear_aliases(server, session):
sid, _, _ = session
_call(server, "command.dispatch", name="goal", arg="first goal", session_id=sid)
r = _call(server, "command.dispatch", name="goal", arg="stop", session_id=sid)
assert "cleared" in r["result"]["output"].lower()
_call(server, "command.dispatch", name="goal", arg="second goal", session_id=sid)
r = _call(server, "command.dispatch", name="goal", arg="done", session_id=sid)
assert "cleared" in r["result"]["output"].lower()
def test_goal_requires_session(server):
r = _call(server, "command.dispatch", name="goal", arg="nope", session_id="unknown")
assert "error" in r
assert r["error"]["code"] == 4001
# ── slash.exec /goal routing ──────────────────────────────────────────
def test_slash_exec_rejects_goal_routes_to_command_dispatch(server, session):
"""slash.exec must reject /goal with 4018 so the TUI client falls through
to command.dispatch. Without this, the HermesCLI slash-worker subprocess
would set the goal but silently drop the kickoff the queue is in-proc."""
sid, _, _ = session
r = _call(server, "slash.exec", command="goal status", session_id=sid)
assert "error" in r
assert r["error"]["code"] == 4018
assert "command.dispatch" in r["error"]["message"]
def test_pending_input_commands_includes_goal(server):
"""Guard: _PENDING_INPUT_COMMANDS must list 'goal' — removing it would
silently re-break the TUI."""
assert "goal" in server._PENDING_INPUT_COMMANDS
+11 -1
View File
@@ -94,10 +94,20 @@ _HERMES_ENV_PATH = (
)
_PROJECT_ENV_PATH = r'(?:(?:/|\.{1,2}/)?(?:[^\s/"\'`]+/)*\.env(?:\.[^/\s"\'`]+)*)'
_PROJECT_CONFIG_PATH = r'(?:(?:/|\.{1,2}/)?(?:[^\s/"\'`]+/)*config\.yaml)'
_SHELL_RC_FILES = (
r'(?:~|\$home|\$\{home\})/\.'
r'(?:bashrc|zshrc|profile|bash_profile|zprofile)\b'
)
_CREDENTIAL_FILES = (
r'(?:~|\$home|\$\{home\})/\.'
r'(?:netrc|pgpass|npmrc|pypirc)\b'
)
_SENSITIVE_WRITE_TARGET = (
r'(?:/etc/|/dev/sd|'
rf'{_SSH_SENSITIVE_PATH}|'
rf'{_HERMES_ENV_PATH})'
rf'{_HERMES_ENV_PATH}|'
rf'{_SHELL_RC_FILES}|'
rf'{_CREDENTIAL_FILES})'
)
_PROJECT_SENSITIVE_WRITE_TARGET = rf'(?:{_PROJECT_ENV_PATH}|{_PROJECT_CONFIG_PATH})'
_COMMAND_TAIL = r'(?:\s*(?:&&|\|\||;).*)?$'
+28 -5
View File
@@ -2757,17 +2757,40 @@ def _chromium_search_roots() -> List[str]:
def _chromium_installed() -> bool:
"""Return True when a usable Chromium (or headless-shell) build is on disk.
Checks, in order:
1. ``AGENT_BROWSER_EXECUTABLE_PATH`` env var the official way to point
agent-browser at a pre-installed Chrome/Chromium.
2. System Chrome/Chromium in PATH (``google-chrome``, ``chromium-browser``,
``chrome``).
3. Playwright's browser cache (current logic) — directories containing
``chromium-*`` or ``chromium_headless_shell-*``.
agent-browser (0.26+) downloads Playwright's chromium / headless-shell
builds into ``PLAYWRIGHT_BROWSERS_PATH`` and won't start without them.
When the CLI is present but no browser build is, the first browser tool
call hangs for the full command timeout (often ~30s each) before
surfacing a useless error. Guarding the tool behind this check prevents
advertising a capability that will fail at runtime.
builds into ``PLAYWRIGHT_BROWSERS_PATH`` and won't start without at least
one of the three above being present. Without a browser binary the CLI
hangs on first use until the command timeout fires (often ~30s). Guarding
the tool behind this check prevents advertising a capability that will
fail at runtime.
"""
global _cached_chromium_installed
if _cached_chromium_installed is not None:
return _cached_chromium_installed
# 1. AGENT_BROWSER_EXECUTABLE_PATH — explicit user-configured browser
ab_path = os.environ.get("AGENT_BROWSER_EXECUTABLE_PATH", "").strip()
if ab_path:
if os.path.isfile(ab_path) or shutil.which(ab_path):
_cached_chromium_installed = True
return True
# 2. System Chrome/Chromium in PATH (common names)
system_chrome = shutil.which("google-chrome") or shutil.which("chromium-browser") or shutil.which("chrome")
if system_chrome:
_cached_chromium_installed = True
return True
# 3. Playwright browser cache (legacy — chromium-* / chromium_headless_shell-* dirs)
for root in _chromium_search_roots():
if not root or not os.path.isdir(root):
continue
+7
View File
@@ -1026,6 +1026,12 @@ def _build_child_agent(
except Exception as exc:
logger.debug("Could not load delegation reasoning_effort: %s", exc)
# Inherit the parent's fallback provider chain so subagents can recover
# from rate-limits and credential exhaustion exactly like the top-level
# agent does. _fallback_chain is a list accepted by AIAgent's
# fallback_model parameter (which handles both list and dict forms).
parent_fallback = getattr(parent_agent, "_fallback_chain", None) or None
child = AIAgent(
base_url=effective_base_url,
api_key=effective_api_key,
@@ -1038,6 +1044,7 @@ def _build_child_agent(
max_tokens=getattr(parent_agent, "max_tokens", None),
reasoning_config=child_reasoning,
prefill_messages=getattr(parent_agent, "prefill_messages", None),
fallback_model=parent_fallback,
enabled_toolsets=child_toolsets,
quiet_mode=True,
ephemeral_system_prompt=child_prompt,
+37 -7
View File
@@ -53,6 +53,27 @@ WRITE_DENIED_PATHS = build_write_denied_paths(_HOME)
WRITE_DENIED_PREFIXES = build_write_denied_prefixes(_HOME)
_OSC_SEQUENCE_RE = re.compile(r"\x1b\][^\x07\x1b]*(?:\x07|\x1b\\)")
_FENCE_MARKER_RE = re.compile(r"'?\x07?__HERMES_FENCE_[A-Za-z0-9]+__\x07?'?")
def _strip_terminal_fence_leaks(text: str) -> str:
"""Strip leaked terminal fence wrappers from file read output."""
if not text:
return text
cleaned_lines: List[str] = []
for line in text.splitlines(keepends=True):
had_terminal_wrapper = "__HERMES_FENCE_" in line or "\x1b]" in line
cleaned = _OSC_SEQUENCE_RE.sub("", line)
cleaned = _FENCE_MARKER_RE.sub("", cleaned)
cleaned = cleaned.replace("\x07", "")
if had_terminal_wrapper and cleaned.strip("'\r\n\t ") == "":
continue
cleaned_lines.append(cleaned)
return "".join(cleaned_lines)
def _get_safe_write_root() -> Optional[str]:
"""Return the resolved HERMES_WRITE_SAFE_ROOT path, or None if unset.
@@ -511,8 +532,9 @@ class ShellFileOperations(FileOperations):
# File not found - try to suggest similar files
return self._suggest_similar_files(path)
stat_output = _strip_terminal_fence_leaks(stat_result.stdout)
try:
file_size = int(stat_result.stdout.strip())
file_size = int(stat_output.strip())
except ValueError:
file_size = 0
@@ -536,8 +558,9 @@ class ShellFileOperations(FileOperations):
# Read a sample to check for binary content
sample_cmd = f"head -c 1000 {self._escape_shell_arg(path)} 2>/dev/null"
sample_result = self._exec(sample_cmd)
sample_output = _strip_terminal_fence_leaks(sample_result.stdout)
if self._is_likely_binary(path, sample_result.stdout):
if self._is_likely_binary(path, sample_output):
return ReadResult(
is_binary=True,
file_size=file_size,
@@ -551,12 +574,14 @@ class ShellFileOperations(FileOperations):
if read_result.exit_code != 0:
return ReadResult(error=f"Failed to read file: {read_result.stdout}")
read_output = _strip_terminal_fence_leaks(read_result.stdout)
# Get total line count
wc_cmd = f"wc -l < {self._escape_shell_arg(path)}"
wc_result = self._exec(wc_cmd)
wc_output = _strip_terminal_fence_leaks(wc_result.stdout)
try:
total_lines = int(wc_result.stdout.strip())
total_lines = int(wc_output.strip())
except ValueError:
total_lines = 0
@@ -567,7 +592,7 @@ class ShellFileOperations(FileOperations):
hint = f"Use offset={end_line + 1} to continue reading (showing {offset}-{end_line} of {total_lines} lines)"
return ReadResult(
content=self._add_line_numbers(read_result.stdout, offset),
content=self._add_line_numbers(read_output, offset),
total_lines=total_lines,
file_size=file_size,
truncated=truncated,
@@ -637,14 +662,16 @@ class ShellFileOperations(FileOperations):
stat_result = self._exec(stat_cmd)
if stat_result.exit_code != 0:
return self._suggest_similar_files(path)
stat_output = _strip_terminal_fence_leaks(stat_result.stdout)
try:
file_size = int(stat_result.stdout.strip())
file_size = int(stat_output.strip())
except ValueError:
file_size = 0
if self._is_image(path):
return ReadResult(is_image=True, is_binary=True, file_size=file_size)
sample_result = self._exec(f"head -c 1000 {self._escape_shell_arg(path)} 2>/dev/null")
if self._is_likely_binary(path, sample_result.stdout):
sample_output = _strip_terminal_fence_leaks(sample_result.stdout)
if self._is_likely_binary(path, sample_output):
return ReadResult(
is_binary=True, file_size=file_size,
error="Binary file — cannot display as text."
@@ -652,7 +679,10 @@ class ShellFileOperations(FileOperations):
cat_result = self._exec(f"cat {self._escape_shell_arg(path)}")
if cat_result.exit_code != 0:
return ReadResult(error=f"Failed to read file: {cat_result.stdout}")
return ReadResult(content=cat_result.stdout, file_size=file_size)
return ReadResult(
content=_strip_terminal_fence_leaks(cat_result.stdout),
file_size=file_size,
)
def delete_file(self, path: str) -> WriteResult:
"""Delete a file via rm."""
+19 -1
View File
@@ -1097,7 +1097,25 @@ def _handle_read_file(args, **kw):
def _handle_write_file(args, **kw):
tid = kw.get("task_id") or "default"
return write_file_tool(path=args.get("path", ""), content=args.get("content", ""), task_id=tid)
if not args.get("path") or not isinstance(args.get("path"), str):
return tool_error(
"write_file: missing required field 'path'. Re-emit the tool call with "
"both 'path' and 'content' set."
)
if "content" not in args:
return tool_error(
"write_file: missing required field 'content'. The tool call included a "
"path but no content argument — this is almost always a dropped-arg bug "
"under context pressure. Re-emit the tool call with the full content "
"payload, or use execute_code with hermes_tools.write_file() for very "
"large files."
)
if not isinstance(args["content"], str):
return tool_error(
f"write_file: 'content' must be a string, got "
f"{type(args['content']).__name__}."
)
return write_file_tool(path=args["path"], content=args["content"], task_id=tid)
def _handle_patch(args, **kw):

Some files were not shown because too many files have changed in this diff Show More