Compare commits

...

157 Commits

Author SHA1 Message Date
teknium1 c8582fc4a2 fix(discord): persist thread participation across gateway restarts
_bot_participated_threads was an in-memory set — lost on every restart.
After restart, the bot forgot which threads it was active in, requiring
fresh @mentions and potentially creating duplicate threads instead of
continuing existing conversations.

Changes:
- Persist thread IDs to ~/.hermes/discord_threads.json
- Load on adapter init, save on every new thread participation
- _track_thread() replaces direct .add() calls for atomic persist
- Cap at 500 tracked threads to prevent unbounded growth
- /thread slash command also tracks participation
- 7 new tests covering persistence, restart survival, corruption
  recovery, cap enforcement
2026-03-17 02:26:34 -07:00
teknium1 0351e4fa90 fix: add metadata param to base send_image and forward in send_animation
_send_response_parts() calls send_image(metadata=_thread_metadata) but
the base class signature didn't accept metadata, crashing platforms that
don't override send_image. send_animation already had the param but
wasn't forwarding it.

Credit: @0xbyt4 (PR #1077)
2026-03-17 02:02:28 -07:00
Teknium 1b2d6c424c fix: add --yes flag to bypass confirmation in /skills install and uninstall (#1647)
Fixes hanging when using /skills install or /skills uninstall from the
TUI — bare input() calls hang inside prompt_toolkit's event loop.

Changes:
- Add skip_confirm parameter to do_install() and do_uninstall()
- Separate --yes/-y (confirmation bypass) from --force (scan override)
  in both argparse and slash command handlers
- Update usage hint for /skills uninstall to show [--yes]

The original PR (#1595) accidentally deleted the install_from_quarantine()
call, which would have broken all installs. That bug is not present here.

Based on PR #1595 by 333Alden333.

Co-authored-by: 333Alden333 <333Alden333@users.noreply.github.com>
2026-03-17 01:59:07 -07:00
Teknium 28c35d045d Merge pull request #1537 from aydnOktay/improve/skill-manager-error-logging
Improve error logging in skill manager tool
2026-03-17 01:53:58 -07:00
Teknium 1f6a1f0028 fix(tools): chunk long messages in send_message_tool before platform dispatch
* add base support

* fix: correct skill author attribution to youssefea

* fix(tools): chunk long messages in send_message_tool before platform dispatch

  - Convert BasePlatformAdapter.truncate_message() to @staticmethod
  - Apply truncate_message() in _send_to_platform() with per-platform
    max lengths
  - Remove naive character split in _send_discord()
  - Attach media files to last chunk only for Telegram
  - Add regression tests for chunking and media placement

---------

Co-authored-by: youssefea <youcefea99@gmail.com>
Co-authored-by: llbn <46884939+llbn@users.noreply.github.com>
2026-03-17 01:52:51 -07:00
Teknium d7029489d6 fix: show custom endpoint models in /model via live API probe (#1645)
Add 'custom' to the provider order so custom OpenAI-compatible
endpoints appear in /model list. Probes the endpoint's /models API
to dynamically discover available models.

Changes:
- Add 'custom' to _PROVIDER_ORDER in list_available_providers()
- Add _get_custom_base_url() helper to read model.base_url from config
- Add custom branch in provider_model_ids() using fetch_api_models()
- Custom endpoint detection via base_url presence for has_creds check

Based on PR #1612 by @aashizpoudel.

Co-authored-by: Aashish Poudel <aashizpoudel@users.noreply.github.com>
2026-03-17 01:52:46 -07:00
Teknium 12afccd9ca fix(tools): chunk long messages in send_message_tool before dispatch (#1552)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
2026-03-17 01:52:43 -07:00
Teknium 81f76111b0 Merge pull request #1560 from eren-karakus0/fix/singularity-preflight-check
fix(terminal): add Singularity/Apptainer preflight availability check
2026-03-17 01:52:03 -07:00
Teknium 96dac22194 fix: prevent infinite 400 loop on context overflow + block prompt injection via cache files (#1630, #1558)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
2026-03-17 01:50:59 -07:00
Teknium 2d36819503 feat: add Base blockchain optional skill
* add base support

* fix: correct skill author attribution to youssefea

---------

Co-authored-by: youssefea <youcefea99@gmail.com>
2026-03-17 01:50:03 -07:00
Teknium 8e20a7e035 fix(gateway): strip MEDIA: and [[audio_as_voice]] tags from message body
* fix(gateway): strip MEDIA: and [[audio_as_voice]] tags from message body

Closes #1561

* fix: remove redundant re import, use existing import

---------

Co-authored-by: mettin4 <coktinmetin@gmail.com>
2026-03-17 01:47:35 -07:00
Teknium 4920c5940f feat: auto-detect local file paths in gateway responses for native media delivery (#1640)
Small models (7B-14B) can't reliably use MEDIA: or IMAGE: syntax. This
adds extract_local_files() to BasePlatformAdapter that regex-detects
bare local file paths ending in image/video extensions, validates them
with os.path.isfile(), and delivers them as native platform attachments.

Hardened over the original PR:
- Code-block exclusion: paths inside fenced blocks and inline code are
  skipped so code samples are never mutilated
- URL rejection: negative lookbehind prevents matching path segments
  inside HTTP URLs
- Relative path rejection: ./foo.png no longer matches
- Tilde path cleanup: raw ~/... form is removed from response text
- Deduplication by expanded path
- Added .webm to _VIDEO_EXTS
- Fallback to send_document for unrecognized media extensions

Based on PR #1636 by sudoingX.

Co-authored-by: sudoingX <sudoingX@users.noreply.github.com>
2026-03-17 01:47:34 -07:00
Teknium 3744118311 feat(cli): two-stage /model autocomplete with ghost text suggestions (#1641)
* feat(cli): two-stage /model autocomplete with ghost text suggestions

- SlashCommandCompleter: Tab-complete providers first (anthropic:, openrouter:, etc.)
  then models within the selected provider
- SlashCommandAutoSuggest: inline ghost text for slash commands, subcommands,
  and /model provider:model two-stage suggestions
- Custom Tab key binding: accepts provider completion and immediately
  re-triggers completions to show that provider's models
- COMMANDS_BY_CATEGORY: structured format with explicit subcommands for
  tab completion and ghost text (prompt, reasoning, voice, skills, cron, browser)
- SUBCOMMANDS dict auto-extracted from command definitions
- Model/provider info cached 60s for responsive completions

* fix: repair test regression and restore gold color from PR #1622

- Fix test_unknown_command_still_shows_error: patch _cprint instead of
  console.print to match the _cprint switch in process_command()
- Restore gold color on 'Type /help' hint using _DIM + _GOLD constants
  instead of bare \033[2m (was losing the #B8860B gold)
- Use _GOLD constant for ambiguous command message for consistency
- Add clarifying comment on SUBCOMMANDS regex fallback

---------

Co-authored-by: Lars van der Zande <lmvanderzande@gmail.com>
2026-03-17 01:47:32 -07:00
Teknium 5ada0b95e9 Merge pull request #1609 from 0xbyt4/fix/context-counter-cache-tokens
fix: context counter shows cached token count in status bar
2026-03-17 01:45:12 -07:00
teknium1 19eaf5d956 test: fix telegram mock to include ParseMode constant
The MarkdownV2 formatting change imports telegram.constants.ParseMode,
which the test mock didn't provide. Add ParseMode to the mock so
existing tests continue working.
2026-03-17 01:44:11 -07:00
Alex Ferrari 365d175100 fix: apply MarkdownV2 formatting in _send_telegram for proper rendering
The _send_telegram() function was sending raw markdown text without
parse_mode, causing bold, links, and headers to render as plain text.
This fix reuses the gateway adapter's format_message() to convert
markdown to Telegram's MarkdownV2 format, with a fallback to plain
text if parsing fails.
2026-03-17 01:44:11 -07:00
Teknium c3ca68d25b Merge pull request #1614 from PeterFile/fix/launchd-service-recovery
fix(gateway): recover stale launchd service state
2026-03-17 01:43:07 -07:00
Teknium eaa9ceeb43 Merge pull request #1621 from Death-Incarnate/main
fix: isolate test_anthropic_adapter from local credentials
2026-03-17 01:40:39 -07:00
Teknium 949fac192f fix(tools): remove unnecessary crontab requirement from cronjob tool (#1638)
* fix(tools): remove unnecessary crontab requirement from cronjob tool

The hermes cron system is internal — it uses a JSON-based scheduler
ticked by the gateway (cron/scheduler.py), not system crontab.

The check for shutil.which('crontab') was preventing the cronjob tool
from being available in environments without crontab installed (e.g.
minimal Ubuntu containers).

Changes:
- Remove shutil.which('crontab') check from check_cronjob_requirements()
- Remove unused shutil import
- Update docstring to clarify internal scheduler is used
- Update tests to reflect new behavior and add coverage for all
  session modes (interactive, gateway, exec_ask)

Fixes #1589

* test: add HERMES_EXEC_ASK coverage for cronjob requirements

Adds missing test for the exec_ask session mode, complementing
the cherry-picked fix from PR #1633.

---------

Co-authored-by: Bartok9 <bartokmagic@proton.me>
2026-03-17 01:40:02 -07:00
Teknium 4b96d10bc3 fix(cli): invalidate update-check cache after hermes update
Signed-off-by: nidhi-singh02 <nidhi2894@gmail.com>
Co-authored-by: nidhi-singh02 <nidhi2894@gmail.com>
2026-03-17 01:38:11 -07:00
teknium1 c16870277c test: add regression test for stale PID in gateway_state.json (#1631)
Verifies that write_runtime_status() overwrites pid and start_time
from a previous process rather than preserving them via setdefault().
Covers the fix from PR #1632.
2026-03-17 01:35:02 -07:00
Teknium 247e3c1470 Merge pull request #1632 from nidhi-singh02/fix/stale-pid-gateway-state
fix(gateway): overwrite stale PID in gateway_state.json on restart
2026-03-17 01:34:24 -07:00
Teknium 2af4af6390 Merge pull request #1635 from NousResearch/hermes/hermes-a86162db
fix: sanitize corrupted .env files on read and during migration
2026-03-17 01:33:36 -07:00
Teknium 749e9977a0 Merge pull request #1629 from NousResearch/hermes/hermes-6891ac11
feat(browser): multi-provider cloud browser support + Browser Use integration
2026-03-17 01:32:38 -07:00
teknium1 1c61ab6bd9 fix: unconditionally clear ANTHROPIC_TOKEN on v8→v9 migration
No conditional checks — just clear it. The new auth flow doesn't use
this env var. Anyone upgrading gets it wiped once, then it's done.
2026-03-17 01:31:20 -07:00
teknium1 e9f1a8e39b fix: gate ANTHROPIC_TOKEN cleanup to config version 8→9 migration
- Bump _config_version 8 → 9
- Move stale ANTHROPIC_TOKEN clearing into 'if current_ver < 9' block
  so it only runs once during the upgrade, not on every migrate_config()
- ANTHROPIC_TOKEN is still a valid auth path (OAuth flow), so we don't
  want to clear it repeatedly — only during the one-time migration from
  old setups that left it stale
- Add test_skips_on_version_9_or_later to verify one-time behavior
- All tests set config version 8 to trigger migration
2026-03-17 01:28:38 -07:00
teknium1 b6a51c955e fix: clear stale ANTHROPIC_TOKEN during migration, remove false *** detection
- Remove *** placeholder detection from _sanitize_env_lines (was based on
  confusing terminal redaction with literal file content)
- Add migrate_config() logic to clear stale ANTHROPIC_TOKEN when better
  credentials exist (ANTHROPIC_API_KEY or Claude Code auto-discovery)
- Old ANTHROPIC_TOKEN values shadow Claude Code credential fallthrough,
  breaking auth for users who updated without re-running setup
- Preserves ANTHROPIC_TOKEN when it's the only auth method available
- 3 new migration tests, updated existing tests
2026-03-17 01:26:23 -07:00
teknium1 634c1f6752 fix: sanitize corrupted .env files on read and during migration
Fixes two corruption patterns that break API keys during updates:

1. Concatenated KEY=VALUE pairs on a single line due to missing newlines
   (e.g. ANTHROPIC_API_KEY=sk-...OPENAI_BASE_URL=https://...). Uses a
   known-keys set to safely detect and split concatenated entries without
   false-splitting values that contain uppercase text.

2. Stale KEY=*** placeholder entries left by incomplete setup runs that
   never get updated and shadow real credentials.

Changes:
- Add _sanitize_env_lines() that splits concatenated known keys and drops
  *** placeholders
- Add sanitize_env_file() public API for explicit repair
- Call sanitization in save_env_value() on every read (self-healing)
- Call sanitize_env_file() at the start of migrate_config() so existing
  corrupted files are repaired on update
- 12 new tests covering splits, placeholders, edge cases, and integration
2026-03-17 01:13:34 -07:00
Teknium 6ebb816e56 Merge pull request #1634 from NousResearch/hermes/hermes-a86162db
chore: release v0.3.0 (v2026.3.17)
2026-03-17 00:55:51 -07:00
teknium1 37862f74fa chore: release v0.3.0 (v2026.3.17)
- Bump version 0.2.0 → 0.3.0
- Add comprehensive changelog (248 merged PRs, 15 contributors)
- CalVer tag: v2026.3.17
2026-03-17 00:38:48 -07:00
nidhi-singh02 67546746d4 fix(gateway): overwrite stale PID in gateway_state.json on restart
Signed-off-by: nidhi-singh02 <nidhi2894@gmail.com>
2026-03-17 13:01:55 +05:30
ShawnPana d44b6b7f1b feat(browser): multi-provider cloud browser support + Browser Use integration
Introduce a cloud browser provider abstraction so users can switch
between Local Browser, Browserbase, and Browser Use (or future providers)
via hermes tools / hermes setup.

Cloud browser providers are behind an ABC (tools/browser_providers/base.py)
so adding a new provider is a single-file addition with no changes to
browser_tool.py internals.

Changes:
- tools/browser_providers/ package with ABC, Browserbase extraction,
  and Browser Use provider
- browser_tool.py refactored to use _PROVIDER_REGISTRY + _get_cloud_provider()
  (cached) instead of hardcoded _is_local_mode() / _create_browserbase_session()
- tools_config.py: generic _is_provider_active() / _detect_active_provider_index()
  replace TTS-only logic; Browser Use added as third browser option
- config.py: BROWSER_USE_API_KEY added to OPTIONAL_ENV_VARS + show_config + allowlist
- subprocess pipe hang fix: agent-browser daemon inherits pipe fds,
  communicate() blocks. Replaced with Popen + temp files.

Original PR: #1208
Co-authored-by: ShawnPana <shawnpana@users.noreply.github.com>
2026-03-17 00:16:34 -07:00
Teknium 3576f44a57 feat: add Vercel AI Gateway provider (#1628)
* feat: add Vercel AI Gateway as a first-class provider

Adds AI Gateway (ai-gateway.vercel.sh) as a new inference provider
with AI_GATEWAY_API_KEY authentication, live model discovery, and
reasoning support via extra_body.reasoning.

Based on PR #1492 by jerilynzheng.

* feat: add AI Gateway to setup wizard, doctor, and fallback providers

* test: add AI Gateway to api_key_providers test suite

* feat: add AI Gateway to hermes model CLI and model metadata

Wire AI Gateway into the interactive model selection menu and add
context lengths for AI Gateway model IDs in model_metadata.py.

* feat: use claude-haiku-4.5 as AI Gateway auxiliary model

* revert: use gemini-3-flash as AI Gateway auxiliary model

* fix: move AI Gateway below established providers in selection order

---------

Co-authored-by: jerilynzheng <jerilynzheng@users.noreply.github.com>
Co-authored-by: jerilynzheng <zheng.jerilyn@gmail.com>
2026-03-17 00:12:16 -07:00
teknium1 4768ea624d fix: skip stale cron jobs on gateway restart instead of firing immediately
When the gateway restarts after being down past a scheduled run time,
recurring jobs (cron/interval) were firing immediately because their
next_run_at was in the past. Now jobs more than 2 minutes late are
fast-forwarded to the next future occurrence instead.

- get_due_jobs() checks staleness for cron/interval jobs
- Stale jobs get next_run_at recomputed and saved
- Jobs within 2 minutes of their schedule still fire normally
- One-shot (once) jobs are unaffected — they fire if missed

Fixes the 'cron jobs run on every gateway restart' issue.
2026-03-16 23:48:14 -07:00
Teknium e3f9894caf fix: send_animation metadata, MarkdownV2 inline code splitting, tirith cosign-free install (#1626)
* fix: Anthropic OAuth compatibility — Claude Code identity fingerprinting

Anthropic routes OAuth/subscription requests based on Claude Code's
identity markers. Without them, requests get intermittent 500 errors
(~25% failure rate observed). This matches what pi-ai (clawdbot) and
OpenCode both implement for OAuth compatibility.

Changes (OAuth tokens only — API key users unaffected):

1. Headers: user-agent 'claude-cli/2.1.2 (external, cli)' + x-app 'cli'
2. System prompt: prepend 'You are Claude Code, Anthropic's official CLI'
3. System prompt sanitization: replace Hermes/Nous references
4. Tool names: prefix with 'mcp_' (Claude Code convention for non-native tools)
5. Tool name stripping: remove 'mcp_' prefix from response tool calls

Before: 9/12 OK, 1 hard fail, 4 needed retries (~25% error rate)
After: 16/16 OK, 0 failures, 0 retries (0% error rate)

* fix: three gateway issues from user error logs

1. send_animation missing metadata kwarg (base.py)
   - Base class send_animation lacked the metadata parameter that the
     call site in base.py line 917 passes. Telegram's override accepted
     it, but any platform without an override (Discord, Slack, etc.)
     hit TypeError. Added metadata to base class signature.

2. MarkdownV2 split-inside-inline-code (base.py truncate_message)
   - truncate_message could split at a space inside an inline code span
     (e.g. `function(arg1, arg2)`), leaving an unpaired backtick and
     unescaped parentheses in the chunk. Telegram rejects with
     'character ( is reserved'. Added inline code awareness to the
     split-point finder — detects odd backtick counts and moves the
     split before the code span.

3. tirith auto-install without cosign (tirith_security.py)
   - Previously required cosign on PATH for auto-install, blocking
     install entirely with a warning if missing. Now proceeds with
     SHA-256 checksum verification only when cosign is unavailable.
     Cosign is still used for full supply chain verification when
     present. If cosign IS present but verification explicitly fails,
     install is still aborted (tampered release).
2026-03-16 23:39:41 -07:00
teknium1 19c8ad3d3d fix: add Claude Code user-agent to OAuth token exchange/refresh requests
Anthropic's token endpoint is behind Cloudflare which blocks Python's
default urllib user-agent (Python-urllib/3.x). Without a proper
user-agent, the token exchange returns 403 (Cloudflare error 1010).

Adds 'claude-cli/2.1.2 (external, cli)' user-agent to all three
OAuth HTTP requests:
- Initial token exchange (authorization_code grant)
- Hermes token refresh (refresh_token grant)
- Claude Code credential refresh (refresh_token grant)

Verified: full OAuth PKCE flow now works end-to-end.
2026-03-16 23:26:43 -07:00
teknium1 bd3b0c712b fix: make OAuth login URL prominent for SSH/headless users
The URL is now the primary element — displayed in a bordered box
before the browser auto-open attempt. Works for users who SSH into
remote servers where webbrowser.open() silently fails.
2026-03-16 23:21:30 -07:00
Teknium 46176c8029 refactor: centralize slash command registry (#1603)
* refactor: centralize slash command registry

Replace 7+ scattered command definition sites with a single
CommandDef registry in hermes_cli/commands.py. All downstream
consumers now derive from this registry:

- CLI process_command() resolves aliases via resolve_command()
- Gateway _known_commands uses GATEWAY_KNOWN_COMMANDS frozenset
- Gateway help text generated by gateway_help_lines()
- Telegram BotCommands generated by telegram_bot_commands()
- Slack subcommand map generated by slack_subcommand_map()

Adding a command or alias is now a one-line change to
COMMAND_REGISTRY instead of touching 6+ files.

Bugfixes included:
- Telegram now registers /rollback, /background (were missing)
- Slack now has /voice, /update, /reload-mcp (were missing)
- Gateway duplicate 'reasoning' dispatch (dead code) removed
- Gateway help text can no longer drift from CLI help

Backwards-compatible: COMMANDS and COMMANDS_BY_CATEGORY dicts are
rebuilt from the registry, so existing imports work unchanged.

* docs: update developer docs for centralized command registry

Update AGENTS.md with full 'Slash Command Registry' and 'Adding a
Slash Command' sections covering CommandDef fields, registry helpers,
and the one-line alias workflow.

Also update:
- CONTRIBUTING.md: commands.py description
- website/docs/reference/slash-commands.md: reference central registry
- docs/plans/centralize-command-registry.md: mark COMPLETED
- plans/checkpoint-rollback.md: reference new pattern
- hermes-agent-dev skill: architecture table

* chore: remove stale plan docs
2026-03-16 23:21:03 -07:00
teknium1 b798062501 fix: improve OAuth login UX for headless/SSH users
Put the authorization URL front and center instead of treating it as
a fallback. Most Hermes users run on remote servers via SSH where
webbrowser.open() silently fails.
2026-03-16 23:17:29 -07:00
teknium1 63e88326a8 feat: Hermes-native PKCE OAuth flow for Claude Pro/Max subscriptions
Adds our own OAuth login and token refresh flow, independent of Claude
Code CLI. Mirrors the PKCE flow used by pi-ai (clawdbot) and OpenCode:

- run_hermes_oauth_login(): full PKCE authorization code flow
  - Opens browser to claude.ai/oauth/authorize
  - User pastes code#state back
  - Exchanges for access + refresh tokens
  - Stores in ~/.hermes/.anthropic_oauth.json (our own file)
  - Also writes to ~/.claude/.credentials.json for backward compat

- refresh_hermes_oauth_token(): automatic token refresh
  - POST to console.anthropic.com/v1/oauth/token with refresh_token
  - Updates both credential files on success

- Credential resolution priority updated:
  1. ANTHROPIC_TOKEN env var
  2. CLAUDE_CODE_OAUTH_TOKEN env var
  3. Hermes OAuth credentials (~/.hermes/.anthropic_oauth.json) ← NEW
  4. Claude Code credentials (~/.claude/.credentials.json)
  5. ANTHROPIC_API_KEY env var

Uses same CLIENT_ID, endpoints, scopes, and PKCE parameters as
Claude Code / OpenCode / pi-ai. Token refresh happens automatically
before each API call via _try_refresh_anthropic_client_credentials.
2026-03-16 23:15:56 -07:00
Teknium 474301adc6 fix: improve execute_code error logging and harden cleanup (#1623)
* fix(tools): improve error logging in code_execution_tool

* fix: harden execute_code cleanup and reduce logging noise

Follow-up to cherry-picked PR #1588 (aydnOktay):
- Initialize server_sock = None before try block to prevent NameError
  if exception occurs before socket creation (line 413 is inside the try)
- Guard server_sock.close() with None check
- Narrow cleanup exception handlers to OSError (the actual error type)
- Remove exc_info=True from cleanup debug logs — benign teardown
  failures don't need stack traces, the message is sufficient
- Remove redundant try/except around shutil.rmtree(ignore_errors=True)
- Silence sock_path unlink with pass — expected when already cleaned up

---------

Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-16 23:13:26 -07:00
DeadMan 285300528b fix: isolate test_anthropic_adapter from local credentials
Two tests lacked filesystem isolation causing them to pick up real
~/.claude/.credentials.json tokens on machines with Claude Code installed.

- test_prefers_oauth_token_over_api_key: add tmp_path, mock Path.home,
  clear CLAUDE_CODE_OAUTH_TOKEN env
- test_falls_back_to_token: same isolation

Also commit run_agent.py generic-400 retry fix.
2026-03-16 22:53:32 -07:00
Verne 673f132151 fix(gateway): Recover stale service state
Repair stale launchd/systemd definitions during install and
teach launchd start to reload unloaded jobs before retrying.

Stop masking service restart failures by falling back to a
foreground gateway when a configured service manager is still
broken.

Refs: #1613
2026-03-17 11:05:28 +08:00
0xbyt4 8d0a96a8bf fix: context counter shows cached token count in status bar
Anthropic prompt caching splits input into cache_read_input_tokens,
cache_creation_input_tokens, and non-cached input_tokens. The context
counter only read input_tokens (non-cached portion), showing ~3 tokens
instead of the real ~18K total. Now includes cached portions for
Anthropic native provider only — other providers (OpenAI, OpenRouter,
Codex) already include cached tokens in their prompt_tokens field.

Before: 3/200K | 0%
After: 17.7K/200K | 9%
2026-03-17 05:06:11 +03:00
SHL0MS cfa87e77a9 Merge pull request #1598 from NousResearch/shloms/ascii-video-v3
Refactor ascii-video skill: creative-first SKILL.md, consolidate references
2026-03-16 20:46:12 -04:00
Teknium 60e38e82ec fix: auto-detect D-Bus session bus for systemctl --user on headless servers (#1601)
* fix: Anthropic OAuth compatibility — Claude Code identity fingerprinting

Anthropic routes OAuth/subscription requests based on Claude Code's
identity markers. Without them, requests get intermittent 500 errors
(~25% failure rate observed). This matches what pi-ai (clawdbot) and
OpenCode both implement for OAuth compatibility.

Changes (OAuth tokens only — API key users unaffected):

1. Headers: user-agent 'claude-cli/2.1.2 (external, cli)' + x-app 'cli'
2. System prompt: prepend 'You are Claude Code, Anthropic's official CLI'
3. System prompt sanitization: replace Hermes/Nous references
4. Tool names: prefix with 'mcp_' (Claude Code convention for non-native tools)
5. Tool name stripping: remove 'mcp_' prefix from response tool calls

Before: 9/12 OK, 1 hard fail, 4 needed retries (~25% error rate)
After: 16/16 OK, 0 failures, 0 retries (0% error rate)

* fix: auto-detect DBUS_SESSION_BUS_ADDRESS for systemctl --user on headless servers

On SSH sessions to headless servers, DBUS_SESSION_BUS_ADDRESS and
XDG_RUNTIME_DIR may not be set even when the user's systemd instance
is running via linger. This causes 'systemctl --user' to fail with
'Failed to connect to bus: No medium found', breaking gateway
restart/start/stop as a service and falling back to foreground mode.

Add _ensure_user_systemd_env() that detects the standard D-Bus socket
at /run/user/<UID>/bus and sets the env vars before any systemctl --user
call. Called from _systemctl_cmd() so all existing call sites benefit
automatically with zero changes.

Fixes: gateway restart falling back to foreground on headless servers

* fix: show linger guidance when gateway restart fails during update and gateway restart

When systemctl --user restart fails during 'hermes update' or
'hermes gateway restart', check linger status and tell the user
exactly what to run (sudo -S -p '' loginctl enable-linger) instead of
silently falling back to foreground mode.

Also applies _ensure_user_systemd_env() to the raw systemctl calls
in cmd_update so they work properly on SSH sessions where D-Bus
env vars are missing.
2026-03-16 17:45:48 -07:00
Teknium ce430fed4c installer: clarify why sudo is needed at every prompt (#1602)
* fix: Anthropic OAuth compatibility — Claude Code identity fingerprinting

Anthropic routes OAuth/subscription requests based on Claude Code's
identity markers. Without them, requests get intermittent 500 errors
(~25% failure rate observed). This matches what pi-ai (clawdbot) and
OpenCode both implement for OAuth compatibility.

Changes (OAuth tokens only — API key users unaffected):

1. Headers: user-agent 'claude-cli/2.1.2 (external, cli)' + x-app 'cli'
2. System prompt: prepend 'You are Claude Code, Anthropic's official CLI'
3. System prompt sanitization: replace Hermes/Nous references
4. Tool names: prefix with 'mcp_' (Claude Code convention for non-native tools)
5. Tool name stripping: remove 'mcp_' prefix from response tool calls

Before: 9/12 OK, 1 hard fail, 4 needed retries (~25% error rate)
After: 16/16 OK, 0 failures, 0 retries (0% error rate)

* installer: clarify why sudo is needed at every prompt

Every sudo prompt now explicitly states what packages are being installed
and that Hermes Agent itself does not require or retain root access.
Covers system packages, build tools, and Playwright browser deps.
2026-03-16 17:43:48 -07:00
Teknium 6794e79bb4 feat: add /bg as alias for /background slash command (#1590)
* feat: add optional smart model routing

Add a conservative cheap-vs-strong routing option that can send very short/simple turns to a cheaper model across providers while keeping the primary model for complex work. Wire it through CLI, gateway, and cron, and document the config.yaml workflow.

* fix(gateway): remove recursive ExecStop from systemd units, extend TimeoutStopSec to 60s

* fix(gateway): avoid recursive ExecStop in user systemd unit

* fix: extend ExecStop removal and TimeoutStopSec=60 to system unit

The cherry-picked PR #1448 fix only covered the user systemd unit.
The system unit had the same TimeoutStopSec=15 and could benefit
from the same 60s timeout for clean shutdown. Also adds a regression
test for the system unit.

---------

Co-authored-by: Ninja <ninja@local>

* feat(skills): add blender-mcp optional skill for 3D modeling

Control a running Blender instance from Hermes via socket connection
to the blender-mcp addon (port 9876). Supports creating 3D objects,
materials, animations, and running arbitrary bpy code.

Placed in optional-skills/ since it requires Blender 4.3+ desktop
with a third-party addon manually started each session.

* feat(acp): support slash commands in ACP adapter (#1532)

Adds /help, /model, /tools, /context, /reset, /compact, /version
to the ACP adapter (VS Code, Zed, JetBrains). Commands are handled
directly in the server without instantiating the TUI — each command
queries agent/session state and returns plain text.

Unrecognized /commands fall through to the LLM as normal messages.

/model uses detect_provider_for_model() for auto-detection when
switching models, matching the CLI and gateway behavior.

Fixes #1402

* fix(logging): improve error logging in session search tool (#1533)

* fix(gateway): restart on retryable startup failures (#1517)

* feat(email): add skip_attachments option via config.yaml

* feat(email): add skip_attachments option via config.yaml

Adds a config.yaml-driven option to skip email attachments in the
gateway email adapter. Useful for malware protection and bandwidth
savings.

Configure in config.yaml:
  platforms:
    email:
      skip_attachments: true

Based on PR #1521 by @an420eth, changed from env var to config.yaml
(via PlatformConfig.extra) to match the project's config-first pattern.

* docs: document skip_attachments option for email adapter

* fix(telegram): retry on transient TLS failures during connect and send

Add exponential-backoff retry (3 attempts) around initialize() to
handle transient TLS resets during gateway startup. Also catches
TimedOut and OSError in addition to NetworkError.

Add exponential-backoff retry (3 attempts) around send_message() for
NetworkError during message delivery, wrapping the existing Markdown
fallback logic.

Both imports are guarded with try/except ImportError for test
environments where telegram is mocked.

Based on PR #1527 by cmd8. Closes #1526.

* feat: permissive block_anchor thresholds and unicode normalization (#1539)

Salvaged from PR #1528 by an420eth. Closes #517.

Improves _strategy_block_anchor in fuzzy_match.py:
- Add unicode normalization (smart quotes, em/en-dashes, ellipsis,
  non-breaking spaces → ASCII) so LLM-produced unicode artifacts
  don't break anchor line matching
- Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for
  multiple candidates — if first/last lines match exactly, the
  block is almost certainly correct
- Use original (non-normalized) content for offset calculation to
  preserve correct character positions

Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space
anchors, very-low-similarity unique matches), zero regressions on
all 9 existing fuzzy match tests.

Co-authored-by: an420eth <an420eth@users.noreply.github.com>

* feat(cli): add file path autocomplete in the input prompt (#1545)

When typing a path-like token (./  ../  ~/  /  or containing /),
the CLI now shows filesystem completions in the dropdown menu.
Directories show a trailing slash and 'dir' label; files show
their size. Completions are case-insensitive and capped at 30
entries.

Triggered by tokens like:
  edit ./src/ma     → shows ./src/main.py, ./src/manifest.json, ...
  check ~/doc       → shows ~/docs/, ~/documents/, ...
  read /etc/hos     → shows /etc/hosts, /etc/hostname, ...
  open tools/reg    → shows tools/registry.py

Slash command autocomplete (/help, /model, etc.) is unaffected —
it still triggers when the input starts with /.

Inspired by OpenCode PR #145 (file path completion menu).

Implementation:
- hermes_cli/commands.py: _extract_path_word() detects path-like
  tokens, _path_completions() yields filesystem Completions with
  size labels, get_completions() routes to paths vs slash commands
- tests/hermes_cli/test_path_completion.py: 26 tests covering
  path extraction, prefix filtering, directory markers, home
  expansion, case-insensitivity, integration with slash commands

* feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled

Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:

- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)

Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.

Inspired by OpenClaw PR #47959.

* fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs)

Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.

Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.

* feat: smart approvals + /stop command (inspired by OpenAI Codex)

* feat: smart approvals — LLM-based risk assessment for dangerous commands

Adds a 'smart' approval mode that uses the auxiliary LLM to assess
whether a flagged command is genuinely dangerous or a false positive,
auto-approving low-risk commands without prompting the user.

Inspired by OpenAI Codex's Smart Approvals guardian subagent
(openai/codex#13860).

Config (config.yaml):
  approvals:
    mode: manual   # manual (default), smart, off

Modes:
- manual — current behavior, always prompt the user
- smart  — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block),
           or ESCALATE (fall through to manual prompt)
- off    — skip all approval prompts (equivalent to --yolo)

When smart mode auto-approves, the pattern gets session-level approval
so subsequent uses of the same pattern don't trigger another LLM call.
When it denies, the command is blocked without user prompt. When
uncertain, it escalates to the normal manual approval flow.

The LLM prompt is carefully scoped: it sees only the command text and
the flagged reason, assesses actual risk vs false positive, and returns
a single-word verdict.

* feat: make smart approval model configurable via config.yaml

Adds auxiliary.approval section to config.yaml with the same
provider/model/base_url/api_key pattern as other aux tasks (vision,
web_extract, compression, etc.).

Config:
  auxiliary:
    approval:
      provider: auto
      model: ''        # fast/cheap model recommended
      base_url: ''
      api_key: ''

Bridged to env vars in both CLI and gateway paths so the aux client
picks them up automatically.

* feat: add /stop command to kill all background processes

Adds a /stop slash command that kills all running background processes
at once. Currently users have to process(list) then process(kill) for
each one individually.

Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current
turn) from /stop (cleans up background processes). See openai/codex#14602.

Ctrl+C continues to only interrupt the active agent turn — background
dev servers, watchers, etc. are preserved. /stop is the explicit way
to clean them all up.

* feat: first-class plugin architecture + hide status bar cost by default (#1544)

The persistent status bar now shows context %, token counts, and
duration but NOT $ cost by default. Cost display is opt-in via:

  display:
    show_cost: true

in config.yaml, or: hermes config set display.show_cost true

The /usage command still shows full cost breakdown since the user
explicitly asked for it — this only affects the always-visible bar.

Status bar without cost:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m

Status bar with show_cost: true:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m

* feat: improve memory prioritization + aggressive skill updates (inspired by OpenAI Codex)

* feat: improve memory prioritization — user preferences over procedural knowledge

Inspired by OpenAI Codex's memory prompt improvements (openai/codex#14493)
which focus memory writes on user preferences and recurring patterns
rather than procedural task details.

Key insight: 'Optimize for reducing future user steering — the most
valuable memory prevents the user from having to repeat themselves.'

Changes:
- MEMORY_GUIDANCE (prompt_builder.py): added prioritization hierarchy
  and the core principle about reducing user steering
- MEMORY_SCHEMA (memory_tool.py): reordered WHEN TO SAVE list to put
  corrections first, added explicit PRIORITY guidance
- Memory nudge (run_agent.py): now asks specifically about preferences,
  corrections, and workflow patterns instead of generic 'anything'
- Memory flush (run_agent.py): now instructs to prioritize user
  preferences and corrections over task-specific details

* feat: more aggressive skill creation and update prompting

Press harder on skill updates — the agent should proactively patch
skills when it encounters issues during use, not wait to be asked.

Changes:
- SKILLS_GUIDANCE: 'consider saving' → 'save'; added explicit instruction
  to patch skills immediately when found outdated/wrong
- Skills header: added instruction to update loaded skills before finishing
  if they had missing steps or wrong commands
- Skill nudge: more assertive ('save the approach' not 'consider saving'),
  now also prompts for updating existing skills used in the task
- Skill nudge interval: lowered default from 15 to 10 iterations
- skill_manage schema: added 'patch it immediately' to update triggers

* feat: first-class plugin architecture (#1555)

Plugin system for extending Hermes with custom tools, hooks, and
integrations — no source code changes required.

Core system (hermes_cli/plugins.py):
  - Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and
    pip entry_points (hermes_agent.plugins group)
  - PluginContext with register_tool() and register_hook()
  - 6 lifecycle hooks: pre/post tool_call, pre/post llm_call,
    on_session_start/end
  - Namespace package handling for relative imports in plugins
  - Graceful error isolation — broken plugins never crash the agent

Integration (model_tools.py):
  - Plugin discovery runs after built-in + MCP tools
  - Plugin tools bypass toolset filter via get_plugin_tool_names()
  - Pre/post tool call hooks fire in handle_function_call()

CLI:
  - /plugins command shows loaded plugins, tool counts, status
  - Added to COMMANDS dict for autocomplete

Docs:
  - Getting started guide (build-a-hermes-plugin.md) — full tutorial
    building a calculator plugin step by step
  - Reference page (features/plugins.md) — quick overview + tables
  - Covers: file structure, schemas, handlers, hooks, data files,
    bundled skills, env var gating, pip distribution, common mistakes

Tests: 16 tests covering discovery, loading, hooks, tool visibility.

* feat: add /bg as alias for /background slash command

Adds /bg alias across CLI, gateway, and Slack platform adapter.
Updates help text, autocomplete, known_commands set, and dispatch
logic. Includes tests for the new alias.

* docs: add plan for centralized slash command registry

Scopes a refactor to replace 7+ scattered command definition sites
with a single CommandDef registry in hermes_cli/commands.py. Includes
derived helper functions for gateway help text, Telegram BotCommands,
Slack subcommand maps, and alias resolution.

Documents current drift (Telegram missing /rollback + /background,
Slack missing /voice + /update, gateway dead code) that the refactor
fixes for free.

---------

Co-authored-by: Ninja <ninja@local>
Co-authored-by: alireza78a <alireza78a@users.noreply.github.com>
Co-authored-by: Oktay Aydin <113846926+aydnOktay@users.noreply.github.com>
Co-authored-by: JP Lew <polydegen@protonmail.com>
Co-authored-by: an420eth <an420eth@users.noreply.github.com>
2026-03-16 17:27:02 -07:00
Teknium 181077b785 fix: hide Honcho session line on CLI load when no API key configured (#1582)
HonchoClientConfig.from_env() set enabled=True unconditionally,
even when HONCHO_API_KEY was not set. When ~/.honcho/config.json
didn't exist, from_global_config() fell back to from_env() and
returned enabled=True with a null api_key, causing the Honcho
session indicator to display on every CLI launch.

Fix: from_env() now sets enabled=bool(api_key), matching the
auto-enable logic already used in from_global_config().
Also added api_key guard to the CLI display as defense-in-depth.
2026-03-16 17:22:52 -07:00
SHL0MS 63635744bf Refactor ascii-video skill: creative-first SKILL.md, consolidate reference files 2026-03-16 20:11:12 -04:00
Teknium 2158c44efd fix: Anthropic OAuth compatibility — Claude Code identity fingerprinting (#1597)
Anthropic routes OAuth/subscription requests based on Claude Code's
identity markers. Without them, requests get intermittent 500 errors
(~25% failure rate observed). This matches what pi-ai (clawdbot) and
OpenCode both implement for OAuth compatibility.

Changes (OAuth tokens only — API key users unaffected):

1. Headers: user-agent 'claude-cli/2.1.2 (external, cli)' + x-app 'cli'
2. System prompt: prepend 'You are Claude Code, Anthropic's official CLI'
3. System prompt sanitization: replace Hermes/Nous references
4. Tool names: prefix with 'mcp_' (Claude Code convention for non-native tools)
5. Tool name stripping: remove 'mcp_' prefix from response tool calls

Before: 9/12 OK, 1 hard fail, 4 needed retries (~25% error rate)
After: 16/16 OK, 0 failures, 0 retries (0% error rate)
2026-03-16 17:08:22 -07:00
Teknium e6cf1c94a8 Merge pull request #1585 from 0xbyt4/fix/anthropic-error-handling
fix(anthropic): retry 429/529 errors and surface error details to users
2026-03-16 15:46:06 -07:00
0xbyt4 d998cac319 fix(anthropic): retry 429/529 errors and surface error details to users
- 429 rate limit and 529 overloaded were incorrectly treated as
  non-retryable client errors, causing immediate failure instead of
  exponential backoff retry. Users hitting Anthropic rate limits got
  silent failures or no response at all.
- Generic "Sorry, I encountered an unexpected error" now includes
  error type, details, and status-specific hints (auth, rate limit,
  overloaded).
- Failed agent with final_response=None now surfaces the actual
  error message instead of returning an empty response.
2026-03-17 01:07:11 +03:00
Teknium 6c84e26e70 Merge pull request #1538 from NousResearch/hermes/hermes-a098c323
feat: unified streaming infrastructure — real-time token delivery for CLI + gateway
2026-03-16 14:22:57 -07:00
teknium1 f4d61c168b merge: resolve conflicts with main (show_cost, turn routing, docker docs) 2026-03-16 14:22:38 -07:00
teknium1 8feb9e4656 docs: add streaming section to configuration guide 2026-03-16 12:53:49 -07:00
teknium1 25a1f1867f fix(gateway): prevent message flooding on adapters without edit support
When the stream consumer's first edit_message() call fails (Signal,
Email, HomeAssistant don't support editing), it now disables editing
for the rest of the stream instead of falling back to sending a new
message every 0.3 seconds. The final response is delivered by the
normal send path since already_sent stays false.

Without this fix, enabling gateway streaming on Signal/Email/HA would
flood the chat with dozens of partial messages.
2026-03-16 12:41:28 -07:00
Teknium 5e5c92663d fix: hermes update causes dual gateways on macOS (launchd) (#1567)
* feat: add optional smart model routing

Add a conservative cheap-vs-strong routing option that can send very short/simple turns to a cheaper model across providers while keeping the primary model for complex work. Wire it through CLI, gateway, and cron, and document the config.yaml workflow.

* fix(gateway): remove recursive ExecStop from systemd units, extend TimeoutStopSec to 60s

* fix(gateway): avoid recursive ExecStop in user systemd unit

* fix: extend ExecStop removal and TimeoutStopSec=60 to system unit

The cherry-picked PR #1448 fix only covered the user systemd unit.
The system unit had the same TimeoutStopSec=15 and could benefit
from the same 60s timeout for clean shutdown. Also adds a regression
test for the system unit.

---------

Co-authored-by: Ninja <ninja@local>

* feat(skills): add blender-mcp optional skill for 3D modeling

Control a running Blender instance from Hermes via socket connection
to the blender-mcp addon (port 9876). Supports creating 3D objects,
materials, animations, and running arbitrary bpy code.

Placed in optional-skills/ since it requires Blender 4.3+ desktop
with a third-party addon manually started each session.

* feat(acp): support slash commands in ACP adapter (#1532)

Adds /help, /model, /tools, /context, /reset, /compact, /version
to the ACP adapter (VS Code, Zed, JetBrains). Commands are handled
directly in the server without instantiating the TUI — each command
queries agent/session state and returns plain text.

Unrecognized /commands fall through to the LLM as normal messages.

/model uses detect_provider_for_model() for auto-detection when
switching models, matching the CLI and gateway behavior.

Fixes #1402

* fix(logging): improve error logging in session search tool (#1533)

* fix(gateway): restart on retryable startup failures (#1517)

* feat(email): add skip_attachments option via config.yaml

* feat(email): add skip_attachments option via config.yaml

Adds a config.yaml-driven option to skip email attachments in the
gateway email adapter. Useful for malware protection and bandwidth
savings.

Configure in config.yaml:
  platforms:
    email:
      skip_attachments: true

Based on PR #1521 by @an420eth, changed from env var to config.yaml
(via PlatformConfig.extra) to match the project's config-first pattern.

* docs: document skip_attachments option for email adapter

* fix(telegram): retry on transient TLS failures during connect and send

Add exponential-backoff retry (3 attempts) around initialize() to
handle transient TLS resets during gateway startup. Also catches
TimedOut and OSError in addition to NetworkError.

Add exponential-backoff retry (3 attempts) around send_message() for
NetworkError during message delivery, wrapping the existing Markdown
fallback logic.

Both imports are guarded with try/except ImportError for test
environments where telegram is mocked.

Based on PR #1527 by cmd8. Closes #1526.

* feat: permissive block_anchor thresholds and unicode normalization (#1539)

Salvaged from PR #1528 by an420eth. Closes #517.

Improves _strategy_block_anchor in fuzzy_match.py:
- Add unicode normalization (smart quotes, em/en-dashes, ellipsis,
  non-breaking spaces → ASCII) so LLM-produced unicode artifacts
  don't break anchor line matching
- Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for
  multiple candidates — if first/last lines match exactly, the
  block is almost certainly correct
- Use original (non-normalized) content for offset calculation to
  preserve correct character positions

Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space
anchors, very-low-similarity unique matches), zero regressions on
all 9 existing fuzzy match tests.

Co-authored-by: an420eth <an420eth@users.noreply.github.com>

* feat(cli): add file path autocomplete in the input prompt (#1545)

When typing a path-like token (./  ../  ~/  /  or containing /),
the CLI now shows filesystem completions in the dropdown menu.
Directories show a trailing slash and 'dir' label; files show
their size. Completions are case-insensitive and capped at 30
entries.

Triggered by tokens like:
  edit ./src/ma     → shows ./src/main.py, ./src/manifest.json, ...
  check ~/doc       → shows ~/docs/, ~/documents/, ...
  read /etc/hos     → shows /etc/hosts, /etc/hostname, ...
  open tools/reg    → shows tools/registry.py

Slash command autocomplete (/help, /model, etc.) is unaffected —
it still triggers when the input starts with /.

Inspired by OpenCode PR #145 (file path completion menu).

Implementation:
- hermes_cli/commands.py: _extract_path_word() detects path-like
  tokens, _path_completions() yields filesystem Completions with
  size labels, get_completions() routes to paths vs slash commands
- tests/hermes_cli/test_path_completion.py: 26 tests covering
  path extraction, prefix filtering, directory markers, home
  expansion, case-insensitivity, integration with slash commands

* feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled

Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:

- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)

Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.

Inspired by OpenClaw PR #47959.

* fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs)

Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.

Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.

* feat: smart approvals + /stop command (inspired by OpenAI Codex)

* feat: smart approvals — LLM-based risk assessment for dangerous commands

Adds a 'smart' approval mode that uses the auxiliary LLM to assess
whether a flagged command is genuinely dangerous or a false positive,
auto-approving low-risk commands without prompting the user.

Inspired by OpenAI Codex's Smart Approvals guardian subagent
(openai/codex#13860).

Config (config.yaml):
  approvals:
    mode: manual   # manual (default), smart, off

Modes:
- manual — current behavior, always prompt the user
- smart  — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block),
           or ESCALATE (fall through to manual prompt)
- off    — skip all approval prompts (equivalent to --yolo)

When smart mode auto-approves, the pattern gets session-level approval
so subsequent uses of the same pattern don't trigger another LLM call.
When it denies, the command is blocked without user prompt. When
uncertain, it escalates to the normal manual approval flow.

The LLM prompt is carefully scoped: it sees only the command text and
the flagged reason, assesses actual risk vs false positive, and returns
a single-word verdict.

* feat: make smart approval model configurable via config.yaml

Adds auxiliary.approval section to config.yaml with the same
provider/model/base_url/api_key pattern as other aux tasks (vision,
web_extract, compression, etc.).

Config:
  auxiliary:
    approval:
      provider: auto
      model: ''        # fast/cheap model recommended
      base_url: ''
      api_key: ''

Bridged to env vars in both CLI and gateway paths so the aux client
picks them up automatically.

* feat: add /stop command to kill all background processes

Adds a /stop slash command that kills all running background processes
at once. Currently users have to process(list) then process(kill) for
each one individually.

Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current
turn) from /stop (cleans up background processes). See openai/codex#14602.

Ctrl+C continues to only interrupt the active agent turn — background
dev servers, watchers, etc. are preserved. /stop is the explicit way
to clean them all up.

* feat: first-class plugin architecture + hide status bar cost by default (#1544)

The persistent status bar now shows context %, token counts, and
duration but NOT $ cost by default. Cost display is opt-in via:

  display:
    show_cost: true

in config.yaml, or: hermes config set display.show_cost true

The /usage command still shows full cost breakdown since the user
explicitly asked for it — this only affects the always-visible bar.

Status bar without cost:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m

Status bar with show_cost: true:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m

* feat: improve memory prioritization + aggressive skill updates (inspired by OpenAI Codex)

* feat: improve memory prioritization — user preferences over procedural knowledge

Inspired by OpenAI Codex's memory prompt improvements (openai/codex#14493)
which focus memory writes on user preferences and recurring patterns
rather than procedural task details.

Key insight: 'Optimize for reducing future user steering — the most
valuable memory prevents the user from having to repeat themselves.'

Changes:
- MEMORY_GUIDANCE (prompt_builder.py): added prioritization hierarchy
  and the core principle about reducing user steering
- MEMORY_SCHEMA (memory_tool.py): reordered WHEN TO SAVE list to put
  corrections first, added explicit PRIORITY guidance
- Memory nudge (run_agent.py): now asks specifically about preferences,
  corrections, and workflow patterns instead of generic 'anything'
- Memory flush (run_agent.py): now instructs to prioritize user
  preferences and corrections over task-specific details

* feat: more aggressive skill creation and update prompting

Press harder on skill updates — the agent should proactively patch
skills when it encounters issues during use, not wait to be asked.

Changes:
- SKILLS_GUIDANCE: 'consider saving' → 'save'; added explicit instruction
  to patch skills immediately when found outdated/wrong
- Skills header: added instruction to update loaded skills before finishing
  if they had missing steps or wrong commands
- Skill nudge: more assertive ('save the approach' not 'consider saving'),
  now also prompts for updating existing skills used in the task
- Skill nudge interval: lowered default from 15 to 10 iterations
- skill_manage schema: added 'patch it immediately' to update triggers

* feat: first-class plugin architecture (#1555)

Plugin system for extending Hermes with custom tools, hooks, and
integrations — no source code changes required.

Core system (hermes_cli/plugins.py):
  - Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and
    pip entry_points (hermes_agent.plugins group)
  - PluginContext with register_tool() and register_hook()
  - 6 lifecycle hooks: pre/post tool_call, pre/post llm_call,
    on_session_start/end
  - Namespace package handling for relative imports in plugins
  - Graceful error isolation — broken plugins never crash the agent

Integration (model_tools.py):
  - Plugin discovery runs after built-in + MCP tools
  - Plugin tools bypass toolset filter via get_plugin_tool_names()
  - Pre/post tool call hooks fire in handle_function_call()

CLI:
  - /plugins command shows loaded plugins, tool counts, status
  - Added to COMMANDS dict for autocomplete

Docs:
  - Getting started guide (build-a-hermes-plugin.md) — full tutorial
    building a calculator plugin step by step
  - Reference page (features/plugins.md) — quick overview + tables
  - Covers: file structure, schemas, handlers, hooks, data files,
    bundled skills, env var gating, pip distribution, common mistakes

Tests: 16 tests covering discovery, loading, hooks, tool visibility.

* fix: hermes update causes dual gateways on macOS (launchd)

Three bugs worked together to create the dual-gateway problem:

1. cmd_update only checked systemd for gateway restart, completely
   ignoring launchd on macOS. After killing the PID it would print
   'Restart it with: hermes gateway run' even when launchd was about
   to auto-respawn the process.

2. launchd's KeepAlive.SuccessfulExit=false respawns the gateway
   after SIGTERM (non-zero exit), so the user's manual restart
   created a second instance.

3. The launchd plist lacked --replace (systemd had it), so the
   respawned gateway didn't kill stale instances on startup.

Fixes:
- Add --replace to launchd ProgramArguments (matches systemd)
- Add launchd detection to cmd_update's auto-restart logic
- Print 'auto-restart via launchd' instead of manual restart hint

* fix: add launchd plist auto-refresh + explicit restart in cmd_update

Two integration issues with the initial fix:

1. Existing macOS users with old plist (no --replace) would never
   get the fix until manual uninstall/reinstall. Added
   refresh_launchd_plist_if_needed() — mirrors the existing
   refresh_systemd_unit_if_needed(). Called from launchd_start(),
   launchd_restart(), and cmd_update.

2. cmd_update relied on KeepAlive respawn after SIGTERM rather than
   explicit launchctl stop/start. This caused races: launchd would
   respawn the old process before the PID file was cleaned up.
   Now does explicit stop+start (matching how systemd gets an
   explicit systemctl restart), with plist refresh first so the
   new --replace flag is picked up.

---------

Co-authored-by: Ninja <ninja@local>
Co-authored-by: alireza78a <alireza78a@users.noreply.github.com>
Co-authored-by: Oktay Aydin <113846926+aydnOktay@users.noreply.github.com>
Co-authored-by: JP Lew <polydegen@protonmail.com>
Co-authored-by: an420eth <an420eth@users.noreply.github.com>
2026-03-16 12:36:29 -07:00
teknium1 942950f5b9 feat(cli): live reasoning token streaming — dim box above response
When both display.streaming and display.show_reasoning are enabled,
reasoning tokens stream in real-time into a dim bordered box. When
content tokens start arriving, the reasoning box closes and the
response box opens — smooth visual transition.

- _stream_reasoning_delta(): line-buffered rendering in dim text
- _close_reasoning_box(): flush + close, called on first content token
- Reasoning callback routes to streaming version when both flags set
- Skips static post-response reasoning display when streamed live
- State reset per turn via _reset_stream_state()

Works with reasoning_content deltas (OpenRouter reasoning mode) and
thinking_delta events (Anthropic extended thinking).
2026-03-16 10:29:55 -07:00
teknium1 d3687d3e81 docs: document planned live reasoning token display as future enhancement
The streaming infrastructure already fires reasoning deltas via
_fire_reasoning_delta() during streaming. The remaining work is the
CLI display layer: a dim reasoning box that opens on first reasoning
token, streams live, then transitions to the response box.

Reference: PR #1214 (raulvidis) for gateway reasoning visibility.
2026-03-16 10:22:44 -07:00
Muhammet Eren Karakuş 43b8ecd172 fix(tests): use case-insensitive regex in singularity preflight tests
pytest.raises(match=...) is case-sensitive by default. The error
message starts with "Neither" (capital N) but the regex used lowercase
"neither", causing CI failures on Linux.
2026-03-16 19:01:39 +03:00
Muhammet Eren Karakuş 606f57a3ab fix(terminal): add Singularity/Apptainer preflight availability check
When neither apptainer nor singularity is installed, the Singularity
backend silently defaults to "singularity" and fails with a cryptic
FileNotFoundError inside _start_instance().  Add a preflight check
that resolves the executable and verifies it responds, raising a
clear RuntimeError with install instructions on failure.

Closes #1511
2026-03-16 18:25:20 +03:00
teknium1 23b9d88a76 docs: add streaming config to cli-config.yaml.example and defaults
Documents the new streaming options in the example config:
- display.streaming for CLI (under display section)
- streaming.enabled + transport/interval/threshold/cursor for gateway
- Added streaming: false to load_cli_config() defaults dict
2026-03-16 07:53:08 -07:00
teknium1 c0b88018eb feat: ship streaming disabled by default — opt-in via config
Streaming is now off by default for both CLI and gateway. Users opt in:

CLI (config.yaml):
  display:
    streaming: true

Gateway (config.yaml):
  streaming:
    enabled: true

This lets early adopters test streaming while existing users see zero
change. Once we have enough field validation, we flip the default to
true in a subsequent release.
2026-03-16 07:44:42 -07:00
teknium1 fc4080c58a fix(cli): add <THINKING> to streaming tag suppression list
Anthropic native models emit <THINKING> tags in text content (separate
from the SDK's thinking_delta events). Without suppression, these tags
leak into the streamed CLI output. Found during live provider testing.
2026-03-16 07:34:29 -07:00
Teknium 91b9495b04 feat(browser): /browser connect — attach browser tools to live Chrome via CDP (#1549)
feat(browser): /browser connect — attach browser tools to live Chrome via CDP
2026-03-16 07:32:07 -07:00
teknium1 c2769dffe0 merge: resolve conflicts with main (plugins + stop commands) 2026-03-16 07:32:00 -07:00
teknium1 71e35311f5 fix(browser): model waits for user instruction after /browser connect
Updated the injected context message to tell the model to await the
user's instruction before operating the browser. Typical flow is:
user opens Chrome → logs into sites → /browser connect → tells the
agent what to do.
2026-03-16 07:20:43 -07:00
Teknium 97990e7ad5 feat: first-class plugin architecture (#1555)
Plugin system for extending Hermes with custom tools, hooks, and
integrations — no source code changes required.

Core system (hermes_cli/plugins.py):
  - Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and
    pip entry_points (hermes_agent.plugins group)
  - PluginContext with register_tool() and register_hook()
  - 6 lifecycle hooks: pre/post tool_call, pre/post llm_call,
    on_session_start/end
  - Namespace package handling for relative imports in plugins
  - Graceful error isolation — broken plugins never crash the agent

Integration (model_tools.py):
  - Plugin discovery runs after built-in + MCP tools
  - Plugin tools bypass toolset filter via get_plugin_tool_names()
  - Pre/post tool call hooks fire in handle_function_call()

CLI:
  - /plugins command shows loaded plugins, tool counts, status
  - Added to COMMANDS dict for autocomplete

Docs:
  - Getting started guide (build-a-hermes-plugin.md) — full tutorial
    building a calculator plugin step by step
  - Reference page (features/plugins.md) — quick overview + tables
  - Covers: file structure, schemas, handlers, hooks, data files,
    bundled skills, env var gating, pip distribution, common mistakes

Tests: 16 tests covering discovery, loading, hooks, tool visibility.
2026-03-16 07:17:36 -07:00
teknium1 73f39a7761 feat(browser): auto-launch Chrome when /browser connect finds no debugger
When /browser connect detects that port 9222 isn't open, it now:
1. Finds Chrome/Chromium/Brave/Edge on the system (macOS app bundles
   or Linux PATH lookup)
2. Launches it with --remote-debugging-port=9222 (detached)
3. Waits up to 5 seconds for the port to come up
4. Falls back to manual instructions if auto-launch fails

This means GUI-only users can just type /browser connect without
needing to know about terminal flags or Chrome launch commands.
2026-03-16 07:05:48 -07:00
Teknium 1ecfe68675 feat: improve memory prioritization + aggressive skill updates (inspired by OpenAI Codex)
* feat: improve memory prioritization — user preferences over procedural knowledge

Inspired by OpenAI Codex's memory prompt improvements (openai/codex#14493)
which focus memory writes on user preferences and recurring patterns
rather than procedural task details.

Key insight: 'Optimize for reducing future user steering — the most
valuable memory prevents the user from having to repeat themselves.'

Changes:
- MEMORY_GUIDANCE (prompt_builder.py): added prioritization hierarchy
  and the core principle about reducing user steering
- MEMORY_SCHEMA (memory_tool.py): reordered WHEN TO SAVE list to put
  corrections first, added explicit PRIORITY guidance
- Memory nudge (run_agent.py): now asks specifically about preferences,
  corrections, and workflow patterns instead of generic 'anything'
- Memory flush (run_agent.py): now instructs to prioritize user
  preferences and corrections over task-specific details

* feat: more aggressive skill creation and update prompting

Press harder on skill updates — the agent should proactively patch
skills when it encounters issues during use, not wait to be asked.

Changes:
- SKILLS_GUIDANCE: 'consider saving' → 'save'; added explicit instruction
  to patch skills immediately when found outdated/wrong
- Skills header: added instruction to update loaded skills before finishing
  if they had missing steps or wrong commands
- Skill nudge: more assertive ('save the approach' not 'consider saving'),
  now also prompts for updating existing skills used in the task
- Skill nudge interval: lowered default from 15 to 10 iterations
- skill_manage schema: added 'patch it immediately' to update triggers
2026-03-16 06:52:32 -07:00
Teknium 447594be28 feat: first-class plugin architecture + hide status bar cost by default (#1544)
The persistent status bar now shows context %, token counts, and
duration but NOT $ cost by default. Cost display is opt-in via:

  display:
    show_cost: true

in config.yaml, or: hermes config set display.show_cost true

The /usage command still shows full cost breakdown since the user
explicitly asked for it — this only affects the always-visible bar.

Status bar without cost:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m

Status bar with show_cost: true:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m
2026-03-16 06:43:57 -07:00
teknium1 9d1483c7e6 feat(browser): /browser connect — attach browser tools to live Chrome via CDP
Add /browser slash command for connecting browser tools to the user's
live Chrome instance via Chrome DevTools Protocol:

  /browser connect       — connect to Chrome on localhost:9222
  /browser connect ws://host:port  — custom CDP endpoint
  /browser disconnect    — revert to default (headless/Browserbase)
  /browser status        — show current browser mode + connectivity

When connected:
- All browser tools (navigate, snapshot, click, etc.) control the
  user's real Chrome — logged-in sessions, cookies, open tabs
- Platform-specific Chrome launch instructions are shown
- Port connectivity is tested immediately
- A context message is injected so the model knows it's controlling
  a live browser and should be mindful of user's open tabs

Implementation:
- BROWSER_CDP_URL env var drives the backend selection in browser_tool.py
- New _create_cdp_session() creates sessions using the CDP override
- _get_cdp_override() checked before local/Browserbase selection
- Existing agent-browser --cdp flag handles the actual CDP connection

Inspired by OpenClaw's browser profile system.
2026-03-16 06:38:20 -07:00
teknium1 8e07f9ca56 fix: audit fixes — 5 bugs found and resolved
Thorough code review found 5 issues across run_agent.py, cli.py, and gateway/:

1. CRITICAL — Gateway stream consumer task never started: stream_consumer_holder
   was checked BEFORE run_sync populated it. Fixed with async polling pattern
   (same as track_agent).

2. MEDIUM-HIGH — Streaming fallback after partial delivery caused double-response:
   if streaming failed after some tokens were delivered, the fallback would
   re-deliver the full response. Now tracks deltas_were_sent and only falls
   back when no tokens reached consumers yet.

3. MEDIUM — Codex mode lost on_first_delta spinner callback: _run_codex_stream
   now accepts on_first_delta parameter, fires it on first text delta. Passed
   through from _interruptible_streaming_api_call via _codex_on_first_delta
   instance attribute.

4. MEDIUM — CLI close-tag after-text bypassed tag filtering: text after a
   reasoning close tag was sent directly to _emit_stream_text, skipping
   open-tag detection. Now routes through _stream_delta for full filtering.

5. LOW — Removed 140 lines of dead code: old _streaming_api_call method
   (superseded by _interruptible_streaming_api_call). Updated 13 tests in
   test_run_agent.py and test_openai_client_lifecycle.py to use the new
   method name and signature.

4573 tests passing.
2026-03-16 06:35:46 -07:00
Teknium 57be18c026 feat: smart approvals + /stop command (inspired by OpenAI Codex)
* feat: smart approvals — LLM-based risk assessment for dangerous commands

Adds a 'smart' approval mode that uses the auxiliary LLM to assess
whether a flagged command is genuinely dangerous or a false positive,
auto-approving low-risk commands without prompting the user.

Inspired by OpenAI Codex's Smart Approvals guardian subagent
(openai/codex#13860).

Config (config.yaml):
  approvals:
    mode: manual   # manual (default), smart, off

Modes:
- manual — current behavior, always prompt the user
- smart  — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block),
           or ESCALATE (fall through to manual prompt)
- off    — skip all approval prompts (equivalent to --yolo)

When smart mode auto-approves, the pattern gets session-level approval
so subsequent uses of the same pattern don't trigger another LLM call.
When it denies, the command is blocked without user prompt. When
uncertain, it escalates to the normal manual approval flow.

The LLM prompt is carefully scoped: it sees only the command text and
the flagged reason, assesses actual risk vs false positive, and returns
a single-word verdict.

* feat: make smart approval model configurable via config.yaml

Adds auxiliary.approval section to config.yaml with the same
provider/model/base_url/api_key pattern as other aux tasks (vision,
web_extract, compression, etc.).

Config:
  auxiliary:
    approval:
      provider: auto
      model: ''        # fast/cheap model recommended
      base_url: ''
      api_key: ''

Bridged to env vars in both CLI and gateway paths so the aux client
picks them up automatically.

* feat: add /stop command to kill all background processes

Adds a /stop slash command that kills all running background processes
at once. Currently users have to process(list) then process(kill) for
each one individually.

Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current
turn) from /stop (cleans up background processes). See openai/codex#14602.

Ctrl+C continues to only interrupt the active agent turn — background
dev servers, watchers, etc. are preserved. /stop is the explicit way
to clean them all up.
2026-03-16 06:20:11 -07:00
teknium1 99369b926c fix: always fall back to non-streaming on ANY streaming error
Previously the fallback only triggered on specific error keywords like
'streaming is not supported'. Many third-party providers have partial
or broken streaming — rejecting stream=True, crashing on stream_options,
dropping connections mid-stream, returning malformed chunks, etc.

Now: any exception during the streaming API call triggers an automatic
fallback to the standard non-streaming request path. The error is logged
at INFO level for diagnostics but never surfaces to the user. If the
fallback also fails, THAT error propagates normally.

This ensures streaming is additive — it improves UX when it works but
never breaks providers that don't support it.

Tests: 2 new (any-error fallback, double-failure propagation), 15 total.
2026-03-16 06:15:09 -07:00
Teknium 2633272ea9 feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled (#1542)
feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled
2026-03-16 06:08:17 -07:00
Teknium 2ba219fa4b feat(cli): add file path autocomplete in the input prompt (#1545)
When typing a path-like token (./  ../  ~/  /  or containing /),
the CLI now shows filesystem completions in the dropdown menu.
Directories show a trailing slash and 'dir' label; files show
their size. Completions are case-insensitive and capped at 30
entries.

Triggered by tokens like:
  edit ./src/ma     → shows ./src/main.py, ./src/manifest.json, ...
  check ~/doc       → shows ~/docs/, ~/documents/, ...
  read /etc/hos     → shows /etc/hosts, /etc/hostname, ...
  open tools/reg    → shows tools/registry.py

Slash command autocomplete (/help, /model, etc.) is unaffected —
it still triggers when the input starts with /.

Inspired by OpenCode PR #145 (file path completion menu).

Implementation:
- hermes_cli/commands.py: _extract_path_word() detects path-like
  tokens, _path_completions() yields filesystem Completions with
  size labels, get_completions() routes to paths vs slash commands
- tests/hermes_cli/test_path_completion.py: 26 tests covering
  path extraction, prefix filtering, directory markers, home
  expansion, case-insensitivity, integration with slash commands
2026-03-16 06:07:45 -07:00
teknium1 9a423c3487 fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs)
Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.

Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.
2026-03-16 05:58:34 -07:00
teknium1 5479bb0e0c feat(gateway): streaming token delivery — StreamingConfig, GatewayStreamConsumer, already_sent
Stage 3 of streaming support. Gateway now streams tokens to messaging platforms:

- StreamingConfig dataclass (enabled, transport, edit_interval, buffer_threshold, cursor)
  on GatewayConfig with from_dict/to_dict serialization
- GatewayStreamConsumer: async queue-based consumer that progressively edits
  a single message on the target platform (edit transport)
- on_delta() → queue → run() async task → send_or_edit() with rate limiting
- already_sent propagation: when streaming delivered the response, handler
  returns None so base adapter skips duplicate send()
- stream_delta_callback wired into AIAgent constructor in _run_agent
- Consumer lifecycle: started as asyncio task, awaited with timeout in finally

Config (config.yaml):
  streaming:
    enabled: true
    transport: edit      # progressive editMessageText
    edit_interval: 0.3   # seconds between edits
    buffer_threshold: 40 # chars before forcing flush
    cursor: ' ▉'

Credit: jobless0x (#774, #1312), OutThisLife (#798), clicksingh (#697).
2026-03-16 05:52:42 -07:00
teknium1 c51e7b4af7 feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled
Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:

- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)

Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.

Inspired by OpenClaw PR #47959.
2026-03-16 05:48:45 -07:00
Teknium 7d2c786acc Merge pull request #1534 from NousResearch/fix/1445-docker-cwd-optin
fix(docker): make cwd workspace mount explicit opt-in
2026-03-16 05:42:21 -07:00
teknium1 b72f522e30 test: fake minisweagent for docker cwd mount regressions
Make the new Docker cwd-mount tests pass in CI environments that do not have the minisweagent package installed by injecting a fake module instead of monkeypatching an import path that may not exist.
2026-03-16 05:40:05 -07:00
Teknium 352980311b feat: permissive block_anchor thresholds and unicode normalization (#1539)
Salvaged from PR #1528 by an420eth. Closes #517.

Improves _strategy_block_anchor in fuzzy_match.py:
- Add unicode normalization (smart quotes, em/en-dashes, ellipsis,
  non-breaking spaces → ASCII) so LLM-produced unicode artifacts
  don't break anchor line matching
- Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for
  multiple candidates — if first/last lines match exactly, the
  block is almost certainly correct
- Use original (non-normalized) content for offset calculation to
  preserve correct character positions

Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space
anchors, very-low-similarity unique matches), zero regressions on
all 9 existing fuzzy match tests.

Co-authored-by: an420eth <an420eth@users.noreply.github.com>
2026-03-16 05:29:25 -07:00
Teknium b411b979cb fix(telegram): retry on transient TLS failures during connect and send (#1535)
fix(telegram): retry on transient TLS failures during connect and send
2026-03-16 05:28:11 -07:00
teknium1 ac739e485f fix(cli): reasoning tag suppression during streaming + fix fallback detection
Fixes two issues found during live testing:

1. Reasoning tag suppression: close tags like </REASONING_SCRATCHPAD>
   that arrive split across stream tokens (e.g. '</REASONING_SCRATCH' +
   'PAD>\n\nHello') were being lost because the buffer was discarded.
   Fix: keep a sliding window of the tail (max close tag length) so
   partial tags survive across tokens.

2. Streaming fallback detection was too broad — 'stream' matched any
   error containing that word (including 'stream_options' rejections).
   Narrowed to specific phrases: 'streaming is not', 'streaming not
   support', 'does not support stream', 'not available'.

Verified with real API calls: streaming works end-to-end with
reasoning block suppression, response box framing, and proper
fallback to Rich Panel when streaming isn't active.
2026-03-16 05:28:10 -07:00
Teknium 8758e2e8d7 feat(email): add skip_attachments option via config.yaml
* feat(email): add skip_attachments option via config.yaml

Adds a config.yaml-driven option to skip email attachments in the
gateway email adapter. Useful for malware protection and bandwidth
savings.

Configure in config.yaml:
  platforms:
    email:
      skip_attachments: true

Based on PR #1521 by @an420eth, changed from env var to config.yaml
(via PlatformConfig.extra) to match the project's config-first pattern.

* docs: document skip_attachments option for email adapter
2026-03-16 05:27:54 -07:00
JP Lew 17e87478d2 fix(gateway): restart on retryable startup failures (#1517) 2026-03-16 05:26:31 -07:00
aydnOktay a5359e61e7 fix(tools): improve error logging in skill_manager_tool 2026-03-16 15:25:30 +03:00
teknium1 25b0ae7979 fix(telegram): retry on transient TLS failures during connect and send
Add exponential-backoff retry (3 attempts) around initialize() to
handle transient TLS resets during gateway startup. Also catches
TimedOut and OSError in addition to NetworkError.

Add exponential-backoff retry (3 attempts) around send_message() for
NetworkError during message delivery, wrapping the existing Markdown
fallback logic.

Both imports are guarded with try/except ImportError for test
environments where telegram is mocked.

Based on PR #1527 by cmd8. Closes #1526.
2026-03-16 05:23:32 -07:00
Oktay Aydin dfe72b9d97 fix(logging): improve error logging in session search tool (#1533) 2026-03-16 05:22:00 -07:00
teknium1 780ddd102b fix(docker): gate cwd workspace mount behind config
Keep Docker sandboxes isolated by default. Add an explicit terminal.docker_mount_cwd_to_workspace opt-in, thread it through terminal/file environment creation, and document the security tradeoff and config.yaml workflow clearly.
2026-03-16 05:20:56 -07:00
Bartok9 8cdbbcaaa2 fix(docker): auto-mount host CWD to /workspace
Fixes #1445 — When using Docker backend, the user's current working
directory is now automatically bind-mounted to /workspace inside the
container. This allows users to run `cd my-project && hermes` and have
their project files accessible to the agent without manual volume config.

Changes:
- Add host_cwd and auto_mount_cwd parameters to DockerEnvironment
- Capture original host CWD in _get_env_config() before container fallback
- Pass host_cwd through _create_environment() to Docker backend
- Add TERMINAL_DOCKER_NO_AUTO_MOUNT env var to disable if needed
- Skip auto-mount when /workspace is already explicitly mounted
- Add tests for auto-mount behavior
- Add documentation for the new feature

The auto-mount is skipped when:
1. TERMINAL_DOCKER_NO_AUTO_MOUNT=true is set
2. User configured docker_volumes with :/workspace
3. persistent_filesystem=true (persistent sandbox mode)

This makes the Docker backend behave more intuitively — the agent
operates on the user's actual project directory by default.
2026-03-16 05:20:21 -07:00
Teknium a2f0d14f29 feat(acp): support slash commands in ACP adapter (#1532)
Adds /help, /model, /tools, /context, /reset, /compact, /version
to the ACP adapter (VS Code, Zed, JetBrains). Commands are handled
directly in the server without instantiating the TUI — each command
queries agent/session state and returns plain text.

Unrecognized /commands fall through to the LLM as normal messages.

/model uses detect_provider_for_model() for auto-detection when
switching models, matching the CLI and gateway behavior.

Fixes #1402
2026-03-16 05:19:36 -07:00
teknium1 2219695d92 test: 14-test streaming suite — accumulator, callbacks, fallback, reasoning, Codex
Tests cover:
- Text/tool-call/mixed response accumulation into correct shape
- Delta callback ordering and on_first_delta firing once
- Tool-call suppression (no callbacks during tool turns)
- Provider fallback on 'not supported' errors
- Reasoning content accumulation and callback
- _has_stream_consumers() detection
- Codex stream delta callback firing
2026-03-16 05:12:38 -07:00
teknium1 d23e9a9bed feat(cli): streaming token display — line-buffered rendering with response box framing
Stage 2 of streaming support. CLI now streams tokens in real-time:

- _stream_delta(): line-buffered rendering via _cprint (prompt_toolkit safe)
- _flush_stream(): emits remaining buffer and closes response box
- Response box opens on first token, closes on flush
- Skip Rich Panel when streaming already displayed content
- Reset streaming state before each agent turn
- Compatible with existing TTS streaming (both can fire simultaneously)
- Uses skin engine for response label branding

Credit: OutThisLife (#798 CLI streaming concept).
2026-03-16 05:10:15 -07:00
Teknium add945e53c feat(skills): add blender-mcp optional skill for 3D modeling (#1531)
feat(skills): add blender-mcp optional skill for 3D modeling
2026-03-16 05:05:56 -07:00
teknium1 c1ac32737d feat: unified streaming infrastructure — core delta callbacks for all providers
Stage 1 of streaming support. Adds:

- stream_delta_callback parameter on AIAgent.__init__ for real-time token delivery
- _interruptible_streaming_api_call() handling chat_completions + anthropic_messages
- Enhanced _run_codex_stream() to fire delta callbacks during Codex streaming
- _fire_stream_delta() fires both display and TTS callbacks
- _fire_reasoning_delta() for reasoning content streaming
- Tool-call suppression: callbacks only fire on text-only responses
- on_first_delta callback for spinner control on first token
- Provider fallback: graceful degradation to non-streaming
- _has_stream_consumers() unifies stream_delta_callback and _stream_callback checks
- Anthropic streaming returns native Message for downstream compatibility

Drawing from PRs #922 (unified streaming), #1312 (gateway consumer),
#774 (Telegram streaming), #798 (CLI streaming), #1214 (reasoning modes).
Credit: jobless0x, OutThisLife, clicksingh, raulvidis.
2026-03-16 05:05:45 -07:00
alireza78a 14b049d658 feat(skills): add blender-mcp optional skill for 3D modeling
Control a running Blender instance from Hermes via socket connection
to the blender-mcp addon (port 9876). Supports creating 3D objects,
materials, animations, and running arbitrary bpy code.

Placed in optional-skills/ since it requires Blender 4.3+ desktop
with a third-party addon manually started each session.
2026-03-16 05:03:19 -07:00
Teknium 002c459981 fix(gateway): remove recursive ExecStop from systemd units, extend TimeoutStopSec to 60s
* fix(gateway): avoid recursive ExecStop in user systemd unit

* fix: extend ExecStop removal and TimeoutStopSec=60 to system unit

The cherry-picked PR #1448 fix only covered the user systemd unit.
The system unit had the same TimeoutStopSec=15 and could benefit
from the same 60s timeout for clean shutdown. Also adds a regression
test for the system unit.

---------

Co-authored-by: Ninja <ninja@local>
2026-03-16 05:03:11 -07:00
Teknium ce660a4413 fix(gateway): remove app-specific Athabasca references from vision enrichment (#1529)
Salvaged from PR #1428 by jplew.

Removes Athabasca-specific persistence guidance accidentally merged
in PR #1422:
- Drop Athabasca docstring and injected note from _enrich_message_with_vision
- Delete tests/gateway/test_image_enrichment.py (asserted app-specific behavior)

Co-authored-by: jplew <jplew@users.noreply.github.com>
2026-03-16 05:02:58 -07:00
Teknium ee579af566 docs: add CLI status bar docs and update /usage reference (#1523)
- Add Status Bar section to user-guide/cli.md with layout example,
  element descriptions, responsive width behavior, and color-coded
  context threshold table
- Update /usage description in slash-commands reference to mention
  cost breakdown and session duration
2026-03-16 04:58:28 -07:00
Teknium caa944e752 fix(setup+gateway): defer config write, PID-based gateway kill, scoped systemd service names (#1499)
fix(setup+gateway): defer config write, PID-based gateway kill, scoped systemd service names
2026-03-16 04:58:12 -07:00
Teknium 00110fb3c3 docs: update checkpoint/rollback docs for new features
- Reflect that checkpoints are now enabled by default
- Document /rollback diff <N> for previewing changes
- Document /rollback <N> <file> for single-file restore
- Document automatic conversation undo on rollback
- Document terminal command checkpoint coverage
- Update listing example to show change stats
- Fix config path (checkpoints.enabled, not agent.checkpoints_enabled)
- Consolidate features/checkpoints.md to brief summary with link
2026-03-16 04:56:22 -07:00
Bartok9 3543b755af fix(docker): auto-mount host CWD to /workspace
Fixes #1445 — When using Docker backend, the user's current working
directory is now automatically bind-mounted to /workspace inside the
container. This allows users to run `cd my-project && hermes` and have
their project files accessible to the agent without manual volume config.

Changes:
- Add host_cwd and auto_mount_cwd parameters to DockerEnvironment
- Capture original host CWD in _get_env_config() before container fallback
- Pass host_cwd through _create_environment() to Docker backend
- Add TERMINAL_DOCKER_NO_AUTO_MOUNT env var to disable if needed
- Skip auto-mount when /workspace is already explicitly mounted
- Add tests for auto-mount behavior
- Add documentation for the new feature

The auto-mount is skipped when:
1. TERMINAL_DOCKER_NO_AUTO_MOUNT=true is set
2. User configured docker_volumes with :/workspace
3. persistent_filesystem=true (persistent sandbox mode)

This makes the Docker backend behave more intuitively — the agent
operates on the user's actual project directory by default.
2026-03-16 04:53:24 -07:00
teknium1 51185354dd docs: document scoped systemd service names for multi-install
- Update messaging guide to use 'hermes gateway' CLI commands instead
  of raw systemctl (auto-resolves the correct service name)
- Add info callout explaining multi-install service name scoping
- Update HERMES_HOME env var docs to mention PID + service name scoping
2026-03-16 04:44:53 -07:00
Teknium 9e845a6e53 feat: major /rollback improvements — enabled by default, diff preview, file-level restore, conversation undo, terminal checkpoints
Checkpoint & rollback upgrades:

1. Enabled by default — checkpoints are now on for all new sessions.
   Zero cost when no file-mutating tools fire. Disable with
   checkpoints.enabled: false in config.yaml.

2. Diff preview — /rollback diff <N> shows a git diff between the
   checkpoint and current working tree before committing to a restore.

3. File-level restore — /rollback <N> <file> restores a single file
   from a checkpoint instead of the entire directory.

4. Conversation undo on rollback — when restoring files, the last
   chat turn is automatically undone so the agent's context matches
   the restored filesystem state.

5. Terminal command checkpoints — destructive terminal commands (rm,
   mv, sed -i, truncate, git reset/clean, output redirects) now
   trigger automatic checkpoints before execution. Previously only
   write_file and patch were covered.

6. Change summary in listing — /rollback now shows file count and
   +insertions/-deletions for each checkpoint.

7. Fixed dead code — removed duplicate _run_git call in
   list_checkpoints with nonsensical --all if False condition.

8. Updated help text — /rollback with no args now shows available
   subcommands (diff, file-level restore).
2026-03-16 04:43:37 -07:00
Teknium 00a0c56598 feat: add persistent CLI status bar and usage details (#1522)
Salvaged from PR #1104 by kshitijk4poor. Closes #683.

Adds a persistent status bar to the CLI showing model name, context
window usage with visual bar, estimated cost, and session duration.
Responsive layout degrades gracefully for narrow terminals.

Changes:
- agent/usage_pricing.py: shared pricing table, cost estimation with
  Decimal arithmetic, duration/token formatting helpers
- agent/insights.py: refactored to reuse usage_pricing (eliminates
  duplicate pricing table and formatting logic)
- cli.py: status bar with FormattedTextControl fragments, color-coded
  context thresholds (green/yellow/orange/red), enhanced /usage with
  cost breakdown, 1Hz idle refresh for status bar updates
- tests/test_cli_status_bar.py: status bar snapshot, width collapsing,
  usage report with/without pricing, zero-priced model handling
- tests/test_insights.py: verify zero-priced providers show as unknown

Salvage fixes:
- Resolved conflict with voice status bar (both coexist in layout)
- Import _format_context_length from hermes_cli.banner (moved since PR)

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-16 04:42:48 -07:00
teknium1 30da22e1c1 feat(gateway): scope systemd service name to HERMES_HOME
Multiple Hermes installations on the same machine now get unique
systemd service names:
- Default ~/.hermes → hermes-gateway (backward compatible)
- Custom HERMES_HOME → hermes-gateway-<8-char-hash>

Changes:
- Add get_service_name() in hermes_cli/gateway.py that derives a
  deterministic service name from HERMES_HOME via SHA256
- Replace all hardcoded 'hermes-gateway' systemd references with
  get_service_name() across gateway.py, main.py, status.py, uninstall.py
- Add HERMES_HOME env var to both user and system systemd unit templates
  so the gateway process uses the correct installation
- Update tests to use get_service_name() in assertions
2026-03-16 04:42:46 -07:00
teknium1 e7d3f1f3ba fix(update): kill gateway via PID file before restart
cmd_update only ran 'systemctl --user restart hermes-gateway', which
left manually-started gateway processes alive, causing duplicates.

Now uses get_running_pid() from gateway/status.py (scoped to
HERMES_HOME) to find and SIGTERM this installation's gateway before
restarting. Safe with multiple Hermes installations since each
HERMES_HOME has its own PID file.

If no systemd service exists, informs the user to restart manually.

Based on PR #1131 by teknium1. Dropped the cli.py Rich from_ansi
changes (already on main).
2026-03-16 04:35:34 -07:00
Teknium c1da1fdcd5 feat: auto-detect provider when switching models via /model (#1506)
When typing /model deepseek-chat while on a different provider, the
model name now auto-resolves to the correct provider instead of
silently staying on the wrong one and causing API errors.

Detection priority:
1. Direct provider with credentials (e.g. DEEPSEEK_API_KEY set)
2. OpenRouter catalog match with proper slug remapping
3. Direct provider without creds (clear error beats silent failure)

Also adds DeepSeek as a first-class API-key provider — just set
DEEPSEEK_API_KEY and /model deepseek-chat routes directly.

Bare model names get remapped to proper OpenRouter slugs:
  /model gpt-5.4 → openai/gpt-5.4
  /model claude-opus-4.6 → anthropic/claude-opus-4.6

Salvages the concept from PR #1177 by @virtaava with credential
awareness and OpenRouter slug mapping added.

Co-authored-by: virtaava <virtaava@users.noreply.github.com>
2026-03-16 04:34:45 -07:00
teknium1 f7c5d8a749 Merge remote-tracking branch 'origin/main' into hermes/hermes-6360cdf9 2026-03-16 00:29:49 -07:00
Teknium 9cf7e2f0af Merge pull request #1495 from NousResearch/fix/814-group-session-isolation
fix(gateway): default group sessions to per-user isolation
2026-03-16 00:25:43 -07:00
Teknium dd7921d514 fix(honcho): isolate session routing for multi-user gateway (#1500)
Salvaged from PR #1470 by adavyas.

Core fix: Honcho tool calls in a multi-session gateway could route to
the wrong session because honcho_tools.py relied on process-global
state. Now threads session context through the call chain:
  AIAgent._invoke_tool() → handle_function_call() → registry.dispatch()
  → handler **kw → _resolve_session_context()

Changes:
- Add _resolve_session_context() to prefer per-call context over globals
- Plumb honcho_manager + honcho_session_key through handle_function_call
- Add sync_honcho=False to run_conversation() for synthetic flush turns
- Pass honcho_session_key through gateway memory flush lifecycle
- Harden gateway PID detection when /proc cmdline is unreadable
- Make interrupt test scripts import-safe for pytest-xdist
- Wrap BibTeX examples in Jekyll raw blocks for docs build
- Fix thread-order-dependent assertion in client lifecycle test
- Expand Honcho docs: session isolation, lifecycle, routing internals

Dropped from original PR:
- Indentation change in _create_request_openai_client that would move
  client creation inside the lock (causes unnecessary contention)

Co-authored-by: adavyas <adavyas@users.noreply.github.com>
2026-03-16 00:23:47 -07:00
Teknium eb4f0348e1 fix: persist CLI token counts to session DB for /insights
Token usage was tracked in-memory during CLI sessions (session_prompt_tokens,
session_completion_tokens) but never written to the SQLite session DB. The
gateway persisted tokens via session_store.update_session(), but CLI sessions
always showed 0 tokens in /insights.

Now run_agent.py persists token deltas to the DB after each API call for CLI
sessions. Gateway sessions continue to use their existing persist path to
avoid double-counting.
2026-03-16 00:23:13 -07:00
teknium1 38b4fd3737 fix(gateway): make group session isolation configurable
default group and channel sessions to per-user isolation, allow opting back into shared room sessions via config.yaml, and document Discord gateway routing and session behavior.
2026-03-16 00:22:23 -07:00
ygd58 36dd7a3e8d fix(setup): defer config.yaml write until after model selection
_update_config_for_provider() was called immediately after provider
selection for zai, kimi-coding, minimax, minimax-cn, and anthropic —
before model selection happened. Since the gateway re-reads config.yaml
per-message, this created a race where the gateway would pick up the
new provider but still use the old (incompatible) model name.

Capture selected_base_url in each provider block, then call
_update_config_for_provider() once, after model selection completes,
right before save_config(). The in-memory _set_model_provider() calls
stay in place so the config object remains consistent during setup.

Closes #1182
2026-03-16 00:18:30 -07:00
Teknium dd698f6d5d fix(gateway): SSL certificate auto-detection for NixOS and non-standard systems (#1494)
fix(gateway): SSL certificate auto-detection for NixOS and non-standard systems
2026-03-16 00:14:13 -07:00
teknium1 06a7d19f98 fix(gateway): isolate group sessions per user
Include participant identifiers in non-DM session keys when available so group and channel conversations no longer share one transcript across every active user in the chat.
2026-03-15 23:08:56 -07:00
teknium1 3801532bd3 fix(gateway): SSL certificate auto-detection for NixOS and non-standard systems
Add _ensure_ssl_certs() that discovers CA certificate bundles before any
HTTP library is imported.  Resolution order:
1. Python's ssl.get_default_verify_paths()
2. certifi (if installed)
3. Common distro/macOS paths

Only sets SSL_CERT_FILE if not already present in the environment.
Wrapped in a function (called immediately) to avoid polluting module
namespace.

Based on PR #1151 by sylvesterroos.
2026-03-15 23:04:34 -07:00
Teknium aaacab7de7 docs: explain checkpoints, /rollback, and git worktrees
* docs: explain checkpoints, rollback, and git worktrees

* fix: correct hermes -w description — auto-creates worktree, takes no path arg

---------

Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-15 23:04:07 -07:00
Teknium 4298c6fd9a fix: route background process watcher notifications to Telegram forum topics (#1481)
Salvaged from PR #1146 by spanishflu-est1918.

Background process progress/completion messages were sent with only
chat_id, landing in the general topic instead of the originating forum
topic. Thread the thread_id from HERMES_SESSION_THREAD_ID through the
watcher payload and pass it as metadata to adapter.send() so Telegram
routes notifications to the correct topic.

The env var export (HERMES_SESSION_THREAD_ID in _set_session_env /
_clear_session_env) already existed on main — this commit adds the
missing watcher plumbing.

Co-authored-by: spanishflu-est1918 <spanishflu-est1918@users.noreply.github.com>
2026-03-15 23:01:57 -07:00
Teknium c30505dddd feat: add OSS Security Forensics skill (Skills Hub) (#1482)
* feat: add OSS Security Forensics skill (Skills Hub)

Salvaged from PR #1066 by zagiscoming. Adds a 7-phase multi-agent
investigation framework for GitHub supply chain attack forensics.

Skill contents (optional-skills/security/oss-forensics/):
- SKILL.md: 420-line investigation framework with 8 anti-hallucination
  guardrails, 5 specialist investigators, ethical use guidelines,
  and API rate limiting guidance
- evidence-store.py: CLI evidence manager with add/list/verify/query/
  export/summary + SHA-256 integrity + chain of custody
- references/: evidence types, GH Archive BigQuery guide (expanded with
  12 event types and 6 query templates), recovery techniques (4 methods),
  investigation templates (5 attack patterns)
- templates/: forensic report template (151 lines), malicious package
  report template

Changes from original PR:
- Dropped unrelated core tool changes (delegate_tool.py role parameter,
  AGENTS.md, README.md modifications)
- Removed duplicate skills/security/oss-forensics/ placement
- Fixed github-archive-guide.md (missing from optional-skills/, expanded
  from 33 to 160+ lines with all 12 event types and query templates)
- Added ethical use guidelines and API rate limiting sections
- Rewrote tests to match the v2 evidence store API (12 tests, all pass)

Closes #384

* fix: use python3 and SKILL_DIR paths throughout oss-forensics skill

- Replace all 'python' invocations with 'python3' for portability
  (Ubuntu doesn't ship 'python' by default)
- Replace relative '../scripts/' and '../templates/' paths with
  SKILL_DIR/scripts/ and SKILL_DIR/templates/ convention
- Add path convention note before Phase 0 explaining SKILL_DIR
- Fix double --- separator (cosmetic)
- Applies to SKILL.md, evidence-store.py docstring,
  recovery-techniques.md, and forensic-report.md template

---------

Co-authored-by: zagiscoming <zagiscoming@users.noreply.github.com>
2026-03-15 21:59:53 -07:00
Teknium 70e24d77a1 Merge pull request #1490 from NousResearch/fix/1033-telegram-voice-fallback
fix: restore local STT fallback for gateway voice notes
2026-03-15 21:58:32 -07:00
Teknium fa3db2671a docs(readme): add CLI vs messaging quick reference
Co-authored-by: Frank <97429702+tsubasakong@users.noreply.github.com>
2026-03-15 21:58:11 -07:00
Teknium 6fd9f2a0c5 fix(gateway): null-coalesce mode in SessionResetPolicy.from_dict (#1488)
fix(gateway): null-coalesce mode in SessionResetPolicy.from_dict
2026-03-15 21:57:31 -07:00
teknium1 1f72ce71b7 fix: restore local STT fallback for gateway voice notes
Restore local STT command fallback for voice transcription, detect whisper and ffmpeg in common local install paths, and avoid bogus no-provider messaging when only a backend-specific key is missing.
2026-03-15 21:51:40 -07:00
teknium1 102a255575 fix(gateway): null-coalesce mode in SessionResetPolicy.from_dict
Complete the YAML null handling for all three SessionResetPolicy fields.
at_hour and idle_minutes already had null coalescing; mode was still
using data.get('mode', 'both') which returns None when the key exists
with an explicit null value.

Add regression test covering all-null input.

Based on PR #1120 by stablegenius49.
2026-03-15 21:40:22 -07:00
Teknium 5beb681c70 fix(cli): prefer curses over simple_term_menu in setup.py (#1487) 2026-03-15 21:16:21 -07:00
Teknium c9a9db318e feat(tools): persistent shell mode for local and SSH backends (#1483)
feat(tools): persistent shell mode for local and SSH backends
2026-03-15 21:14:01 -07:00
teknium1 01e62c067b merge: resolve conflicts with origin/main (SSH preflight check) 2026-03-15 21:13:40 -07:00
Teknium ceb970c559 fix(terminal): add SSH preflight check (#1486) 2026-03-15 21:09:07 -07:00
teknium1 6894358fe1 docs: add persistent shell section to configuration and env-vars reference
Documents terminal.persistent_shell config option, per-backend env var
overrides, precedence table, and what state persists across commands.
2026-03-15 21:01:50 -07:00
Teknium 3f0f4a04a9 fix(agent): skip reasoning extra_body for unsupported OpenRouter models (#1485)
* fix(agent): skip reasoning extra_body for models that don't support it

Sending reasoning config to models like MiniMax or Nvidia via OpenRouter
causes a 400 BadRequestError. Previously, reasoning extra_body was sent
to all OpenRouter and Nous models unconditionally.

Fix: only send reasoning extra_body when the model slug starts with a
known reasoning-capable prefix (deepseek/, anthropic/, openai/, x-ai/,
google/gemini-2, qwen/qwen3) or when using Nous Portal directly.

Applies to both the main API call path (_build_api_kwargs) and the
conversation summary path.

Fixes #1083

* test(agent): cover reasoning extra_body gating

---------

Co-authored-by: ygd58 <buraysandro9@gmail.com>
2026-03-15 20:42:07 -07:00
Teknium c564e1c3dc feat(tools): centralize tool emoji metadata in registry + skin integration (#1484)
feat(tools): centralize tool emoji metadata in registry + skin integration
2026-03-15 20:35:24 -07:00
teknium1 210d5ade1e feat(tools): centralize tool emoji metadata in registry + skin integration
- Add 'emoji' field to ToolEntry and 'get_emoji()' to ToolRegistry
- Add emoji= to all 50+ registry.register() calls across tool files
- Add get_tool_emoji() helper in agent/display.py with 3-tier resolution:
  skin override → registry default → hardcoded fallback
- Replace hardcoded emoji maps in run_agent.py, delegate_tool.py, and
  gateway/run.py with centralized get_tool_emoji() calls
- Add 'tool_emojis' field to SkinConfig so skins can override per-tool
  emojis (e.g. ares skin could use swords instead of wrenches)
- Add 11 tests (5 registry emoji, 6 display/skin integration)
- Update AGENTS.md skin docs table

Based on the approach from PR #1061 by ForgingAlex (emoji centralization
in registry). This salvage fixes several issues from the original:
- Does NOT split the cronjob tool (which would crash on missing schemas)
- Does NOT change image_generate toolset/requires_env/is_async
- Does NOT delete existing tests
- Completes the centralization (gateway/run.py was missed)
- Hooks into the skin system for full customizability
2026-03-15 20:21:21 -07:00
teknium1 33ebedc76d feat: enable persistent shell by default for SSH, add config option
SSH persistent shell now defaults to true — non-local backends benefit
most from state persistence across execute() calls. Local backend
remains opt-in via TERMINAL_LOCAL_PERSISTENT env var.

New config.yaml option: terminal.persistent_shell (default: true)
Controls the default for non-local backends. Users can disable with:
  hermes config set terminal.persistent_shell false

Precedence: per-backend env var > TERMINAL_PERSISTENT_SHELL > default.

Wired through cli.py, gateway/run.py, and hermes_cli/config.py so the
config.yaml value reaches terminal_tool via env var bridge.
2026-03-15 20:17:13 -07:00
teknium1 5b80654198 feat(tools): add persistent shell mode to local and SSH backends
Cherry-picked from PR #1067 by alt-glitch.
Adds PersistentShellMixin with file-based IPC protocol for long-lived
bash shells. LocalEnvironment and SSHEnvironment gain persistent=True
option. Controlled via TERMINAL_LOCAL_PERSISTENT / TERMINAL_SSH_PERSISTENT
env vars. Fixes latent stderr pipe buffer deadlock.

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-03-15 20:13:02 -07:00
Teknium 25e53f3c1a fix(custom-endpoint): verify /models and suggest working /v1 base URL (#1480) 2026-03-15 20:09:50 -07:00
Teknium 103f7b1ebc fix: verbose mode shows full untruncated output
* fix(cli): silence tirith prefetch install warnings at startup

* fix: verbose mode now shows full untruncated tool args, results, content, and think blocks

When tool progress is set to 'verbose' (via /verbose or config), the display
was still truncating tool arguments to 100 chars, tool results to 100-200 chars,
assistant content to 100 chars, and think blocks to 5 lines. This defeated the
purpose of verbose mode.

Changes:
- Tool args: show full JSON args (not truncated to log_prefix_chars)
- Tool results: show full result content in both display and debug logs
- Assistant content: show full content during tool-call loops
- Think blocks: show full reasoning text (not truncated to 5 lines/100 chars)
- Auto-enable reasoning display when verbose mode is active
- Fix initial agent creation to respect verbose config (was always quiet_mode=True)
- Updated verbose label to mention think blocks
2026-03-15 20:03:37 -07:00
Teknium a56937735e fix(telegram): escape chunk indicators in MarkdownV2 (#1478) 2026-03-15 19:27:15 -07:00
Teknium 7148534401 fix(gateway): make /status report live state and tokens (#1476) 2026-03-15 19:18:58 -07:00
Teknium 4e91b0240b fix(honcho): correct seed_ai_identity to use session.add_messages() (#1475)
The seed_ai_identity method was calling assistant_peer.add_message() which
doesn't exist on the Honcho SDK's Peer class. Fixed to use the correct
pattern: session.add_messages([peer.message(content)]), matching the
existing message sync code at line 294.

Discovered and fixed by Yuqi (Hermes Agent), Angello's AI companion.

Co-authored-by: Angello Picasso <angello.picasso@devsu.com>
2026-03-15 19:07:57 -07:00
Teknium 5e92a4ce5a fix: auto-reload MCP tools when mcp_servers config changes without restart (#1474)
Fixes #1036

After adding an MCP server to config.yaml, users had to restart Hermes
before the new tools became visible — even though /reload-mcp existed.

Add _check_config_mcp_changes() called from process_loop every 5s:
- stat() config.yaml for mtime changes (fast path, no YAML parse)
- On mtime change, parse and compare mcp_servers section
- If mcp_servers changed, auto-trigger _reload_mcp() and notify user
- Skip check while agent is running to avoid interrupting tool calls
- Throttled to CONFIG_WATCH_INTERVAL=5s to avoid busy-polling

/reload-mcp still works for manual force-reload.

Tests: 6 new tests in TestMCPConfigWatch, all passed

Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>
2026-03-15 19:03:34 -07:00
Teknium 471c663fdf fix(cli): silence tirith prefetch install warnings at startup (#1452) 2026-03-15 18:07:03 -07:00
Teknium 64d333204b Merge pull request #1242 from NousResearch/fix/file-tool-log-noise
fix: reduce file tool log noise
2026-03-15 11:11:18 -07:00
Teknium c44af43840 Merge pull request #1401 from NousResearch/hermes/hermes-eca4a640
test: protect atomic temp cleanup on interrupts
2026-03-15 11:10:41 -07:00
alt-glitch 4511322f56 Merge origin/main into sid/persistent-backend
Resolve conflict in local.py: keep refactored _make_run_env helper
over inline _sanitize_subprocess_env logic.
2026-03-15 21:08:11 +05:30
teknium1 b117bbc125 test: cover atomic temp cleanup on interrupts
- add regression coverage for BaseException cleanup in atomic_json_write
- add dedicated atomic_yaml_write tests, including interrupt cleanup
- document why BaseException is intentional in both helpers
2026-03-14 22:31:51 -07:00
alt-glitch e266530c7d add different polling intervals for ssh and local backends. ssh has a
longer roundtrip
2026-03-15 02:54:32 +05:30
alt-glitch 879b7d3fbf fix(tests): update mock stdout in env blocklist tests
The fake_popen mock used iter([]) for proc.stdout which doesn't
support .close(). Use MagicMock with __iter__ instead, since
_drain_stdout now calls proc.stdout.close() in its finally block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 02:48:05 +05:30
alt-glitch 9f36483bf4 refactor: deduplicate execute/cleanup, merge init, clean up helpers
- Merge _init_persistent_shell + _start_persistent_shell into single method
- Move execute() dispatcher and cleanup() into PersistentShellMixin
  so LocalEnvironment and SSHEnvironment inherit them
- Remove broad except Exception wrappers from _execute_oneshot in both backends
- Replace try/except with os.path.exists checks in local _read_temp_files
  and _cleanup_temp_files
- Remove redundant bash -c from SSH oneshot (SSH already runs in a shell)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 02:39:56 +05:30
alt-glitch 7be314c456 pass configs to file_tools for r+w over ssh.
pass TERM env.
default to ~ to in local and ssh backends.
ssh backend.
2026-03-15 02:26:39 +05:30
balyan.sid@gmail.com 9001b34146 simplify docstrings, fix some bugs 2026-03-15 01:20:42 +05:30
balyan.sid@gmail.com 861202b56c wip: add persistent shell to ssh and local terminal backends 2026-03-15 01:20:42 +05:30
balyan.sid@gmail.com 9d63dcc3f9 add persistent ssh backend 2026-03-15 01:19:38 +05:30
teknium1 b59da08730 fix: reduce file tool log noise
- treat git diff --cached --quiet rc=1 as an expected checkpoint state
  instead of logging it as an error
- downgrade expected write PermissionError/EROFS/EACCES failures out of
  error logging while keeping unexpected exceptions at error level
- add regression tests for both logging behaviors
2026-03-13 22:14:00 -07:00
204 changed files with 20163 additions and 3349 deletions
+42 -5
View File
@@ -129,14 +129,50 @@ Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`. Re
- **KawaiiSpinner** (`agent/display.py`) — animated faces during API calls, `┊` activity feed for tool results
- `load_cli_config()` in cli.py merges hardcoded defaults + user config YAML
- **Skin engine** (`hermes_cli/skin_engine.py`) — data-driven CLI theming; initialized from `display.skin` config key at startup; skins customize banner colors, spinner faces/verbs/wings, tool prefix, response box, branding text
- `process_command()` is a method on `HermesCLI` (not in commands.py)
- `process_command()` is a method on `HermesCLI` — dispatches on canonical command name resolved via `resolve_command()` from the central registry
- Skill slash commands: `agent/skill_commands.py` scans `~/.hermes/skills/`, injects as **user message** (not system prompt) to preserve prompt caching
### Adding CLI Commands
### Slash Command Registry (`hermes_cli/commands.py`)
1. Add to `COMMANDS` dict in `hermes_cli/commands.py`
2. Add handler in `HermesCLI.process_command()` in `cli.py`
3. For persistent settings, use `save_config_value()` in `cli.py`
All slash commands are defined in a central `COMMAND_REGISTRY` list of `CommandDef` objects. Every downstream consumer derives from this registry automatically:
- **CLI** — `process_command()` resolves aliases via `resolve_command()`, dispatches on canonical name
- **Gateway** — `GATEWAY_KNOWN_COMMANDS` frozenset for hook emission, `resolve_command()` for dispatch
- **Gateway help** — `gateway_help_lines()` generates `/help` output
- **Telegram** — `telegram_bot_commands()` generates the BotCommand menu
- **Slack** — `slack_subcommand_map()` generates `/hermes` subcommand routing
- **Autocomplete** — `COMMANDS` flat dict feeds `SlashCommandCompleter`
- **CLI help** — `COMMANDS_BY_CATEGORY` dict feeds `show_help()`
### Adding a Slash Command
1. Add a `CommandDef` entry to `COMMAND_REGISTRY` in `hermes_cli/commands.py`:
```python
CommandDef("mycommand", "Description of what it does", "Session",
aliases=("mc",), args_hint="[arg]"),
```
2. Add handler in `HermesCLI.process_command()` in `cli.py`:
```python
elif canonical == "mycommand":
self._handle_mycommand(cmd_original)
```
3. If the command is available in the gateway, add a handler in `gateway/run.py`:
```python
if canonical == "mycommand":
return await self._handle_mycommand(event)
```
4. For persistent settings, use `save_config_value()` in `cli.py`
**CommandDef fields:**
- `name` — canonical name without slash (e.g. `"background"`)
- `description` — human-readable description
- `category` — one of `"Session"`, `"Configuration"`, `"Tools & Skills"`, `"Info"`, `"Exit"`
- `aliases` — tuple of alternative names (e.g. `("bg",)`)
- `args_hint` — argument placeholder shown in help (e.g. `"<prompt>"`, `"[name]"`)
- `cli_only` — only available in the interactive CLI
- `gateway_only` — only available in messaging platforms
**Adding an alias** requires only adding it to the `aliases` tuple on the existing `CommandDef`. No other file changes needed — dispatch, help text, Telegram menu, Slack mapping, and autocomplete all update automatically.
---
@@ -235,6 +271,7 @@ hermes_cli/skin_engine.py # SkinConfig dataclass, built-in skins, YAML loader
| Spinner verbs | `spinner.thinking_verbs` | `display.py` |
| Spinner wings (optional) | `spinner.wings` | `display.py` |
| Tool output prefix | `tool_prefix` | `display.py` |
| Per-tool emojis | `tool_emojis` | `display.py``get_tool_emoji()` |
| Agent name | `branding.agent_name` | `banner.py`, `cli.py` |
| Welcome message | `branding.welcome` | `cli.py` |
| Response box label | `branding.response_label` | `cli.py` |
+1 -1
View File
@@ -136,7 +136,7 @@ hermes-agent/
│ ├── auth.py # Provider resolution, OAuth, Nous Portal
│ ├── models.py # OpenRouter model selection lists
│ ├── banner.py # Welcome banner, ASCII art
│ ├── commands.py # Slash command definitions + autocomplete
│ ├── commands.py # Central slash command registry (CommandDef), autocomplete, gateway helpers
│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval)
│ ├── doctor.py # Diagnostics
│ ├── skills_hub.py # Skills Hub CLI + /skills slash command
+18
View File
@@ -62,6 +62,24 @@ hermes doctor # Diagnose any issues
📖 **[Full documentation →](https://hermes-agent.nousresearch.com/docs/)**
## CLI vs Messaging Quick Reference
Hermes has two entry points: start the terminal UI with `hermes`, or run the gateway and talk to it from Telegram, Discord, Slack, WhatsApp, Signal, or Email. Once you're in a conversation, many slash commands are shared across both interfaces.
| Action | CLI | Messaging platforms |
|---------|-----|---------------------|
| Start chatting | `hermes` | Run `hermes gateway setup` + `hermes gateway start`, then send the bot a message |
| Start fresh conversation | `/new` or `/reset` | `/new` or `/reset` |
| Change model | `/model [provider:model]` | `/model [provider:model]` |
| Set a personality | `/personality [name]` | `/personality [name]` |
| Retry or undo the last turn | `/retry`, `/undo` | `/retry`, `/undo` |
| Compress context / check usage | `/compress`, `/usage`, `/insights [--days N]` | `/compress`, `/usage`, `/insights [days]` |
| Browse skills | `/skills` or `/<skill-name>` | `/skills` or `/<skill-name>` |
| Interrupt current work | `Ctrl+C` or send a new message | `/stop` or send a new message |
| Platform-specific status | `/platforms` | `/status`, `/sethome` |
For the full command lists, see the [CLI guide](https://hermes-agent.nousresearch.com/docs/user-guide/cli) and the [Messaging Gateway guide](https://hermes-agent.nousresearch.com/docs/user-guide/messaging).
---
## Documentation
+377
View File
@@ -0,0 +1,377 @@
# Hermes Agent v0.3.0 (v2026.3.17)
**Release Date:** March 17, 2026
> The streaming, plugins, and provider release — unified real-time token delivery, first-class plugin architecture, rebuilt provider system with Vercel AI Gateway, native Anthropic provider, smart approvals, live Chrome CDP browser connect, ACP IDE integration, Honcho memory, voice mode, persistent shell, and 50+ bug fixes across every platform.
---
## ✨ Highlights
- **Unified Streaming Infrastructure** — Real-time token-by-token delivery in CLI and all gateway platforms. Responses stream as they're generated instead of arriving as a block. ([#1538](https://github.com/NousResearch/hermes-agent/pull/1538))
- **First-Class Plugin Architecture** — Drop Python files into `~/.hermes/plugins/` to extend Hermes with custom tools, commands, and hooks. No forking required. ([#1544](https://github.com/NousResearch/hermes-agent/pull/1544), [#1555](https://github.com/NousResearch/hermes-agent/pull/1555))
- **Native Anthropic Provider** — Direct Anthropic API calls with Claude Code credential auto-discovery, OAuth PKCE flows, and native prompt caching. No OpenRouter middleman needed. ([#1097](https://github.com/NousResearch/hermes-agent/pull/1097))
- **Smart Approvals + /stop Command** — Codex-inspired approval system that learns which commands are safe and remembers your preferences. `/stop` kills the current agent run immediately. ([#1543](https://github.com/NousResearch/hermes-agent/pull/1543))
- **Honcho Memory Integration** — Async memory writes, configurable recall modes, session title integration, and multi-user isolation in gateway mode. By @erosika. ([#736](https://github.com/NousResearch/hermes-agent/pull/736))
- **Voice Mode** — Push-to-talk in CLI, voice notes in Telegram/Discord, Discord voice channel support, and local Whisper transcription via faster-whisper. ([#1299](https://github.com/NousResearch/hermes-agent/pull/1299), [#1185](https://github.com/NousResearch/hermes-agent/pull/1185), [#1429](https://github.com/NousResearch/hermes-agent/pull/1429))
- **Concurrent Tool Execution** — Multiple independent tool calls now run in parallel via ThreadPoolExecutor, significantly reducing latency for multi-tool turns. ([#1152](https://github.com/NousResearch/hermes-agent/pull/1152))
- **PII Redaction** — When `privacy.redact_pii` is enabled, personally identifiable information is automatically scrubbed before sending context to LLM providers. ([#1542](https://github.com/NousResearch/hermes-agent/pull/1542))
- **`/browser connect` via CDP** — Attach browser tools to a live Chrome instance through Chrome DevTools Protocol. Debug, inspect, and interact with pages you already have open. ([#1549](https://github.com/NousResearch/hermes-agent/pull/1549))
- **Vercel AI Gateway Provider** — Route Hermes through Vercel's AI Gateway for access to their model catalog and infrastructure. ([#1628](https://github.com/NousResearch/hermes-agent/pull/1628))
- **Centralized Provider Router** — Rebuilt provider system with `call_llm` API, unified `/model` command, auto-detect provider on model switch, and direct endpoint overrides for auxiliary/delegation clients. ([#1003](https://github.com/NousResearch/hermes-agent/pull/1003), [#1506](https://github.com/NousResearch/hermes-agent/pull/1506), [#1375](https://github.com/NousResearch/hermes-agent/pull/1375))
- **ACP Server (IDE Integration)** — VS Code, Zed, and JetBrains can now connect to Hermes as an agent backend, with full slash command support. ([#1254](https://github.com/NousResearch/hermes-agent/pull/1254), [#1532](https://github.com/NousResearch/hermes-agent/pull/1532))
- **Persistent Shell Mode** — Local and SSH terminal backends can maintain shell state across tool calls — cd, env vars, and aliases persist. By @alt-glitch. ([#1067](https://github.com/NousResearch/hermes-agent/pull/1067), [#1483](https://github.com/NousResearch/hermes-agent/pull/1483))
- **Agentic On-Policy Distillation (OPD)** — New RL training environment for distilling agent policies, expanding the Atropos training ecosystem. ([#1149](https://github.com/NousResearch/hermes-agent/pull/1149))
---
## 🏗️ Core Agent & Architecture
### Provider & Model Support
- **Centralized provider router** with `call_llm` API and unified `/model` command — switch models and providers seamlessly ([#1003](https://github.com/NousResearch/hermes-agent/pull/1003))
- **Vercel AI Gateway** provider support ([#1628](https://github.com/NousResearch/hermes-agent/pull/1628))
- **Auto-detect provider** when switching models via `/model` ([#1506](https://github.com/NousResearch/hermes-agent/pull/1506))
- **Direct endpoint overrides** for auxiliary and delegation clients — point vision/subagent calls at specific endpoints ([#1375](https://github.com/NousResearch/hermes-agent/pull/1375))
- **Native Anthropic auxiliary vision** — use Claude's native vision API instead of routing through OpenAI-compatible endpoints ([#1377](https://github.com/NousResearch/hermes-agent/pull/1377))
- Anthropic OAuth flow improvements — auto-run `claude setup-token`, reauthentication, PKCE state persistence, identity fingerprinting ([#1132](https://github.com/NousResearch/hermes-agent/pull/1132), [#1360](https://github.com/NousResearch/hermes-agent/pull/1360), [#1396](https://github.com/NousResearch/hermes-agent/pull/1396), [#1597](https://github.com/NousResearch/hermes-agent/pull/1597))
- Fix adaptive thinking without `budget_tokens` for Claude 4.6 models — by @ASRagab ([#1128](https://github.com/NousResearch/hermes-agent/pull/1128))
- Fix Anthropic cache markers through adapter — by @brandtcormorant ([#1216](https://github.com/NousResearch/hermes-agent/pull/1216))
- Retry Anthropic 429/529 errors and surface details to users — by @0xbyt4 ([#1585](https://github.com/NousResearch/hermes-agent/pull/1585))
- Fix Anthropic adapter max_tokens, fallback crash, proxy base_url — by @0xbyt4 ([#1121](https://github.com/NousResearch/hermes-agent/pull/1121))
- Fix DeepSeek V3 parser dropping multiple parallel tool calls — by @mr-emmett-one ([#1365](https://github.com/NousResearch/hermes-agent/pull/1365), [#1300](https://github.com/NousResearch/hermes-agent/pull/1300))
- Accept unlisted models with warning instead of rejecting ([#1047](https://github.com/NousResearch/hermes-agent/pull/1047), [#1102](https://github.com/NousResearch/hermes-agent/pull/1102))
- Skip reasoning params for unsupported OpenRouter models ([#1485](https://github.com/NousResearch/hermes-agent/pull/1485))
- MiniMax Anthropic API compatibility fix ([#1623](https://github.com/NousResearch/hermes-agent/pull/1623))
- Custom endpoint `/models` verification and `/v1` base URL suggestion ([#1480](https://github.com/NousResearch/hermes-agent/pull/1480))
- Resolve delegation providers from `custom_providers` config ([#1328](https://github.com/NousResearch/hermes-agent/pull/1328))
- Kimi model additions and User-Agent fix ([#1039](https://github.com/NousResearch/hermes-agent/pull/1039))
- Strip `call_id`/`response_item_id` for Mistral compatibility ([#1058](https://github.com/NousResearch/hermes-agent/pull/1058))
### Agent Loop & Conversation
- **Anthropic Context Editing API** support ([#1147](https://github.com/NousResearch/hermes-agent/pull/1147))
- Improved context compaction handoff summaries — compressor now preserves more actionable state ([#1273](https://github.com/NousResearch/hermes-agent/pull/1273))
- Sync session_id after mid-run context compression ([#1160](https://github.com/NousResearch/hermes-agent/pull/1160))
- Session hygiene threshold tuned to 50% for more proactive compression ([#1096](https://github.com/NousResearch/hermes-agent/pull/1096), [#1161](https://github.com/NousResearch/hermes-agent/pull/1161))
- Include session ID in system prompt via `--pass-session-id` flag ([#1040](https://github.com/NousResearch/hermes-agent/pull/1040))
- Prevent closed OpenAI client reuse across retries ([#1391](https://github.com/NousResearch/hermes-agent/pull/1391))
- Sanitize chat payloads and provider precedence ([#1253](https://github.com/NousResearch/hermes-agent/pull/1253))
- Handle dict tool call arguments from Codex and local backends ([#1393](https://github.com/NousResearch/hermes-agent/pull/1393), [#1440](https://github.com/NousResearch/hermes-agent/pull/1440))
### Memory & Sessions
- **Improve memory prioritization** — user preferences and corrections weighted above procedural knowledge ([#1548](https://github.com/NousResearch/hermes-agent/pull/1548))
- Tighter memory and session recall guidance in system prompts ([#1329](https://github.com/NousResearch/hermes-agent/pull/1329))
- Persist CLI token counts to session DB for `/insights` ([#1498](https://github.com/NousResearch/hermes-agent/pull/1498))
- Keep Honcho recall out of the cached system prefix ([#1201](https://github.com/NousResearch/hermes-agent/pull/1201))
- Correct `seed_ai_identity` to use `session.add_messages()` ([#1475](https://github.com/NousResearch/hermes-agent/pull/1475))
- Isolate Honcho session routing for multi-user gateway ([#1500](https://github.com/NousResearch/hermes-agent/pull/1500))
---
## 📱 Messaging Platforms (Gateway)
### Gateway Core
- **System gateway service mode** — run as a system-level systemd service, not just user-level ([#1371](https://github.com/NousResearch/hermes-agent/pull/1371))
- **Gateway install scope prompts** — choose user vs system scope during setup ([#1374](https://github.com/NousResearch/hermes-agent/pull/1374))
- **Reasoning hot reload** — change reasoning settings without restarting the gateway ([#1275](https://github.com/NousResearch/hermes-agent/pull/1275))
- Default group sessions to per-user isolation — no more shared state across users in group chats ([#1495](https://github.com/NousResearch/hermes-agent/pull/1495), [#1417](https://github.com/NousResearch/hermes-agent/pull/1417))
- Harden gateway restart recovery ([#1310](https://github.com/NousResearch/hermes-agent/pull/1310))
- Cancel active runs during shutdown ([#1427](https://github.com/NousResearch/hermes-agent/pull/1427))
- SSL certificate auto-detection for NixOS and non-standard systems ([#1494](https://github.com/NousResearch/hermes-agent/pull/1494))
- Auto-detect D-Bus session bus for `systemctl --user` on headless servers ([#1601](https://github.com/NousResearch/hermes-agent/pull/1601))
- Auto-enable systemd linger during gateway install on headless servers ([#1334](https://github.com/NousResearch/hermes-agent/pull/1334))
- Fall back to module entrypoint when `hermes` is not on PATH ([#1355](https://github.com/NousResearch/hermes-agent/pull/1355))
- Fix dual gateways on macOS launchd after `hermes update` ([#1567](https://github.com/NousResearch/hermes-agent/pull/1567))
- Remove recursive ExecStop from systemd units ([#1530](https://github.com/NousResearch/hermes-agent/pull/1530))
- Prevent logging handler accumulation in gateway mode ([#1251](https://github.com/NousResearch/hermes-agent/pull/1251))
- Restart on retryable startup failures — by @jplew ([#1517](https://github.com/NousResearch/hermes-agent/pull/1517))
- Backfill model on gateway sessions after agent runs ([#1306](https://github.com/NousResearch/hermes-agent/pull/1306))
- PID-based gateway kill and deferred config write ([#1499](https://github.com/NousResearch/hermes-agent/pull/1499))
### Telegram
- Buffer media groups to prevent self-interruption from photo bursts ([#1341](https://github.com/NousResearch/hermes-agent/pull/1341), [#1422](https://github.com/NousResearch/hermes-agent/pull/1422))
- Retry on transient TLS failures during connect and send ([#1535](https://github.com/NousResearch/hermes-agent/pull/1535))
- Harden polling conflict handling ([#1339](https://github.com/NousResearch/hermes-agent/pull/1339))
- Escape chunk indicators and inline code in MarkdownV2 ([#1478](https://github.com/NousResearch/hermes-agent/pull/1478), [#1626](https://github.com/NousResearch/hermes-agent/pull/1626))
- Check updater/app state before disconnect ([#1389](https://github.com/NousResearch/hermes-agent/pull/1389))
### Discord
- `/thread` command with `auto_thread` config and media metadata fixes ([#1178](https://github.com/NousResearch/hermes-agent/pull/1178))
- Auto-thread on @mention, skip mention text in bot threads ([#1438](https://github.com/NousResearch/hermes-agent/pull/1438))
- Retry without reply reference for system messages ([#1385](https://github.com/NousResearch/hermes-agent/pull/1385))
- Preserve native document and video attachment support ([#1392](https://github.com/NousResearch/hermes-agent/pull/1392))
- Defer discord adapter annotations to avoid optional import crashes ([#1314](https://github.com/NousResearch/hermes-agent/pull/1314))
### Slack
- Thread handling overhaul — progress messages, responses, and session isolation all respect threads ([#1103](https://github.com/NousResearch/hermes-agent/pull/1103))
- Formatting, reactions, user resolution, and command improvements ([#1106](https://github.com/NousResearch/hermes-agent/pull/1106))
- Fix MAX_MESSAGE_LENGTH 3900 → 39000 ([#1117](https://github.com/NousResearch/hermes-agent/pull/1117))
- File upload fallback preserves thread context — by @0xbyt4 ([#1122](https://github.com/NousResearch/hermes-agent/pull/1122))
- Improve setup guidance ([#1387](https://github.com/NousResearch/hermes-agent/pull/1387))
### Email
- Fix IMAP UID tracking and SMTP TLS verification ([#1305](https://github.com/NousResearch/hermes-agent/pull/1305))
- Add `skip_attachments` option via config.yaml ([#1536](https://github.com/NousResearch/hermes-agent/pull/1536))
### Home Assistant
- Event filtering closed by default ([#1169](https://github.com/NousResearch/hermes-agent/pull/1169))
---
## 🖥️ CLI & User Experience
### Interactive CLI
- **Persistent CLI status bar** — always-visible model, provider, and token counts ([#1522](https://github.com/NousResearch/hermes-agent/pull/1522))
- **File path autocomplete** in the input prompt ([#1545](https://github.com/NousResearch/hermes-agent/pull/1545))
- **`/plan` command** — generate implementation plans from specs ([#1372](https://github.com/NousResearch/hermes-agent/pull/1372), [#1381](https://github.com/NousResearch/hermes-agent/pull/1381))
- **Major `/rollback` improvements** — richer checkpoint history, clearer UX ([#1505](https://github.com/NousResearch/hermes-agent/pull/1505))
- **Preload CLI skills on launch** — skills are ready before the first prompt ([#1359](https://github.com/NousResearch/hermes-agent/pull/1359))
- **Centralized slash command registry** — all commands defined once, consumed everywhere ([#1603](https://github.com/NousResearch/hermes-agent/pull/1603))
- `/bg` alias for `/background` ([#1590](https://github.com/NousResearch/hermes-agent/pull/1590))
- Prefix matching for slash commands — `/mod` resolves to `/model` ([#1320](https://github.com/NousResearch/hermes-agent/pull/1320))
- `/new`, `/reset`, `/clear` now start genuinely fresh sessions ([#1237](https://github.com/NousResearch/hermes-agent/pull/1237))
- Accept session ID prefixes for session actions ([#1425](https://github.com/NousResearch/hermes-agent/pull/1425))
- TUI prompt and accent output now respect active skin ([#1282](https://github.com/NousResearch/hermes-agent/pull/1282))
- Centralize tool emoji metadata in registry + skin integration ([#1484](https://github.com/NousResearch/hermes-agent/pull/1484))
- "View full command" option added to dangerous command approval — by @teknium1 based on design by community ([#887](https://github.com/NousResearch/hermes-agent/pull/887))
- Non-blocking startup update check and banner deduplication ([#1386](https://github.com/NousResearch/hermes-agent/pull/1386))
- `/reasoning` command output ordering and inline think extraction fixes ([#1031](https://github.com/NousResearch/hermes-agent/pull/1031))
- Verbose mode shows full untruncated output ([#1472](https://github.com/NousResearch/hermes-agent/pull/1472))
- Fix `/status` to report live state and tokens ([#1476](https://github.com/NousResearch/hermes-agent/pull/1476))
- Seed a default global SOUL.md ([#1311](https://github.com/NousResearch/hermes-agent/pull/1311))
### Setup & Configuration
- **OpenClaw migration** during first-time setup — by @kshitijk4poor ([#981](https://github.com/NousResearch/hermes-agent/pull/981))
- `hermes claw migrate` command + migration docs ([#1059](https://github.com/NousResearch/hermes-agent/pull/1059))
- Smart vision setup that respects the user's chosen provider ([#1323](https://github.com/NousResearch/hermes-agent/pull/1323))
- Handle headless setup flows end-to-end ([#1274](https://github.com/NousResearch/hermes-agent/pull/1274))
- Prefer curses over `simple_term_menu` in setup.py ([#1487](https://github.com/NousResearch/hermes-agent/pull/1487))
- Show effective model and provider in `/status` ([#1284](https://github.com/NousResearch/hermes-agent/pull/1284))
- Config set examples use placeholder syntax ([#1322](https://github.com/NousResearch/hermes-agent/pull/1322))
- Reload .env over stale shell overrides ([#1434](https://github.com/NousResearch/hermes-agent/pull/1434))
- Fix is_coding_plan NameError crash — by @0xbyt4 ([#1123](https://github.com/NousResearch/hermes-agent/pull/1123))
- Add missing packages to setuptools config — by @alt-glitch ([#912](https://github.com/NousResearch/hermes-agent/pull/912))
- Installer: clarify why sudo is needed at every prompt ([#1602](https://github.com/NousResearch/hermes-agent/pull/1602))
---
## 🔧 Tool System
### Terminal & Execution
- **Persistent shell mode** for local and SSH backends — maintain shell state across tool calls — by @alt-glitch ([#1067](https://github.com/NousResearch/hermes-agent/pull/1067), [#1483](https://github.com/NousResearch/hermes-agent/pull/1483))
- **Tirith pre-exec command scanning** — security layer that analyzes commands before execution ([#1256](https://github.com/NousResearch/hermes-agent/pull/1256))
- Strip Hermes provider env vars from all subprocess environments ([#1157](https://github.com/NousResearch/hermes-agent/pull/1157), [#1172](https://github.com/NousResearch/hermes-agent/pull/1172), [#1399](https://github.com/NousResearch/hermes-agent/pull/1399), [#1419](https://github.com/NousResearch/hermes-agent/pull/1419)) — initial fix by @eren-karakus0
- SSH preflight check ([#1486](https://github.com/NousResearch/hermes-agent/pull/1486))
- Docker backend: make cwd workspace mount explicit opt-in ([#1534](https://github.com/NousResearch/hermes-agent/pull/1534))
- Add project root to PYTHONPATH in execute_code sandbox ([#1383](https://github.com/NousResearch/hermes-agent/pull/1383))
- Eliminate execute_code progress spam on gateway platforms ([#1098](https://github.com/NousResearch/hermes-agent/pull/1098))
- Clearer docker backend preflight errors ([#1276](https://github.com/NousResearch/hermes-agent/pull/1276))
### Browser
- **`/browser connect`** — attach browser tools to a live Chrome instance via CDP ([#1549](https://github.com/NousResearch/hermes-agent/pull/1549))
- Improve browser cleanup, local browser PATH setup, and screenshot recovery ([#1333](https://github.com/NousResearch/hermes-agent/pull/1333))
### MCP
- **Selective tool loading** with utility policies — filter which MCP tools are available ([#1302](https://github.com/NousResearch/hermes-agent/pull/1302))
- Auto-reload MCP tools when `mcp_servers` config changes without restart ([#1474](https://github.com/NousResearch/hermes-agent/pull/1474))
- Resolve npx stdio connection failures ([#1291](https://github.com/NousResearch/hermes-agent/pull/1291))
- Preserve MCP toolsets when saving platform tool config ([#1421](https://github.com/NousResearch/hermes-agent/pull/1421))
### Vision
- Unify vision backend gating ([#1367](https://github.com/NousResearch/hermes-agent/pull/1367))
- Surface actual error reason instead of generic message ([#1338](https://github.com/NousResearch/hermes-agent/pull/1338))
- Make Claude image handling work end-to-end ([#1408](https://github.com/NousResearch/hermes-agent/pull/1408))
### Cron
- **Compress cron management into one tool** — single `cronjob` tool replaces multiple commands ([#1343](https://github.com/NousResearch/hermes-agent/pull/1343))
- Suppress duplicate cron sends to auto-delivery targets ([#1357](https://github.com/NousResearch/hermes-agent/pull/1357))
- Persist cron sessions to SQLite ([#1255](https://github.com/NousResearch/hermes-agent/pull/1255))
- Per-job runtime overrides (provider, model, base_url) ([#1398](https://github.com/NousResearch/hermes-agent/pull/1398))
- Atomic write in `save_job_output` to prevent data loss on crash ([#1173](https://github.com/NousResearch/hermes-agent/pull/1173))
- Preserve thread context for `deliver=origin` ([#1437](https://github.com/NousResearch/hermes-agent/pull/1437))
### Patch Tool
- Avoid corrupting pipe chars in V4A patch apply ([#1286](https://github.com/NousResearch/hermes-agent/pull/1286))
- Permissive `block_anchor` thresholds and unicode normalization ([#1539](https://github.com/NousResearch/hermes-agent/pull/1539))
### Delegation
- Add observability metadata to subagent results (model, tokens, duration, tool trace) ([#1175](https://github.com/NousResearch/hermes-agent/pull/1175))
---
## 🧩 Skills Ecosystem
### Skills System
- **Integrate skills.sh** as a hub source alongside ClawHub ([#1303](https://github.com/NousResearch/hermes-agent/pull/1303))
- Secure skill env setup on load ([#1153](https://github.com/NousResearch/hermes-agent/pull/1153))
- Honor policy table for dangerous verdicts ([#1330](https://github.com/NousResearch/hermes-agent/pull/1330))
- Harden ClawHub skill search exact matches ([#1400](https://github.com/NousResearch/hermes-agent/pull/1400))
- Fix ClawHub skill install — use `/download` ZIP endpoint ([#1060](https://github.com/NousResearch/hermes-agent/pull/1060))
- Avoid mislabeling local skills as builtin — by @arceus77-7 ([#862](https://github.com/NousResearch/hermes-agent/pull/862))
### New Skills
- **Linear** project management ([#1230](https://github.com/NousResearch/hermes-agent/pull/1230))
- **X/Twitter** via x-cli ([#1285](https://github.com/NousResearch/hermes-agent/pull/1285))
- **Telephony** — Twilio, SMS, and AI calls ([#1289](https://github.com/NousResearch/hermes-agent/pull/1289))
- **1Password** — by @arceus77-7 ([#883](https://github.com/NousResearch/hermes-agent/pull/883), [#1179](https://github.com/NousResearch/hermes-agent/pull/1179))
- **NeuroSkill BCI** integration ([#1135](https://github.com/NousResearch/hermes-agent/pull/1135))
- **Blender MCP** for 3D modeling ([#1531](https://github.com/NousResearch/hermes-agent/pull/1531))
- **OSS Security Forensics** ([#1482](https://github.com/NousResearch/hermes-agent/pull/1482))
- **Parallel CLI** research skill ([#1301](https://github.com/NousResearch/hermes-agent/pull/1301))
- **OpenCode** CLI skill ([#1174](https://github.com/NousResearch/hermes-agent/pull/1174))
- **ASCII Video** skill refactored — by @SHL0MS ([#1213](https://github.com/NousResearch/hermes-agent/pull/1213), [#1598](https://github.com/NousResearch/hermes-agent/pull/1598))
---
## 🎙️ Voice Mode
- Voice mode foundation — push-to-talk CLI, Telegram/Discord voice notes ([#1299](https://github.com/NousResearch/hermes-agent/pull/1299))
- Free local Whisper transcription via faster-whisper ([#1185](https://github.com/NousResearch/hermes-agent/pull/1185))
- Discord voice channel reliability fixes ([#1429](https://github.com/NousResearch/hermes-agent/pull/1429))
- Restore local STT fallback for gateway voice notes ([#1490](https://github.com/NousResearch/hermes-agent/pull/1490))
- Honor `stt.enabled: false` across gateway transcription ([#1394](https://github.com/NousResearch/hermes-agent/pull/1394))
- Fix bogus incapability message on Telegram voice notes (Issue [#1033](https://github.com/NousResearch/hermes-agent/issues/1033))
---
## 🔌 ACP (IDE Integration)
- Restore ACP server implementation ([#1254](https://github.com/NousResearch/hermes-agent/pull/1254))
- Support slash commands in ACP adapter ([#1532](https://github.com/NousResearch/hermes-agent/pull/1532))
---
## 🧪 RL Training
- **Agentic On-Policy Distillation (OPD)** environment — new RL training environment for agent policy distillation ([#1149](https://github.com/NousResearch/hermes-agent/pull/1149))
- Make tinker-atropos RL training fully optional ([#1062](https://github.com/NousResearch/hermes-agent/pull/1062))
---
## 🔒 Security & Reliability
### Security Hardening
- **Tirith pre-exec command scanning** — static analysis of terminal commands before execution ([#1256](https://github.com/NousResearch/hermes-agent/pull/1256))
- **PII redaction** when `privacy.redact_pii` is enabled ([#1542](https://github.com/NousResearch/hermes-agent/pull/1542))
- Strip Hermes provider/gateway/tool env vars from all subprocess environments ([#1157](https://github.com/NousResearch/hermes-agent/pull/1157), [#1172](https://github.com/NousResearch/hermes-agent/pull/1172), [#1399](https://github.com/NousResearch/hermes-agent/pull/1399), [#1419](https://github.com/NousResearch/hermes-agent/pull/1419))
- Docker cwd workspace mount now explicit opt-in — never auto-mount host directories ([#1534](https://github.com/NousResearch/hermes-agent/pull/1534))
- Escape parens and braces in fork bomb regex pattern ([#1397](https://github.com/NousResearch/hermes-agent/pull/1397))
- Harden `.worktreeinclude` path containment ([#1388](https://github.com/NousResearch/hermes-agent/pull/1388))
- Use description as `pattern_key` to prevent approval collisions ([#1395](https://github.com/NousResearch/hermes-agent/pull/1395))
### Reliability
- Guard init-time stdio writes ([#1271](https://github.com/NousResearch/hermes-agent/pull/1271))
- Session log writes reuse shared atomic JSON helper ([#1280](https://github.com/NousResearch/hermes-agent/pull/1280))
- Atomic temp cleanup protected on interrupts ([#1401](https://github.com/NousResearch/hermes-agent/pull/1401))
---
## 🐛 Notable Bug Fixes
- **`/status` always showing 0 tokens** — now reports live state (Issue [#1465](https://github.com/NousResearch/hermes-agent/issues/1465), [#1476](https://github.com/NousResearch/hermes-agent/pull/1476))
- **Custom model endpoints not working** — restored config-saved endpoint resolution (Issue [#1460](https://github.com/NousResearch/hermes-agent/issues/1460), [#1373](https://github.com/NousResearch/hermes-agent/pull/1373))
- **MCP tools not visible until restart** — auto-reload on config change (Issue [#1036](https://github.com/NousResearch/hermes-agent/issues/1036), [#1474](https://github.com/NousResearch/hermes-agent/pull/1474))
- **`hermes tools` removing MCP tools** — preserve MCP toolsets when saving (Issue [#1247](https://github.com/NousResearch/hermes-agent/issues/1247), [#1421](https://github.com/NousResearch/hermes-agent/pull/1421))
- **Terminal subprocesses inheriting `OPENAI_BASE_URL`** breaking external tools (Issue [#1002](https://github.com/NousResearch/hermes-agent/issues/1002), [#1399](https://github.com/NousResearch/hermes-agent/pull/1399))
- **Background process lost on gateway restart** — improved recovery (Issue [#1144](https://github.com/NousResearch/hermes-agent/issues/1144))
- **Cron jobs not persisting state** — now stored in SQLite (Issue [#1416](https://github.com/NousResearch/hermes-agent/issues/1416), [#1255](https://github.com/NousResearch/hermes-agent/pull/1255))
- **Cronjob `deliver: origin` not preserving thread context** (Issue [#1219](https://github.com/NousResearch/hermes-agent/issues/1219), [#1437](https://github.com/NousResearch/hermes-agent/pull/1437))
- **Gateway systemd service failing to auto-restart** when browser processes orphaned (Issue [#1617](https://github.com/NousResearch/hermes-agent/issues/1617))
- **`/background` completion report cut off in Telegram** (Issue [#1443](https://github.com/NousResearch/hermes-agent/issues/1443))
- **Model switching not taking effect** (Issue [#1244](https://github.com/NousResearch/hermes-agent/issues/1244), [#1183](https://github.com/NousResearch/hermes-agent/pull/1183))
- **`hermes doctor` reporting cronjob as unavailable** (Issue [#878](https://github.com/NousResearch/hermes-agent/issues/878), [#1180](https://github.com/NousResearch/hermes-agent/pull/1180))
- **WhatsApp bridge messages not received** from mobile (Issue [#1142](https://github.com/NousResearch/hermes-agent/issues/1142))
- **Setup wizard hanging on headless SSH** (Issue [#905](https://github.com/NousResearch/hermes-agent/issues/905), [#1274](https://github.com/NousResearch/hermes-agent/pull/1274))
- **Log handler accumulation** degrading gateway performance (Issue [#990](https://github.com/NousResearch/hermes-agent/issues/990), [#1251](https://github.com/NousResearch/hermes-agent/pull/1251))
- **Gateway NULL model in DB** (Issue [#987](https://github.com/NousResearch/hermes-agent/issues/987), [#1306](https://github.com/NousResearch/hermes-agent/pull/1306))
- **Strict endpoints rejecting replayed tool_calls** (Issue [#893](https://github.com/NousResearch/hermes-agent/issues/893))
- **Remaining hardcoded `~/.hermes` paths** — all now respect `HERMES_HOME` (Issue [#892](https://github.com/NousResearch/hermes-agent/issues/892), [#1233](https://github.com/NousResearch/hermes-agent/pull/1233))
- **Delegate tool not working with custom inference providers** (Issue [#1011](https://github.com/NousResearch/hermes-agent/issues/1011), [#1328](https://github.com/NousResearch/hermes-agent/pull/1328))
- **Skills Guard blocking official skills** (Issue [#1006](https://github.com/NousResearch/hermes-agent/issues/1006), [#1330](https://github.com/NousResearch/hermes-agent/pull/1330))
- **Setup writing provider before model selection** (Issue [#1182](https://github.com/NousResearch/hermes-agent/issues/1182))
- **`GatewayConfig.get()` AttributeError** crashing all message handling (Issue [#1158](https://github.com/NousResearch/hermes-agent/issues/1158), [#1287](https://github.com/NousResearch/hermes-agent/pull/1287))
- **`/update` hard-failing with "command not found"** (Issue [#1049](https://github.com/NousResearch/hermes-agent/issues/1049))
- **Image analysis failing silently** (Issue [#1034](https://github.com/NousResearch/hermes-agent/issues/1034), [#1338](https://github.com/NousResearch/hermes-agent/pull/1338))
- **API `BadRequestError` from `'dict'` object has no attribute `'strip'`** (Issue [#1071](https://github.com/NousResearch/hermes-agent/issues/1071))
- **Slash commands requiring exact full name** — now uses prefix matching (Issue [#928](https://github.com/NousResearch/hermes-agent/issues/928), [#1320](https://github.com/NousResearch/hermes-agent/pull/1320))
- **Gateway stops responding when terminal is closed on headless** (Issue [#1005](https://github.com/NousResearch/hermes-agent/issues/1005))
---
## 🧪 Testing
- Cover empty cached Anthropic tool-call turns ([#1222](https://github.com/NousResearch/hermes-agent/pull/1222))
- Fix stale CI assumptions in parser and quick-command coverage ([#1236](https://github.com/NousResearch/hermes-agent/pull/1236))
- Fix gateway async tests without implicit event loop ([#1278](https://github.com/NousResearch/hermes-agent/pull/1278))
- Make gateway async tests xdist-safe ([#1281](https://github.com/NousResearch/hermes-agent/pull/1281))
- Cross-timezone naive timestamp regression for cron ([#1319](https://github.com/NousResearch/hermes-agent/pull/1319))
- Isolate codex provider tests from local env ([#1335](https://github.com/NousResearch/hermes-agent/pull/1335))
- Lock retry replacement semantics ([#1379](https://github.com/NousResearch/hermes-agent/pull/1379))
- Improve error logging in session search tool — by @aydnOktay ([#1533](https://github.com/NousResearch/hermes-agent/pull/1533))
---
## 📚 Documentation
- Comprehensive SOUL.md guide ([#1315](https://github.com/NousResearch/hermes-agent/pull/1315))
- Voice mode documentation ([#1316](https://github.com/NousResearch/hermes-agent/pull/1316), [#1362](https://github.com/NousResearch/hermes-agent/pull/1362))
- Provider contribution guide ([#1361](https://github.com/NousResearch/hermes-agent/pull/1361))
- ACP and internal systems implementation guides ([#1259](https://github.com/NousResearch/hermes-agent/pull/1259))
- Expand Docusaurus coverage across CLI, tools, skills, and skins ([#1232](https://github.com/NousResearch/hermes-agent/pull/1232))
- Terminal backend and Windows troubleshooting ([#1297](https://github.com/NousResearch/hermes-agent/pull/1297))
- Skills hub reference section ([#1317](https://github.com/NousResearch/hermes-agent/pull/1317))
- Checkpoint, /rollback, and git worktrees guide ([#1493](https://github.com/NousResearch/hermes-agent/pull/1493), [#1524](https://github.com/NousResearch/hermes-agent/pull/1524))
- CLI status bar and /usage reference ([#1523](https://github.com/NousResearch/hermes-agent/pull/1523))
- Fallback providers + /background command docs ([#1430](https://github.com/NousResearch/hermes-agent/pull/1430))
- Gateway service scopes docs ([#1378](https://github.com/NousResearch/hermes-agent/pull/1378))
- Slack thread reply behavior docs ([#1407](https://github.com/NousResearch/hermes-agent/pull/1407))
- Redesigned landing page with Nous blue palette — by @austinpickett ([#974](https://github.com/NousResearch/hermes-agent/pull/974))
- Fix several documentation typos — by @JackTheGit ([#953](https://github.com/NousResearch/hermes-agent/pull/953))
- Stabilize website diagrams ([#1405](https://github.com/NousResearch/hermes-agent/pull/1405))
- CLI vs messaging quick reference in README ([#1491](https://github.com/NousResearch/hermes-agent/pull/1491))
- Add search to Docusaurus ([#1053](https://github.com/NousResearch/hermes-agent/pull/1053))
- Home Assistant integration docs ([#1170](https://github.com/NousResearch/hermes-agent/pull/1170))
---
## 👥 Contributors
### Core
- **@teknium1** — 220+ PRs spanning every area of the codebase
### Top Community Contributors
- **@0xbyt4** (4 PRs) — Anthropic adapter fixes (max_tokens, fallback crash, 429/529 retry), Slack file upload thread context, setup NameError fix
- **@erosika** (1 PR) — Honcho memory integration: async writes, memory modes, session title integration
- **@SHL0MS** (2 PRs) — ASCII video skill design patterns and refactoring
- **@alt-glitch** (2 PRs) — Persistent shell mode for local/SSH backends, setuptools packaging fix
- **@arceus77-7** (2 PRs) — 1Password skill, fix skills list mislabeling
- **@kshitijk4poor** (1 PR) — OpenClaw migration during setup wizard
- **@ASRagab** (1 PR) — Fix adaptive thinking for Claude 4.6 models
- **@eren-karakus0** (1 PR) — Strip Hermes provider env vars from subprocess environment
- **@mr-emmett-one** (1 PR) — Fix DeepSeek V3 parser multi-tool call support
- **@jplew** (1 PR) — Gateway restart on retryable startup failures
- **@brandtcormorant** (1 PR) — Fix Anthropic cache control for empty text blocks
- **@aydnOktay** (1 PR) — Improve error logging in session search tool
- **@austinpickett** (1 PR) — Landing page redesign with Nous blue palette
- **@JackTheGit** (1 PR) — Documentation typo fixes
### All Contributors
@0xbyt4, @alt-glitch, @arceus77-7, @ASRagab, @austinpickett, @aydnOktay, @brandtcormorant, @eren-karakus0, @erosika, @JackTheGit, @jplew, @kshitijk4poor, @mr-emmett-one, @SHL0MS, @teknium1
---
**Full Changelog**: [v2026.3.12...v2026.3.17](https://github.com/NousResearch/hermes-agent/compare/v2026.3.12...v2026.3.17)
+151 -5
View File
@@ -42,7 +42,7 @@ from acp_adapter.events import (
make_tool_progress_cb,
)
from acp_adapter.permissions import make_approval_callback
from acp_adapter.session import SessionManager
from acp_adapter.session import SessionManager, SessionState
logger = logging.getLogger(__name__)
@@ -226,10 +226,19 @@ class HermesACPAgent(acp.Agent):
logger.error("prompt: session %s not found", session_id)
return PromptResponse(stop_reason="refusal")
user_text = _extract_text(prompt)
if not user_text.strip():
user_text = _extract_text(prompt).strip()
if not user_text:
return PromptResponse(stop_reason="end_turn")
# Intercept slash commands — handle locally without calling the LLM
if user_text.startswith("/"):
response_text = self._handle_slash_command(user_text, state)
if response_text is not None:
if self._conn:
update = acp.update_agent_message_text(response_text)
await self._conn.session_update(session_id, update)
return PromptResponse(stop_reason="end_turn")
logger.info("Prompt on session %s: %s", session_id, user_text[:100])
conn = self._conn
@@ -315,12 +324,149 @@ class HermesACPAgent(acp.Agent):
stop_reason = "cancelled" if state.cancel_event and state.cancel_event.is_set() else "end_turn"
return PromptResponse(stop_reason=stop_reason, usage=usage)
# ---- Model switching ----------------------------------------------------
# ---- Slash commands (headless) -------------------------------------------
_SLASH_COMMANDS = {
"help": "Show available commands",
"model": "Show or change current model",
"tools": "List available tools",
"context": "Show conversation context info",
"reset": "Clear conversation history",
"compact": "Compress conversation context",
"version": "Show Hermes version",
}
def _handle_slash_command(self, text: str, state: SessionState) -> str | None:
"""Dispatch a slash command and return the response text.
Returns ``None`` for unrecognized commands so they fall through
to the LLM (the user may have typed ``/something`` as prose).
"""
parts = text.split(maxsplit=1)
cmd = parts[0].lstrip("/").lower()
args = parts[1].strip() if len(parts) > 1 else ""
handler = {
"help": self._cmd_help,
"model": self._cmd_model,
"tools": self._cmd_tools,
"context": self._cmd_context,
"reset": self._cmd_reset,
"compact": self._cmd_compact,
"version": self._cmd_version,
}.get(cmd)
if handler is None:
return None # not a known command — let the LLM handle it
try:
return handler(args, state)
except Exception as e:
logger.error("Slash command /%s error: %s", cmd, e, exc_info=True)
return f"Error executing /{cmd}: {e}"
def _cmd_help(self, args: str, state: SessionState) -> str:
lines = ["Available commands:", ""]
for cmd, desc in self._SLASH_COMMANDS.items():
lines.append(f" /{cmd:10s} {desc}")
lines.append("")
lines.append("Unrecognized /commands are sent to the model as normal messages.")
return "\n".join(lines)
def _cmd_model(self, args: str, state: SessionState) -> str:
if not args:
model = state.model or getattr(state.agent, "model", "unknown")
provider = getattr(state.agent, "provider", None) or "auto"
return f"Current model: {model}\nProvider: {provider}"
new_model = args.strip()
target_provider = None
# Auto-detect provider for the requested model
try:
from hermes_cli.models import parse_model_input, detect_provider_for_model
current_provider = getattr(state.agent, "provider", None) or "openrouter"
target_provider, new_model = parse_model_input(new_model, current_provider)
if target_provider == current_provider:
detected = detect_provider_for_model(new_model, current_provider)
if detected:
target_provider, new_model = detected
except Exception:
logger.debug("Provider detection failed, using model as-is", exc_info=True)
state.model = new_model
state.agent = self.session_manager._make_agent(
session_id=state.session_id,
cwd=state.cwd,
model=new_model,
)
provider_label = target_provider or getattr(state.agent, "provider", "auto")
logger.info("Session %s: model switched to %s", state.session_id, new_model)
return f"Model switched to: {new_model}\nProvider: {provider_label}"
def _cmd_tools(self, args: str, state: SessionState) -> str:
try:
from model_tools import get_tool_definitions
toolsets = getattr(state.agent, "enabled_toolsets", None) or ["hermes-acp"]
tools = get_tool_definitions(enabled_toolsets=toolsets, quiet_mode=True)
if not tools:
return "No tools available."
lines = [f"Available tools ({len(tools)}):"]
for t in tools:
name = t.get("function", {}).get("name", "?")
desc = t.get("function", {}).get("description", "")
# Truncate long descriptions
if len(desc) > 80:
desc = desc[:77] + "..."
lines.append(f" {name}: {desc}")
return "\n".join(lines)
except Exception as e:
return f"Could not list tools: {e}"
def _cmd_context(self, args: str, state: SessionState) -> str:
n_messages = len(state.history)
if n_messages == 0:
return "Conversation is empty (no messages yet)."
# Count by role
roles: dict[str, int] = {}
for msg in state.history:
role = msg.get("role", "unknown")
roles[role] = roles.get(role, 0) + 1
lines = [
f"Conversation: {n_messages} messages",
f" user: {roles.get('user', 0)}, assistant: {roles.get('assistant', 0)}, "
f"tool: {roles.get('tool', 0)}, system: {roles.get('system', 0)}",
]
model = state.model or getattr(state.agent, "model", "")
if model:
lines.append(f"Model: {model}")
return "\n".join(lines)
def _cmd_reset(self, args: str, state: SessionState) -> str:
state.history.clear()
return "Conversation history cleared."
def _cmd_compact(self, args: str, state: SessionState) -> str:
if not state.history:
return "Nothing to compress — conversation is empty."
try:
agent = state.agent
if hasattr(agent, "compress_context"):
agent.compress_context(state.history)
return f"Context compressed. Messages: {len(state.history)}"
return "Context compression not available for this agent."
except Exception as e:
return f"Compression failed: {e}"
def _cmd_version(self, args: str, state: SessionState) -> str:
return f"Hermes Agent v{HERMES_VERSION}"
# ---- Model switching (ACP protocol method) -------------------------------
async def set_session_model(
self, model_id: str, session_id: str, **kwargs: Any
):
"""Switch the model for a session."""
"""Switch the model for a session (called by ACP protocol)."""
state = self.session_manager.get_session(session_id)
if state:
state.model = model_id
+297 -10
View File
@@ -45,14 +45,19 @@ _COMMON_BETAS = [
"fine-grained-tool-streaming-2025-05-14",
]
# Additional beta headers required for OAuth/subscription auth
# Both clawdbot and OpenCode include claude-code-20250219 alongside oauth-2025-04-20.
# Without claude-code-20250219, Anthropic's API rejects OAuth tokens with 401.
# Additional beta headers required for OAuth/subscription auth.
# Matches what Claude Code (and pi-ai / OpenCode) send.
_OAUTH_ONLY_BETAS = [
"claude-code-20250219",
"oauth-2025-04-20",
]
# Claude Code identity — required for OAuth requests to be routed correctly.
# Without these, Anthropic's infrastructure intermittently 500s OAuth traffic.
_CLAUDE_CODE_VERSION = "2.1.2"
_CLAUDE_CODE_SYSTEM_PREFIX = "You are Claude Code, Anthropic's official CLI for Claude."
_MCP_TOOL_PREFIX = "mcp_"
def _is_oauth_token(key: str) -> bool:
"""Check if the key is an OAuth/setup token (not a regular Console API key).
@@ -88,10 +93,16 @@ def build_anthropic_client(api_key: str, base_url: str = None):
kwargs["base_url"] = base_url
if _is_oauth_token(api_key):
# OAuth access token / setup-token → Bearer auth + beta headers
# OAuth access token / setup-token → Bearer auth + Claude Code identity.
# Anthropic routes OAuth requests based on user-agent and headers;
# without Claude Code's fingerprint, requests get intermittent 500s.
all_betas = _COMMON_BETAS + _OAUTH_ONLY_BETAS
kwargs["auth_token"] = api_key
kwargs["default_headers"] = {"anthropic-beta": ",".join(all_betas)}
kwargs["default_headers"] = {
"anthropic-beta": ",".join(all_betas),
"user-agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
"x-app": "cli",
}
else:
# Regular API key → x-api-key header + common betas
kwargs["api_key"] = api_key
@@ -189,7 +200,10 @@ def _refresh_oauth_token(creds: Dict[str, Any]) -> Optional[str]:
req = urllib.request.Request(
"https://console.anthropic.com/v1/oauth/token",
data=data,
headers={"Content-Type": "application/x-www-form-urlencoded"},
headers={
"Content-Type": "application/x-www-form-urlencoded",
"User-Agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
},
method="POST",
)
@@ -332,12 +346,24 @@ def resolve_anthropic_token() -> Optional[str]:
return preferred
return cc_token
# 3. Claude Code credential file
# 3. Hermes-managed OAuth credentials (~/.hermes/.anthropic_oauth.json)
hermes_creds = read_hermes_oauth_credentials()
if hermes_creds:
if is_claude_code_token_valid(hermes_creds):
logger.debug("Using Hermes-managed OAuth credentials")
return hermes_creds["accessToken"]
# Expired — try refresh
logger.debug("Hermes OAuth token expired — attempting refresh")
refreshed = refresh_hermes_oauth_token()
if refreshed:
return refreshed
# 4. Claude Code credential file
resolved_claude_token = _resolve_claude_code_token_from_credentials(creds)
if resolved_claude_token:
return resolved_claude_token
# 4. Regular API key, or a legacy OAuth token saved in ANTHROPIC_API_KEY.
# 5. Regular API key, or a legacy OAuth token saved in ANTHROPIC_API_KEY.
# This remains as a compatibility fallback for pre-migration Hermes configs.
api_key = os.getenv("ANTHROPIC_API_KEY", "").strip()
if api_key:
@@ -386,6 +412,215 @@ def run_oauth_setup_token() -> Optional[str]:
return None
# ── Hermes-native PKCE OAuth flow ────────────────────────────────────────
# Mirrors the flow used by Claude Code, pi-ai, and OpenCode.
# Stores credentials in ~/.hermes/.anthropic_oauth.json (our own file).
_OAUTH_CLIENT_ID = "9d1c250a-e61b-44d9-88ed-5944d1962f5e"
_OAUTH_TOKEN_URL = "https://console.anthropic.com/v1/oauth/token"
_OAUTH_REDIRECT_URI = "https://console.anthropic.com/oauth/code/callback"
_OAUTH_SCOPES = "org:create_api_key user:profile user:inference"
_HERMES_OAUTH_FILE = Path(os.getenv("HERMES_HOME", str(Path.home() / ".hermes"))) / ".anthropic_oauth.json"
def _generate_pkce() -> tuple:
"""Generate PKCE code_verifier and code_challenge (S256)."""
import base64
import hashlib
import secrets
verifier = base64.urlsafe_b64encode(secrets.token_bytes(32)).rstrip(b"=").decode()
challenge = base64.urlsafe_b64encode(
hashlib.sha256(verifier.encode()).digest()
).rstrip(b"=").decode()
return verifier, challenge
def run_hermes_oauth_login() -> Optional[str]:
"""Run Hermes-native OAuth PKCE flow for Claude Pro/Max subscription.
Opens a browser to claude.ai for authorization, prompts for the code,
exchanges it for tokens, and stores them in ~/.hermes/.anthropic_oauth.json.
Returns the access token on success, None on failure.
"""
import time
import webbrowser
verifier, challenge = _generate_pkce()
# Build authorization URL
params = {
"code": "true",
"client_id": _OAUTH_CLIENT_ID,
"response_type": "code",
"redirect_uri": _OAUTH_REDIRECT_URI,
"scope": _OAUTH_SCOPES,
"code_challenge": challenge,
"code_challenge_method": "S256",
"state": verifier,
}
from urllib.parse import urlencode
auth_url = f"https://claude.ai/oauth/authorize?{urlencode(params)}"
print()
print("Authorize Hermes with your Claude Pro/Max subscription.")
print()
print("╭─ Claude Pro/Max Authorization ────────────────────╮")
print("│ │")
print("│ Open this link in your browser: │")
print("╰───────────────────────────────────────────────────╯")
print()
print(f" {auth_url}")
print()
# Try to open browser automatically (works on desktop, silently fails on headless/SSH)
try:
webbrowser.open(auth_url)
print(" (Browser opened automatically)")
except Exception:
pass
print()
print("After authorizing, you'll see a code. Paste it below.")
print()
try:
auth_code = input("Authorization code: ").strip()
except (KeyboardInterrupt, EOFError):
return None
if not auth_code:
print("No code entered.")
return None
# Split code#state format
splits = auth_code.split("#")
code = splits[0]
state = splits[1] if len(splits) > 1 else ""
# Exchange code for tokens
try:
import urllib.request
exchange_data = json.dumps({
"grant_type": "authorization_code",
"client_id": _OAUTH_CLIENT_ID,
"code": code,
"state": state,
"redirect_uri": _OAUTH_REDIRECT_URI,
"code_verifier": verifier,
}).encode()
req = urllib.request.Request(
_OAUTH_TOKEN_URL,
data=exchange_data,
headers={
"Content-Type": "application/json",
"User-Agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
},
method="POST",
)
with urllib.request.urlopen(req, timeout=15) as resp:
result = json.loads(resp.read().decode())
except Exception as e:
print(f"Token exchange failed: {e}")
return None
access_token = result.get("access_token", "")
refresh_token = result.get("refresh_token", "")
expires_in = result.get("expires_in", 3600)
if not access_token:
print("No access token in response.")
return None
# Store credentials
expires_at_ms = int(time.time() * 1000) + (expires_in * 1000)
_save_hermes_oauth_credentials(access_token, refresh_token, expires_at_ms)
# Also write to Claude Code's credential file for backward compat
_write_claude_code_credentials(access_token, refresh_token, expires_at_ms)
print("Authentication successful!")
return access_token
def _save_hermes_oauth_credentials(access_token: str, refresh_token: str, expires_at_ms: int) -> None:
"""Save OAuth credentials to ~/.hermes/.anthropic_oauth.json."""
data = {
"accessToken": access_token,
"refreshToken": refresh_token,
"expiresAt": expires_at_ms,
}
try:
_HERMES_OAUTH_FILE.parent.mkdir(parents=True, exist_ok=True)
_HERMES_OAUTH_FILE.write_text(json.dumps(data, indent=2), encoding="utf-8")
_HERMES_OAUTH_FILE.chmod(0o600)
except (OSError, IOError) as e:
logger.debug("Failed to save Hermes OAuth credentials: %s", e)
def read_hermes_oauth_credentials() -> Optional[Dict[str, Any]]:
"""Read Hermes-managed OAuth credentials from ~/.hermes/.anthropic_oauth.json."""
if _HERMES_OAUTH_FILE.exists():
try:
data = json.loads(_HERMES_OAUTH_FILE.read_text(encoding="utf-8"))
if data.get("accessToken"):
return data
except (json.JSONDecodeError, OSError, IOError) as e:
logger.debug("Failed to read Hermes OAuth credentials: %s", e)
return None
def refresh_hermes_oauth_token() -> Optional[str]:
"""Refresh the Hermes-managed OAuth token using the stored refresh token.
Returns the new access token, or None if refresh fails.
"""
import time
import urllib.request
creds = read_hermes_oauth_credentials()
if not creds or not creds.get("refreshToken"):
return None
try:
data = json.dumps({
"grant_type": "refresh_token",
"refresh_token": creds["refreshToken"],
"client_id": _OAUTH_CLIENT_ID,
}).encode()
req = urllib.request.Request(
_OAUTH_TOKEN_URL,
data=data,
headers={
"Content-Type": "application/json",
"User-Agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
},
method="POST",
)
with urllib.request.urlopen(req, timeout=10) as resp:
result = json.loads(resp.read().decode())
new_access = result.get("access_token", "")
new_refresh = result.get("refresh_token", creds["refreshToken"])
expires_in = result.get("expires_in", 3600)
if new_access:
new_expires_ms = int(time.time() * 1000) + (expires_in * 1000)
_save_hermes_oauth_credentials(new_access, new_refresh, new_expires_ms)
# Also update Claude Code's credential file
_write_claude_code_credentials(new_access, new_refresh, new_expires_ms)
logger.debug("Successfully refreshed Hermes OAuth token")
return new_access
except Exception as e:
logger.debug("Failed to refresh Hermes OAuth token: %s", e)
return None
# ---------------------------------------------------------------------------
# Message / tool / response format conversion
# ---------------------------------------------------------------------------
@@ -714,14 +949,59 @@ def build_anthropic_kwargs(
max_tokens: Optional[int],
reasoning_config: Optional[Dict[str, Any]],
tool_choice: Optional[str] = None,
is_oauth: bool = False,
) -> Dict[str, Any]:
"""Build kwargs for anthropic.messages.create()."""
"""Build kwargs for anthropic.messages.create().
When *is_oauth* is True, applies Claude Code compatibility transforms:
system prompt prefix, tool name prefixing, and prompt sanitization.
"""
system, anthropic_messages = convert_messages_to_anthropic(messages)
anthropic_tools = convert_tools_to_anthropic(tools) if tools else []
model = normalize_model_name(model)
effective_max_tokens = max_tokens or 16384
# ── OAuth: Claude Code identity ──────────────────────────────────
if is_oauth:
# 1. Prepend Claude Code system prompt identity
cc_block = {"type": "text", "text": _CLAUDE_CODE_SYSTEM_PREFIX}
if isinstance(system, list):
system = [cc_block] + system
elif isinstance(system, str) and system:
system = [cc_block, {"type": "text", "text": system}]
else:
system = [cc_block]
# 2. Sanitize system prompt — replace product name references
# to avoid Anthropic's server-side content filters.
for block in system:
if isinstance(block, dict) and block.get("type") == "text":
text = block.get("text", "")
text = text.replace("Hermes Agent", "Claude Code")
text = text.replace("Hermes agent", "Claude Code")
text = text.replace("hermes-agent", "claude-code")
text = text.replace("Nous Research", "Anthropic")
block["text"] = text
# 3. Prefix tool names with mcp_ (Claude Code convention)
if anthropic_tools:
for tool in anthropic_tools:
if "name" in tool:
tool["name"] = _MCP_TOOL_PREFIX + tool["name"]
# 4. Prefix tool names in message history (tool_use and tool_result blocks)
for msg in anthropic_messages:
content = msg.get("content")
if isinstance(content, list):
for block in content:
if isinstance(block, dict):
if block.get("type") == "tool_use" and "name" in block:
if not block["name"].startswith(_MCP_TOOL_PREFIX):
block["name"] = _MCP_TOOL_PREFIX + block["name"]
elif block.get("type") == "tool_result" and "tool_use_id" in block:
pass # tool_result uses ID, not name
kwargs: Dict[str, Any] = {
"model": model,
"messages": anthropic_messages,
@@ -768,11 +1048,15 @@ def build_anthropic_kwargs(
def normalize_anthropic_response(
response,
strip_tool_prefix: bool = False,
) -> Tuple[SimpleNamespace, str]:
"""Normalize Anthropic response to match the shape expected by AIAgent.
Returns (assistant_message, finish_reason) where assistant_message has
.content, .tool_calls, and .reasoning attributes.
When *strip_tool_prefix* is True, removes the ``mcp_`` prefix that was
added to tool names for OAuth Claude Code compatibility.
"""
text_parts = []
reasoning_parts = []
@@ -784,12 +1068,15 @@ def normalize_anthropic_response(
elif block.type == "thinking":
reasoning_parts.append(block.thinking)
elif block.type == "tool_use":
name = block.name
if strip_tool_prefix and name.startswith(_MCP_TOOL_PREFIX):
name = name[len(_MCP_TOOL_PREFIX):]
tool_calls.append(
SimpleNamespace(
id=block.id,
type="function",
function=SimpleNamespace(
name=block.name,
name=name,
arguments=json.dumps(block.input),
),
)
+1
View File
@@ -57,6 +57,7 @@ _API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
"minimax": "MiniMax-M2.5-highspeed",
"minimax-cn": "MiniMax-M2.5-highspeed",
"anthropic": "claude-haiku-4-5-20251001",
"ai-gateway": "google/gemini-3-flash",
}
# OpenRouter app attribution headers
+26
View File
@@ -59,6 +59,32 @@ def get_skin_tool_prefix() -> str:
return ""
def get_tool_emoji(tool_name: str, default: str = "") -> str:
"""Get the display emoji for a tool.
Resolution order:
1. Active skin's ``tool_emojis`` overrides (if a skin is loaded)
2. Tool registry's per-tool ``emoji`` field
3. *default* fallback
"""
# 1. Skin override
skin = _get_skin()
if skin and skin.tool_emojis:
override = skin.tool_emojis.get(tool_name)
if override:
return override
# 2. Registry default
try:
from tools.registry import registry
emoji = registry.get_emoji(tool_name, default="")
if emoji:
return emoji
except Exception:
pass
# 3. Hardcoded fallback
return default
# =========================================================================
# Tool preview (one-line summary of a tool call's primary argument)
# =========================================================================
+7 -106
View File
@@ -20,65 +20,16 @@ import json
import time
from collections import Counter, defaultdict
from datetime import datetime
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List
# =========================================================================
# Model pricing (USD per million tokens) — approximate as of early 2026
# =========================================================================
MODEL_PRICING = {
# OpenAI
"gpt-4o": {"input": 2.50, "output": 10.00},
"gpt-4o-mini": {"input": 0.15, "output": 0.60},
"gpt-4.1": {"input": 2.00, "output": 8.00},
"gpt-4.1-mini": {"input": 0.40, "output": 1.60},
"gpt-4.1-nano": {"input": 0.10, "output": 0.40},
"gpt-4.5-preview": {"input": 75.00, "output": 150.00},
"gpt-5": {"input": 10.00, "output": 30.00},
"gpt-5.4": {"input": 10.00, "output": 30.00},
"o3": {"input": 10.00, "output": 40.00},
"o3-mini": {"input": 1.10, "output": 4.40},
"o4-mini": {"input": 1.10, "output": 4.40},
# Anthropic
"claude-opus-4-20250514": {"input": 15.00, "output": 75.00},
"claude-sonnet-4-20250514": {"input": 3.00, "output": 15.00},
"claude-3-5-sonnet-20241022": {"input": 3.00, "output": 15.00},
"claude-3-5-haiku-20241022": {"input": 0.80, "output": 4.00},
"claude-3-opus-20240229": {"input": 15.00, "output": 75.00},
"claude-3-haiku-20240307": {"input": 0.25, "output": 1.25},
# DeepSeek
"deepseek-chat": {"input": 0.14, "output": 0.28},
"deepseek-reasoner": {"input": 0.55, "output": 2.19},
# Google
"gemini-2.5-pro": {"input": 1.25, "output": 10.00},
"gemini-2.5-flash": {"input": 0.15, "output": 0.60},
"gemini-2.0-flash": {"input": 0.10, "output": 0.40},
# Meta (via providers)
"llama-4-maverick": {"input": 0.50, "output": 0.70},
"llama-4-scout": {"input": 0.20, "output": 0.30},
# Z.AI / GLM (direct provider — pricing not published externally, treat as local)
"glm-5": {"input": 0.0, "output": 0.0},
"glm-4.7": {"input": 0.0, "output": 0.0},
"glm-4.5": {"input": 0.0, "output": 0.0},
"glm-4.5-flash": {"input": 0.0, "output": 0.0},
# Kimi / Moonshot (direct provider — pricing not published externally, treat as local)
"kimi-k2.5": {"input": 0.0, "output": 0.0},
"kimi-k2-thinking": {"input": 0.0, "output": 0.0},
"kimi-k2-turbo-preview": {"input": 0.0, "output": 0.0},
"kimi-k2-0905-preview": {"input": 0.0, "output": 0.0},
# MiniMax (direct provider — pricing not published externally, treat as local)
"MiniMax-M2.5": {"input": 0.0, "output": 0.0},
"MiniMax-M2.5-highspeed": {"input": 0.0, "output": 0.0},
"MiniMax-M2.1": {"input": 0.0, "output": 0.0},
}
from agent.usage_pricing import DEFAULT_PRICING, estimate_cost_usd, format_duration_compact, get_pricing, has_known_pricing
# Fallback: unknown/custom models get zero cost (we can't assume pricing
# for self-hosted models, custom OAI endpoints, local inference, etc.)
_DEFAULT_PRICING = {"input": 0.0, "output": 0.0}
_DEFAULT_PRICING = DEFAULT_PRICING
def _has_known_pricing(model_name: str) -> bool:
"""Check if a model has known pricing (vs unknown/custom endpoint)."""
return _get_pricing(model_name) is not _DEFAULT_PRICING
return has_known_pricing(model_name)
def _get_pricing(model_name: str) -> Dict[str, float]:
@@ -87,67 +38,17 @@ def _get_pricing(model_name: str) -> Dict[str, float]:
Returns _DEFAULT_PRICING (zero cost) for unknown/custom models —
we can't assume costs for self-hosted endpoints, local inference, etc.
"""
if not model_name:
return _DEFAULT_PRICING
# Strip provider prefix (e.g., "anthropic/claude-..." -> "claude-...")
bare = model_name.split("/")[-1].lower()
# Exact match first
if bare in MODEL_PRICING:
return MODEL_PRICING[bare]
# Fuzzy prefix match — prefer the LONGEST matching key to avoid
# e.g. "gpt-4o" matching before "gpt-4o-mini" for "gpt-4o-mini-2024-07-18"
best_match = None
best_len = 0
for key, price in MODEL_PRICING.items():
if bare.startswith(key) and len(key) > best_len:
best_match = price
best_len = len(key)
if best_match:
return best_match
# Keyword heuristics (checked in most-specific-first order)
if "opus" in bare:
return {"input": 15.00, "output": 75.00}
if "sonnet" in bare:
return {"input": 3.00, "output": 15.00}
if "haiku" in bare:
return {"input": 0.80, "output": 4.00}
if "gpt-4o-mini" in bare:
return {"input": 0.15, "output": 0.60}
if "gpt-4o" in bare:
return {"input": 2.50, "output": 10.00}
if "gpt-5" in bare:
return {"input": 10.00, "output": 30.00}
if "deepseek" in bare:
return {"input": 0.14, "output": 0.28}
if "gemini" in bare:
return {"input": 0.15, "output": 0.60}
return _DEFAULT_PRICING
return get_pricing(model_name)
def _estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float:
"""Estimate the USD cost for a given model and token counts."""
pricing = _get_pricing(model)
return (input_tokens * pricing["input"] + output_tokens * pricing["output"]) / 1_000_000
return estimate_cost_usd(model, input_tokens, output_tokens)
def _format_duration(seconds: float) -> str:
"""Format seconds into a human-readable duration string."""
if seconds < 60:
return f"{seconds:.0f}s"
minutes = seconds / 60
if minutes < 60:
return f"{minutes:.0f}m"
hours = minutes / 60
if hours < 24:
remaining_min = int(minutes % 60)
return f"{int(hours)}h {remaining_min}m" if remaining_min else f"{int(hours)}h"
days = hours / 24
return f"{days:.1f}d"
return format_duration_compact(seconds)
def _bar_chart(values: List[int], max_width: int = 20) -> List[str]:
+9
View File
@@ -40,6 +40,8 @@ DEFAULT_CONTEXT_LENGTHS = {
"anthropic/claude-opus-4.6": 200000,
"anthropic/claude-sonnet-4": 200000,
"anthropic/claude-sonnet-4-20250514": 200000,
"anthropic/claude-sonnet-4.5": 200000,
"anthropic/claude-sonnet-4.6": 200000,
"anthropic/claude-haiku-4.5": 200000,
# Bare Anthropic model IDs (for native API provider)
"claude-opus-4-6": 200000,
@@ -50,11 +52,18 @@ DEFAULT_CONTEXT_LENGTHS = {
"claude-opus-4-20250514": 200000,
"claude-sonnet-4-20250514": 200000,
"claude-haiku-4-5-20251001": 200000,
"openai/gpt-5": 128000,
"openai/gpt-4.1": 1047576,
"openai/gpt-4.1-mini": 1047576,
"openai/gpt-4o": 128000,
"openai/gpt-4-turbo": 128000,
"openai/gpt-4o-mini": 128000,
"google/gemini-3-pro-preview": 1048576,
"google/gemini-3-flash": 1048576,
"google/gemini-2.5-flash": 1048576,
"google/gemini-2.0-flash": 1048576,
"google/gemini-2.5-pro": 1048576,
"deepseek/deepseek-v3.2": 65536,
"meta-llama/llama-3.3-70b-instruct": 131072,
"deepseek/deepseek-chat-v3": 65536,
"qwen/qwen-2.5-72b-instruct": 32768,
+17 -5
View File
@@ -73,9 +73,15 @@ DEFAULT_AGENT_IDENTITY = (
MEMORY_GUIDANCE = (
"You have persistent memory across sessions. Save durable facts using the memory "
"tool: user preferences, environment details, tool quirks, and stable conventions. "
"Memory is injected into every turn, so keep it compact. Do NOT save task progress, "
"session outcomes, or completed-work logs to memory; use session_search to recall "
"those from past transcripts."
"Memory is injected into every turn, so keep it compact and focused on facts that "
"will still matter later.\n"
"Prioritize what reduces future user steering — the most valuable memory is one "
"that prevents the user from having to correct or remind you again. "
"User preferences and recurring corrections matter more than procedural task details.\n"
"Do NOT save task progress, session outcomes, completed-work logs, or temporary TODO "
"state to memory; use session_search to recall those from past transcripts. "
"If you've discovered a new way to do something, solved a problem that could be "
"necessary later, save it as a skill with the skill tool."
)
SESSION_SEARCH_GUIDANCE = (
@@ -86,8 +92,11 @@ SESSION_SEARCH_GUIDANCE = (
SKILLS_GUIDANCE = (
"After completing a complex task (5+ tool calls), fixing a tricky error, "
"or discovering a non-trivial workflow, consider saving the approach as a "
"skill with skill_manage so you can reuse it next time."
"or discovering a non-trivial workflow, save the approach as a "
"skill with skill_manage so you can reuse it next time.\n"
"When using a skill and finding it outdated, incomplete, or wrong, "
"patch it immediately with skill_manage(action='patch') — don't wait to be asked. "
"Skills that aren't maintained become liabilities."
)
PLATFORM_HINTS = {
@@ -326,6 +335,9 @@ def build_skills_system_prompt(
"Before replying, scan the skills below. If one clearly matches your task, "
"load it with skill_view(name) and follow its instructions. "
"If a skill has issues, fix it with skill_manage(action='patch').\n"
"After difficult/iterative tasks, offer to save as a skill. "
"If a skill you loaded was missing steps, had wrong commands, or needed "
"pitfalls you discovered, update it before finishing.\n"
"\n"
"<available_skills>\n"
+ "\n".join(index_lines) + "\n"
+184
View File
@@ -0,0 +1,184 @@
"""Helpers for optional cheap-vs-strong model routing."""
from __future__ import annotations
import os
import re
from typing import Any, Dict, Optional
_COMPLEX_KEYWORDS = {
"debug",
"debugging",
"implement",
"implementation",
"refactor",
"patch",
"traceback",
"stacktrace",
"exception",
"error",
"analyze",
"analysis",
"investigate",
"architecture",
"design",
"compare",
"benchmark",
"optimize",
"optimise",
"review",
"terminal",
"shell",
"tool",
"tools",
"pytest",
"test",
"tests",
"plan",
"planning",
"delegate",
"subagent",
"cron",
"docker",
"kubernetes",
}
_URL_RE = re.compile(r"https?://|www\.", re.IGNORECASE)
def _coerce_bool(value: Any, default: bool = False) -> bool:
if value is None:
return default
if isinstance(value, bool):
return value
if isinstance(value, str):
return value.strip().lower() in {"1", "true", "yes", "on"}
return bool(value)
def _coerce_int(value: Any, default: int) -> int:
try:
return int(value)
except (TypeError, ValueError):
return default
def choose_cheap_model_route(user_message: str, routing_config: Optional[Dict[str, Any]]) -> Optional[Dict[str, Any]]:
"""Return the configured cheap-model route when a message looks simple.
Conservative by design: if the message has signs of code/tool/debugging/
long-form work, keep the primary model.
"""
cfg = routing_config or {}
if not _coerce_bool(cfg.get("enabled"), False):
return None
cheap_model = cfg.get("cheap_model") or {}
if not isinstance(cheap_model, dict):
return None
provider = str(cheap_model.get("provider") or "").strip().lower()
model = str(cheap_model.get("model") or "").strip()
if not provider or not model:
return None
text = (user_message or "").strip()
if not text:
return None
max_chars = _coerce_int(cfg.get("max_simple_chars"), 160)
max_words = _coerce_int(cfg.get("max_simple_words"), 28)
if len(text) > max_chars:
return None
if len(text.split()) > max_words:
return None
if text.count("\n") > 1:
return None
if "```" in text or "`" in text:
return None
if _URL_RE.search(text):
return None
lowered = text.lower()
words = {token.strip(".,:;!?()[]{}\"'`") for token in lowered.split()}
if words & _COMPLEX_KEYWORDS:
return None
route = dict(cheap_model)
route["provider"] = provider
route["model"] = model
route["routing_reason"] = "simple_turn"
return route
def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any]], primary: Dict[str, Any]) -> Dict[str, Any]:
"""Resolve the effective model/runtime for one turn.
Returns a dict with model/runtime/signature/label fields.
"""
route = choose_cheap_model_route(user_message, routing_config)
if not route:
return {
"model": primary.get("model"),
"runtime": {
"api_key": primary.get("api_key"),
"base_url": primary.get("base_url"),
"provider": primary.get("provider"),
"api_mode": primary.get("api_mode"),
},
"label": None,
"signature": (
primary.get("model"),
primary.get("provider"),
primary.get("base_url"),
primary.get("api_mode"),
),
}
from hermes_cli.runtime_provider import resolve_runtime_provider
explicit_api_key = None
api_key_env = str(route.get("api_key_env") or "").strip()
if api_key_env:
explicit_api_key = os.getenv(api_key_env) or None
try:
runtime = resolve_runtime_provider(
requested=route.get("provider"),
explicit_api_key=explicit_api_key,
explicit_base_url=route.get("base_url"),
)
except Exception:
return {
"model": primary.get("model"),
"runtime": {
"api_key": primary.get("api_key"),
"base_url": primary.get("base_url"),
"provider": primary.get("provider"),
"api_mode": primary.get("api_mode"),
},
"label": None,
"signature": (
primary.get("model"),
primary.get("provider"),
primary.get("base_url"),
primary.get("api_mode"),
),
}
return {
"model": route.get("model"),
"runtime": {
"api_key": runtime.get("api_key"),
"base_url": runtime.get("base_url"),
"provider": runtime.get("provider"),
"api_mode": runtime.get("api_mode"),
},
"label": f"smart route → {route.get('model')} ({runtime.get('provider')})",
"signature": (
route.get("model"),
runtime.get("provider"),
runtime.get("base_url"),
runtime.get("api_mode"),
),
}
+134
View File
@@ -0,0 +1,134 @@
from __future__ import annotations
from decimal import Decimal
from typing import Dict
MODEL_PRICING = {
"gpt-4o": {"input": 2.50, "output": 10.00},
"gpt-4o-mini": {"input": 0.15, "output": 0.60},
"gpt-4.1": {"input": 2.00, "output": 8.00},
"gpt-4.1-mini": {"input": 0.40, "output": 1.60},
"gpt-4.1-nano": {"input": 0.10, "output": 0.40},
"gpt-4.5-preview": {"input": 75.00, "output": 150.00},
"gpt-5": {"input": 10.00, "output": 30.00},
"gpt-5.4": {"input": 10.00, "output": 30.00},
"o3": {"input": 10.00, "output": 40.00},
"o3-mini": {"input": 1.10, "output": 4.40},
"o4-mini": {"input": 1.10, "output": 4.40},
"claude-opus-4-20250514": {"input": 15.00, "output": 75.00},
"claude-sonnet-4-20250514": {"input": 3.00, "output": 15.00},
"claude-3-5-sonnet-20241022": {"input": 3.00, "output": 15.00},
"claude-3-5-haiku-20241022": {"input": 0.80, "output": 4.00},
"claude-3-opus-20240229": {"input": 15.00, "output": 75.00},
"claude-3-haiku-20240307": {"input": 0.25, "output": 1.25},
"deepseek-chat": {"input": 0.14, "output": 0.28},
"deepseek-reasoner": {"input": 0.55, "output": 2.19},
"gemini-2.5-pro": {"input": 1.25, "output": 10.00},
"gemini-2.5-flash": {"input": 0.15, "output": 0.60},
"gemini-2.0-flash": {"input": 0.10, "output": 0.40},
"llama-4-maverick": {"input": 0.50, "output": 0.70},
"llama-4-scout": {"input": 0.20, "output": 0.30},
"glm-5": {"input": 0.0, "output": 0.0},
"glm-4.7": {"input": 0.0, "output": 0.0},
"glm-4.5": {"input": 0.0, "output": 0.0},
"glm-4.5-flash": {"input": 0.0, "output": 0.0},
"kimi-k2.5": {"input": 0.0, "output": 0.0},
"kimi-k2-thinking": {"input": 0.0, "output": 0.0},
"kimi-k2-turbo-preview": {"input": 0.0, "output": 0.0},
"kimi-k2-0905-preview": {"input": 0.0, "output": 0.0},
"MiniMax-M2.5": {"input": 0.0, "output": 0.0},
"MiniMax-M2.5-highspeed": {"input": 0.0, "output": 0.0},
"MiniMax-M2.1": {"input": 0.0, "output": 0.0},
}
DEFAULT_PRICING = {"input": 0.0, "output": 0.0}
def get_pricing(model_name: str) -> Dict[str, float]:
if not model_name:
return DEFAULT_PRICING
bare = model_name.split("/")[-1].lower()
if bare in MODEL_PRICING:
return MODEL_PRICING[bare]
best_match = None
best_len = 0
for key, price in MODEL_PRICING.items():
if bare.startswith(key) and len(key) > best_len:
best_match = price
best_len = len(key)
if best_match:
return best_match
if "opus" in bare:
return {"input": 15.00, "output": 75.00}
if "sonnet" in bare:
return {"input": 3.00, "output": 15.00}
if "haiku" in bare:
return {"input": 0.80, "output": 4.00}
if "gpt-4o-mini" in bare:
return {"input": 0.15, "output": 0.60}
if "gpt-4o" in bare:
return {"input": 2.50, "output": 10.00}
if "gpt-5" in bare:
return {"input": 10.00, "output": 30.00}
if "deepseek" in bare:
return {"input": 0.14, "output": 0.28}
if "gemini" in bare:
return {"input": 0.15, "output": 0.60}
return DEFAULT_PRICING
def has_known_pricing(model_name: str) -> bool:
pricing = get_pricing(model_name)
return pricing is not DEFAULT_PRICING and any(
float(value) > 0 for value in pricing.values()
)
def estimate_cost_usd(model: str, input_tokens: int, output_tokens: int) -> float:
pricing = get_pricing(model)
total = (
Decimal(input_tokens) * Decimal(str(pricing["input"]))
+ Decimal(output_tokens) * Decimal(str(pricing["output"]))
) / Decimal("1000000")
return float(total)
def format_duration_compact(seconds: float) -> str:
if seconds < 60:
return f"{seconds:.0f}s"
minutes = seconds / 60
if minutes < 60:
return f"{minutes:.0f}m"
hours = minutes / 60
if hours < 24:
remaining_min = int(minutes % 60)
return f"{int(hours)}h {remaining_min}m" if remaining_min else f"{int(hours)}h"
days = hours / 24
return f"{days:.1f}d"
def format_token_count_compact(value: int) -> str:
abs_value = abs(int(value))
if abs_value < 1_000:
return str(int(value))
sign = "-" if value < 0 else ""
units = ((1_000_000_000, "B"), (1_000_000, "M"), (1_000, "K"))
for threshold, suffix in units:
if abs_value >= threshold:
scaled = abs_value / threshold
if scaled < 10:
text = f"{scaled:.2f}"
elif scaled < 100:
text = f"{scaled:.1f}"
else:
text = f"{scaled:.0f}"
text = text.rstrip("0").rstrip(".")
return f"{sign}{text}{suffix}"
return f"{value:,}"
+53 -1
View File
@@ -51,6 +51,20 @@ model:
# # Data policy: "allow" (default) or "deny" to exclude providers that may store data
# # data_collection: "deny"
# =============================================================================
# Smart Model Routing (optional)
# =============================================================================
# Use a cheaper model for short/simple turns while keeping your main model for
# more complex requests. Disabled by default.
#
# smart_model_routing:
# enabled: true
# max_simple_chars: 160
# max_simple_words: 28
# cheap_model:
# provider: openrouter
# model: google/gemini-2.5-flash
# =============================================================================
# Git Worktree Isolation
# =============================================================================
@@ -76,8 +90,9 @@ model:
# - Messaging (Telegram/Discord): Uses MESSAGING_CWD from .env (default: home)
terminal:
backend: "local"
cwd: "." # For local backend: "." = current directory. Ignored for remote backends.
cwd: "." # For local backend: "." = current directory. Ignored for remote backends unless a backend documents otherwise.
timeout: 180
docker_mount_cwd_to_workspace: false # SECURITY: off by default. Opt in to mount the launch cwd into Docker /workspace.
lifetime_seconds: 300
# sudo_password: "" # Enable sudo commands (pipes via sudo -S) - SECURITY WARNING: plaintext!
@@ -107,6 +122,7 @@ terminal:
# timeout: 180
# lifetime_seconds: 300
# docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
# docker_mount_cwd_to_workspace: true # Explicit opt-in: mount your launch cwd into /workspace
# -----------------------------------------------------------------------------
# OPTION 4: Singularity/Apptainer container
@@ -333,6 +349,25 @@ session_reset:
idle_minutes: 1440 # Inactivity timeout in minutes (default: 1440 = 24 hours)
at_hour: 4 # Daily reset hour, 0-23 local time (default: 4 AM)
# When true, group/channel chats use one session per participant when the platform
# provides a user ID. This is the secure default and prevents users in the same
# room from sharing context, interrupts, and token costs. Set false only if you
# explicitly want one shared "room brain" per group/channel.
group_sessions_per_user: true
# ─────────────────────────────────────────────────────────────────────────────
# Gateway Streaming
# ─────────────────────────────────────────────────────────────────────────────
# Stream tokens to messaging platforms in real-time. The bot sends a message
# on first token, then progressively edits it as more tokens arrive.
# Disabled by default — enable to try the streaming UX on Telegram/Discord/Slack.
streaming:
enabled: false
# transport: edit # "edit" = progressive editMessageText
# edit_interval: 0.3 # seconds between message edits
# buffer_threshold: 40 # chars before forcing an edit flush
# cursor: " ▉" # cursor shown during streaming
# =============================================================================
# Skills Configuration
# =============================================================================
@@ -694,6 +729,12 @@ display:
# Toggle at runtime with /reasoning show or /reasoning hide.
show_reasoning: false
# Stream tokens to the terminal as they arrive instead of waiting for the
# full response. The response box opens on first token and text appears
# line-by-line. Tool calls are still captured silently.
# Disabled by default — enable to try the streaming UX.
streaming: false
# ───────────────────────────────────────────────────────────────────────────
# Skin / Theme
# ───────────────────────────────────────────────────────────────────────────
@@ -734,3 +775,14 @@ display:
# tool_prefix: "╎" # Tool output line prefix (default: ┊)
#
skin: default
# =============================================================================
# Privacy
# =============================================================================
# privacy:
# # Redact PII from the LLM context prompt.
# # When true, phone numbers are stripped and user/chat IDs are replaced
# # with deterministic hashes before being sent to the model.
# # Names and usernames are NOT affected (user-chosen, publicly visible).
# # Routing/delivery still uses the original values internally.
# redact_pii: false
+1038 -97
View File
File diff suppressed because it is too large Load Diff
+41 -1
View File
@@ -6,6 +6,7 @@ Output is saved to ~/.hermes/cron/output/{job_id}/{timestamp}.md
"""
import json
import logging
import tempfile
import os
import re
@@ -14,6 +15,8 @@ from datetime import datetime, timedelta
from pathlib import Path
from typing import Optional, Dict, List, Any
logger = logging.getLogger(__name__)
from hermes_time import now as _hermes_now
try:
@@ -528,10 +531,18 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):
def get_due_jobs() -> List[Dict[str, Any]]:
"""Get all jobs that are due to run now."""
"""Get all jobs that are due to run now.
For recurring jobs (cron/interval), if the scheduled time is stale
(more than one period in the past, e.g. because the gateway was down),
the job is fast-forwarded to the next future run instead of firing
immediately. This prevents a burst of missed jobs on gateway restart.
"""
now = _hermes_now()
jobs = [_apply_skill_fields(j) for j in load_jobs()]
raw_jobs = load_jobs() # For saving updates
due = []
needs_save = False
for job in jobs:
if not job.get("enabled", True):
@@ -543,8 +554,37 @@ def get_due_jobs() -> List[Dict[str, Any]]:
next_run_dt = _ensure_aware(datetime.fromisoformat(next_run))
if next_run_dt <= now:
schedule = job.get("schedule", {})
kind = schedule.get("kind")
# For recurring jobs, check if the scheduled time is stale
# (gateway was down and missed the window). Fast-forward to
# the next future occurrence instead of firing a stale run.
if kind in ("cron", "interval") and (now - next_run_dt).total_seconds() > 120:
# More than 2 minutes late — this is a missed run, not a current one.
# Recompute next_run_at to the next future occurrence.
new_next = compute_next_run(schedule, now.isoformat())
if new_next:
logger.info(
"Job '%s' missed its scheduled time (%s). "
"Fast-forwarding to next run: %s",
job.get("name", job["id"]),
next_run,
new_next,
)
# Update the job in storage
for rj in raw_jobs:
if rj["id"] == job["id"]:
rj["next_run_at"] = new_next
needs_save = True
break
continue # Skip this run
due.append(job)
if needs_save:
save_jobs(raw_jobs)
return due
+19 -5
View File
@@ -315,6 +315,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
# Provider routing
pr = _cfg.get("provider_routing", {})
smart_routing = _cfg.get("smart_model_routing", {}) or {}
from hermes_cli.runtime_provider import (
resolve_runtime_provider,
@@ -331,12 +332,25 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
message = format_runtime_provider_error(exc)
raise RuntimeError(message) from exc
from agent.smart_model_routing import resolve_turn_route
turn_route = resolve_turn_route(
prompt,
smart_routing,
{
"model": model,
"api_key": runtime.get("api_key"),
"base_url": runtime.get("base_url"),
"provider": runtime.get("provider"),
"api_mode": runtime.get("api_mode"),
},
)
agent = AIAgent(
model=model,
api_key=runtime.get("api_key"),
base_url=runtime.get("base_url"),
provider=runtime.get("provider"),
api_mode=runtime.get("api_mode"),
model=turn_route["model"],
api_key=turn_route["runtime"].get("api_key"),
base_url=turn_route["runtime"].get("base_url"),
provider=turn_route["runtime"].get("provider"),
api_mode=turn_route["runtime"].get("api_mode"),
max_iterations=max_iterations,
reasoning_config=reasoning_config,
prefill_messages=prefill_messages,
+54 -2
View File
@@ -97,10 +97,11 @@ class SessionResetPolicy:
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "SessionResetPolicy":
# Handle both missing keys and explicit null values (YAML null → None)
mode = data.get("mode")
at_hour = data.get("at_hour")
idle_minutes = data.get("idle_minutes")
return cls(
mode=data.get("mode", "both"),
mode=mode if mode is not None else "both",
at_hour=at_hour if at_hour is not None else 4,
idle_minutes=idle_minutes if idle_minutes is not None else 1440,
)
@@ -145,6 +146,37 @@ class PlatformConfig:
)
@dataclass
class StreamingConfig:
"""Configuration for real-time token streaming to messaging platforms."""
enabled: bool = False
transport: str = "edit" # "edit" (progressive editMessageText) or "off"
edit_interval: float = 0.3 # Seconds between message edits
buffer_threshold: int = 40 # Chars before forcing an edit
cursor: str = "" # Cursor shown during streaming
def to_dict(self) -> Dict[str, Any]:
return {
"enabled": self.enabled,
"transport": self.transport,
"edit_interval": self.edit_interval,
"buffer_threshold": self.buffer_threshold,
"cursor": self.cursor,
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "StreamingConfig":
if not data:
return cls()
return cls(
enabled=data.get("enabled", False),
transport=data.get("transport", "edit"),
edit_interval=float(data.get("edit_interval", 0.3)),
buffer_threshold=int(data.get("buffer_threshold", 40)),
cursor=data.get("cursor", ""),
)
@dataclass
class GatewayConfig:
"""
@@ -174,7 +206,13 @@ class GatewayConfig:
# STT settings
stt_enabled: bool = True # Whether to auto-transcribe inbound voice messages
# Session isolation in shared chats
group_sessions_per_user: bool = True # Isolate group/channel sessions per participant when user IDs are available
# Streaming configuration
streaming: StreamingConfig = field(default_factory=StreamingConfig)
def get_connected_platforms(self) -> List[Platform]:
"""Return list of platforms that are enabled and configured."""
connected = []
@@ -239,6 +277,8 @@ class GatewayConfig:
"sessions_dir": str(self.sessions_dir),
"always_log_local": self.always_log_local,
"stt_enabled": self.stt_enabled,
"group_sessions_per_user": self.group_sessions_per_user,
"streaming": self.streaming.to_dict(),
}
@classmethod
@@ -279,6 +319,8 @@ class GatewayConfig:
if stt_enabled is None:
stt_enabled = data.get("stt", {}).get("enabled") if isinstance(data.get("stt"), dict) else None
group_sessions_per_user = data.get("group_sessions_per_user")
return cls(
platforms=platforms,
default_reset_policy=default_policy,
@@ -289,6 +331,8 @@ class GatewayConfig:
sessions_dir=sessions_dir,
always_log_local=data.get("always_log_local", True),
stt_enabled=_coerce_bool(stt_enabled, True),
group_sessions_per_user=_coerce_bool(group_sessions_per_user, True),
streaming=StreamingConfig.from_dict(data.get("streaming", {})),
)
@@ -344,6 +388,14 @@ def load_gateway_config() -> GatewayConfig:
if isinstance(stt_cfg, dict) and "enabled" in stt_cfg:
config.stt_enabled = _coerce_bool(stt_cfg.get("enabled"), True)
# Bridge group session isolation from config.yaml into gateway runtime.
# Secure default is per-user isolation in shared chats.
if "group_sessions_per_user" in yaml_cfg:
config.group_sessions_per_user = _coerce_bool(
yaml_cfg.get("group_sessions_per_user"),
True,
)
# Bridge discord settings from config.yaml to env vars
# (env vars take precedence — only set if not already defined)
discord_cfg = yaml_cfg.get("discord", {})
+137 -6
View File
@@ -510,6 +510,7 @@ class BasePlatformAdapter(ABC):
image_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""
Send an image natively via the platform API.
@@ -528,6 +529,7 @@ class BasePlatformAdapter(ABC):
animation_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""
Send an animated GIF natively via the platform API.
@@ -536,7 +538,7 @@ class BasePlatformAdapter(ABC):
(e.g., Telegram send_animation) so they auto-play inline.
Default falls back to send_image.
"""
return await self.send_image(chat_id=chat_id, image_url=animation_url, caption=caption, reply_to=reply_to)
return await self.send_image(chat_id=chat_id, image_url=animation_url, caption=caption, reply_to=reply_to, metadata=metadata)
@staticmethod
def _is_animation_url(url: str) -> bool:
@@ -726,7 +728,75 @@ class BasePlatformAdapter(ABC):
cleaned = re.sub(r'\n{3,}', '\n\n', cleaned).strip()
return media, cleaned
@staticmethod
def extract_local_files(content: str) -> Tuple[List[str], str]:
"""
Detect bare local file paths in response text for native media delivery.
Matches absolute paths (/...) and tilde paths (~/) ending in common
image or video extensions. Validates each candidate with
``os.path.isfile()`` to avoid false positives from URLs or
non-existent paths.
Paths inside fenced code blocks (``` ... ```) and inline code
(`...`) are ignored so that code samples are never mutilated.
Returns:
Tuple of (list of expanded file paths, cleaned text with the
raw path strings removed).
"""
_LOCAL_MEDIA_EXTS = (
'.png', '.jpg', '.jpeg', '.gif', '.webp',
'.mp4', '.mov', '.avi', '.mkv', '.webm',
)
ext_part = '|'.join(e.lstrip('.') for e in _LOCAL_MEDIA_EXTS)
# (?<![/:\w.]) prevents matching inside URLs (e.g. https://…/img.png)
# and relative paths (./foo.png)
# (?:~/|/) anchors to absolute or home-relative paths
path_re = re.compile(
r'(?<![/:\w.])(?:~/|/)(?:[\w.\-]+/)*[\w.\-]+\.(?:' + ext_part + r')\b',
re.IGNORECASE,
)
# Build spans covered by fenced code blocks and inline code
code_spans: list = []
for m in re.finditer(r'```[^\n]*\n.*?```', content, re.DOTALL):
code_spans.append((m.start(), m.end()))
for m in re.finditer(r'`[^`\n]+`', content):
code_spans.append((m.start(), m.end()))
def _in_code(pos: int) -> bool:
return any(s <= pos < e for s, e in code_spans)
found: list = [] # (raw_match_text, expanded_path)
for match in path_re.finditer(content):
if _in_code(match.start()):
continue
raw = match.group(0)
expanded = os.path.expanduser(raw)
if os.path.isfile(expanded):
found.append((raw, expanded))
# Deduplicate by expanded path, preserving discovery order
seen: set = set()
unique: list = []
for raw, expanded in found:
if expanded not in seen:
seen.add(expanded)
unique.append((raw, expanded))
paths = [expanded for _, expanded in unique]
cleaned = content
if unique:
for raw, _exp in unique:
cleaned = cleaned.replace(raw, '')
cleaned = re.sub(r'\n{3,}', '\n\n', cleaned).strip()
return paths, cleaned
async def _keep_typing(self, chat_id: str, interval: float = 2.0, metadata=None) -> None:
"""
Continuously send typing indicator until cancelled.
@@ -752,7 +822,10 @@ class BasePlatformAdapter(ABC):
if not self._message_handler:
return
session_key = build_session_key(event.source)
session_key = build_session_key(
event.source,
group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
)
# Check if there's already an active handler for this session
if session_key in self._active_sessions:
@@ -836,8 +909,17 @@ class BasePlatformAdapter(ABC):
# Extract image URLs and send them as native platform attachments
images, text_content = self.extract_images(response)
# Strip any remaining internal directives from message body (fixes #1561)
text_content = text_content.replace("[[audio_as_voice]]", "").strip()
text_content = re.sub(r"MEDIA:\s*\S+", "", text_content).strip()
if images:
logger.info("[%s] extract_images found %d image(s) in response (%d chars)", self.name, len(images), len(response))
# Auto-detect bare local file paths for native media delivery
# (helps small models that don't use MEDIA: syntax)
local_files, text_content = self.extract_local_files(text_content)
if local_files:
logger.info("[%s] extract_local_files found %d file(s) in response", self.name, len(local_files))
# Auto-TTS: if voice message, generate audio FIRST (before sending text)
# Skipped when the chat has voice mode disabled (/voice off)
@@ -931,7 +1013,7 @@ class BasePlatformAdapter(ABC):
# Send extracted media files — route by file type
_AUDIO_EXTS = {'.ogg', '.opus', '.mp3', '.wav', '.m4a'}
_VIDEO_EXTS = {'.mp4', '.mov', '.avi', '.mkv', '.3gp'}
_VIDEO_EXTS = {'.mp4', '.mov', '.avi', '.mkv', '.webm', '.3gp'}
_IMAGE_EXTS = {'.jpg', '.jpeg', '.png', '.webp', '.gif'}
for media_path, is_voice in media_files:
@@ -968,7 +1050,34 @@ class BasePlatformAdapter(ABC):
print(f"[{self.name}] Failed to send media ({ext}): {media_result.error}")
except Exception as media_err:
print(f"[{self.name}] Error sending media: {media_err}")
# Send auto-detected local files as native attachments
for file_path in local_files:
if human_delay > 0:
await asyncio.sleep(human_delay)
try:
ext = Path(file_path).suffix.lower()
if ext in _IMAGE_EXTS:
await self.send_image_file(
chat_id=event.source.chat_id,
image_path=file_path,
metadata=_thread_metadata,
)
elif ext in _VIDEO_EXTS:
await self.send_video(
chat_id=event.source.chat_id,
video_path=file_path,
metadata=_thread_metadata,
)
else:
await self.send_document(
chat_id=event.source.chat_id,
file_path=file_path,
metadata=_thread_metadata,
)
except Exception as file_err:
logger.error("[%s] Error sending local file %s: %s", self.name, file_path, file_err)
# Check if there's a pending message that was queued during our processing
if session_key in self._pending_messages:
pending_event = self._pending_messages.pop(session_key)
@@ -1074,7 +1183,8 @@ class BasePlatformAdapter(ABC):
"""
return content
def truncate_message(self, content: str, max_length: int = 4096) -> List[str]:
@staticmethod
def truncate_message(content: str, max_length: int = 4096) -> List[str]:
"""
Split a long message into chunks, preserving code block boundaries.
@@ -1126,6 +1236,27 @@ class BasePlatformAdapter(ABC):
if split_at < 1:
split_at = headroom
# Avoid splitting inside an inline code span (`...`).
# If the text before split_at has an odd number of unescaped
# backticks, the split falls inside inline code — the resulting
# chunk would have an unpaired backtick and any special characters
# (like parentheses) inside the broken span would be unescaped,
# causing MarkdownV2 parse errors on Telegram.
candidate = remaining[:split_at]
backtick_count = candidate.count("`") - candidate.count("\\`")
if backtick_count % 2 == 1:
# Find the last unescaped backtick and split before it
last_bt = candidate.rfind("`")
while last_bt > 0 and candidate[last_bt - 1] == "\\":
last_bt = candidate.rfind("`", 0, last_bt)
if last_bt > 0:
# Try to find a space or newline just before the backtick
safe_split = candidate.rfind(" ", 0, last_bt)
nl_split = candidate.rfind("\n", 0, last_bt)
safe_split = max(safe_split, nl_split)
if safe_split > headroom // 4:
split_at = safe_split
chunk_body = remaining[:split_at]
remaining = remaining[split_at:].lstrip()
+56 -4
View File
@@ -10,6 +10,7 @@ Uses discord.py library for:
"""
import asyncio
import json
import logging
import os
import struct
@@ -18,6 +19,7 @@ import tempfile
import threading
import time
from collections import defaultdict
from pathlib import Path
from typing import Callable, Dict, List, Optional, Any
logger = logging.getLogger(__name__)
@@ -434,8 +436,11 @@ class DiscordAdapter(BasePlatformAdapter):
self._voice_input_callback: Optional[Callable] = None # set by run.py
self._on_voice_disconnect: Optional[Callable] = None # set by run.py
# Track threads where the bot has participated so follow-up messages
# in those threads don't require @mention.
self._bot_participated_threads: set = set()
# in those threads don't require @mention. Persisted to disk so the
# set survives gateway restarts.
self._bot_participated_threads: set = self._load_participated_threads()
# Cap to prevent unbounded growth (Discord threads get archived).
self._MAX_TRACKED_THREADS = 500
async def connect(self) -> bool:
"""Connect to Discord and start receiving events."""
@@ -1573,6 +1578,10 @@ class DiscordAdapter(BasePlatformAdapter):
link = f"<#{thread_id}>" if thread_id else f"**{thread_name}**"
await interaction.followup.send(f"Created thread {link}", ephemeral=True)
# Track thread participation so follow-ups don't require @mention
if thread_id:
self._track_thread(thread_id)
# If a message was provided, kick off a new Hermes session in the thread
starter = (message or "").strip()
if starter and thread_id:
@@ -1798,6 +1807,49 @@ class DiscordAdapter(BasePlatformAdapter):
return f"{parent_name} / {thread_name}"
return thread_name
# ------------------------------------------------------------------
# Thread participation persistence
# ------------------------------------------------------------------
@staticmethod
def _thread_state_path() -> Path:
"""Path to the persisted thread participation set."""
from hermes_cli.config import get_hermes_home
return get_hermes_home() / "discord_threads.json"
@classmethod
def _load_participated_threads(cls) -> set:
"""Load persisted thread IDs from disk."""
path = cls._thread_state_path()
try:
if path.exists():
data = json.loads(path.read_text(encoding="utf-8"))
if isinstance(data, list):
return set(data)
except Exception as e:
logger.debug("Could not load discord thread state: %s", e)
return set()
def _save_participated_threads(self) -> None:
"""Persist the current thread set to disk (best-effort)."""
path = self._thread_state_path()
try:
# Trim to most recent entries if over cap
thread_list = list(self._bot_participated_threads)
if len(thread_list) > self._MAX_TRACKED_THREADS:
thread_list = thread_list[-self._MAX_TRACKED_THREADS:]
self._bot_participated_threads = set(thread_list)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(thread_list), encoding="utf-8")
except Exception as e:
logger.debug("Could not save discord thread state: %s", e)
def _track_thread(self, thread_id: str) -> None:
"""Add a thread to the participation set and persist."""
if thread_id not in self._bot_participated_threads:
self._bot_participated_threads.add(thread_id)
self._save_participated_threads()
async def _handle_message(self, message: DiscordMessage) -> None:
"""Handle incoming Discord messages."""
# In server channels (not DMs), require the bot to be @mentioned
@@ -1850,7 +1902,7 @@ class DiscordAdapter(BasePlatformAdapter):
is_thread = True
thread_id = str(thread.id)
auto_threaded_channel = thread
self._bot_participated_threads.add(thread_id)
self._track_thread(thread_id)
# Determine message type
msg_type = MessageType.TEXT
@@ -1954,7 +2006,7 @@ class DiscordAdapter(BasePlatformAdapter):
# Track thread participation so the bot won't require @mention for
# follow-up messages in threads it has already engaged in.
if thread_id:
self._bot_participated_threads.add(thread_id)
self._track_thread(thread_id)
await self.handle_message(event)
+19 -3
View File
@@ -135,14 +135,23 @@ def _extract_email_address(raw: str) -> str:
return raw.strip().lower()
def _extract_attachments(msg: email_lib.message.Message) -> List[Dict[str, Any]]:
"""Extract attachment metadata and cache files locally."""
def _extract_attachments(
msg: email_lib.message.Message,
skip_attachments: bool = False,
) -> List[Dict[str, Any]]:
"""Extract attachment metadata and cache files locally.
When *skip_attachments* is True, all attachment/inline parts are ignored
(useful for malware protection or bandwidth savings).
"""
attachments = []
if not msg.is_multipart():
return attachments
for part in msg.walk():
disposition = str(part.get("Content-Disposition", ""))
if skip_attachments and ("attachment" in disposition or "inline" in disposition):
continue
if "attachment" not in disposition and "inline" not in disposition:
continue
# Skip text/plain and text/html body parts
@@ -196,6 +205,13 @@ class EmailAdapter(BasePlatformAdapter):
self._smtp_port = int(os.getenv("EMAIL_SMTP_PORT", "587"))
self._poll_interval = int(os.getenv("EMAIL_POLL_INTERVAL", "15"))
# Skip attachments — configured via config.yaml:
# platforms:
# email:
# skip_attachments: true
extra = config.extra or {}
self._skip_attachments = extra.get("skip_attachments", False)
# Track message IDs we've already processed to avoid duplicates
self._seen_uids: set = set()
self._poll_task: Optional[asyncio.Task] = None
@@ -306,7 +322,7 @@ class EmailAdapter(BasePlatformAdapter):
message_id = msg.get("Message-ID", "")
in_reply_to = msg.get("In-Reply-To", "")
body = _extract_text_body(msg)
attachments = _extract_attachments(msg)
attachments = _extract_attachments(msg, skip_attachments=self._skip_attachments)
results.append({
"uid": uid,
+5 -17
View File
@@ -789,23 +789,11 @@ class SlackAdapter(BasePlatformAdapter):
user_id = command.get("user_id", "")
channel_id = command.get("channel_id", "")
# Map subcommands to gateway commands
subcommand_map = {
"new": "/reset", "reset": "/reset",
"status": "/status", "stop": "/stop",
"help": "/help",
"model": "/model", "personality": "/personality",
"retry": "/retry", "undo": "/undo",
"compact": "/compress", "compress": "/compress",
"resume": "/resume",
"background": "/background",
"usage": "/usage",
"insights": "/insights",
"title": "/title",
"reasoning": "/reasoning",
"provider": "/provider",
"rollback": "/rollback",
}
# Map subcommands to gateway commands — derived from central registry.
# Also keep "compact" as a Slack-specific alias for /compress.
from hermes_cli.commands import slack_subcommand_map
subcommand_map = slack_subcommand_map()
subcommand_map["compact"] = "/compress"
first_word = text.split()[0] if text else ""
if first_word in subcommand_map:
# Preserve arguments after the subcommand
+78 -48
View File
@@ -202,8 +202,26 @@ class TelegramAdapter(BasePlatformAdapter):
self._handle_media_message
))
# Start polling in background
await self._app.initialize()
# Start polling — retry initialize() for transient TLS resets
try:
from telegram.error import NetworkError, TimedOut
except ImportError:
NetworkError = TimedOut = OSError # type: ignore[misc,assignment]
_max_connect = 3
for _attempt in range(_max_connect):
try:
await self._app.initialize()
break
except (NetworkError, TimedOut, OSError) as init_err:
if _attempt < _max_connect - 1:
wait = 2 ** _attempt
logger.warning(
"[%s] Connect attempt %d/%d failed: %s — retrying in %ds",
self.name, _attempt + 1, _max_connect, init_err, wait,
)
await asyncio.sleep(wait)
else:
raise
await self._app.start()
loop = asyncio.get_running_loop()
@@ -222,29 +240,13 @@ class TelegramAdapter(BasePlatformAdapter):
)
# Register bot commands so Telegram shows a hint menu when users type /
# List is derived from the central COMMAND_REGISTRY — adding a new
# gateway command there automatically adds it to the Telegram menu.
try:
from telegram import BotCommand
from hermes_cli.commands import telegram_bot_commands
await self._bot.set_my_commands([
BotCommand("new", "Start a new conversation"),
BotCommand("reset", "Reset conversation history"),
BotCommand("model", "Show or change the model"),
BotCommand("reasoning", "Show or change reasoning effort"),
BotCommand("personality", "Set a personality"),
BotCommand("retry", "Retry your last message"),
BotCommand("undo", "Remove the last exchange"),
BotCommand("status", "Show session info"),
BotCommand("stop", "Stop the running agent"),
BotCommand("sethome", "Set this chat as the home channel"),
BotCommand("compress", "Compress conversation context"),
BotCommand("title", "Set or show the session title"),
BotCommand("resume", "Resume a previously-named session"),
BotCommand("usage", "Show token usage for this session"),
BotCommand("provider", "Show available providers"),
BotCommand("insights", "Show usage insights and analytics"),
BotCommand("update", "Update Hermes to the latest version"),
BotCommand("reload_mcp", "Reload MCP servers from config"),
BotCommand("voice", "Toggle voice reply mode"),
BotCommand("help", "Show available commands"),
BotCommand(name, desc) for name, desc in telegram_bot_commands()
])
except Exception as e:
logger.warning(
@@ -265,6 +267,8 @@ class TelegramAdapter(BasePlatformAdapter):
release_scoped_lock("telegram-bot-token", self._token_lock_identity)
except Exception:
pass
message = f"Telegram startup failed: {e}"
self._set_fatal_error("telegram_connect_error", message, retryable=True)
logger.error("[%s] Failed to connect to Telegram: %s", self.name, e, exc_info=True)
return False
@@ -322,36 +326,59 @@ class TelegramAdapter(BasePlatformAdapter):
# Format and split message if needed
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
if len(chunks) > 1:
# truncate_message appends a raw " (1/2)" suffix. Escape the
# MarkdownV2-special parentheses so Telegram doesn't reject the
# chunk and fall back to plain text.
chunks = [
re.sub(r" \((\d+)/(\d+)\)$", r" \\(\1/\2\\)", chunk)
for chunk in chunks
]
message_ids = []
thread_id = metadata.get("thread_id") if metadata else None
try:
from telegram.error import NetworkError as _NetErr
except ImportError:
_NetErr = OSError # type: ignore[misc,assignment]
for i, chunk in enumerate(chunks):
# Try Markdown first, fall back to plain text if it fails
try:
msg = await self._bot.send_message(
chat_id=int(chat_id),
text=chunk,
parse_mode=ParseMode.MARKDOWN_V2,
reply_to_message_id=int(reply_to) if reply_to and i == 0 else None,
message_thread_id=int(thread_id) if thread_id else None,
)
except Exception as md_error:
# Markdown parsing failed, try plain text
if "parse" in str(md_error).lower() or "markdown" in str(md_error).lower():
logger.warning("[%s] MarkdownV2 parse failed, falling back to plain text: %s", self.name, md_error)
# Strip MDV2 escape backslashes so the user doesn't
# see raw backslashes littered through the message.
plain_chunk = _strip_mdv2(chunk)
msg = await self._bot.send_message(
chat_id=int(chat_id),
text=plain_chunk,
parse_mode=None, # Plain text
reply_to_message_id=int(reply_to) if reply_to and i == 0 else None,
message_thread_id=int(thread_id) if thread_id else None,
)
else:
raise # Re-raise if not a parse error
msg = None
for _send_attempt in range(3):
try:
# Try Markdown first, fall back to plain text if it fails
try:
msg = await self._bot.send_message(
chat_id=int(chat_id),
text=chunk,
parse_mode=ParseMode.MARKDOWN_V2,
reply_to_message_id=int(reply_to) if reply_to and i == 0 else None,
message_thread_id=int(thread_id) if thread_id else None,
)
except Exception as md_error:
# Markdown parsing failed, try plain text
if "parse" in str(md_error).lower() or "markdown" in str(md_error).lower():
logger.warning("[%s] MarkdownV2 parse failed, falling back to plain text: %s", self.name, md_error)
plain_chunk = _strip_mdv2(chunk)
msg = await self._bot.send_message(
chat_id=int(chat_id),
text=plain_chunk,
parse_mode=None,
reply_to_message_id=int(reply_to) if reply_to and i == 0 else None,
message_thread_id=int(thread_id) if thread_id else None,
)
else:
raise
break # success
except _NetErr as send_err:
if _send_attempt < 2:
wait = 2 ** _send_attempt
logger.warning("[%s] Network error on send (attempt %d/3), retrying in %ds: %s",
self.name, _send_attempt + 1, wait, send_err)
await asyncio.sleep(wait)
else:
raise
message_ids.append(str(msg.message_id))
return SendResult(
@@ -821,7 +848,10 @@ class TelegramAdapter(BasePlatformAdapter):
def _photo_batch_key(self, event: MessageEvent, msg: Message) -> str:
"""Return a batching key for Telegram photos/albums."""
from gateway.session import build_session_key
session_key = build_session_key(event.source)
session_key = build_session_key(
event.source,
group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
)
media_group_id = getattr(msg, "media_group_id", None)
if media_group_id:
return f"{session_key}:album:{media_group_id}"
+410 -162
View File
@@ -29,6 +29,49 @@ from pathlib import Path
from datetime import datetime
from typing import Dict, Optional, Any, List
# ---------------------------------------------------------------------------
# SSL certificate auto-detection for NixOS and other non-standard systems.
# Must run BEFORE any HTTP library (discord, aiohttp, etc.) is imported.
# ---------------------------------------------------------------------------
def _ensure_ssl_certs() -> None:
"""Set SSL_CERT_FILE if the system doesn't expose CA certs to Python."""
if "SSL_CERT_FILE" in os.environ:
return # user already configured it
import ssl
# 1. Python's compiled-in defaults
paths = ssl.get_default_verify_paths()
for candidate in (paths.cafile, paths.openssl_cafile):
if candidate and os.path.exists(candidate):
os.environ["SSL_CERT_FILE"] = candidate
return
# 2. certifi (ships its own Mozilla bundle)
try:
import certifi
os.environ["SSL_CERT_FILE"] = certifi.where()
return
except ImportError:
pass
# 3. Common distro / macOS locations
for candidate in (
"/etc/ssl/certs/ca-certificates.crt", # Debian/Ubuntu/Gentoo
"/etc/pki/tls/certs/ca-bundle.crt", # RHEL/CentOS 7
"/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem", # RHEL/CentOS 8+
"/etc/ssl/ca-bundle.pem", # SUSE/OpenSUSE
"/etc/ssl/cert.pem", # Alpine / macOS
"/etc/pki/tls/cert.pem", # Fedora
"/usr/local/etc/openssl@1.1/cert.pem", # macOS Homebrew Intel
"/opt/homebrew/etc/openssl@1.1/cert.pem", # macOS Homebrew ARM
):
if os.path.exists(candidate):
os.environ["SSL_CERT_FILE"] = candidate
return
_ensure_ssl_certs()
# Add parent directory to path
sys.path.insert(0, str(Path(__file__).parent.parent))
@@ -77,6 +120,7 @@ if _config_path.exists():
"container_persistent": "TERMINAL_CONTAINER_PERSISTENT",
"docker_volumes": "TERMINAL_DOCKER_VOLUMES",
"sandbox_dir": "TERMINAL_SANDBOX_DIR",
"persistent_shell": "TERMINAL_PERSISTENT_SHELL",
}
for _cfg_key, _env_var in _terminal_env_map.items():
if _cfg_key in _terminal_cfg:
@@ -113,6 +157,12 @@ if _config_path.exists():
"base_url": "AUXILIARY_WEB_EXTRACT_BASE_URL",
"api_key": "AUXILIARY_WEB_EXTRACT_API_KEY",
},
"approval": {
"provider": "AUXILIARY_APPROVAL_PROVIDER",
"model": "AUXILIARY_APPROVAL_MODEL",
"base_url": "AUXILIARY_APPROVAL_BASE_URL",
"api_key": "AUXILIARY_APPROVAL_API_KEY",
},
}
for _task_key, _env_map in _aux_task_env.items():
_task_cfg = _auxiliary_cfg.get(_task_key, {})
@@ -274,6 +324,7 @@ class GatewayRunner:
self._show_reasoning = self._load_show_reasoning()
self._provider_routing = self._load_provider_routing()
self._fallback_model = self._load_fallback_model()
self._smart_model_routing = self._load_smart_model_routing()
# Wire process registry into session store for reset protection
from tools.process_registry import process_registry
@@ -305,7 +356,7 @@ class GatewayRunner:
# Ensure tirith security scanner is available (downloads if needed)
try:
from tools.tirith_security import ensure_installed
ensure_installed()
ensure_installed(log_failures=False)
except Exception:
pass # Non-fatal — fail-open at scan time if unavailable
@@ -434,7 +485,11 @@ class GatewayRunner:
# -----------------------------------------------------------------
def _flush_memories_for_session(self, old_session_id: str):
def _flush_memories_for_session(
self,
old_session_id: str,
honcho_session_key: Optional[str] = None,
):
"""Prompt the agent to save memories/skills before context is lost.
Synchronous worker meant to be called via run_in_executor from
@@ -462,6 +517,7 @@ class GatewayRunner:
quiet_mode=True,
enabled_toolsets=["memory", "skills"],
session_id=old_session_id,
honcho_session_key=honcho_session_key,
)
# Build conversation history from transcript
@@ -489,6 +545,7 @@ class GatewayRunner:
tmp_agent.run_conversation(
user_message=flush_prompt,
conversation_history=msgs,
sync_honcho=False,
)
logger.info("Pre-reset memory flush completed for session %s", old_session_id)
# Flush any queued Honcho writes before the session is dropped
@@ -500,10 +557,19 @@ class GatewayRunner:
except Exception as e:
logger.debug("Pre-reset memory flush failed for session %s: %s", old_session_id, e)
async def _async_flush_memories(self, old_session_id: str):
async def _async_flush_memories(
self,
old_session_id: str,
honcho_session_key: Optional[str] = None,
):
"""Run the sync memory flush in a thread pool so it won't block the event loop."""
loop = asyncio.get_event_loop()
await loop.run_in_executor(None, self._flush_memories_for_session, old_session_id)
await loop.run_in_executor(
None,
self._flush_memories_for_session,
old_session_id,
honcho_session_key,
)
@property
def should_exit_cleanly(self) -> bool:
@@ -513,6 +579,33 @@ class GatewayRunner:
def exit_reason(self) -> Optional[str]:
return self._exit_reason
def _session_key_for_source(self, source: SessionSource) -> str:
"""Resolve the current session key for a source, honoring gateway config when available."""
if hasattr(self, "session_store") and self.session_store is not None:
try:
session_key = self.session_store._generate_session_key(source)
if isinstance(session_key, str) and session_key:
return session_key
except Exception:
pass
config = getattr(self, "config", None)
return build_session_key(
source,
group_sessions_per_user=getattr(config, "group_sessions_per_user", True),
)
def _resolve_turn_agent_config(self, user_message: str, model: str, runtime_kwargs: dict) -> dict:
from agent.smart_model_routing import resolve_turn_route
primary = {
"model": model,
"api_key": runtime_kwargs.get("api_key"),
"base_url": runtime_kwargs.get("base_url"),
"provider": runtime_kwargs.get("provider"),
"api_mode": runtime_kwargs.get("api_mode"),
}
return resolve_turn_route(user_message, getattr(self, "_smart_model_routing", {}), primary)
async def _handle_adapter_fatal_error(self, adapter: BasePlatformAdapter) -> None:
"""React to a non-retryable adapter failure after startup."""
logger.error(
@@ -715,6 +808,20 @@ class GatewayRunner:
pass
return None
@staticmethod
def _load_smart_model_routing() -> dict:
"""Load optional smart cheap-vs-strong model routing config."""
try:
import yaml as _y
cfg_path = _hermes_home / "config.yaml"
if cfg_path.exists():
with open(cfg_path, encoding="utf-8") as _f:
cfg = _y.safe_load(_f) or {}
return cfg.get("smart_model_routing", {}) or {}
except Exception:
pass
return {}
async def start(self) -> bool:
"""
Start the gateway and all configured platform adapters.
@@ -757,12 +864,15 @@ class GatewayRunner:
logger.warning("Process checkpoint recovery: %s", e)
connected_count = 0
enabled_platform_count = 0
startup_nonretryable_errors: list[str] = []
startup_retryable_errors: list[str] = []
# Initialize and connect each configured platform
for platform, platform_config in self.config.platforms.items():
if not platform_config.enabled:
continue
enabled_platform_count += 1
adapter = self._create_adapter(platform, platform_config)
if not adapter:
@@ -784,12 +894,22 @@ class GatewayRunner:
logger.info("%s connected", platform.value)
else:
logger.warning("%s failed to connect", platform.value)
if adapter.has_fatal_error and not adapter.fatal_error_retryable:
startup_nonretryable_errors.append(
if adapter.has_fatal_error:
target = (
startup_retryable_errors
if adapter.fatal_error_retryable
else startup_nonretryable_errors
)
target.append(
f"{platform.value}: {adapter.fatal_error_message}"
)
else:
startup_retryable_errors.append(
f"{platform.value}: failed to connect"
)
except Exception as e:
logger.error("%s error: %s", platform.value, e)
startup_retryable_errors.append(f"{platform.value}: {e}")
if connected_count == 0:
if startup_nonretryable_errors:
@@ -802,7 +922,16 @@ class GatewayRunner:
pass
self._request_clean_exit(reason)
return True
logger.warning("No messaging platforms connected.")
if enabled_platform_count > 0:
reason = "; ".join(startup_retryable_errors) or "all configured messaging platforms failed to connect"
logger.error("Gateway failed to connect any configured messaging platform: %s", reason)
try:
from gateway.status import write_runtime_status
write_runtime_status(gateway_state="startup_failed", exit_reason=reason)
except Exception:
pass
return False
logger.warning("No messaging platforms enabled.")
logger.info("Gateway will continue running for cron job execution.")
# Update delivery router with adapters
@@ -879,7 +1008,7 @@ class GatewayRunner:
entry.session_id, key,
)
try:
await self._async_flush_memories(entry.session_id)
await self._async_flush_memories(entry.session_id, key)
self._shutdown_gateway_honcho(key)
self.session_store._pre_flushed_sessions.add(entry.session_id)
except Exception as e:
@@ -941,6 +1070,12 @@ class GatewayRunner:
config: Any
) -> Optional[BasePlatformAdapter]:
"""Create the appropriate adapter for a platform."""
if hasattr(config, "extra") and isinstance(config.extra, dict):
config.extra.setdefault(
"group_sessions_per_user",
self.config.group_sessions_per_user,
)
if platform == Platform.TELEGRAM:
from gateway.platforms.telegram import TelegramAdapter, check_telegram_requirements
if not check_telegram_requirements():
@@ -1112,8 +1247,11 @@ class GatewayRunner:
# Special case: Telegram/photo bursts often arrive as multiple near-
# simultaneous updates. Do NOT interrupt for photo-only follow-ups here;
# let the adapter-level batching/queueing logic absorb them.
_quick_key = build_session_key(source)
_quick_key = self._session_key_for_source(source)
if _quick_key in self._running_agents:
if event.get_command() == "status":
return await self._handle_status_command(event)
if event.message_type == MessageType.PHOTO:
logger.debug("PRIORITY photo follow-up for session %s — queueing without interrupt", _quick_key[:20])
adapter = self.adapters.get(source.platform)
@@ -1147,45 +1285,47 @@ class GatewayRunner:
# Check for commands
command = event.get_command()
# Emit command:* hook for any recognized slash command
_known_commands = {"new", "reset", "help", "status", "stop", "model", "reasoning",
"personality", "plan", "retry", "undo", "sethome", "set-home",
"compress", "usage", "insights", "reload-mcp", "reload_mcp",
"update", "title", "resume", "provider", "rollback",
"background", "reasoning", "voice"}
if command and command in _known_commands:
# Emit command:* hook for any recognized slash command.
# GATEWAY_KNOWN_COMMANDS is derived from the central COMMAND_REGISTRY
# in hermes_cli/commands.py — no hardcoded set to maintain here.
from hermes_cli.commands import GATEWAY_KNOWN_COMMANDS, resolve_command as _resolve_cmd
if command and command in GATEWAY_KNOWN_COMMANDS:
await self.hooks.emit(f"command:{command}", {
"platform": source.platform.value if source.platform else "",
"user_id": source.user_id,
"command": command,
"args": event.get_command_args().strip(),
})
if command in ["new", "reset"]:
# Resolve aliases to canonical name so dispatch only checks canonicals.
_cmd_def = _resolve_cmd(command) if command else None
canonical = _cmd_def.name if _cmd_def else command
if canonical == "new":
return await self._handle_reset_command(event)
if command == "help":
if canonical == "help":
return await self._handle_help_command(event)
if command == "status":
if canonical == "status":
return await self._handle_status_command(event)
if command == "stop":
if canonical == "stop":
return await self._handle_stop_command(event)
if command == "model":
if canonical == "model":
return await self._handle_model_command(event)
if command == "reasoning":
if canonical == "reasoning":
return await self._handle_reasoning_command(event)
if command == "provider":
if canonical == "provider":
return await self._handle_provider_command(event)
if command == "personality":
if canonical == "personality":
return await self._handle_personality_command(event)
if command == "plan":
if canonical == "plan":
try:
from agent.skill_commands import build_plan_path, build_skill_invocation_message
@@ -1202,51 +1342,48 @@ class GatewayRunner:
)
if not event.text:
return "Failed to load the bundled /plan skill."
command = None
canonical = None
except Exception as e:
logger.exception("Failed to prepare /plan command")
return f"Failed to enter plan mode: {e}"
if command == "retry":
if canonical == "retry":
return await self._handle_retry_command(event)
if command == "undo":
if canonical == "undo":
return await self._handle_undo_command(event)
if command in ["sethome", "set-home"]:
if canonical == "sethome":
return await self._handle_set_home_command(event)
if command == "compress":
if canonical == "compress":
return await self._handle_compress_command(event)
if command == "usage":
if canonical == "usage":
return await self._handle_usage_command(event)
if command == "insights":
if canonical == "insights":
return await self._handle_insights_command(event)
if command in ("reload-mcp", "reload_mcp"):
if canonical == "reload-mcp":
return await self._handle_reload_mcp_command(event)
if command == "update":
if canonical == "update":
return await self._handle_update_command(event)
if command == "title":
if canonical == "title":
return await self._handle_title_command(event)
if command == "resume":
if canonical == "resume":
return await self._handle_resume_command(event)
if command == "rollback":
if canonical == "rollback":
return await self._handle_rollback_command(event)
if command == "background":
if canonical == "background":
return await self._handle_background_command(event)
if command == "reasoning":
return await self._handle_reasoning_command(event)
if command == "voice":
if canonical == "voice":
return await self._handle_voice_command(event)
# User-defined quick commands (bypass agent loop, no LLM call)
@@ -1298,7 +1435,7 @@ class GatewayRunner:
logger.debug("Skill command check failed (non-fatal): %s", e)
# Check for pending exec approval responses
session_key_preview = build_session_key(source)
session_key_preview = self._session_key_for_source(source)
if session_key_preview in self._pending_approvals:
user_text = event.text.strip().lower()
if user_text in ("yes", "y", "approve", "ok", "go", "do it"):
@@ -1347,8 +1484,17 @@ class GatewayRunner:
# Set environment variables for tools
self._set_session_env(context)
# Read privacy.redact_pii from config (re-read per message)
_redact_pii = False
try:
with open(_config_path, encoding="utf-8") as _pf:
_pcfg = yaml.safe_load(_pf) or {}
_redact_pii = bool((_pcfg.get("privacy") or {}).get("redact_pii", False))
except Exception:
pass
# Build the context prompt to inject
context_prompt = build_session_context_prompt(context)
context_prompt = build_session_context_prompt(context, redact_pii=_redact_pii)
# If the previous session expired and was auto-reset, prepend a notice
# so the agent knows this is a fresh conversation (not an intentional /reset).
@@ -1717,9 +1863,37 @@ class GatewayRunner:
session_key=session_key
)
response = agent_result.get("final_response", "")
response = agent_result.get("final_response") or ""
agent_messages = agent_result.get("messages", [])
# Surface error details when the agent failed silently (final_response=None)
if not response and agent_result.get("failed"):
error_detail = agent_result.get("error", "unknown error")
error_str = str(error_detail).lower()
# Detect context-overflow failures and give specific guidance.
# Generic 400 "Error" from Anthropic with large sessions is the
# most common cause of this (#1630).
_is_ctx_fail = any(p in error_str for p in (
"context", "token", "too large", "too long",
"exceed", "payload",
)) or (
"400" in error_str
and len(history) > 50
)
if _is_ctx_fail:
response = (
"⚠️ Session too large for the model's context window.\n"
"Use /compact to compress the conversation, or "
"/reset to start fresh."
)
else:
response = (
f"The request failed: {str(error_detail)[:300]}\n"
"Try again or use /reset to start a fresh session."
)
# If the agent's session_id changed during compression, update
# session_entry so transcript writes below go to the right session.
if agent_result.get("session_id") and agent_result["session_id"] != session_entry.session_id:
@@ -1766,12 +1940,30 @@ class GatewayRunner:
# This preserves the complete agent loop (tool_calls, tool results,
# intermediate reasoning) so sessions can be resumed with full context
# and transcripts are useful for debugging and training data.
#
# IMPORTANT: When the agent failed before producing any response
# (e.g. context-overflow 400), do NOT persist the user's message.
# Persisting it would make the session even larger, causing the
# same failure on the next attempt — an infinite loop. (#1630)
agent_failed_early = (
agent_result.get("failed")
and not agent_result.get("final_response")
)
if agent_failed_early:
logger.info(
"Skipping transcript persistence for failed request in "
"session %s to prevent session growth loop.",
session_entry.session_id,
)
ts = datetime.now().isoformat()
# If this is a fresh session (no history), write the full tool
# definitions as the first entry so the transcript is self-describing
# -- the same list of dicts sent as tools=[...] in the API request.
if not history:
if agent_failed_early:
pass # Skip all transcript writes — don't grow a broken session
elif not history:
tool_defs = agent_result.get("tools", [])
self.session_store.append_to_transcript(
session_entry.session_id,
@@ -1788,40 +1980,43 @@ class GatewayRunner:
# Use the filtered history length (history_offset) that was actually
# passed to the agent, not len(history) which includes session_meta
# entries that were stripped before the agent saw them.
history_len = agent_result.get("history_offset", len(history))
new_messages = agent_messages[history_len:] if len(agent_messages) > history_len else []
# If no new messages found (edge case), fall back to simple user/assistant
if not new_messages:
self.session_store.append_to_transcript(
session_entry.session_id,
{"role": "user", "content": message_text, "timestamp": ts}
)
if response:
if not agent_failed_early:
history_len = agent_result.get("history_offset", len(history))
new_messages = agent_messages[history_len:] if len(agent_messages) > history_len else []
# If no new messages found (edge case), fall back to simple user/assistant
if not new_messages:
self.session_store.append_to_transcript(
session_entry.session_id,
{"role": "assistant", "content": response, "timestamp": ts}
)
else:
# The agent already persisted these messages to SQLite via
# _flush_messages_to_session_db(), so skip the DB write here
# to prevent the duplicate-write bug (#860). We still write
# to JSONL for backward compatibility and as a backup.
agent_persisted = self._session_db is not None
for msg in new_messages:
# Skip system messages (they're rebuilt each run)
if msg.get("role") == "system":
continue
# Add timestamp to each message for debugging
entry = {**msg, "timestamp": ts}
self.session_store.append_to_transcript(
session_entry.session_id, entry,
skip_db=agent_persisted,
{"role": "user", "content": message_text, "timestamp": ts}
)
if response:
self.session_store.append_to_transcript(
session_entry.session_id,
{"role": "assistant", "content": response, "timestamp": ts}
)
else:
# The agent already persisted these messages to SQLite via
# _flush_messages_to_session_db(), so skip the DB write here
# to prevent the duplicate-write bug (#860). We still write
# to JSONL for backward compatibility and as a backup.
agent_persisted = self._session_db is not None
for msg in new_messages:
# Skip system messages (they're rebuilt each run)
if msg.get("role") == "system":
continue
# Add timestamp to each message for debugging
entry = {**msg, "timestamp": ts}
self.session_store.append_to_transcript(
session_entry.session_id, entry,
skip_db=agent_persisted,
)
# Update session with actual prompt token count and model from the agent
self.session_store.update_session(
session_entry.session_key,
input_tokens=agent_result.get("input_tokens", 0),
output_tokens=agent_result.get("output_tokens", 0),
last_prompt_tokens=agent_result.get("last_prompt_tokens", 0),
model=agent_result.get("model"),
)
@@ -1830,13 +2025,41 @@ class GatewayRunner:
if self._should_send_voice_reply(event, response, agent_messages):
await self._send_voice_reply(event, response)
# If streaming already delivered the response, return None so
# _process_message_background doesn't send it again.
if agent_result.get("already_sent"):
return None
return response
except Exception as e:
logger.exception("Agent error in session %s", session_key)
error_type = type(e).__name__
error_detail = str(e)[:300] if str(e) else "no details available"
status_hint = ""
status_code = getattr(e, "status_code", None)
if status_code == 401:
status_hint = " Check your API key or run `claude /login` to refresh OAuth credentials."
elif status_code == 429:
status_hint = " You are being rate-limited. Please wait a moment and try again."
elif status_code == 529:
status_hint = " The API is temporarily overloaded. Please try again shortly."
elif status_code == 400:
# 400 with a large session is almost always a context overflow.
# Give specific guidance instead of a generic error. (#1630)
_hist_len = len(history) if 'history' in locals() else 0
if _hist_len > 50:
return (
"⚠️ Session too large for the model's context window.\n"
"Use /compact to compress the conversation, or "
"/reset to start fresh."
)
else:
status_hint = " The request was rejected by the API."
return (
"Sorry, I encountered an unexpected error. "
"The details have been logged for debugging. "
f"Sorry, I encountered an error ({error_type}).\n"
f"{error_detail}\n"
f"{status_hint}"
"Try again or use /reset to start a fresh session."
)
finally:
@@ -1848,14 +2071,16 @@ class GatewayRunner:
source = event.source
# Get existing session key
session_key = self.session_store._generate_session_key(source)
session_key = self._session_key_for_source(source)
# Flush memories in the background (fire-and-forget) so the user
# gets the "Session reset!" response immediately.
try:
old_entry = self.session_store._entries.get(session_key)
if old_entry:
asyncio.create_task(self._async_flush_memories(old_entry.session_id))
asyncio.create_task(
self._async_flush_memories(old_entry.session_id, session_key)
)
except Exception as e:
logger.debug("Gateway memory flush on reset failed: %s", e)
@@ -1918,30 +2143,10 @@ class GatewayRunner:
async def _handle_help_command(self, event: MessageEvent) -> str:
"""Handle /help command - list available commands."""
from hermes_cli.commands import gateway_help_lines
lines = [
"📖 **Hermes Commands**\n",
"`/new` — Start a new conversation",
"`/reset` — Reset conversation history",
"`/status` — Show session info",
"`/stop` — Interrupt the running agent",
"`/model [provider:model]` — Show/change model (or switch provider)",
"`/provider` — Show available providers and auth status",
"`/personality [name]` — Set a personality",
"`/retry` — Retry your last message",
"`/undo` — Remove the last exchange",
"`/sethome` — Set this chat as the home channel",
"`/compress` — Compress conversation context",
"`/title [name]` — Set or show the session title",
"`/resume [name]` — Resume a previously-named session",
"`/usage` — Show token usage for this session",
"`/insights [days]` — Show usage insights and analytics",
"`/reasoning [level|show|hide]` — Set reasoning effort or toggle display",
"`/rollback [number]` — List or restore filesystem checkpoints",
"`/background <prompt>` — Run a prompt in a separate background session",
"`/voice [on|off|tts|status]` — Toggle voice reply mode",
"`/reload-mcp` — Reload MCP servers from config",
"`/update` — Update Hermes Agent to the latest version",
"`/help` — Show this message",
*gateway_help_lines(),
]
try:
from agent.skill_commands import get_skill_commands
@@ -2019,6 +2224,12 @@ class GatewayRunner:
# Parse provider:model syntax
target_provider, new_model = parse_model_input(args, current_provider)
# Auto-detect provider when no explicit provider:model syntax was used
if target_provider == current_provider:
from hermes_cli.models import detect_provider_for_model
detected = detect_provider_for_model(new_model, current_provider)
if detected:
target_provider, new_model = detected
provider_changed = target_provider != current_provider
# Resolve credentials for the target provider (for API probe)
@@ -2801,11 +3012,12 @@ class GatewayRunner:
max_iterations = int(os.getenv("HERMES_MAX_ITERATIONS", "90"))
reasoning_config = self._load_reasoning_config()
self._reasoning_config = reasoning_config
turn_route = self._resolve_turn_agent_config(prompt, model, runtime_kwargs)
def run_sync():
agent = AIAgent(
model=model,
**runtime_kwargs,
model=turn_route["model"],
**turn_route["runtime"],
max_iterations=max_iterations,
quiet_mode=True,
verbose_logging=False,
@@ -3078,7 +3290,7 @@ class GatewayRunner:
return "Session database not available."
source = event.source
session_key = build_session_key(source)
session_key = self._session_key_for_source(source)
name = event.get_command_args().strip()
if not name:
@@ -3122,7 +3334,9 @@ class GatewayRunner:
# Flush memories for current session before switching
try:
asyncio.create_task(self._async_flush_memories(current_entry.session_id))
asyncio.create_task(
self._async_flush_memories(current_entry.session_id, session_key)
)
except Exception as e:
logger.debug("Memory flush on resume failed: %s", e)
@@ -3150,7 +3364,7 @@ class GatewayRunner:
async def _handle_usage_command(self, event: MessageEvent) -> str:
"""Handle /usage command -- show token usage for the session's last agent run."""
source = event.source
session_key = build_session_key(source)
session_key = self._session_key_for_source(source)
agent = self._running_agents.get(session_key)
if agent and hasattr(agent, "session_total_tokens") and agent.session_api_calls > 0:
@@ -3525,13 +3739,9 @@ class GatewayRunner:
1. Immediately understand what the user sent (no extra tool call).
2. Re-examine the image with vision_analyze if it needs more detail.
Athabasca persistence should happen through Athabasca's own POST
/api/uploads flow, using the returned asset.publicUrl rather than local
cache paths.
Args:
user_text: The user's original caption / message text.
image_paths: List of local file paths to cached images.
user_text: The user's original caption / message text.
image_paths: List of local file paths to cached images.
Returns:
The enriched message string with vision descriptions prepended.
@@ -3556,16 +3766,10 @@ class GatewayRunner:
result = _json.loads(result_json)
if result.get("success"):
description = result.get("analysis", "")
athabasca_note = (
"\n[If this image needs to persist in Athabasca state, upload the cached file "
"through Athabasca POST /api/uploads and use the returned asset.publicUrl. "
"Do not store the local cache path as the canonical imageUrl.]"
)
enriched_parts.append(
f"[The user sent an image~ Here's what I can see:\n{description}]\n"
f"[If you need a closer look, use vision_analyze with "
f"image_url: {path} ~]"
f"{athabasca_note}"
)
else:
enriched_parts.append(
@@ -3629,7 +3833,10 @@ class GatewayRunner:
)
else:
error = result.get("error", "unknown error")
if "No STT provider" in error or "not set" in error:
if (
"No STT provider" in error
or error.startswith("Neither VOICE_TOOLS_OPENAI_KEY nor OPENAI_API_KEY is set")
):
enriched_parts.append(
"[The user sent a voice message but I can't listen "
"to it right now~ No STT provider is configured "
@@ -3674,6 +3881,7 @@ class GatewayRunner:
session_key = watcher.get("session_key", "")
platform_name = watcher.get("platform", "")
chat_id = watcher.get("chat_id", "")
thread_id = watcher.get("thread_id", "")
notify_mode = self._load_background_notifications_mode()
logger.debug("Process watcher started: %s (every %ss, notify=%s)",
@@ -3721,7 +3929,8 @@ class GatewayRunner:
break
if adapter and chat_id:
try:
await adapter.send(chat_id, message_text)
send_meta = {"thread_id": thread_id} if thread_id else None
await adapter.send(chat_id, message_text, metadata=send_meta)
except Exception as e:
logger.error("Watcher delivery error: %s", e)
break
@@ -3740,7 +3949,8 @@ class GatewayRunner:
break
if adapter and chat_id:
try:
await adapter.send(chat_id, message_text)
send_meta = {"thread_id": thread_id} if thread_id else None
await adapter.send(chat_id, message_text, metadata=send_meta)
except Exception as e:
logger.error("Watcher delivery error: %s", e)
@@ -3851,45 +4061,8 @@ class GatewayRunner:
last_tool[0] = tool_name
# Build progress message with primary argument preview
tool_emojis = {
"terminal": "💻",
"process": "⚙️",
"web_search": "🔍",
"web_extract": "📄",
"read_file": "📖",
"write_file": "✍️",
"patch": "🔧",
"search": "🔎",
"search_files": "🔎",
"list_directory": "📂",
"image_generate": "🎨",
"text_to_speech": "🔊",
"browser_navigate": "🌐",
"browser_click": "👆",
"browser_type": "⌨️",
"browser_snapshot": "📸",
"browser_scroll": "📜",
"browser_back": "◀️",
"browser_press": "⌨️",
"browser_close": "🚪",
"browser_get_images": "🖼️",
"browser_vision": "👁️",
"moa_query": "🧠",
"mixture_of_agents": "🧠",
"vision_analyze": "👁️",
"skill_view": "📚",
"skills_list": "📋",
"todo": "📋",
"memory": "🧠",
"session_search": "🔍",
"send_message": "📨",
"cronjob": "",
"execute_code": "🐍",
"delegate_task": "🔀",
"clarify": "",
"skill_manage": "📝",
}
emoji = tool_emojis.get(tool_name, "⚙️")
from agent.display import get_tool_emoji
emoji = get_tool_emoji(tool_name, default="⚙️")
# Verbose mode: show detailed arguments
if progress_mode == "verbose" and args:
@@ -4016,6 +4189,7 @@ class GatewayRunner:
agent_holder = [None] # Mutable container for the agent instance
result_holder = [None] # Mutable container for the result
tools_holder = [None] # Mutable container for the tool definitions
stream_consumer_holder = [None] # Mutable container for stream consumer
# Bridge sync step_callback → async hooks.emit for agent:step events
_loop_for_step = asyncio.get_event_loop()
@@ -4078,9 +4252,39 @@ class GatewayRunner:
honcho_manager, honcho_config = self._get_or_create_gateway_honcho(session_key)
reasoning_config = self._load_reasoning_config()
self._reasoning_config = reasoning_config
# Set up streaming consumer if enabled
_stream_consumer = None
_stream_delta_cb = None
_scfg = getattr(getattr(self, 'config', None), 'streaming', None)
if _scfg is None:
from gateway.config import StreamingConfig
_scfg = StreamingConfig()
if _scfg.enabled and _scfg.transport != "off":
try:
from gateway.stream_consumer import GatewayStreamConsumer, StreamConsumerConfig
_adapter = self.adapters.get(source.platform)
if _adapter:
_consumer_cfg = StreamConsumerConfig(
edit_interval=_scfg.edit_interval,
buffer_threshold=_scfg.buffer_threshold,
cursor=_scfg.cursor,
)
_stream_consumer = GatewayStreamConsumer(
adapter=_adapter,
chat_id=source.chat_id,
config=_consumer_cfg,
metadata={"thread_id": source.thread_id} if source.thread_id else None,
)
_stream_delta_cb = _stream_consumer.on_delta
stream_consumer_holder[0] = _stream_consumer
except Exception as _sc_err:
logger.debug("Could not set up stream consumer: %s", _sc_err)
turn_route = self._resolve_turn_agent_config(message, model, runtime_kwargs)
agent = AIAgent(
model=model,
**runtime_kwargs,
model=turn_route["model"],
**turn_route["runtime"],
max_iterations=max_iterations,
quiet_mode=True,
verbose_logging=False,
@@ -4097,6 +4301,7 @@ class GatewayRunner:
session_id=session_id,
tool_progress_callback=progress_callback if tool_progress_enabled else None,
step_callback=_step_callback_sync if _hooks_ref.loaded_hooks else None,
stream_delta_callback=_stream_delta_cb,
platform=platform_key,
honcho_session_key=session_key,
honcho_manager=honcho_manager,
@@ -4167,15 +4372,23 @@ class GatewayRunner:
result = agent.run_conversation(message, conversation_history=agent_history, task_id=session_id)
result_holder[0] = result
# Signal the stream consumer that the agent is done
if _stream_consumer is not None:
_stream_consumer.finish()
# Return final response, or a message if something went wrong
final_response = result.get("final_response")
# Extract last actual prompt token count from the agent's compressor
# Extract actual token counts from the agent instance used for this run
_last_prompt_toks = 0
_input_toks = 0
_output_toks = 0
_agent = agent_holder[0]
if _agent and hasattr(_agent, "context_compressor"):
_last_prompt_toks = getattr(_agent.context_compressor, "last_prompt_tokens", 0)
_input_toks = getattr(_agent, "session_prompt_tokens", 0)
_output_toks = getattr(_agent, "session_completion_tokens", 0)
_resolved_model = getattr(_agent, "model", None) if _agent else None
if not final_response:
@@ -4187,6 +4400,8 @@ class GatewayRunner:
"tools": tools_holder[0] or [],
"history_offset": len(agent_history),
"last_prompt_tokens": _last_prompt_toks,
"input_tokens": _input_toks,
"output_tokens": _output_toks,
"model": _resolved_model,
}
@@ -4250,6 +4465,8 @@ class GatewayRunner:
"tools": tools_holder[0] or [],
"history_offset": len(agent_history),
"last_prompt_tokens": _last_prompt_toks,
"input_tokens": _input_toks,
"output_tokens": _output_toks,
"model": _resolved_model,
"session_id": effective_session_id,
}
@@ -4258,6 +4475,20 @@ class GatewayRunner:
progress_task = None
if tool_progress_enabled:
progress_task = asyncio.create_task(send_progress_messages())
# Start stream consumer task — polls for consumer creation since it
# happens inside run_sync (thread pool) after the agent is constructed.
stream_task = None
async def _start_stream_consumer():
"""Wait for the stream consumer to be created, then run it."""
for _ in range(200): # Up to 10s wait
if stream_consumer_holder[0] is not None:
await stream_consumer_holder[0].run()
return
await asyncio.sleep(0.05)
stream_task = asyncio.create_task(_start_stream_consumer())
# Track this agent as running for this session (for interrupt support)
# We do this in a callback after the agent is created
@@ -4340,6 +4571,17 @@ class GatewayRunner:
if progress_task:
progress_task.cancel()
interrupt_monitor.cancel()
# Wait for stream consumer to finish its final edit
if stream_task:
try:
await asyncio.wait_for(stream_task, timeout=5.0)
except (asyncio.TimeoutError, asyncio.CancelledError):
stream_task.cancel()
try:
await stream_task
except asyncio.CancelledError:
pass
# Clean up tracking
tracking_task.cancel()
@@ -4353,6 +4595,12 @@ class GatewayRunner:
await task
except asyncio.CancelledError:
pass
# If streaming already delivered the response, mark it so the
# caller's send() is skipped (avoiding duplicate messages).
_sc = stream_consumer_holder[0]
if _sc and _sc.already_sent and isinstance(response, dict):
response["already_sent"] = True
return response
+108 -12
View File
@@ -8,9 +8,11 @@ Handles:
- Dynamic system prompt injection (agent knows its context)
"""
import hashlib
import logging
import os
import json
import re
import uuid
from pathlib import Path
from datetime import datetime, timedelta
@@ -19,6 +21,41 @@ from typing import Dict, List, Optional, Any
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# PII redaction helpers
# ---------------------------------------------------------------------------
_PHONE_RE = re.compile(r"^\+?\d[\d\-\s]{6,}$")
def _hash_id(value: str) -> str:
"""Deterministic 12-char hex hash of an identifier."""
return hashlib.sha256(value.encode("utf-8")).hexdigest()[:12]
def _hash_sender_id(value: str) -> str:
"""Hash a sender ID to ``user_<12hex>``."""
return f"user_{_hash_id(value)}"
def _hash_chat_id(value: str) -> str:
"""Hash the numeric portion of a chat ID, preserving platform prefix.
``telegram:12345`` ``telegram:<hash>``
``12345`` ``<hash>``
"""
colon = value.find(":")
if colon > 0:
prefix = value[:colon]
return f"{prefix}:{_hash_id(value[colon + 1:])}"
return _hash_id(value)
def _looks_like_phone(value: str) -> bool:
"""Return True if *value* looks like a phone number (E.164 or similar)."""
return bool(_PHONE_RE.match(value.strip()))
from .config import (
Platform,
GatewayConfig,
@@ -146,7 +183,21 @@ class SessionContext:
}
def build_session_context_prompt(context: SessionContext) -> str:
_PII_SAFE_PLATFORMS = frozenset({
Platform.WHATSAPP,
Platform.SIGNAL,
Platform.TELEGRAM,
})
"""Platforms where user IDs can be safely redacted (no in-message mention system
that requires raw IDs). Discord is excluded because mentions use ``<@user_id>``
and the LLM needs the real ID to tag users."""
def build_session_context_prompt(
context: SessionContext,
*,
redact_pii: bool = False,
) -> str:
"""
Build the dynamic system prompt section that tells the agent about its context.
@@ -154,7 +205,15 @@ def build_session_context_prompt(context: SessionContext) -> str:
- Where messages are coming from
- What platforms are connected
- Where it can deliver scheduled task outputs
When *redact_pii* is True **and** the source platform is in
``_PII_SAFE_PLATFORMS``, phone numbers are stripped and user/chat IDs
are replaced with deterministic hashes before being sent to the LLM.
Platforms like Discord are excluded because mentions need real IDs.
Routing still uses the original values (they stay in SessionSource).
"""
# Only apply redaction on platforms where IDs aren't needed for mentions
redact_pii = redact_pii and context.source.platform in _PII_SAFE_PLATFORMS
lines = [
"## Current Session Context",
"",
@@ -165,7 +224,25 @@ def build_session_context_prompt(context: SessionContext) -> str:
if context.source.platform == Platform.LOCAL:
lines.append(f"**Source:** {platform_name} (the machine running this agent)")
else:
lines.append(f"**Source:** {platform_name} ({context.source.description})")
# Build a description that respects PII redaction
src = context.source
if redact_pii:
# Build a safe description without raw IDs
_uname = src.user_name or (
_hash_sender_id(src.user_id) if src.user_id else "user"
)
_cname = src.chat_name or _hash_chat_id(src.chat_id)
if src.chat_type == "dm":
desc = f"DM with {_uname}"
elif src.chat_type == "group":
desc = f"group: {_cname}"
elif src.chat_type == "channel":
desc = f"channel: {_cname}"
else:
desc = _cname
else:
desc = src.description
lines.append(f"**Source:** {platform_name} ({desc})")
# Channel topic (if available - provides context about the channel's purpose)
if context.source.chat_topic:
@@ -175,7 +252,10 @@ def build_session_context_prompt(context: SessionContext) -> str:
if context.source.user_name:
lines.append(f"**User:** {context.source.user_name}")
elif context.source.user_id:
lines.append(f"**User ID:** {context.source.user_id}")
uid = context.source.user_id
if redact_pii:
uid = _hash_sender_id(uid)
lines.append(f"**User ID:** {uid}")
# Platform-specific behavioral notes
if context.source.platform == Platform.SLACK:
@@ -210,7 +290,8 @@ def build_session_context_prompt(context: SessionContext) -> str:
lines.append("")
lines.append("**Home Channels (default destinations):**")
for platform, home in context.home_channels.items():
lines.append(f" - {platform.value}: {home.name} (ID: {home.chat_id})")
hc_id = _hash_chat_id(home.chat_id) if redact_pii else home.chat_id
lines.append(f" - {platform.value}: {home.name} (ID: {hc_id})")
# Delivery options for scheduled tasks
lines.append("")
@@ -220,7 +301,10 @@ def build_session_context_prompt(context: SessionContext) -> str:
if context.source.platform == Platform.LOCAL:
lines.append("- `\"origin\"` → Local output (saved to files)")
else:
lines.append(f"- `\"origin\"` → Back to this chat ({context.source.chat_name or context.source.chat_id})")
_origin_label = context.source.chat_name or (
_hash_chat_id(context.source.chat_id) if redact_pii else context.source.chat_id
)
lines.append(f"- `\"origin\"` → Back to this chat ({_origin_label})")
# Local always available
lines.append("- `\"local\"` → Save to local files only (~/.hermes/cron/output/)")
@@ -315,7 +399,7 @@ class SessionEntry:
)
def build_session_key(source: SessionSource) -> str:
def build_session_key(source: SessionSource, group_sessions_per_user: bool = True) -> str:
"""Build a deterministic session key from a message source.
This is the single source of truth for session key construction.
@@ -328,7 +412,11 @@ def build_session_key(source: SessionSource) -> str:
Group/channel rules:
- chat_id identifies the parent group/channel.
- user_id/user_id_alt isolates participants within that parent chat when available when
``group_sessions_per_user`` is enabled.
- thread_id differentiates threads within that parent chat.
- Without participant identifiers, or when isolation is disabled, messages fall back to one
shared session per chat.
- Without identifiers, messages fall back to one session per platform/chat_type.
"""
platform = source.platform.value
@@ -340,13 +428,18 @@ def build_session_key(source: SessionSource) -> str:
if source.thread_id:
return f"agent:main:{platform}:dm:{source.thread_id}"
return f"agent:main:{platform}:dm"
participant_id = source.user_id_alt or source.user_id
key_parts = ["agent:main", platform, source.chat_type]
if source.chat_id:
if source.thread_id:
return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}:{source.thread_id}"
return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}"
key_parts.append(source.chat_id)
if source.thread_id:
return f"agent:main:{platform}:{source.chat_type}:{source.thread_id}"
return f"agent:main:{platform}:{source.chat_type}"
key_parts.append(source.thread_id)
if group_sessions_per_user and participant_id:
key_parts.append(str(participant_id))
return ":".join(key_parts)
class SessionStore:
@@ -425,7 +518,10 @@ class SessionStore:
def _generate_session_key(self, source: SessionSource) -> str:
"""Generate a session key from a source."""
return build_session_key(source)
return build_session_key(
source,
group_sessions_per_user=getattr(self.config, "group_sessions_per_user", True),
)
def _is_session_expired(self, entry: SessionEntry) -> bool:
"""Check if a session has expired based on its reset policy.
+24 -6
View File
@@ -83,8 +83,7 @@ def _looks_like_gateway_process(pid: int) -> bool:
"""Return True when the live PID still looks like the Hermes gateway."""
cmdline = _read_process_cmdline(pid)
if not cmdline:
# If we cannot inspect the process, fall back to the liveness check.
return True
return False
patterns = (
"hermes_cli.main gateway",
@@ -94,6 +93,24 @@ def _looks_like_gateway_process(pid: int) -> bool:
return any(pattern in cmdline for pattern in patterns)
def _record_looks_like_gateway(record: dict[str, Any]) -> bool:
"""Validate gateway identity from PID-file metadata when cmdline is unavailable."""
if record.get("kind") != _GATEWAY_KIND:
return False
argv = record.get("argv")
if not isinstance(argv, list) or not argv:
return False
cmdline = " ".join(str(part) for part in argv)
patterns = (
"hermes_cli.main gateway",
"hermes gateway",
"gateway/run.py",
)
return any(pattern in cmdline for pattern in patterns)
def _build_pid_record() -> dict:
return {
"pid": os.getpid(),
@@ -178,8 +195,8 @@ def write_runtime_status(
payload = _read_json_file(path) or _build_runtime_status_record()
payload.setdefault("platforms", {})
payload.setdefault("kind", _GATEWAY_KIND)
payload.setdefault("pid", os.getpid())
payload.setdefault("start_time", _get_process_start_time(os.getpid()))
payload["pid"] = os.getpid()
payload["start_time"] = _get_process_start_time(os.getpid())
payload["updated_at"] = _utc_now_iso()
if gateway_state is not None:
@@ -325,8 +342,9 @@ def get_running_pid() -> Optional[int]:
return None
if not _looks_like_gateway_process(pid):
remove_pid_file()
return None
if not _record_looks_like_gateway(record):
remove_pid_file()
return None
return pid
+177
View File
@@ -0,0 +1,177 @@
"""Gateway streaming consumer — bridges sync agent callbacks to async platform delivery.
The agent fires stream_delta_callback(text) synchronously from its worker thread.
GatewayStreamConsumer:
1. Receives deltas via on_delta() (thread-safe, sync)
2. Queues them to an asyncio task via queue.Queue
3. The async run() task buffers, rate-limits, and progressively edits
a single message on the target platform
Design: Uses the edit transport (send initial message, then editMessageText).
This is universally supported across Telegram, Discord, and Slack.
Credit: jobless0x (#774, #1312), OutThisLife (#798), clicksingh (#697).
"""
from __future__ import annotations
import asyncio
import logging
import queue
import time
from dataclasses import dataclass
from typing import Any, Optional
logger = logging.getLogger("gateway.stream_consumer")
# Sentinel to signal the stream is complete
_DONE = object()
@dataclass
class StreamConsumerConfig:
"""Runtime config for a single stream consumer instance."""
edit_interval: float = 0.3
buffer_threshold: int = 40
cursor: str = ""
class GatewayStreamConsumer:
"""Async consumer that progressively edits a platform message with streamed tokens.
Usage::
consumer = GatewayStreamConsumer(adapter, chat_id, config, metadata=metadata)
# Pass consumer.on_delta as stream_delta_callback to AIAgent
agent = AIAgent(..., stream_delta_callback=consumer.on_delta)
# Start the consumer as an asyncio task
task = asyncio.create_task(consumer.run())
# ... run agent in thread pool ...
consumer.finish() # signal completion
await task # wait for final edit
"""
def __init__(
self,
adapter: Any,
chat_id: str,
config: Optional[StreamConsumerConfig] = None,
metadata: Optional[dict] = None,
):
self.adapter = adapter
self.chat_id = chat_id
self.cfg = config or StreamConsumerConfig()
self.metadata = metadata
self._queue: queue.Queue = queue.Queue()
self._accumulated = ""
self._message_id: Optional[str] = None
self._already_sent = False
self._edit_supported = True # Disabled on first edit failure (Signal/Email/HA)
self._last_edit_time = 0.0
@property
def already_sent(self) -> bool:
"""True if at least one message was sent/edited — signals the base
adapter to skip re-sending the final response."""
return self._already_sent
def on_delta(self, text: str) -> None:
"""Thread-safe callback — called from the agent's worker thread."""
if text:
self._queue.put(text)
def finish(self) -> None:
"""Signal that the stream is complete."""
self._queue.put(_DONE)
async def run(self) -> None:
"""Async task that drains the queue and edits the platform message."""
try:
while True:
# Drain all available items from the queue
got_done = False
while True:
try:
item = self._queue.get_nowait()
if item is _DONE:
got_done = True
break
self._accumulated += item
except queue.Empty:
break
# Decide whether to flush an edit
now = time.monotonic()
elapsed = now - self._last_edit_time
should_edit = (
got_done
or (elapsed >= self.cfg.edit_interval
and len(self._accumulated) > 0)
or len(self._accumulated) >= self.cfg.buffer_threshold
)
if should_edit and self._accumulated:
display_text = self._accumulated
if not got_done:
display_text += self.cfg.cursor
await self._send_or_edit(display_text)
self._last_edit_time = time.monotonic()
if got_done:
# Final edit without cursor
if self._accumulated and self._message_id:
await self._send_or_edit(self._accumulated)
return
await asyncio.sleep(0.05) # Small yield to not busy-loop
except asyncio.CancelledError:
# Best-effort final edit on cancellation
if self._accumulated and self._message_id:
try:
await self._send_or_edit(self._accumulated)
except Exception:
pass
except Exception as e:
logger.error("Stream consumer error: %s", e)
async def _send_or_edit(self, text: str) -> None:
"""Send or edit the streaming message."""
try:
if self._message_id is not None:
if self._edit_supported:
# Edit existing message
result = await self.adapter.edit_message(
chat_id=self.chat_id,
message_id=self._message_id,
content=text,
)
if result.success:
self._already_sent = True
else:
# Edit not supported by this adapter — stop streaming,
# let the normal send path handle the final response.
# Without this guard, adapters like Signal/Email would
# flood the chat with a new message every edit_interval.
logger.debug("Edit failed, disabling streaming for this adapter")
self._edit_supported = False
else:
# Editing not supported — skip intermediate updates.
# The final response will be sent by the normal path.
pass
else:
# First message — send new
result = await self.adapter.send(
chat_id=self.chat_id,
content=text,
metadata=self.metadata,
)
if result.success and result.message_id:
self._message_id = result.message_id
self._already_sent = True
else:
# Initial send failed — disable streaming for this session
self._edit_supported = False
except Exception as e:
logger.error("Stream send/edit error: %s", e)
+2 -2
View File
@@ -11,5 +11,5 @@ Provides subcommands for:
- hermes cron - Manage cron jobs
"""
__version__ = "0.2.0"
__release_date__ = "2026.3.12"
__version__ = "0.3.0"
__release_date__ = "2026.3.17"
+17
View File
@@ -147,6 +147,22 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
api_key_env_vars=("MINIMAX_CN_API_KEY",),
base_url_env_var="MINIMAX_CN_BASE_URL",
),
"deepseek": ProviderConfig(
id="deepseek",
name="DeepSeek",
auth_type="api_key",
inference_base_url="https://api.deepseek.com/v1",
api_key_env_vars=("DEEPSEEK_API_KEY",),
base_url_env_var="DEEPSEEK_BASE_URL",
),
"ai-gateway": ProviderConfig(
id="ai-gateway",
name="AI Gateway",
auth_type="api_key",
inference_base_url="https://ai-gateway.vercel.sh/v1",
api_key_env_vars=("AI_GATEWAY_API_KEY",),
base_url_env_var="AI_GATEWAY_BASE_URL",
),
}
@@ -524,6 +540,7 @@ def resolve_provider(
"kimi": "kimi-coding", "moonshot": "kimi-coding",
"minimax-china": "minimax-cn", "minimax_cn": "minimax-cn",
"claude": "anthropic", "claude-code": "anthropic",
"aigateway": "ai-gateway", "vercel": "ai-gateway", "vercel-ai-gateway": "ai-gateway",
}
normalized = _PROVIDER_ALIASES.get(normalized, normalized)
+513 -52
View File
@@ -1,77 +1,296 @@
"""Slash command definitions and autocomplete for the Hermes CLI.
Contains the shared built-in ``COMMANDS`` dict and ``SlashCommandCompleter``.
The completer can optionally include dynamic skill slash commands supplied by the
interactive CLI.
Central registry for all slash commands. Every consumer -- CLI help, gateway
dispatch, Telegram BotCommands, Slack subcommand mapping, autocomplete --
derives its data from ``COMMAND_REGISTRY``.
To add a command: add a ``CommandDef`` entry to ``COMMAND_REGISTRY``.
To add an alias: set ``aliases=("short",)`` on the existing ``CommandDef``.
"""
from __future__ import annotations
import os
import re
from collections.abc import Callable, Mapping
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any
from prompt_toolkit.auto_suggest import AutoSuggest, Suggestion
from prompt_toolkit.completion import Completer, Completion
# Commands organized by category for better help display
COMMANDS_BY_CATEGORY = {
"Session": {
"/new": "Start a new session (fresh session ID + history)",
"/reset": "Start a new session (alias for /new)",
"/clear": "Clear screen and start a new session",
"/history": "Show conversation history",
"/save": "Save the current conversation",
"/retry": "Retry the last message (resend to agent)",
"/undo": "Remove the last user/assistant exchange",
"/title": "Set a title for the current session (usage: /title My Session Name)",
"/compress": "Manually compress conversation context (flush memories + summarize)",
"/rollback": "List or restore filesystem checkpoints (usage: /rollback [number])",
"/background": "Run a prompt in the background (usage: /background <prompt>)",
},
"Configuration": {
"/config": "Show current configuration",
"/model": "Show or change the current model",
"/provider": "Show available providers and current provider",
"/prompt": "View/set custom system prompt",
"/personality": "Set a predefined personality",
"/verbose": "Cycle tool progress display: off → new → all → verbose",
"/reasoning": "Manage reasoning effort and display (usage: /reasoning [level|show|hide])",
"/skin": "Show or change the display skin/theme",
"/voice": "Toggle voice mode (Ctrl+B to record). Usage: /voice [on|off|tts|status]",
},
"Tools & Skills": {
"/tools": "List available tools",
"/toolsets": "List available toolsets",
"/skills": "Search, install, inspect, or manage skills from online registries",
"/cron": "Manage scheduled tasks (list, add/create, edit, pause, resume, run, remove)",
"/reload-mcp": "Reload MCP servers from config.yaml",
},
"Info": {
"/help": "Show this help message",
"/usage": "Show token usage for the current session",
"/insights": "Show usage insights and analytics (last 30 days)",
"/platforms": "Show gateway/messaging platform status",
"/paste": "Check clipboard for an image and attach it",
},
"Exit": {
"/quit": "Exit the CLI (also: /exit, /q)",
},
}
# ---------------------------------------------------------------------------
# CommandDef dataclass
# ---------------------------------------------------------------------------
# Flat dict for backwards compatibility and autocomplete
COMMANDS = {}
for category_commands in COMMANDS_BY_CATEGORY.values():
COMMANDS.update(category_commands)
@dataclass(frozen=True)
class CommandDef:
"""Definition of a single slash command."""
name: str # canonical name without slash: "background"
description: str # human-readable description
category: str # "Session", "Configuration", etc.
aliases: tuple[str, ...] = () # alternative names: ("bg",)
args_hint: str = "" # argument placeholder: "<prompt>", "[name]"
subcommands: tuple[str, ...] = () # tab-completable subcommands
cli_only: bool = False # only available in CLI
gateway_only: bool = False # only available in gateway/messaging
# ---------------------------------------------------------------------------
# Central registry -- single source of truth
# ---------------------------------------------------------------------------
COMMAND_REGISTRY: list[CommandDef] = [
# Session
CommandDef("new", "Start a new session (fresh session ID + history)", "Session",
aliases=("reset",)),
CommandDef("clear", "Clear screen and start a new session", "Session",
cli_only=True),
CommandDef("history", "Show conversation history", "Session",
cli_only=True),
CommandDef("save", "Save the current conversation", "Session",
cli_only=True),
CommandDef("retry", "Retry the last message (resend to agent)", "Session"),
CommandDef("undo", "Remove the last user/assistant exchange", "Session"),
CommandDef("title", "Set a title for the current session", "Session",
args_hint="[name]"),
CommandDef("compress", "Manually compress conversation context", "Session"),
CommandDef("rollback", "List or restore filesystem checkpoints", "Session",
args_hint="[number]"),
CommandDef("stop", "Kill all running background processes", "Session"),
CommandDef("background", "Run a prompt in the background", "Session",
aliases=("bg",), args_hint="<prompt>"),
CommandDef("status", "Show session info", "Session",
gateway_only=True),
CommandDef("sethome", "Set this chat as the home channel", "Session",
gateway_only=True, aliases=("set-home",)),
CommandDef("resume", "Resume a previously-named session", "Session",
args_hint="[name]"),
# Configuration
CommandDef("config", "Show current configuration", "Configuration",
cli_only=True),
CommandDef("model", "Show or change the current model", "Configuration",
args_hint="[name]"),
CommandDef("provider", "Show available providers and current provider",
"Configuration"),
CommandDef("prompt", "View/set custom system prompt", "Configuration",
cli_only=True, args_hint="[text]", subcommands=("clear",)),
CommandDef("personality", "Set a predefined personality", "Configuration",
args_hint="[name]"),
CommandDef("verbose", "Cycle tool progress display: off -> new -> all -> verbose",
"Configuration", cli_only=True),
CommandDef("reasoning", "Manage reasoning effort and display", "Configuration",
args_hint="[level|show|hide]",
subcommands=("none", "low", "minimal", "medium", "high", "xhigh", "show", "hide", "on", "off")),
CommandDef("skin", "Show or change the display skin/theme", "Configuration",
cli_only=True, args_hint="[name]"),
CommandDef("voice", "Toggle voice mode", "Configuration",
args_hint="[on|off|tts|status]", subcommands=("on", "off", "tts", "status")),
# Tools & Skills
CommandDef("tools", "List available tools", "Tools & Skills",
cli_only=True),
CommandDef("toolsets", "List available toolsets", "Tools & Skills",
cli_only=True),
CommandDef("skills", "Search, install, inspect, or manage skills",
"Tools & Skills", cli_only=True,
subcommands=("search", "browse", "inspect", "install")),
CommandDef("cron", "Manage scheduled tasks", "Tools & Skills",
cli_only=True, args_hint="[subcommand]",
subcommands=("list", "add", "create", "edit", "pause", "resume", "run", "remove")),
CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
aliases=("reload_mcp",)),
CommandDef("plugins", "List installed plugins and their status",
"Tools & Skills", cli_only=True),
# Info
CommandDef("help", "Show available commands", "Info"),
CommandDef("usage", "Show token usage for the current session", "Info"),
CommandDef("insights", "Show usage insights and analytics", "Info",
args_hint="[days]"),
CommandDef("platforms", "Show gateway/messaging platform status", "Info",
cli_only=True, aliases=("gateway",)),
CommandDef("paste", "Check clipboard for an image and attach it", "Info",
cli_only=True),
CommandDef("update", "Update Hermes Agent to the latest version", "Info",
gateway_only=True),
# Exit
CommandDef("quit", "Exit the CLI", "Exit",
cli_only=True, aliases=("exit", "q")),
]
# ---------------------------------------------------------------------------
# Derived lookups -- rebuilt once at import time
# ---------------------------------------------------------------------------
def _build_command_lookup() -> dict[str, CommandDef]:
"""Map every name and alias to its CommandDef."""
lookup: dict[str, CommandDef] = {}
for cmd in COMMAND_REGISTRY:
lookup[cmd.name] = cmd
for alias in cmd.aliases:
lookup[alias] = cmd
return lookup
_COMMAND_LOOKUP: dict[str, CommandDef] = _build_command_lookup()
def resolve_command(name: str) -> CommandDef | None:
"""Resolve a command name or alias to its CommandDef.
Accepts names with or without the leading slash.
"""
return _COMMAND_LOOKUP.get(name.lower().lstrip("/"))
def _build_description(cmd: CommandDef) -> str:
"""Build a CLI-facing description string including usage hint."""
if cmd.args_hint:
return f"{cmd.description} (usage: /{cmd.name} {cmd.args_hint})"
return cmd.description
# Backwards-compatible flat dict: "/command" -> description
COMMANDS: dict[str, str] = {}
for _cmd in COMMAND_REGISTRY:
if not _cmd.gateway_only:
COMMANDS[f"/{_cmd.name}"] = _build_description(_cmd)
for _alias in _cmd.aliases:
COMMANDS[f"/{_alias}"] = f"{_cmd.description} (alias for /{_cmd.name})"
# Backwards-compatible categorized dict
COMMANDS_BY_CATEGORY: dict[str, dict[str, str]] = {}
for _cmd in COMMAND_REGISTRY:
if not _cmd.gateway_only:
_cat = COMMANDS_BY_CATEGORY.setdefault(_cmd.category, {})
_cat[f"/{_cmd.name}"] = COMMANDS[f"/{_cmd.name}"]
for _alias in _cmd.aliases:
_cat[f"/{_alias}"] = COMMANDS[f"/{_alias}"]
# Subcommands lookup: "/cmd" -> ["sub1", "sub2", ...]
SUBCOMMANDS: dict[str, list[str]] = {}
for _cmd in COMMAND_REGISTRY:
if _cmd.subcommands:
SUBCOMMANDS[f"/{_cmd.name}"] = list(_cmd.subcommands)
# Also extract subcommands hinted in args_hint via pipe-separated patterns
# e.g. args_hint="[on|off|tts|status]" for commands that don't have explicit subcommands.
# NOTE: If a command already has explicit subcommands, this fallback is skipped.
# Use the `subcommands` field on CommandDef for intentional tab-completable args.
_PIPE_SUBS_RE = re.compile(r"[a-z]+(?:\|[a-z]+)+")
for _cmd in COMMAND_REGISTRY:
key = f"/{_cmd.name}"
if key in SUBCOMMANDS or not _cmd.args_hint:
continue
m = _PIPE_SUBS_RE.search(_cmd.args_hint)
if m:
SUBCOMMANDS[key] = m.group(0).split("|")
# ---------------------------------------------------------------------------
# Gateway helpers
# ---------------------------------------------------------------------------
# Set of all command names + aliases recognized by the gateway
GATEWAY_KNOWN_COMMANDS: frozenset[str] = frozenset(
name
for cmd in COMMAND_REGISTRY
if not cmd.cli_only
for name in (cmd.name, *cmd.aliases)
)
def gateway_help_lines() -> list[str]:
"""Generate gateway help text lines from the registry."""
lines: list[str] = []
for cmd in COMMAND_REGISTRY:
if cmd.cli_only:
continue
args = f" {cmd.args_hint}" if cmd.args_hint else ""
alias_parts: list[str] = []
for a in cmd.aliases:
# Skip internal aliases like reload_mcp (underscore variant)
if a.replace("-", "_") == cmd.name.replace("-", "_") and a != cmd.name:
continue
alias_parts.append(f"`/{a}`")
alias_note = f" (alias: {', '.join(alias_parts)})" if alias_parts else ""
lines.append(f"`/{cmd.name}{args}` -- {cmd.description}{alias_note}")
return lines
def telegram_bot_commands() -> list[tuple[str, str]]:
"""Return (command_name, description) pairs for Telegram setMyCommands.
Telegram command names cannot contain hyphens, so they are replaced with
underscores. Aliases are skipped -- Telegram shows one menu entry per
canonical command.
"""
result: list[tuple[str, str]] = []
for cmd in COMMAND_REGISTRY:
if cmd.cli_only:
continue
tg_name = cmd.name.replace("-", "_")
result.append((tg_name, cmd.description))
return result
def slack_subcommand_map() -> dict[str, str]:
"""Return subcommand -> /command mapping for Slack /hermes handler.
Maps both canonical names and aliases so /hermes bg do stuff works
the same as /hermes background do stuff.
"""
mapping: dict[str, str] = {}
for cmd in COMMAND_REGISTRY:
if cmd.cli_only:
continue
mapping[cmd.name] = f"/{cmd.name}"
for alias in cmd.aliases:
mapping[alias] = f"/{alias}"
return mapping
# ---------------------------------------------------------------------------
# Autocomplete
# ---------------------------------------------------------------------------
class SlashCommandCompleter(Completer):
"""Autocomplete for built-in slash commands and optional skill commands."""
"""Autocomplete for built-in slash commands, subcommands, and skill commands."""
def __init__(
self,
skill_commands_provider: Callable[[], Mapping[str, dict[str, Any]]] | None = None,
model_completer_provider: Callable[[], dict[str, Any]] | None = None,
) -> None:
self._skill_commands_provider = skill_commands_provider
# model_completer_provider returns {"current_provider": str,
# "providers": {id: label, ...}, "models_for": callable(provider) -> list[str]}
self._model_completer_provider = model_completer_provider
self._model_info_cache: dict[str, Any] | None = None
self._model_info_cache_time: float = 0
def _get_model_info(self) -> dict[str, Any]:
"""Get cached model/provider info for /model autocomplete."""
import time
now = time.monotonic()
if self._model_info_cache is not None and now - self._model_info_cache_time < 60:
return self._model_info_cache
if self._model_completer_provider is None:
return {}
try:
self._model_info_cache = self._model_completer_provider() or {}
self._model_info_cache_time = now
except Exception:
self._model_info_cache = self._model_info_cache or {}
return self._model_info_cache
def _iter_skill_commands(self) -> Mapping[str, dict[str, Any]]:
if self._skill_commands_provider is None:
@@ -92,9 +311,152 @@ class SlashCommandCompleter(Completer):
"""
return f"{cmd_name} " if cmd_name == word else cmd_name
@staticmethod
def _extract_path_word(text: str) -> str | None:
"""Extract the current word if it looks like a file path.
Returns the path-like token under the cursor, or None if the
current word doesn't look like a path. A word is path-like when
it starts with ``./``, ``../``, ``~/``, ``/``, or contains a
``/`` separator (e.g. ``src/main.py``).
"""
if not text:
return None
# Walk backwards to find the start of the current "word".
# Words are delimited by spaces, but paths can contain almost anything.
i = len(text) - 1
while i >= 0 and text[i] != " ":
i -= 1
word = text[i + 1:]
if not word:
return None
# Only trigger path completion for path-like tokens
if word.startswith(("./", "../", "~/", "/")) or "/" in word:
return word
return None
@staticmethod
def _path_completions(word: str, limit: int = 30):
"""Yield Completion objects for file paths matching *word*."""
expanded = os.path.expanduser(word)
# Split into directory part and prefix to match inside it
if expanded.endswith("/"):
search_dir = expanded
prefix = ""
else:
search_dir = os.path.dirname(expanded) or "."
prefix = os.path.basename(expanded)
try:
entries = os.listdir(search_dir)
except OSError:
return
count = 0
prefix_lower = prefix.lower()
for entry in sorted(entries):
if prefix and not entry.lower().startswith(prefix_lower):
continue
if count >= limit:
break
full_path = os.path.join(search_dir, entry)
is_dir = os.path.isdir(full_path)
# Build the completion text (what replaces the typed word)
if word.startswith("~"):
display_path = "~/" + os.path.relpath(full_path, os.path.expanduser("~"))
elif os.path.isabs(word):
display_path = full_path
else:
# Keep relative
display_path = os.path.relpath(full_path)
if is_dir:
display_path += "/"
suffix = "/" if is_dir else ""
meta = "dir" if is_dir else _file_size_label(full_path)
yield Completion(
display_path,
start_position=-len(word),
display=entry + suffix,
display_meta=meta,
)
count += 1
def get_completions(self, document, complete_event):
text = document.text_before_cursor
if not text.startswith("/"):
# Try file path completion for non-slash input
path_word = self._extract_path_word(text)
if path_word is not None:
yield from self._path_completions(path_word)
return
# Check if we're completing a subcommand (base command already typed)
parts = text.split(maxsplit=1)
base_cmd = parts[0].lower()
if len(parts) > 1 or (len(parts) == 1 and text.endswith(" ")):
sub_text = parts[1] if len(parts) > 1 else ""
sub_lower = sub_text.lower()
# /model gets two-stage completion:
# Stage 1: provider names (with : suffix)
# Stage 2: after "provider:", list that provider's models
if base_cmd == "/model" and " " not in sub_text:
info = self._get_model_info()
if info:
current_prov = info.get("current_provider", "")
providers = info.get("providers", {})
models_for = info.get("models_for")
if ":" in sub_text:
# Stage 2: "anthropic:cl" → models for anthropic
prov_part, model_part = sub_text.split(":", 1)
model_lower = model_part.lower()
if models_for:
try:
prov_models = models_for(prov_part)
except Exception:
prov_models = []
for mid in prov_models:
if mid.lower().startswith(model_lower) and mid.lower() != model_lower:
full = f"{prov_part}:{mid}"
yield Completion(
full,
start_position=-len(sub_text),
display=mid,
)
else:
# Stage 1: providers sorted: non-current first, current last
for pid, plabel in sorted(
providers.items(),
key=lambda kv: (kv[0] == current_prov, kv[0]),
):
display_name = f"{pid}:"
if display_name.lower().startswith(sub_lower):
meta = f"({plabel})" if plabel != pid else ""
if pid == current_prov:
meta = f"(current — {plabel})" if plabel != pid else "(current)"
yield Completion(
display_name,
start_position=-len(sub_text),
display=display_name,
display_meta=meta,
)
return
# Static subcommand completions
if " " not in sub_text and base_cmd in SUBCOMMANDS:
for sub in SUBCOMMANDS[base_cmd]:
if sub.startswith(sub_lower) and sub != sub_lower:
yield Completion(
sub,
start_position=-len(sub_text),
display=sub,
)
return
word = text[1:]
@@ -120,3 +482,102 @@ class SlashCommandCompleter(Completer):
display=cmd,
display_meta=f"{short_desc}",
)
# ---------------------------------------------------------------------------
# Inline auto-suggest (ghost text) for slash commands
# ---------------------------------------------------------------------------
class SlashCommandAutoSuggest(AutoSuggest):
"""Inline ghost-text suggestions for slash commands and their subcommands.
Shows the rest of a command or subcommand in dim text as you type.
Falls back to history-based suggestions for non-slash input.
"""
def __init__(
self,
history_suggest: AutoSuggest | None = None,
completer: SlashCommandCompleter | None = None,
) -> None:
self._history = history_suggest
self._completer = completer # Reuse its model cache
def get_suggestion(self, buffer, document):
text = document.text_before_cursor
# Only suggest for slash commands
if not text.startswith("/"):
# Fall back to history for regular text
if self._history:
return self._history.get_suggestion(buffer, document)
return None
parts = text.split(maxsplit=1)
base_cmd = parts[0].lower()
if len(parts) == 1 and not text.endswith(" "):
# Still typing the command name: /upd → suggest "ate"
word = text[1:].lower()
for cmd in COMMANDS:
cmd_name = cmd[1:] # strip leading /
if cmd_name.startswith(word) and cmd_name != word:
return Suggestion(cmd_name[len(word):])
return None
# Command is complete — suggest subcommands or model names
sub_text = parts[1] if len(parts) > 1 else ""
sub_lower = sub_text.lower()
# /model gets two-stage ghost text
if base_cmd == "/model" and " " not in sub_text and self._completer:
info = self._completer._get_model_info()
if info:
providers = info.get("providers", {})
models_for = info.get("models_for")
current_prov = info.get("current_provider", "")
if ":" in sub_text:
# Stage 2: after provider:, suggest model
prov_part, model_part = sub_text.split(":", 1)
model_lower = model_part.lower()
if models_for:
try:
for mid in models_for(prov_part):
if mid.lower().startswith(model_lower) and mid.lower() != model_lower:
return Suggestion(mid[len(model_part):])
except Exception:
pass
else:
# Stage 1: suggest provider name with :
for pid in sorted(providers, key=lambda p: (p == current_prov, p)):
candidate = f"{pid}:"
if candidate.lower().startswith(sub_lower) and candidate.lower() != sub_lower:
return Suggestion(candidate[len(sub_text):])
# Static subcommands
if base_cmd in SUBCOMMANDS and SUBCOMMANDS[base_cmd]:
if " " not in sub_text:
for sub in SUBCOMMANDS[base_cmd]:
if sub.startswith(sub_lower) and sub != sub_lower:
return Suggestion(sub[len(sub_text):])
# Fall back to history
if self._history:
return self._history.get_suggestion(buffer, document)
return None
def _file_size_label(path: str) -> str:
"""Return a compact human-readable file size, or '' on error."""
try:
size = os.path.getsize(path)
except OSError:
return ""
if size < 1024:
return f"{size}B"
if size < 1024 * 1024:
return f"{size / 1024:.0f}K"
if size < 1024 * 1024 * 1024:
return f"{size / (1024 * 1024):.1f}M"
return f"{size / (1024 * 1024 * 1024):.1f}G"
+220 -4
View File
@@ -25,6 +25,18 @@ from typing import Dict, Any, Optional, List, Tuple
_IS_WINDOWS = platform.system() == "Windows"
_ENV_VAR_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
# Env var names written to .env that aren't in OPTIONAL_ENV_VARS
# (managed by setup/provider flows directly).
_EXTRA_ENV_KEYS = frozenset({
"OPENAI_API_KEY", "OPENAI_BASE_URL",
"ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN",
"AUXILIARY_VISION_MODEL",
"DISCORD_HOME_CHANNEL", "TELEGRAM_HOME_CHANNEL",
"SIGNAL_ACCOUNT", "SIGNAL_HTTP_URL",
"SIGNAL_ALLOWED_USERS", "SIGNAL_GROUP_ALLOWED_USERS",
"TERMINAL_ENV", "TERMINAL_SSH_KEY", "TERMINAL_SSH_PORT",
"WHATSAPP_MODE", "WHATSAPP_ENABLED",
})
import yaml
@@ -118,6 +130,14 @@ DEFAULT_CONFIG = {
# Each entry is "host_path:container_path" (standard Docker -v syntax).
# Example: ["/home/user/projects:/workspace/projects", "/data:/data"]
"docker_volumes": [],
# Explicit opt-in: mount the host cwd into /workspace for Docker sessions.
# Default off because passing host directories into a sandbox weakens isolation.
"docker_mount_cwd_to_workspace": False,
# Persistent shell — keep a long-lived bash shell across execute() calls
# so cwd/env vars/shell variables survive between commands.
# Enabled by default for non-local backends (SSH); local is always opt-in
# via TERMINAL_LOCAL_PERSISTENT env var.
"persistent_shell": True,
},
"browser": {
@@ -129,7 +149,7 @@ DEFAULT_CONFIG = {
# When enabled, the agent takes a snapshot of the working directory once per
# conversation turn (on first write_file/patch call). Use /rollback to restore.
"checkpoints": {
"enabled": False,
"enabled": True,
"max_snapshots": 50, # Max checkpoints to keep per directory
},
@@ -139,6 +159,12 @@ DEFAULT_CONFIG = {
"summary_model": "google/gemini-3-flash-preview",
"summary_provider": "auto",
},
"smart_model_routing": {
"enabled": False,
"max_simple_chars": 160,
"max_simple_words": 28,
"cheap_model": {},
},
# Auxiliary model config — provider:model for each side task.
# Format: provider is the provider name, model is the model slug.
@@ -177,6 +203,12 @@ DEFAULT_CONFIG = {
"base_url": "",
"api_key": "",
},
"approval": {
"provider": "auto",
"model": "", # fast/cheap model recommended (e.g. gemini-flash, haiku)
"base_url": "",
"api_key": "",
},
"mcp": {
"provider": "auto",
"model": "",
@@ -197,8 +229,15 @@ DEFAULT_CONFIG = {
"resume_display": "full",
"bell_on_complete": False,
"show_reasoning": False,
"streaming": False,
"show_cost": False, # Show $ cost in the status bar (off by default)
"skin": "default",
},
# Privacy settings
"privacy": {
"redact_pii": False, # When True, hash user IDs and strip phone numbers from LLM context
},
# Text-to-speech configuration
"tts": {
@@ -283,6 +322,14 @@ DEFAULT_CONFIG = {
"auto_thread": True, # Auto-create threads on @mention in channels (like Slack)
},
# Approval mode for dangerous commands:
# manual — always prompt the user (default)
# smart — use auxiliary LLM to auto-approve low-risk commands, prompt for high-risk
# off — skip all approval prompts (equivalent to --yolo)
"approvals": {
"mode": "manual",
},
# Permanently allowed dangerous command patterns (added via "always" approval)
"command_allowlist": [],
# User-defined quick commands that bypass the agent loop (type: exec only)
@@ -302,7 +349,7 @@ DEFAULT_CONFIG = {
},
# Config schema version - bump this when adding new required fields
"_config_version": 8,
"_config_version": 9,
}
# =============================================================================
@@ -424,6 +471,20 @@ OPTIONAL_ENV_VARS = {
"category": "provider",
"advanced": True,
},
"DEEPSEEK_API_KEY": {
"description": "DeepSeek API key for direct DeepSeek access",
"prompt": "DeepSeek API Key",
"url": "https://platform.deepseek.com/api_keys",
"password": True,
"category": "provider",
},
"DEEPSEEK_BASE_URL": {
"description": "Custom DeepSeek API base URL (advanced)",
"prompt": "DeepSeek Base URL",
"url": "",
"password": False,
"category": "provider",
},
# ── Tool API keys ──
"FIRECRAWL_API_KEY": {
@@ -458,6 +519,14 @@ OPTIONAL_ENV_VARS = {
"password": False,
"category": "tool",
},
"BROWSER_USE_API_KEY": {
"description": "Browser Use API key for cloud browser (optional — local browser works without this)",
"prompt": "Browser Use API key",
"url": "https://browser-use.com/",
"tools": ["browser_navigate", "browser_click"],
"password": True,
"category": "tool",
},
"FAL_KEY": {
"description": "FAL API key for image generation",
"prompt": "FAL API key",
@@ -716,7 +785,15 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
Dict with migration results: {"env_added": [...], "config_added": [...], "warnings": [...]}
"""
results = {"env_added": [], "config_added": [], "warnings": []}
# ── Always: sanitize .env (split concatenated keys) ──
try:
fixes = sanitize_env_file()
if fixes and not quiet:
print(f" ✓ Repaired .env file ({fixes} corrupted entries fixed)")
except Exception:
pass # best-effort; don't block migration on sanitize failure
# Check config version
current_ver, latest_ver = check_config_version()
@@ -759,6 +836,18 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
tz_display = config["timezone"] or "(server-local)"
print(f" ✓ Added timezone to config.yaml: {tz_display}")
# ── Version 8 → 9: clear ANTHROPIC_TOKEN from .env ──
# The new Anthropic auth flow no longer uses this env var.
if current_ver < 9:
try:
old_token = get_env_value("ANTHROPIC_TOKEN")
if old_token:
save_env_value("ANTHROPIC_TOKEN", "")
if not quiet:
print(" ✓ Cleared ANTHROPIC_TOKEN from .env (no longer used)")
except Exception:
pass
if current_ver < latest_ver and not quiet:
print(f"Config version: {current_ver}{latest_ver}")
@@ -968,6 +1057,19 @@ _FALLBACK_COMMENT = """
# fallback_model:
# provider: openrouter
# model: anthropic/claude-sonnet-4
#
# ── Smart Model Routing ────────────────────────────────────────────────
# Optional cheap-vs-strong routing for simple turns.
# Keeps the primary model for complex work, but can route short/simple
# messages to a cheaper model across providers.
#
# smart_model_routing:
# enabled: true
# max_simple_chars: 160
# max_simple_words: 28
# cheap_model:
# provider: openrouter
# model: google/gemini-2.5-flash
"""
@@ -998,6 +1100,19 @@ _COMMENTED_SECTIONS = """
# fallback_model:
# provider: openrouter
# model: anthropic/claude-sonnet-4
#
# ── Smart Model Routing ────────────────────────────────────────────────
# Optional cheap-vs-strong routing for simple turns.
# Keeps the primary model for complex work, but can route short/simple
# messages to a cheaper model across providers.
#
# smart_model_routing:
# enabled: true
# max_simple_chars: 160
# max_simple_words: 28
# cheap_model:
# provider: openrouter
# model: google/gemini-2.5-flash
"""
@@ -1046,6 +1161,102 @@ def load_env() -> Dict[str, str]:
return env_vars
def _sanitize_env_lines(lines: list) -> list:
"""Fix corrupted .env lines before writing.
Handles two known corruption patterns:
1. Concatenated KEY=VALUE pairs on a single line (missing newline between
entries, e.g. ``ANTHROPIC_API_KEY=sk-...OPENAI_BASE_URL=https://...``).
2. Stale ``KEY=***`` placeholder entries left by incomplete setup runs.
Uses a known-keys set (OPTIONAL_ENV_VARS + _EXTRA_ENV_KEYS) so we only
split on real Hermes env var names, avoiding false positives from values
that happen to contain uppercase text with ``=``.
"""
# Build the known keys set lazily from OPTIONAL_ENV_VARS + extras.
# Done inside the function so OPTIONAL_ENV_VARS is guaranteed to be defined.
known_keys = set(OPTIONAL_ENV_VARS.keys()) | _EXTRA_ENV_KEYS
sanitized: list[str] = []
for line in lines:
raw = line.rstrip("\r\n")
stripped = raw.strip()
# Preserve blank lines and comments
if not stripped or stripped.startswith("#"):
sanitized.append(raw + "\n")
continue
# Detect concatenated KEY=VALUE pairs on one line.
# Search for known KEY= patterns at any position in the line.
split_positions = []
for key_name in known_keys:
needle = key_name + "="
idx = stripped.find(needle)
while idx >= 0:
split_positions.append(idx)
idx = stripped.find(needle, idx + len(needle))
if len(split_positions) > 1:
split_positions.sort()
# Deduplicate (shouldn't happen, but be safe)
split_positions = sorted(set(split_positions))
for i, pos in enumerate(split_positions):
end = split_positions[i + 1] if i + 1 < len(split_positions) else len(stripped)
part = stripped[pos:end].strip()
if part:
sanitized.append(part + "\n")
else:
sanitized.append(stripped + "\n")
return sanitized
def sanitize_env_file() -> int:
"""Read, sanitize, and rewrite ~/.hermes/.env in place.
Returns the number of lines that were fixed (concatenation splits +
placeholder removals). Returns 0 when no changes are needed.
"""
env_path = get_env_path()
if not env_path.exists():
return 0
read_kw = {"encoding": "utf-8", "errors": "replace"} if _IS_WINDOWS else {}
write_kw = {"encoding": "utf-8"} if _IS_WINDOWS else {}
with open(env_path, **read_kw) as f:
original_lines = f.readlines()
sanitized = _sanitize_env_lines(original_lines)
if sanitized == original_lines:
return 0
# Count fixes: difference in line count (from splits) + removed lines
fixes = abs(len(sanitized) - len(original_lines))
if fixes == 0:
# Lines changed content (e.g. *** removal) even if count is same
fixes = sum(1 for a, b in zip(original_lines, sanitized) if a != b)
fixes += abs(len(sanitized) - len(original_lines))
fd, tmp_path = tempfile.mkstemp(dir=str(env_path.parent), suffix=".tmp", prefix=".env_")
try:
with os.fdopen(fd, "w", **write_kw) as f:
f.writelines(sanitized)
f.flush()
os.fsync(f.fileno())
os.replace(tmp_path, env_path)
except BaseException:
try:
os.unlink(tmp_path)
except OSError:
pass
raise
_secure_file(env_path)
return fixes
def save_env_value(key: str, value: str):
"""Save or update a value in ~/.hermes/.env."""
if not _ENV_VAR_NAME_RE.match(key):
@@ -1063,6 +1274,8 @@ def save_env_value(key: str, value: str):
if env_path.exists():
with open(env_path, **read_kw) as f:
lines = f.readlines()
# Sanitize on every read: split concatenated keys, drop stale placeholders
lines = _sanitize_env_lines(lines)
# Find and update or append
found = False
@@ -1183,6 +1396,7 @@ def show_config():
("VOICE_TOOLS_OPENAI_KEY", "OpenAI (STT/TTS)"),
("FIRECRAWL_API_KEY", "Firecrawl"),
("BROWSERBASE_API_KEY", "Browserbase"),
("BROWSER_USE_API_KEY", "Browser Use"),
("FAL_KEY", "FAL"),
]
@@ -1329,7 +1543,7 @@ def set_config_value(key: str, value: str):
# Check if it's an API key (goes to .env)
api_keys = [
'OPENROUTER_API_KEY', 'OPENAI_API_KEY', 'ANTHROPIC_API_KEY', 'VOICE_TOOLS_OPENAI_KEY',
'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL', 'BROWSERBASE_API_KEY', 'BROWSERBASE_PROJECT_ID',
'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL', 'BROWSERBASE_API_KEY', 'BROWSERBASE_PROJECT_ID', 'BROWSER_USE_API_KEY',
'FAL_KEY', 'TELEGRAM_BOT_TOKEN', 'DISCORD_BOT_TOKEN',
'TERMINAL_SSH_HOST', 'TERMINAL_SSH_USER', 'TERMINAL_SSH_KEY',
'SUDO_PASSWORD', 'SLACK_BOT_TOKEN', 'SLACK_APP_TOKEN',
@@ -1388,9 +1602,11 @@ def set_config_value(key: str, value: str):
"terminal.singularity_image": "TERMINAL_SINGULARITY_IMAGE",
"terminal.modal_image": "TERMINAL_MODAL_IMAGE",
"terminal.daytona_image": "TERMINAL_DAYTONA_IMAGE",
"terminal.docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
"terminal.cwd": "TERMINAL_CWD",
"terminal.timeout": "TERMINAL_TIMEOUT",
"terminal.sandbox_dir": "TERMINAL_SANDBOX_DIR",
"terminal.persistent_shell": "TERMINAL_PERSISTENT_SHELL",
}
if key in _config_to_env_sync:
save_env_value(_config_to_env_sync[key], str(value))
+1
View File
@@ -570,6 +570,7 @@ def run_doctor(args):
# MiniMax APIs don't support /models endpoint — https://github.com/NousResearch/hermes-agent/issues/811
("MiniMax", ("MINIMAX_API_KEY",), None, "MINIMAX_BASE_URL", False),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), None, "MINIMAX_CN_BASE_URL", False),
("AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
]
for _pname, _env_vars, _default_url, _base_env, _supports_health_check in _apikey_providers:
_key = ""
+161 -24
View File
@@ -119,17 +119,62 @@ def is_windows() -> bool:
# Service Configuration
# =============================================================================
SERVICE_NAME = "hermes-gateway"
_SERVICE_BASE = "hermes-gateway"
SERVICE_DESCRIPTION = "Hermes Agent Gateway - Messaging Platform Integration"
def get_service_name() -> str:
"""Derive a systemd service name scoped to this HERMES_HOME.
Default ``~/.hermes`` returns ``hermes-gateway`` (backward compatible).
Any other HERMES_HOME appends a short hash so multiple installations
can each have their own systemd service without conflicting.
"""
import hashlib
from pathlib import Path as _Path # local import to avoid monkeypatch interference
home = _Path(os.getenv("HERMES_HOME", _Path.home() / ".hermes")).resolve()
default = (_Path.home() / ".hermes").resolve()
if home == default:
return _SERVICE_BASE
suffix = hashlib.sha256(str(home).encode()).hexdigest()[:8]
return f"{_SERVICE_BASE}-{suffix}"
SERVICE_NAME = _SERVICE_BASE # backward-compat for external importers; prefer get_service_name()
def get_systemd_unit_path(system: bool = False) -> Path:
name = get_service_name()
if system:
return Path("/etc/systemd/system") / f"{SERVICE_NAME}.service"
return Path.home() / ".config" / "systemd" / "user" / f"{SERVICE_NAME}.service"
return Path("/etc/systemd/system") / f"{name}.service"
return Path.home() / ".config" / "systemd" / "user" / f"{name}.service"
def _ensure_user_systemd_env() -> None:
"""Ensure DBUS_SESSION_BUS_ADDRESS and XDG_RUNTIME_DIR are set for systemctl --user.
On headless servers (SSH sessions), these env vars may be missing even when
the user's systemd instance is running (via linger). Without them,
``systemctl --user`` fails with "Failed to connect to bus: No medium found".
We detect the standard socket path and set the vars so all subsequent
subprocess calls inherit them.
"""
uid = os.getuid()
if "XDG_RUNTIME_DIR" not in os.environ:
runtime_dir = f"/run/user/{uid}"
if Path(runtime_dir).exists():
os.environ["XDG_RUNTIME_DIR"] = runtime_dir
if "DBUS_SESSION_BUS_ADDRESS" not in os.environ:
xdg_runtime = os.environ.get("XDG_RUNTIME_DIR", f"/run/user/{uid}")
bus_path = Path(xdg_runtime) / "bus"
if bus_path.exists():
os.environ["DBUS_SESSION_BUS_ADDRESS"] = f"unix:path={bus_path}"
def _systemctl_cmd(system: bool = False) -> list[str]:
if not system:
_ensure_user_systemd_env()
return ["systemctl"] if system else ["systemctl", "--user"]
@@ -350,8 +395,6 @@ def get_hermes_cli_path() -> str:
# =============================================================================
def generate_systemd_unit(system: bool = False, run_as_user: str | None = None) -> str:
import shutil
python_path = get_python_path()
working_dir = str(PROJECT_ROOT)
venv_dir = str(PROJECT_ROOT / "venv")
@@ -360,7 +403,8 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
# Build a PATH that includes the venv, node_modules, and standard system dirs
sane_path = f"{venv_bin}:{node_bin}:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
hermes_cli = shutil.which("hermes") or f"{python_path} -m hermes_cli.main"
hermes_home = str(Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")).resolve())
if system:
username, group_name, home_dir = _system_service_identity(run_as_user)
@@ -380,11 +424,12 @@ Environment="USER={username}"
Environment="LOGNAME={username}"
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=on-failure
RestartSec=10
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=15
TimeoutStopSec=60
StandardOutput=journal
StandardError=journal
@@ -399,15 +444,15 @@ After=network.target
[Service]
Type=simple
ExecStart={python_path} -m hermes_cli.main gateway run --replace
ExecStop={hermes_cli} gateway stop
WorkingDirectory={working_dir}
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=on-failure
RestartSec=10
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=15
TimeoutStopSec=60
StandardOutput=journal
StandardError=journal
@@ -455,7 +500,7 @@ def _print_linger_enable_warning(username: str, detail: str | None = None) -> No
print(f" sudo loginctl enable-linger {username}")
print()
print(" Then restart the gateway:")
print(f" systemctl --user restart {SERVICE_NAME}.service")
print(f" systemctl --user restart {get_service_name()}.service")
print()
@@ -517,6 +562,12 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
scope_flag = " --system" if system else ""
if unit_path.exists() and not force:
if not systemd_unit_is_current(system=system):
print(f"↻ Repairing outdated {_service_scope_label(system)} systemd service at: {unit_path}")
refresh_systemd_unit_if_needed(system=system)
subprocess.run(_systemctl_cmd(system) + ["enable", get_service_name()], check=True)
print(f"{_service_scope_label(system).capitalize()} service definition updated")
return
print(f"Service already installed at: {unit_path}")
print("Use --force to reinstall")
return
@@ -526,7 +577,7 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
unit_path.write_text(generate_systemd_unit(system=system, run_as_user=run_as_user), encoding="utf-8")
subprocess.run(_systemctl_cmd(system) + ["daemon-reload"], check=True)
subprocess.run(_systemctl_cmd(system) + ["enable", SERVICE_NAME], check=True)
subprocess.run(_systemctl_cmd(system) + ["enable", get_service_name()], check=True)
print()
print(f"{_service_scope_label(system).capitalize()} service installed and enabled!")
@@ -534,7 +585,7 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
print("Next steps:")
print(f" {'sudo ' if system else ''}hermes gateway start{scope_flag} # Start the service")
print(f" {'sudo ' if system else ''}hermes gateway status{scope_flag} # Check status")
print(f" {'journalctl' if system else 'journalctl --user'} -u {SERVICE_NAME} -f # View logs")
print(f" {'journalctl' if system else 'journalctl --user'} -u {get_service_name()} -f # View logs")
print()
if system:
@@ -552,8 +603,8 @@ def systemd_uninstall(system: bool = False):
if system:
_require_root_for_system_service("uninstall")
subprocess.run(_systemctl_cmd(system) + ["stop", SERVICE_NAME], check=False)
subprocess.run(_systemctl_cmd(system) + ["disable", SERVICE_NAME], check=False)
subprocess.run(_systemctl_cmd(system) + ["stop", get_service_name()], check=False)
subprocess.run(_systemctl_cmd(system) + ["disable", get_service_name()], check=False)
unit_path = get_systemd_unit_path(system=system)
if unit_path.exists():
@@ -569,7 +620,7 @@ def systemd_start(system: bool = False):
if system:
_require_root_for_system_service("start")
refresh_systemd_unit_if_needed(system=system)
subprocess.run(_systemctl_cmd(system) + ["start", SERVICE_NAME], check=True)
subprocess.run(_systemctl_cmd(system) + ["start", get_service_name()], check=True)
print(f"{_service_scope_label(system).capitalize()} service started")
@@ -578,7 +629,7 @@ def systemd_stop(system: bool = False):
system = _select_systemd_scope(system)
if system:
_require_root_for_system_service("stop")
subprocess.run(_systemctl_cmd(system) + ["stop", SERVICE_NAME], check=True)
subprocess.run(_systemctl_cmd(system) + ["stop", get_service_name()], check=True)
print(f"{_service_scope_label(system).capitalize()} service stopped")
@@ -588,7 +639,7 @@ def systemd_restart(system: bool = False):
if system:
_require_root_for_system_service("restart")
refresh_systemd_unit_if_needed(system=system)
subprocess.run(_systemctl_cmd(system) + ["restart", SERVICE_NAME], check=True)
subprocess.run(_systemctl_cmd(system) + ["restart", get_service_name()], check=True)
print(f"{_service_scope_label(system).capitalize()} service restarted")
@@ -613,12 +664,12 @@ def systemd_status(deep: bool = False, system: bool = False):
print()
subprocess.run(
_systemctl_cmd(system) + ["status", SERVICE_NAME, "--no-pager"],
_systemctl_cmd(system) + ["status", get_service_name(), "--no-pager"],
capture_output=False,
)
result = subprocess.run(
_systemctl_cmd(system) + ["is-active", SERVICE_NAME],
_systemctl_cmd(system) + ["is-active", get_service_name()],
capture_output=True,
text=True,
)
@@ -657,7 +708,7 @@ def systemd_status(deep: bool = False, system: bool = False):
if deep:
print()
print("Recent logs:")
subprocess.run(_journalctl_cmd(system) + ["-u", SERVICE_NAME, "-n", "20", "--no-pager"])
subprocess.run(_journalctl_cmd(system) + ["-u", get_service_name(), "-n", "20", "--no-pager"])
# =============================================================================
@@ -684,6 +735,7 @@ def generate_launchd_plist() -> str:
<string>hermes_cli.main</string>
<string>gateway</string>
<string>run</string>
<string>--replace</string>
</array>
<key>WorkingDirectory</key>
@@ -707,10 +759,45 @@ def generate_launchd_plist() -> str:
</plist>
"""
def launchd_plist_is_current() -> bool:
"""Check if the installed launchd plist matches the currently generated one."""
plist_path = get_launchd_plist_path()
if not plist_path.exists():
return False
installed = plist_path.read_text(encoding="utf-8")
expected = generate_launchd_plist()
return _normalize_service_definition(installed) == _normalize_service_definition(expected)
def refresh_launchd_plist_if_needed() -> bool:
"""Rewrite the installed launchd plist when the generated definition has changed.
Unlike systemd, launchd picks up plist changes on the next ``launchctl stop``/
``launchctl start`` cycle no daemon-reload is needed. We still unload/reload
to make launchd re-read the updated plist immediately.
"""
plist_path = get_launchd_plist_path()
if not plist_path.exists() or launchd_plist_is_current():
return False
plist_path.write_text(generate_launchd_plist(), encoding="utf-8")
# Unload/reload so launchd picks up the new definition
subprocess.run(["launchctl", "unload", str(plist_path)], check=False)
subprocess.run(["launchctl", "load", str(plist_path)], check=False)
print("↻ Updated gateway launchd service definition to match the current Hermes install")
return True
def launchd_install(force: bool = False):
plist_path = get_launchd_plist_path()
if plist_path.exists() and not force:
if not launchd_plist_is_current():
print(f"↻ Repairing outdated launchd service at: {plist_path}")
refresh_launchd_plist_if_needed()
print("✓ Service definition updated")
return
print(f"Service already installed at: {plist_path}")
print("Use --force to reinstall")
return
@@ -739,7 +826,16 @@ def launchd_uninstall():
print("✓ Service uninstalled")
def launchd_start():
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
refresh_launchd_plist_if_needed()
plist_path = get_launchd_plist_path()
try:
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
except subprocess.CalledProcessError as e:
if e.returncode != 3 or not plist_path.exists():
raise
print("↻ launchd job was unloaded; reloading service definition")
subprocess.run(["launchctl", "load", str(plist_path)], check=True)
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
print("✓ Service started")
def launchd_stop():
@@ -747,21 +843,36 @@ def launchd_stop():
print("✓ Service stopped")
def launchd_restart():
launchd_stop()
try:
launchd_stop()
except subprocess.CalledProcessError as e:
if e.returncode != 3:
raise
print("↻ launchd job was unloaded; skipping stop")
launchd_start()
def launchd_status(deep: bool = False):
plist_path = get_launchd_plist_path()
result = subprocess.run(
["launchctl", "list", "ai.hermes.gateway"],
capture_output=True,
text=True
)
print(f"Launchd plist: {plist_path}")
if launchd_plist_is_current():
print("✓ Service definition matches the current Hermes install")
else:
print("⚠ Service definition is stale relative to the current Hermes install")
print(" Run: hermes gateway start")
if result.returncode == 0:
print("✓ Gateway service is loaded")
print(result.stdout)
else:
print("✗ Gateway service is not loaded")
print(" Service definition exists locally but launchd has not loaded it.")
print(" Run: hermes gateway start")
if deep:
log_file = get_hermes_home() / "logs" / "gateway.log"
@@ -1118,7 +1229,7 @@ def _is_service_running() -> bool:
if user_unit_exists:
result = subprocess.run(
_systemctl_cmd(False) + ["is-active", SERVICE_NAME],
_systemctl_cmd(False) + ["is-active", get_service_name()],
capture_output=True, text=True
)
if result.stdout.strip() == "active":
@@ -1126,7 +1237,7 @@ def _is_service_running() -> bool:
if system_unit_exists:
result = subprocess.run(
_systemctl_cmd(True) + ["is-active", SERVICE_NAME],
_systemctl_cmd(True) + ["is-active", get_service_name()],
capture_output=True, text=True
)
if result.stdout.strip() == "active":
@@ -1477,14 +1588,17 @@ def gateway_command(args):
# Try service first, fall back to killing and restarting
service_available = False
system = getattr(args, 'system', False)
service_configured = False
if is_linux() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
service_configured = True
try:
systemd_restart(system=system)
service_available = True
except subprocess.CalledProcessError:
pass
elif is_macos() and get_launchd_plist_path().exists():
service_configured = True
try:
launchd_restart()
service_available = True
@@ -1492,6 +1606,29 @@ def gateway_command(args):
pass
if not service_available:
# systemd/launchd restart failed — check if linger is the issue
if is_linux():
linger_ok, _detail = get_systemd_linger_status()
if linger_ok is not True:
import getpass
_username = getpass.getuser()
print()
print("⚠ Cannot restart gateway as a service — linger is not enabled.")
print(" The gateway user service requires linger to function on headless servers.")
print()
print(f" Run: sudo loginctl enable-linger {_username}")
print()
print(" Then restart the gateway:")
print(" hermes gateway restart")
return
if service_configured:
print()
print("✗ Gateway service restart failed.")
print(" The service definition exists, but the service manager did not recover it.")
print(" Fix the service, then retry: hermes gateway start")
sys.exit(1)
# Manual restart: kill existing processes
killed = kill_gateway_processes()
if killed:
+156 -21
View File
@@ -768,6 +768,7 @@ def cmd_model(args):
"kimi-coding": "Kimi / Moonshot",
"minimax": "MiniMax",
"minimax-cn": "MiniMax (China)",
"ai-gateway": "AI Gateway",
"custom": "Custom endpoint",
}
active_label = provider_labels.get(active, active)
@@ -787,6 +788,7 @@ def cmd_model(args):
("kimi-coding", "Kimi / Moonshot (Moonshot AI direct API)"),
("minimax", "MiniMax (global direct API)"),
("minimax-cn", "MiniMax China (domestic direct API)"),
("ai-gateway", "AI Gateway (Vercel — 200+ models, pay-per-use)"),
]
# Add user-defined custom providers from config.yaml
@@ -855,7 +857,7 @@ def cmd_model(args):
_model_flow_anthropic(config, current_model)
elif selected_provider == "kimi-coding":
_model_flow_kimi(config, current_model)
elif selected_provider in ("zai", "minimax", "minimax-cn"):
elif selected_provider in ("zai", "minimax", "minimax-cn", "ai-gateway"):
_model_flow_api_key_provider(config, selected_provider, current_model)
@@ -1112,8 +1114,32 @@ def _model_flow_custom(config):
effective_key = api_key or current_key
from hermes_cli.models import probe_api_models
probe = probe_api_models(effective_key, effective_url)
if probe.get("used_fallback") and probe.get("resolved_base_url"):
print(
f"Warning: endpoint verification worked at {probe['resolved_base_url']}/models, "
f"not the exact URL you entered. Saving the working base URL instead."
)
effective_url = probe["resolved_base_url"]
if base_url:
base_url = effective_url
elif probe.get("models") is not None:
print(
f"Verified endpoint via {probe.get('probed_url')} "
f"({len(probe.get('models') or [])} model(s) visible)"
)
else:
print(
f"Warning: could not verify this endpoint via {probe.get('probed_url')}. "
f"Hermes will still save it."
)
if probe.get("suggested_base_url"):
print(f" If this server expects /v1, try base URL: {probe['suggested_base_url']}")
if base_url:
save_env_value("OPENAI_BASE_URL", base_url)
save_env_value("OPENAI_BASE_URL", effective_url)
if api_key:
save_env_value("OPENAI_API_KEY", api_key)
@@ -2098,7 +2124,17 @@ def _restore_stashed_changes(
print(" Review `git diff` / `git status` if Hermes behaves unexpectedly.")
return True
def _invalidate_update_cache():
"""Delete the update-check cache so ``hermes --version`` doesn't
report a stale "commits behind" count after a successful update."""
try:
cache_file = Path(os.getenv(
"HERMES_HOME", Path.home() / ".hermes"
)) / ".update_check"
if cache_file.exists():
cache_file.unlink()
except Exception:
pass
def cmd_update(args):
"""Update Hermes Agent to the latest version."""
@@ -2171,6 +2207,7 @@ def cmd_update(args):
commit_count = int(result.stdout.strip())
if commit_count == 0:
_invalidate_update_cache()
print("✓ Already up to date!")
return
@@ -2191,6 +2228,8 @@ def cmd_update(args):
prompt_user=prompt_for_restore,
)
_invalidate_update_cache()
# Reinstall Python dependencies (prefer uv for speed, fall back to pip)
print("→ Updating Python dependencies...")
uv_bin = shutil.which("uv")
@@ -2277,26 +2316,121 @@ def cmd_update(args):
print()
print("✓ Update complete!")
# Auto-restart gateway if it's running as a systemd service
# Auto-restart gateway if it's running.
# Uses the PID file (scoped to HERMES_HOME) to find this
# installation's gateway — safe with multiple installations.
try:
check = subprocess.run(
["systemctl", "--user", "is-active", "hermes-gateway"],
capture_output=True, text=True, timeout=5,
from gateway.status import get_running_pid, remove_pid_file
from hermes_cli.gateway import (
get_service_name, get_launchd_plist_path, is_macos, is_linux,
refresh_launchd_plist_if_needed,
_ensure_user_systemd_env, get_systemd_linger_status,
)
if check.stdout.strip() == "active":
print()
print("→ Gateway service is running — restarting to pick up changes...")
restart = subprocess.run(
["systemctl", "--user", "restart", "hermes-gateway"],
capture_output=True, text=True, timeout=15,
import signal as _signal
_gw_service_name = get_service_name()
existing_pid = get_running_pid()
has_systemd_service = False
has_launchd_service = False
try:
_ensure_user_systemd_env()
check = subprocess.run(
["systemctl", "--user", "is-active", _gw_service_name],
capture_output=True, text=True, timeout=5,
)
if restart.returncode == 0:
print("✓ Gateway restarted.")
else:
print(f"⚠ Gateway restart failed: {restart.stderr.strip()}")
print(" Try manually: hermes gateway restart")
except (FileNotFoundError, subprocess.TimeoutExpired):
pass # No systemd (macOS, WSL1, etc.) — skip silently
has_systemd_service = check.stdout.strip() == "active"
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
# Check for macOS launchd service
if is_macos():
try:
plist_path = get_launchd_plist_path()
if plist_path.exists():
check = subprocess.run(
["launchctl", "list", "ai.hermes.gateway"],
capture_output=True, text=True, timeout=5,
)
has_launchd_service = check.returncode == 0
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
if existing_pid or has_systemd_service or has_launchd_service:
print()
# When a service manager is handling the gateway, let it
# manage the lifecycle — don't manually SIGTERM the PID
# (launchd KeepAlive would respawn immediately, causing races).
if has_systemd_service:
import time as _time
if existing_pid:
try:
os.kill(existing_pid, _signal.SIGTERM)
print(f"→ Stopped gateway process (PID {existing_pid})")
except ProcessLookupError:
pass
except PermissionError:
print(f"⚠ Permission denied killing gateway PID {existing_pid}")
remove_pid_file()
_time.sleep(1) # Brief pause for port/socket release
print("→ Restarting gateway service...")
restart = subprocess.run(
["systemctl", "--user", "restart", _gw_service_name],
capture_output=True, text=True, timeout=15,
)
if restart.returncode == 0:
print("✓ Gateway restarted.")
else:
print(f"⚠ Gateway restart failed: {restart.stderr.strip()}")
# Check if linger is the issue
if is_linux():
linger_ok, _detail = get_systemd_linger_status()
if linger_ok is not True:
import getpass
_username = getpass.getuser()
print()
print(" Linger must be enabled for the gateway user service to function.")
print(f" Run: sudo loginctl enable-linger {_username}")
print()
print(" Then restart the gateway:")
print(" hermes gateway restart")
else:
print(" Try manually: hermes gateway restart")
elif has_launchd_service:
# Refresh the plist first (picks up --replace and other
# changes from the update we just pulled).
refresh_launchd_plist_if_needed()
# Explicit stop+start — don't rely on KeepAlive respawn
# after a manual SIGTERM, which would race with the
# PID file cleanup.
print("→ Restarting gateway service...")
stop = subprocess.run(
["launchctl", "stop", "ai.hermes.gateway"],
capture_output=True, text=True, timeout=10,
)
start = subprocess.run(
["launchctl", "start", "ai.hermes.gateway"],
capture_output=True, text=True, timeout=10,
)
if start.returncode == 0:
print("✓ Gateway restarted via launchd.")
else:
print(f"⚠ Gateway restart failed: {start.stderr.strip()}")
print(" Try manually: hermes gateway restart")
elif existing_pid:
try:
os.kill(existing_pid, _signal.SIGTERM)
print(f"→ Stopped gateway process (PID {existing_pid})")
except ProcessLookupError:
pass # Already gone
except PermissionError:
print(f"⚠ Permission denied killing gateway PID {existing_pid}")
remove_pid_file()
print(" ️ Gateway was running manually (not as a service).")
print(" Restart it with: hermes gateway run")
except Exception as e:
logger.debug("Gateway restart during update failed: %s", e)
print()
print("Tip: You can now select a provider and model:")
@@ -2859,7 +2993,8 @@ For more help on a command:
skills_install = skills_subparsers.add_parser("install", help="Install a skill")
skills_install.add_argument("identifier", help="Skill identifier (e.g. openai/skills/skill-creator)")
skills_install.add_argument("--category", default="", help="Category folder to install into")
skills_install.add_argument("--force", "--yes", "-y", dest="force", action="store_true", help="Install despite blocked scan verdict")
skills_install.add_argument("--force", action="store_true", help="Install despite blocked scan verdict")
skills_install.add_argument("--yes", "-y", action="store_true", help="Skip confirmation prompt (needed in TUI mode)")
skills_inspect = skills_subparsers.add_parser("inspect", help="Preview a skill without installing")
skills_inspect.add_argument("identifier", help="Skill identifier")
+292 -21
View File
@@ -8,6 +8,7 @@ Add, remove, or reorder entries here — both `hermes setup` and
from __future__ import annotations
import json
import os
import urllib.request
import urllib.error
from difflib import get_close_matches
@@ -78,6 +79,24 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"claude-sonnet-4-20250514",
"claude-haiku-4-5-20251001",
],
"deepseek": [
"deepseek-chat",
"deepseek-reasoner",
],
"ai-gateway": [
"anthropic/claude-opus-4.6",
"anthropic/claude-sonnet-4.6",
"anthropic/claude-sonnet-4.5",
"anthropic/claude-haiku-4.5",
"openai/gpt-5",
"openai/gpt-4.1",
"openai/gpt-4.1-mini",
"google/gemini-3-pro-preview",
"google/gemini-3-flash",
"google/gemini-2.5-pro",
"google/gemini-2.5-flash",
"deepseek/deepseek-v3.2",
],
}
_PROVIDER_LABELS = {
@@ -89,6 +108,8 @@ _PROVIDER_LABELS = {
"minimax": "MiniMax",
"minimax-cn": "MiniMax (China)",
"anthropic": "Anthropic",
"deepseek": "DeepSeek",
"ai-gateway": "AI Gateway",
"custom": "Custom endpoint",
}
@@ -103,6 +124,10 @@ _PROVIDER_ALIASES = {
"minimax_cn": "minimax-cn",
"claude": "anthropic",
"claude-code": "anthropic",
"deep-seek": "deepseek",
"aigateway": "ai-gateway",
"vercel": "ai-gateway",
"vercel-ai-gateway": "ai-gateway",
}
@@ -137,6 +162,7 @@ def list_available_providers() -> list[dict[str, str]]:
_PROVIDER_ORDER = [
"openrouter", "nous", "openai-codex",
"zai", "kimi-coding", "minimax", "minimax-cn", "anthropic",
"ai-gateway", "deepseek", "custom",
]
# Build reverse alias map
aliases_for: dict[str, list[str]] = {}
@@ -150,9 +176,12 @@ def list_available_providers() -> list[dict[str, str]]:
# Check if this provider has credentials available
has_creds = False
try:
from hermes_cli.runtime_provider import resolve_runtime_provider
runtime = resolve_runtime_provider(requested=pid)
has_creds = bool(runtime.get("api_key"))
if pid == "custom":
has_creds = bool(_get_custom_base_url())
else:
from hermes_cli.runtime_provider import resolve_runtime_provider
runtime = resolve_runtime_provider(requested=pid)
has_creds = bool(runtime.get("api_key"))
except Exception:
pass
result.append({
@@ -191,6 +220,19 @@ def parse_model_input(raw: str, current_provider: str) -> tuple[str, str]:
return (current_provider, stripped)
def _get_custom_base_url() -> str:
"""Get the custom endpoint base_url from config.yaml."""
try:
from hermes_cli.config import load_config
config = load_config()
model_cfg = config.get("model", {})
if isinstance(model_cfg, dict):
return str(model_cfg.get("base_url", "")).strip()
except Exception:
pass
return ""
def curated_models_for_provider(provider: Optional[str]) -> list[tuple[str, str]]:
"""Return ``(model_id, description)`` tuples for a provider's model list.
@@ -212,6 +254,111 @@ def curated_models_for_provider(provider: Optional[str]) -> list[tuple[str, str]
return [(m, "") for m in models]
def detect_provider_for_model(
model_name: str,
current_provider: str,
) -> Optional[tuple[str, str]]:
"""Auto-detect the best provider for a model name.
Returns ``(provider_id, model_name)`` the model name may be remapped
(e.g. bare ``deepseek-chat`` ``deepseek/deepseek-chat`` for OpenRouter).
Returns ``None`` when no confident match is found.
Priority:
1. Direct provider with credentials (highest)
2. Direct provider without credentials remap to OpenRouter slug
3. OpenRouter catalog match
"""
name = (model_name or "").strip()
if not name:
return None
name_lower = name.lower()
# Aggregators list other providers' models — never auto-switch TO them
_AGGREGATORS = {"nous", "openrouter"}
# If the model belongs to the current provider's catalog, don't suggest switching
current_models = _PROVIDER_MODELS.get(current_provider, [])
if any(name_lower == m.lower() for m in current_models):
return None
# --- Step 1: check static provider catalogs for a direct match ---
direct_match: Optional[str] = None
for pid, models in _PROVIDER_MODELS.items():
if pid == current_provider or pid in _AGGREGATORS:
continue
if any(name_lower == m.lower() for m in models):
direct_match = pid
break
if direct_match:
# Check if we have credentials for this provider
has_creds = False
try:
from hermes_cli.auth import PROVIDER_REGISTRY
pconfig = PROVIDER_REGISTRY.get(direct_match)
if pconfig:
import os
for env_var in pconfig.api_key_env_vars:
if os.getenv(env_var, "").strip():
has_creds = True
break
except Exception:
pass
if has_creds:
return (direct_match, name)
# No direct creds — try to find this model on OpenRouter instead
or_slug = _find_openrouter_slug(name)
if or_slug:
return ("openrouter", or_slug)
# Still return the direct provider — credential resolution will
# give a clear error rather than silently using the wrong provider
return (direct_match, name)
# --- Step 2: check OpenRouter catalog ---
# First try exact match (handles provider/model format)
or_slug = _find_openrouter_slug(name)
if or_slug:
if current_provider != "openrouter":
return ("openrouter", or_slug)
# Already on openrouter, just return the resolved slug
if or_slug != name:
return ("openrouter", or_slug)
return None # already on openrouter with matching name
return None
def _find_openrouter_slug(model_name: str) -> Optional[str]:
"""Find the full OpenRouter model slug for a bare or partial model name.
Handles:
- Exact match: ``anthropic/claude-opus-4.6`` as-is
- Bare name: ``deepseek-chat`` ``deepseek/deepseek-chat``
- Bare name: ``claude-opus-4.6`` ``anthropic/claude-opus-4.6``
"""
name_lower = model_name.strip().lower()
if not name_lower:
return None
# Exact match (already has provider/ prefix)
for mid, _ in OPENROUTER_MODELS:
if name_lower == mid.lower():
return mid
# Try matching just the model part (after the /)
for mid, _ in OPENROUTER_MODELS:
if "/" in mid:
_, model_part = mid.split("/", 1)
if name_lower == model_part.lower():
return mid
return None
def normalize_provider(provider: Optional[str]) -> str:
"""Normalize provider aliases to Hermes' canonical provider ids.
@@ -261,6 +408,22 @@ def provider_model_ids(provider: Optional[str]) -> list[str]:
live = _fetch_anthropic_models()
if live:
return live
if normalized == "ai-gateway":
live = _fetch_ai_gateway_models()
if live:
return live
if normalized == "custom":
base_url = _get_custom_base_url()
if base_url:
# Try common API key env vars for custom endpoints
api_key = (
os.getenv("CUSTOM_API_KEY", "")
or os.getenv("OPENAI_API_KEY", "")
or os.getenv("OPENROUTER_API_KEY", "")
)
live = fetch_api_models(api_key, base_url)
if live:
return live
return list(_PROVIDER_MODELS.get(normalized, []))
@@ -308,6 +471,89 @@ def _fetch_anthropic_models(timeout: float = 5.0) -> Optional[list[str]]:
return None
def probe_api_models(
api_key: Optional[str],
base_url: Optional[str],
timeout: float = 5.0,
) -> dict[str, Any]:
"""Probe an OpenAI-compatible ``/models`` endpoint with light URL heuristics."""
normalized = (base_url or "").strip().rstrip("/")
if not normalized:
return {
"models": None,
"probed_url": None,
"resolved_base_url": "",
"suggested_base_url": None,
"used_fallback": False,
}
if normalized.endswith("/v1"):
alternate_base = normalized[:-3].rstrip("/")
else:
alternate_base = normalized + "/v1"
candidates: list[tuple[str, bool]] = [(normalized, False)]
if alternate_base and alternate_base != normalized:
candidates.append((alternate_base, True))
tried: list[str] = []
headers: dict[str, str] = {}
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
for candidate_base, is_fallback in candidates:
url = candidate_base.rstrip("/") + "/models"
tried.append(url)
req = urllib.request.Request(url, headers=headers)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
data = json.loads(resp.read().decode())
return {
"models": [m.get("id", "") for m in data.get("data", [])],
"probed_url": url,
"resolved_base_url": candidate_base.rstrip("/"),
"suggested_base_url": alternate_base if alternate_base != candidate_base else normalized,
"used_fallback": is_fallback,
}
except Exception:
continue
return {
"models": None,
"probed_url": tried[-1] if tried else normalized.rstrip("/") + "/models",
"resolved_base_url": normalized,
"suggested_base_url": alternate_base if alternate_base != normalized else None,
"used_fallback": False,
}
def _fetch_ai_gateway_models(timeout: float = 5.0) -> Optional[list[str]]:
"""Fetch available language models with tool-use from AI Gateway."""
api_key = os.getenv("AI_GATEWAY_API_KEY", "").strip()
if not api_key:
return None
base_url = os.getenv("AI_GATEWAY_BASE_URL", "").strip()
if not base_url:
from hermes_constants import AI_GATEWAY_BASE_URL
base_url = AI_GATEWAY_BASE_URL
url = base_url.rstrip("/") + "/models"
headers: dict[str, str] = {"Authorization": f"Bearer {api_key}"}
req = urllib.request.Request(url, headers=headers)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
data = json.loads(resp.read().decode())
return [
m["id"]
for m in data.get("data", [])
if m.get("id")
and m.get("type") == "language"
and "tool-use" in (m.get("tags") or [])
]
except Exception:
return None
def fetch_api_models(
api_key: Optional[str],
base_url: Optional[str],
@@ -318,22 +564,7 @@ def fetch_api_models(
Returns a list of model ID strings, or ``None`` if the endpoint could not
be reached (network error, timeout, auth failure, etc.).
"""
if not base_url:
return None
url = base_url.rstrip("/") + "/models"
headers: dict[str, str] = {}
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
req = urllib.request.Request(url, headers=headers)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
data = json.loads(resp.read().decode())
# Standard OpenAI format: {"data": [{"id": "model-name", ...}, ...]}
return [m.get("id", "") for m in data.get("data", [])]
except Exception:
return None
return probe_api_models(api_key, base_url, timeout=timeout).get("models")
def validate_requested_model(
@@ -376,13 +607,53 @@ def validate_requested_model(
"message": "Model names cannot contain spaces.",
}
# Custom endpoints can serve any model — skip validation
if normalized == "custom":
probe = probe_api_models(api_key, base_url)
api_models = probe.get("models")
if api_models is not None:
if requested in set(api_models):
return {
"accepted": True,
"persist": True,
"recognized": True,
"message": None,
}
suggestions = get_close_matches(requested, api_models, n=3, cutoff=0.5)
suggestion_text = ""
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
message = (
f"Note: `{requested}` was not found in this custom endpoint's model listing "
f"({probe.get('probed_url')}). It may still work if the server supports hidden or aliased models."
f"{suggestion_text}"
)
if probe.get("used_fallback"):
message += (
f"\n Endpoint verification succeeded after trying `{probe.get('resolved_base_url')}`. "
f"Consider saving that as your base URL."
)
return {
"accepted": True,
"persist": True,
"recognized": False,
"message": message,
}
message = (
f"Note: could not reach this custom endpoint's model listing at `{probe.get('probed_url')}`. "
f"Hermes will still save `{requested}`, but the endpoint should expose `/models` for verification."
)
if probe.get("suggested_base_url"):
message += f"\n If this server expects `/v1`, try base URL: `{probe.get('suggested_base_url')}`"
return {
"accepted": True,
"persist": True,
"recognized": False,
"message": None,
"message": message,
}
# Probe the live API to check if the model actually exists
+449
View File
@@ -0,0 +1,449 @@
"""
Hermes Plugin System
====================
Discovers, loads, and manages plugins from three sources:
1. **User plugins** ``~/.hermes/plugins/<name>/``
2. **Project plugins** ``./.hermes/plugins/<name>/``
3. **Pip plugins** packages that expose the ``hermes_agent.plugins``
entry-point group.
Each directory plugin must contain a ``plugin.yaml`` manifest **and** an
``__init__.py`` with a ``register(ctx)`` function.
Lifecycle hooks
---------------
Plugins may register callbacks for any of the hooks in ``VALID_HOOKS``.
The agent core calls ``invoke_hook(name, **kwargs)`` at the appropriate
points.
Tool registration
-----------------
``PluginContext.register_tool()`` delegates to ``tools.registry.register()``
so plugin-defined tools appear alongside the built-in tools.
"""
from __future__ import annotations
import importlib
import importlib.metadata
import importlib.util
import logging
import os
import sys
import types
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Callable, Dict, List, Optional, Set
try:
import yaml
except ImportError: # pragma: no cover yaml is optional at import time
yaml = None # type: ignore[assignment]
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
VALID_HOOKS: Set[str] = {
"pre_tool_call",
"post_tool_call",
"pre_llm_call",
"post_llm_call",
"on_session_start",
"on_session_end",
}
ENTRY_POINTS_GROUP = "hermes_agent.plugins"
_NS_PARENT = "hermes_plugins"
# ---------------------------------------------------------------------------
# Data classes
# ---------------------------------------------------------------------------
@dataclass
class PluginManifest:
"""Parsed representation of a plugin.yaml manifest."""
name: str
version: str = ""
description: str = ""
author: str = ""
requires_env: List[str] = field(default_factory=list)
provides_tools: List[str] = field(default_factory=list)
provides_hooks: List[str] = field(default_factory=list)
source: str = "" # "user", "project", or "entrypoint"
path: Optional[str] = None
@dataclass
class LoadedPlugin:
"""Runtime state for a single loaded plugin."""
manifest: PluginManifest
module: Optional[types.ModuleType] = None
tools_registered: List[str] = field(default_factory=list)
hooks_registered: List[str] = field(default_factory=list)
enabled: bool = False
error: Optional[str] = None
# ---------------------------------------------------------------------------
# PluginContext handed to each plugin's ``register()`` function
# ---------------------------------------------------------------------------
class PluginContext:
"""Facade given to plugins so they can register tools and hooks."""
def __init__(self, manifest: PluginManifest, manager: "PluginManager"):
self.manifest = manifest
self._manager = manager
# -- tool registration --------------------------------------------------
def register_tool(
self,
name: str,
toolset: str,
schema: dict,
handler: Callable,
check_fn: Callable | None = None,
requires_env: list | None = None,
is_async: bool = False,
description: str = "",
emoji: str = "",
) -> None:
"""Register a tool in the global registry **and** track it as plugin-provided."""
from tools.registry import registry
registry.register(
name=name,
toolset=toolset,
schema=schema,
handler=handler,
check_fn=check_fn,
requires_env=requires_env,
is_async=is_async,
description=description,
emoji=emoji,
)
self._manager._plugin_tool_names.add(name)
logger.debug("Plugin %s registered tool: %s", self.manifest.name, name)
# -- hook registration --------------------------------------------------
def register_hook(self, hook_name: str, callback: Callable) -> None:
"""Register a lifecycle hook callback.
Unknown hook names produce a warning but are still stored so
forward-compatible plugins don't break.
"""
if hook_name not in VALID_HOOKS:
logger.warning(
"Plugin '%s' registered unknown hook '%s' "
"(valid: %s)",
self.manifest.name,
hook_name,
", ".join(sorted(VALID_HOOKS)),
)
self._manager._hooks.setdefault(hook_name, []).append(callback)
logger.debug("Plugin %s registered hook: %s", self.manifest.name, hook_name)
# ---------------------------------------------------------------------------
# PluginManager
# ---------------------------------------------------------------------------
class PluginManager:
"""Central manager that discovers, loads, and invokes plugins."""
def __init__(self) -> None:
self._plugins: Dict[str, LoadedPlugin] = {}
self._hooks: Dict[str, List[Callable]] = {}
self._plugin_tool_names: Set[str] = set()
self._discovered: bool = False
# -----------------------------------------------------------------------
# Public
# -----------------------------------------------------------------------
def discover_and_load(self) -> None:
"""Scan all plugin sources and load each plugin found."""
if self._discovered:
return
self._discovered = True
manifests: List[PluginManifest] = []
# 1. User plugins (~/.hermes/plugins/)
hermes_home = os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes"))
user_dir = Path(hermes_home) / "plugins"
manifests.extend(self._scan_directory(user_dir, source="user"))
# 2. Project plugins (./.hermes/plugins/)
project_dir = Path.cwd() / ".hermes" / "plugins"
manifests.extend(self._scan_directory(project_dir, source="project"))
# 3. Pip / entry-point plugins
manifests.extend(self._scan_entry_points())
# Load each manifest
for manifest in manifests:
self._load_plugin(manifest)
if manifests:
logger.info(
"Plugin discovery complete: %d found, %d enabled",
len(self._plugins),
sum(1 for p in self._plugins.values() if p.enabled),
)
# -----------------------------------------------------------------------
# Directory scanning
# -----------------------------------------------------------------------
def _scan_directory(self, path: Path, source: str) -> List[PluginManifest]:
"""Read ``plugin.yaml`` manifests from subdirectories of *path*."""
manifests: List[PluginManifest] = []
if not path.is_dir():
return manifests
for child in sorted(path.iterdir()):
if not child.is_dir():
continue
manifest_file = child / "plugin.yaml"
if not manifest_file.exists():
manifest_file = child / "plugin.yml"
if not manifest_file.exists():
logger.debug("Skipping %s (no plugin.yaml)", child)
continue
try:
if yaml is None:
logger.warning("PyYAML not installed cannot load %s", manifest_file)
continue
data = yaml.safe_load(manifest_file.read_text()) or {}
manifest = PluginManifest(
name=data.get("name", child.name),
version=str(data.get("version", "")),
description=data.get("description", ""),
author=data.get("author", ""),
requires_env=data.get("requires_env", []),
provides_tools=data.get("provides_tools", []),
provides_hooks=data.get("provides_hooks", []),
source=source,
path=str(child),
)
manifests.append(manifest)
except Exception as exc:
logger.warning("Failed to parse %s: %s", manifest_file, exc)
return manifests
# -----------------------------------------------------------------------
# Entry-point scanning
# -----------------------------------------------------------------------
def _scan_entry_points(self) -> List[PluginManifest]:
"""Check ``importlib.metadata`` for pip-installed plugins."""
manifests: List[PluginManifest] = []
try:
eps = importlib.metadata.entry_points()
# Python 3.12+ returns a SelectableGroups; earlier returns dict
if hasattr(eps, "select"):
group_eps = eps.select(group=ENTRY_POINTS_GROUP)
elif isinstance(eps, dict):
group_eps = eps.get(ENTRY_POINTS_GROUP, [])
else:
group_eps = [ep for ep in eps if ep.group == ENTRY_POINTS_GROUP]
for ep in group_eps:
manifest = PluginManifest(
name=ep.name,
source="entrypoint",
path=ep.value,
)
manifests.append(manifest)
except Exception as exc:
logger.debug("Entry-point scan failed: %s", exc)
return manifests
# -----------------------------------------------------------------------
# Loading
# -----------------------------------------------------------------------
def _load_plugin(self, manifest: PluginManifest) -> None:
"""Import a plugin module and call its ``register(ctx)`` function."""
loaded = LoadedPlugin(manifest=manifest)
try:
if manifest.source in ("user", "project"):
module = self._load_directory_module(manifest)
else:
module = self._load_entrypoint_module(manifest)
loaded.module = module
# Call register()
register_fn = getattr(module, "register", None)
if register_fn is None:
loaded.error = "no register() function"
logger.warning("Plugin '%s' has no register() function", manifest.name)
else:
ctx = PluginContext(manifest, self)
register_fn(ctx)
loaded.tools_registered = [
t for t in self._plugin_tool_names
if t not in {
n
for name, p in self._plugins.items()
for n in p.tools_registered
}
]
loaded.hooks_registered = list(
{
h
for h, cbs in self._hooks.items()
if cbs # non-empty
}
- {
h
for name, p in self._plugins.items()
for h in p.hooks_registered
}
)
loaded.enabled = True
except Exception as exc:
loaded.error = str(exc)
logger.warning("Failed to load plugin '%s': %s", manifest.name, exc)
self._plugins[manifest.name] = loaded
def _load_directory_module(self, manifest: PluginManifest) -> types.ModuleType:
"""Import a directory-based plugin as ``hermes_plugins.<name>``."""
plugin_dir = Path(manifest.path) # type: ignore[arg-type]
init_file = plugin_dir / "__init__.py"
if not init_file.exists():
raise FileNotFoundError(f"No __init__.py in {plugin_dir}")
# Ensure the namespace parent package exists
if _NS_PARENT not in sys.modules:
ns_pkg = types.ModuleType(_NS_PARENT)
ns_pkg.__path__ = [] # type: ignore[attr-defined]
ns_pkg.__package__ = _NS_PARENT
sys.modules[_NS_PARENT] = ns_pkg
module_name = f"{_NS_PARENT}.{manifest.name.replace('-', '_')}"
spec = importlib.util.spec_from_file_location(
module_name,
init_file,
submodule_search_locations=[str(plugin_dir)],
)
if spec is None or spec.loader is None:
raise ImportError(f"Cannot create module spec for {init_file}")
module = importlib.util.module_from_spec(spec)
module.__package__ = module_name
module.__path__ = [str(plugin_dir)] # type: ignore[attr-defined]
sys.modules[module_name] = module
spec.loader.exec_module(module)
return module
def _load_entrypoint_module(self, manifest: PluginManifest) -> types.ModuleType:
"""Load a pip-installed plugin via its entry-point reference."""
eps = importlib.metadata.entry_points()
if hasattr(eps, "select"):
group_eps = eps.select(group=ENTRY_POINTS_GROUP)
elif isinstance(eps, dict):
group_eps = eps.get(ENTRY_POINTS_GROUP, [])
else:
group_eps = [ep for ep in eps if ep.group == ENTRY_POINTS_GROUP]
for ep in group_eps:
if ep.name == manifest.name:
return ep.load()
raise ImportError(
f"Entry point '{manifest.name}' not found in group '{ENTRY_POINTS_GROUP}'"
)
# -----------------------------------------------------------------------
# Hook invocation
# -----------------------------------------------------------------------
def invoke_hook(self, hook_name: str, **kwargs: Any) -> None:
"""Call all registered callbacks for *hook_name*.
Each callback is wrapped in its own try/except so a misbehaving
plugin cannot break the core agent loop.
"""
callbacks = self._hooks.get(hook_name, [])
for cb in callbacks:
try:
cb(**kwargs)
except Exception as exc:
logger.warning(
"Hook '%s' callback %s raised: %s",
hook_name,
getattr(cb, "__name__", repr(cb)),
exc,
)
# -----------------------------------------------------------------------
# Introspection
# -----------------------------------------------------------------------
def list_plugins(self) -> List[Dict[str, Any]]:
"""Return a list of info dicts for all discovered plugins."""
result: List[Dict[str, Any]] = []
for name, loaded in sorted(self._plugins.items()):
result.append(
{
"name": name,
"version": loaded.manifest.version,
"description": loaded.manifest.description,
"source": loaded.manifest.source,
"enabled": loaded.enabled,
"tools": len(loaded.tools_registered),
"hooks": len(loaded.hooks_registered),
"error": loaded.error,
}
)
return result
# ---------------------------------------------------------------------------
# Module-level singleton & convenience functions
# ---------------------------------------------------------------------------
_plugin_manager: Optional[PluginManager] = None
def get_plugin_manager() -> PluginManager:
"""Return (and lazily create) the global PluginManager singleton."""
global _plugin_manager
if _plugin_manager is None:
_plugin_manager = PluginManager()
return _plugin_manager
def discover_plugins() -> None:
"""Discover and load all plugins (idempotent)."""
get_plugin_manager().discover_and_load()
def invoke_hook(hook_name: str, **kwargs: Any) -> None:
"""Invoke a lifecycle hook on all loaded plugins."""
get_plugin_manager().invoke_hook(hook_name, **kwargs)
def get_plugin_tool_names() -> Set[str]:
"""Return the set of tool names registered by plugins."""
return get_plugin_manager()._plugin_tool_names
+152 -123
View File
@@ -59,6 +59,7 @@ _DEFAULT_PROVIDER_MODELS = {
"kimi-coding": ["kimi-k2.5", "kimi-k2-thinking", "kimi-k2-turbo-preview"],
"minimax": ["MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
"minimax-cn": ["MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
"ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
}
@@ -227,54 +228,86 @@ def prompt(question: str, default: str = None, password: bool = False) -> str:
sys.exit(1)
def _curses_prompt_choice(question: str, choices: list, default: int = 0) -> int:
"""Single-select menu using curses to avoid simple_term_menu rendering bugs."""
try:
import curses
result_holder = [default]
def _curses_menu(stdscr):
curses.curs_set(0)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
cursor = default
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
try:
stdscr.addnstr(
0,
0,
question,
max_x - 1,
curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0),
)
except curses.error:
pass
for i, choice in enumerate(choices):
y = i + 2
if y >= max_y - 1:
break
arrow = "" if i == cursor else " "
line = f" {arrow} {choice}"
attr = curses.A_NORMAL
if i == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(1)
try:
stdscr.addnstr(y, 0, line, max_x - 1, attr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
cursor = (cursor - 1) % len(choices)
elif key in (curses.KEY_DOWN, ord("j")):
cursor = (cursor + 1) % len(choices)
elif key in (curses.KEY_ENTER, 10, 13):
result_holder[0] = cursor
return
elif key in (27, ord("q")):
return
curses.wrapper(_curses_menu)
return result_holder[0]
except Exception:
return -1
def prompt_choice(question: str, choices: list, default: int = 0) -> int:
"""Prompt for a choice from a list with arrow key navigation.
Escape keeps the current default (skips the question).
Ctrl+C exits the wizard.
"""
print(color(question, Colors.YELLOW))
# Try to use interactive menu if available
try:
from simple_term_menu import TerminalMenu
import re
# Strip emoji characters — simple_term_menu miscalculates visual
# width of emojis, causing duplicated/garbled lines on redraw.
_emoji_re = re.compile(
"[\U0001f300-\U0001f9ff\U00002600-\U000027bf\U0000fe00-\U0000fe0f"
"\U0001fa00-\U0001fa6f\U0001fa70-\U0001faff\u200d]+",
flags=re.UNICODE,
)
menu_choices = [f" {_emoji_re.sub('', choice).strip()}" for choice in choices]
print_info(" ↑/↓ Navigate Enter Select Esc Skip Ctrl+C Exit")
terminal_menu = TerminalMenu(
menu_choices,
cursor_index=default,
menu_cursor="",
menu_cursor_style=("fg_green", "bold"),
menu_highlight_style=("fg_green",),
cycle_cursor=True,
clear_screen=False,
)
idx = terminal_menu.show()
if idx is None: # User pressed Escape — keep current value
print_info(f" Skipped (keeping current)")
idx = _curses_prompt_choice(question, choices, default)
if idx >= 0:
if idx == default:
print_info(" Skipped (keeping current)")
print()
return default
print() # Add newline after selection
print()
return idx
except (ImportError, NotImplementedError):
pass
except Exception as e:
print(f" (Interactive menu unavailable: {e})")
# Fallback to number-based selection (simple_term_menu doesn't support Windows)
print(color(question, Colors.YELLOW))
for i, choice in enumerate(choices):
marker = "" if i == default else ""
if i == default:
@@ -344,84 +377,15 @@ def prompt_checklist(title: str, items: list, pre_selected: list = None) -> list
if pre_selected is None:
pre_selected = []
print(color(title, Colors.YELLOW))
print_info(" SPACE Toggle ENTER Confirm ESC Skip Ctrl+C Exit")
print()
from hermes_cli.curses_ui import curses_checklist
try:
from simple_term_menu import TerminalMenu
import re
# Strip emoji characters from menu labels — simple_term_menu miscalculates
# visual width of emojis on macOS, causing duplicated/garbled lines.
_emoji_re = re.compile(
"[\U0001f300-\U0001f9ff\U00002600-\U000027bf\U0000fe00-\U0000fe0f"
"\U0001fa00-\U0001fa6f\U0001fa70-\U0001faff\u200d]+",
flags=re.UNICODE,
)
menu_items = [f" {_emoji_re.sub('', item).strip()}" for item in items]
# Map pre-selected indices to the actual menu entry strings
preselected = [menu_items[i] for i in pre_selected if i < len(menu_items)]
terminal_menu = TerminalMenu(
menu_items,
multi_select=True,
show_multi_select_hint=False,
multi_select_cursor="[✓] ",
multi_select_select_on_accept=False,
multi_select_empty_ok=True,
preselected_entries=preselected if preselected else None,
menu_cursor="",
menu_cursor_style=("fg_green", "bold"),
menu_highlight_style=("fg_green",),
cycle_cursor=True,
clear_screen=False,
)
terminal_menu.show()
if terminal_menu.chosen_menu_entries is None:
print_info(" Skipped (keeping current)")
return list(pre_selected)
selected = list(terminal_menu.chosen_menu_indices or [])
return selected
except (ImportError, NotImplementedError):
# Fallback: numbered toggle interface (simple_term_menu doesn't support Windows)
selected = set(pre_selected)
while True:
for i, item in enumerate(items):
marker = color("[✓]", Colors.GREEN) if i in selected else "[ ]"
print(f" {marker} {i + 1}. {item}")
print()
try:
value = input(
color(" Toggle # (or Enter to confirm): ", Colors.DIM)
).strip()
if not value:
break
idx = int(value) - 1
if 0 <= idx < len(items):
if idx in selected:
selected.discard(idx)
else:
selected.add(idx)
else:
print_error(f"Enter a number between 1 and {len(items)}")
except ValueError:
print_error("Enter a number")
except (KeyboardInterrupt, EOFError):
print()
return []
# Clear and redraw (simple approach)
print()
return sorted(selected)
chosen = curses_checklist(
title,
items,
set(pre_selected),
cancel_returns=set(pre_selected),
)
return sorted(chosen)
def _prompt_api_key(var: dict):
@@ -761,6 +725,7 @@ def setup_model_provider(config: dict):
"MiniMax (global endpoint)",
"MiniMax China (mainland China endpoint)",
"Anthropic (Claude models — API key or Claude Code subscription)",
"AI Gateway (Vercel — 200+ models, pay-per-use)",
]
if keep_label:
provider_choices.append(keep_label)
@@ -780,6 +745,7 @@ def setup_model_provider(config: dict):
selected_provider = (
None # "nous", "openai-codex", "openrouter", "custom", or None (keep)
)
selected_base_url = None # deferred until after model selection
nous_models = [] # populated if Nous login succeeds
if provider_idx == 0: # Nous Portal (OAuth)
@@ -933,11 +899,35 @@ def setup_model_provider(config: dict):
base_url = prompt(
" API base URL (e.g., https://api.example.com/v1)", current_url
)
).strip()
api_key = prompt(" API key", password=True)
model_name = prompt(" Model name (e.g., gpt-4, claude-3-opus)", current_model)
if base_url:
from hermes_cli.models import probe_api_models
probe = probe_api_models(api_key, base_url)
if probe.get("used_fallback") and probe.get("resolved_base_url"):
print_warning(
f"Endpoint verification worked at {probe['resolved_base_url']}/models, "
f"not the exact URL you entered. Saving the working base URL instead."
)
base_url = probe["resolved_base_url"]
elif probe.get("models") is not None:
print_success(
f"Verified endpoint via {probe.get('probed_url')} "
f"({len(probe.get('models') or [])} model(s) visible)"
)
else:
print_warning(
f"Could not verify this endpoint via {probe.get('probed_url')}. "
f"Hermes will still save it."
)
if probe.get("suggested_base_url"):
print_info(
f" If this server expects /v1, try base URL: {probe['suggested_base_url']}"
)
save_env_value("OPENAI_BASE_URL", base_url)
if api_key:
save_env_value("OPENAI_API_KEY", api_key)
@@ -1038,8 +1028,8 @@ def setup_model_provider(config: dict):
if existing_custom:
save_env_value("OPENAI_BASE_URL", "")
save_env_value("OPENAI_API_KEY", "")
_update_config_for_provider("zai", zai_base_url, default_model="glm-5")
_set_model_provider(config, "zai", zai_base_url)
selected_base_url = zai_base_url
elif provider_idx == 5: # Kimi / Moonshot
selected_provider = "kimi-coding"
@@ -1071,8 +1061,8 @@ def setup_model_provider(config: dict):
if existing_custom:
save_env_value("OPENAI_BASE_URL", "")
save_env_value("OPENAI_API_KEY", "")
_update_config_for_provider("kimi-coding", pconfig.inference_base_url, default_model="kimi-k2.5")
_set_model_provider(config, "kimi-coding", pconfig.inference_base_url)
selected_base_url = pconfig.inference_base_url
elif provider_idx == 6: # MiniMax
selected_provider = "minimax"
@@ -1104,8 +1094,8 @@ def setup_model_provider(config: dict):
if existing_custom:
save_env_value("OPENAI_BASE_URL", "")
save_env_value("OPENAI_API_KEY", "")
_update_config_for_provider("minimax", pconfig.inference_base_url, default_model="MiniMax-M2.5")
_set_model_provider(config, "minimax", pconfig.inference_base_url)
selected_base_url = pconfig.inference_base_url
elif provider_idx == 7: # MiniMax China
selected_provider = "minimax-cn"
@@ -1137,8 +1127,8 @@ def setup_model_provider(config: dict):
if existing_custom:
save_env_value("OPENAI_BASE_URL", "")
save_env_value("OPENAI_API_KEY", "")
_update_config_for_provider("minimax-cn", pconfig.inference_base_url, default_model="MiniMax-M2.5")
_set_model_provider(config, "minimax-cn", pconfig.inference_base_url)
selected_base_url = pconfig.inference_base_url
elif provider_idx == 8: # Anthropic
selected_provider = "anthropic"
@@ -1241,10 +1231,42 @@ def setup_model_provider(config: dict):
save_env_value("OPENAI_API_KEY", "")
# Don't save base_url for Anthropic — resolve_runtime_provider()
# always hardcodes it. Stale base_urls contaminate other providers.
_update_config_for_provider("anthropic", "", default_model="claude-opus-4-6")
_set_model_provider(config, "anthropic")
selected_base_url = ""
# else: provider_idx == 9 (Keep current) — only shown when a provider already exists
elif provider_idx == 9: # AI Gateway
selected_provider = "ai-gateway"
print()
print_header("AI Gateway API Key")
pconfig = PROVIDER_REGISTRY["ai-gateway"]
print_info(f"Provider: {pconfig.name}")
print_info("Get your API key at: https://vercel.com/docs/ai-gateway")
print()
existing_key = get_env_value("AI_GATEWAY_API_KEY")
if existing_key:
print_info(f"Current: {existing_key[:8]}... (configured)")
if prompt_yes_no("Update API key?", False):
api_key = prompt(" AI Gateway API key", password=True)
if api_key:
save_env_value("AI_GATEWAY_API_KEY", api_key)
print_success("AI Gateway API key updated")
else:
api_key = prompt(" AI Gateway API key", password=True)
if api_key:
save_env_value("AI_GATEWAY_API_KEY", api_key)
print_success("AI Gateway API key saved")
else:
print_warning("Skipped - agent won't work without an API key")
# Clear custom endpoint vars if switching
if existing_custom:
save_env_value("OPENAI_BASE_URL", "")
save_env_value("OPENAI_API_KEY", "")
_update_config_for_provider("ai-gateway", pconfig.inference_base_url, default_model="anthropic/claude-opus-4.6")
_set_model_provider(config, "ai-gateway", pconfig.inference_base_url)
# else: provider_idx == 10 (Keep current) — only shown when a provider already exists
# Normalize "keep current" to an explicit provider so downstream logic
# doesn't fall back to the generic OpenRouter/static-model path.
if selected_provider is None:
@@ -1281,6 +1303,7 @@ def setup_model_provider(config: dict):
"minimax": "MiniMax",
"minimax-cn": "MiniMax CN",
"anthropic": "Anthropic",
"ai-gateway": "AI Gateway",
"custom": "your custom endpoint",
}
_prov_display = _prov_names.get(selected_provider, selected_provider or "your provider")
@@ -1414,7 +1437,7 @@ def setup_model_provider(config: dict):
_set_default_model(config, custom)
_update_config_for_provider("openai-codex", DEFAULT_CODEX_BASE_URL)
_set_model_provider(config, "openai-codex", DEFAULT_CODEX_BASE_URL)
elif selected_provider in ("zai", "kimi-coding", "minimax", "minimax-cn"):
elif selected_provider in ("zai", "kimi-coding", "minimax", "minimax-cn", "ai-gateway"):
_setup_provider_model_selection(
config, selected_provider, current_model,
prompt_choice, prompt,
@@ -1472,6 +1495,12 @@ def setup_model_provider(config: dict):
)
print_success(f"Model set to: {_display}")
# Write provider+base_url to config.yaml only after model selection is complete.
# This prevents a race condition where the gateway picks up a new provider
# before the model name has been updated to match.
if selected_provider in ("zai", "kimi-coding", "minimax", "minimax-cn", "anthropic") and selected_base_url is not None:
_update_config_for_provider(selected_provider, selected_base_url)
save_config(config)
+26 -16
View File
@@ -304,7 +304,7 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all",
def do_install(identifier: str, category: str = "", force: bool = False,
console: Optional[Console] = None) -> None:
console: Optional[Console] = None, skip_confirm: bool = False) -> None:
"""Fetch, quarantine, scan, confirm, and install a skill."""
from tools.skills_hub import (
GitHubAuth, create_source_router, ensure_hub_dirs,
@@ -378,7 +378,8 @@ def do_install(identifier: str, category: str = "", force: bool = False,
c.print(Panel("\n".join(metadata_lines), title="Upstream Metadata", border_style="blue"))
# Confirm with user — show appropriate warning based on source
if not force:
# skip_confirm bypasses the prompt (needed in TUI mode where input() hangs)
if not force and not skip_confirm:
c.print()
if bundle.source == "official":
c.print(Panel(
@@ -598,20 +599,23 @@ def do_audit(name: Optional[str] = None, console: Optional[Console] = None) -> N
c.print()
def do_uninstall(name: str, console: Optional[Console] = None) -> None:
def do_uninstall(name: str, console: Optional[Console] = None,
skip_confirm: bool = False) -> None:
"""Remove a hub-installed skill with confirmation."""
from tools.skills_hub import uninstall_skill
c = console or _console
c.print(f"\n[bold]Uninstall '{name}'?[/]")
try:
answer = input("Confirm [y/N]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
answer = "n"
if answer not in ("y", "yes"):
c.print("[dim]Cancelled.[/]\n")
return
# skip_confirm bypasses the prompt (needed in TUI mode where input() hangs)
if not skip_confirm:
c.print(f"\n[bold]Uninstall '{name}'?[/]")
try:
answer = input("Confirm [y/N]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
answer = "n"
if answer not in ("y", "yes"):
c.print("[dim]Cancelled.[/]\n")
return
success, msg = uninstall_skill(name)
if success:
@@ -923,7 +927,8 @@ def skills_command(args) -> None:
elif action == "search":
do_search(args.query, source=args.source, limit=args.limit)
elif action == "install":
do_install(args.identifier, category=args.category, force=args.force)
do_install(args.identifier, category=args.category, force=args.force,
skip_confirm=getattr(args, "yes", False))
elif action == "inspect":
do_inspect(args.identifier)
elif action == "list":
@@ -1054,11 +1059,15 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
return
identifier = args[0]
category = ""
force = any(flag in args for flag in ("--force", "--yes", "-y"))
# --yes / -y bypasses confirmation prompt (needed in TUI mode)
# --force handles reinstall override
skip_confirm = any(flag in args for flag in ("--yes", "-y"))
force = "--force" in args
for i, a in enumerate(args):
if a == "--category" and i + 1 < len(args):
category = args[i + 1]
do_install(identifier, category=category, force=force, console=c)
do_install(identifier, category=category, force=force,
skip_confirm=skip_confirm, console=c)
elif action == "inspect":
if not args:
@@ -1088,9 +1097,10 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
elif action == "uninstall":
if not args:
c.print("[bold red]Usage:[/] /skills uninstall <name>\n")
c.print("[bold red]Usage:[/] /skills uninstall <name> [--yes]\n")
return
do_uninstall(args[0], console=c)
skip_confirm = any(flag in args for flag in ("--yes", "-y"))
do_uninstall(args[0], console=c, skip_confirm=skip_confirm)
elif action == "publish":
if not args:
+8
View File
@@ -60,6 +60,12 @@ All fields are optional. Missing values inherit from the ``default`` skin.
# Tool prefix: character for tool output lines (default: ┊)
tool_prefix: ""
# Tool emojis: override the default emoji for any tool (used in spinners & progress)
tool_emojis:
terminal: "" # Override terminal tool emoji
web_search: "🔮" # Override web_search tool emoji
# Any tool not listed here uses its registry default
USAGE
=====
@@ -111,6 +117,7 @@ class SkinConfig:
spinner: Dict[str, Any] = field(default_factory=dict)
branding: Dict[str, str] = field(default_factory=dict)
tool_prefix: str = ""
tool_emojis: Dict[str, str] = field(default_factory=dict) # per-tool emoji overrides
banner_logo: str = "" # Rich-markup ASCII art logo (replaces HERMES_AGENT_LOGO)
banner_hero: str = "" # Rich-markup hero art (replaces HERMES_CADUCEUS)
@@ -541,6 +548,7 @@ def _build_skin_config(data: Dict[str, Any]) -> SkinConfig:
spinner=spinner,
branding=branding,
tool_prefix=data.get("tool_prefix", default.get("tool_prefix", "")),
tool_emojis=data.get("tool_emojis", {}),
banner_logo=data.get("banner_logo", ""),
banner_hero=data.get("banner_hero", ""),
)
+6 -1
View File
@@ -275,8 +275,13 @@ def show_status(args):
print(color("◆ Gateway Service", Colors.CYAN, Colors.BOLD))
if sys.platform.startswith('linux'):
try:
from hermes_cli.gateway import get_service_name
_gw_svc = get_service_name()
except Exception:
_gw_svc = "hermes-gateway"
result = subprocess.run(
["systemctl", "--user", "is-active", "hermes-gateway"],
["systemctl", "--user", "is-active", _gw_svc],
capture_output=True,
text=True
)
+56 -21
View File
@@ -190,6 +190,7 @@ TOOL_CATEGORIES = {
"name": "Local Browser",
"tag": "Free headless Chromium (no API key needed)",
"env_vars": [],
"browser_provider": None,
"post_setup": "browserbase", # Same npm install for agent-browser
},
{
@@ -199,6 +200,16 @@ TOOL_CATEGORIES = {
{"key": "BROWSERBASE_API_KEY", "prompt": "Browserbase API key", "url": "https://browserbase.com"},
{"key": "BROWSERBASE_PROJECT_ID", "prompt": "Browserbase project ID"},
],
"browser_provider": "browserbase",
"post_setup": "browserbase",
},
{
"name": "Browser Use",
"tag": "Cloud browser with remote execution",
"env_vars": [
{"key": "BROWSER_USE_API_KEY", "prompt": "Browser Use API key", "url": "https://browser-use.com"},
],
"browser_provider": "browser-use",
"post_setup": "browserbase",
},
],
@@ -575,10 +586,10 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
configured = ""
env_vars = p.get("env_vars", [])
if not env_vars or all(get_env_value(v["key"]) for v in env_vars):
if p.get("tts_provider") and config.get("tts", {}).get("provider") == p["tts_provider"]:
if _is_provider_active(p, config):
configured = " [active]"
elif not env_vars:
configured = " [active]" if config.get("tts", {}).get("provider", "edge") == p.get("tts_provider", "") else ""
configured = ""
else:
configured = " [configured]"
provider_choices.append(f"{p['name']}{tag}{configured}")
@@ -587,15 +598,7 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
provider_choices.append("Skip — keep defaults / configure later")
# Detect current provider as default
default_idx = 0
for i, p in enumerate(providers):
if p.get("tts_provider") and config.get("tts", {}).get("provider") == p["tts_provider"]:
default_idx = i
break
env_vars = p.get("env_vars", [])
if env_vars and all(get_env_value(v["key"]) for v in env_vars):
default_idx = i
break
default_idx = _detect_active_provider_index(providers, config)
provider_idx = _prompt_choice(f" {title}:", provider_choices, default_idx)
@@ -607,6 +610,28 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
_configure_provider(providers[provider_idx], config)
def _is_provider_active(provider: dict, config: dict) -> bool:
"""Check if a provider entry matches the currently active config."""
if provider.get("tts_provider"):
return config.get("tts", {}).get("provider") == provider["tts_provider"]
if "browser_provider" in provider:
current = config.get("browser", {}).get("cloud_provider")
return provider["browser_provider"] == current
return False
def _detect_active_provider_index(providers: list, config: dict) -> int:
"""Return the index of the currently active provider, or 0."""
for i, p in enumerate(providers):
if _is_provider_active(p, config):
return i
# Fallback: env vars present → likely configured
env_vars = p.get("env_vars", [])
if env_vars and all(get_env_value(v["key"]) for v in env_vars):
return i
return 0
def _configure_provider(provider: dict, config: dict):
"""Configure a single provider - prompt for API keys and set config."""
env_vars = provider.get("env_vars", [])
@@ -615,6 +640,15 @@ def _configure_provider(provider: dict, config: dict):
if provider.get("tts_provider"):
config.setdefault("tts", {})["provider"] = provider["tts_provider"]
# Set browser cloud provider in config if applicable
if "browser_provider" in provider:
bp = provider["browser_provider"]
if bp:
config.setdefault("browser", {})["cloud_provider"] = bp
_print_success(f" Browser cloud provider set to: {bp}")
else:
config.get("browser", {}).pop("cloud_provider", None)
if not env_vars:
_print_success(f" {provider['name']} - no configuration needed!")
return
@@ -767,7 +801,7 @@ def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
configured = ""
env_vars = p.get("env_vars", [])
if not env_vars or all(get_env_value(v["key"]) for v in env_vars):
if p.get("tts_provider") and config.get("tts", {}).get("provider") == p["tts_provider"]:
if _is_provider_active(p, config):
configured = " [active]"
elif not env_vars:
configured = ""
@@ -775,15 +809,7 @@ def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
configured = " [configured]"
provider_choices.append(f"{p['name']}{tag}{configured}")
default_idx = 0
for i, p in enumerate(providers):
if p.get("tts_provider") and config.get("tts", {}).get("provider") == p["tts_provider"]:
default_idx = i
break
env_vars = p.get("env_vars", [])
if env_vars and all(get_env_value(v["key"]) for v in env_vars):
default_idx = i
break
default_idx = _detect_active_provider_index(providers, config)
provider_idx = _prompt_choice(" Select provider:", provider_choices, default_idx)
_reconfigure_provider(providers[provider_idx], config)
@@ -797,6 +823,15 @@ def _reconfigure_provider(provider: dict, config: dict):
config.setdefault("tts", {})["provider"] = provider["tts_provider"]
_print_success(f" TTS provider set to: {provider['tts_provider']}")
if "browser_provider" in provider:
bp = provider["browser_provider"]
if bp:
config.setdefault("browser", {})["cloud_provider"] = bp
_print_success(f" Browser cloud provider set to: {bp}")
else:
config.get("browser", {}).pop("cloud_provider", None)
_print_success(f" Browser set to local mode")
if not env_vars:
_print_success(f" {provider['name']} - no configuration needed!")
return
+9 -3
View File
@@ -133,7 +133,13 @@ def uninstall_gateway_service():
if platform.system() != "Linux":
return False
service_file = Path.home() / ".config" / "systemd" / "user" / "hermes-gateway.service"
try:
from hermes_cli.gateway import get_service_name
svc_name = get_service_name()
except Exception:
svc_name = "hermes-gateway"
service_file = Path.home() / ".config" / "systemd" / "user" / f"{svc_name}.service"
if not service_file.exists():
return False
@@ -141,14 +147,14 @@ def uninstall_gateway_service():
try:
# Stop the service
subprocess.run(
["systemctl", "--user", "stop", "hermes-gateway"],
["systemctl", "--user", "stop", svc_name],
capture_output=True,
check=False
)
# Disable the service
subprocess.run(
["systemctl", "--user", "disable", "hermes-gateway"],
["systemctl", "--user", "disable", svc_name],
capture_output=True,
check=False
)
+4
View File
@@ -8,5 +8,9 @@ OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
OPENROUTER_MODELS_URL = f"{OPENROUTER_BASE_URL}/models"
OPENROUTER_CHAT_URL = f"{OPENROUTER_BASE_URL}/chat/completions"
AI_GATEWAY_BASE_URL = "https://ai-gateway.vercel.sh/v1"
AI_GATEWAY_MODELS_URL = f"{AI_GATEWAY_BASE_URL}/models"
AI_GATEWAY_CHAT_URL = f"{AI_GATEWAY_BASE_URL}/chat/completions"
NOUS_API_BASE_URL = "https://inference-api.nousresearch.com/v1"
NOUS_API_CHAT_URL = f"{NOUS_API_BASE_URL}/chat/completions"
+3 -2
View File
@@ -114,11 +114,12 @@ class HonchoClientConfig:
@classmethod
def from_env(cls, workspace_id: str = "hermes") -> HonchoClientConfig:
"""Create config from environment variables (fallback)."""
api_key = os.environ.get("HONCHO_API_KEY")
return cls(
workspace_id=workspace_id,
api_key=os.environ.get("HONCHO_API_KEY"),
api_key=api_key,
environment=os.environ.get("HONCHO_ENVIRONMENT", "production"),
enabled=True,
enabled=bool(api_key),
)
@classmethod
+6 -1
View File
@@ -927,6 +927,11 @@ class HonchoSessionManager:
return False
assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
honcho_session = self._sessions_cache.get(session.honcho_session_id)
if not honcho_session:
logger.warning("No Honcho session cached for '%s', skipping AI seed", session_key)
return False
try:
wrapped = (
f"<ai_identity_seed>\n"
@@ -935,7 +940,7 @@ class HonchoSessionManager:
f"{content.strip()}\n"
f"</ai_identity_seed>"
)
assistant_peer.add_message("assistant", wrapped)
honcho_session.add_messages([assistant_peer.message(wrapped)])
logger.info("Seeded AI identity from '%s' into %s", source, session_key)
return True
except Exception as e:
+43 -6
View File
@@ -113,6 +113,13 @@ try:
except Exception as e:
logger.debug("MCP tool discovery failed: %s", e)
# Plugin tool discovery (user/project/pip plugins)
try:
from hermes_cli.plugins import discover_plugins
discover_plugins()
except Exception as e:
logger.debug("Plugin discovery failed: %s", e)
# =============================================================================
# Backward-compat constants (built once after discovery)
@@ -222,6 +229,16 @@ def get_tool_definitions(
for ts_name in get_all_toolsets():
tools_to_include.update(resolve_toolset(ts_name))
# Always include plugin-registered tools — they bypass the toolset filter
# because their toolsets are dynamic (created at plugin load time).
try:
from hermes_cli.plugins import get_plugin_tool_names
plugin_tools = get_plugin_tool_names()
if plugin_tools:
tools_to_include.update(plugin_tools)
except Exception:
pass
# Ask the registry for schemas (only returns tools whose check_fn passes)
filtered_tools = registry.get_definitions(tools_to_include, quiet=quiet_mode)
@@ -267,6 +284,8 @@ def handle_function_call(
task_id: Optional[str] = None,
user_task: Optional[str] = None,
enabled_tools: Optional[List[str]] = None,
honcho_manager: Optional[Any] = None,
honcho_session_key: Optional[str] = None,
) -> str:
"""
Main function call dispatcher that routes calls to the tool registry.
@@ -298,21 +317,39 @@ def handle_function_call(
if function_name in _AGENT_LOOP_TOOLS:
return json.dumps({"error": f"{function_name} must be handled by the agent loop"})
try:
from hermes_cli.plugins import invoke_hook
invoke_hook("pre_tool_call", tool_name=function_name, args=function_args, task_id=task_id or "")
except Exception:
pass
if function_name == "execute_code":
# Prefer the caller-provided list so subagents can't overwrite
# the parent's tool set via the process-global.
sandbox_enabled = enabled_tools if enabled_tools is not None else _last_resolved_tool_names
return registry.dispatch(
result = registry.dispatch(
function_name, function_args,
task_id=task_id,
enabled_tools=sandbox_enabled,
honcho_manager=honcho_manager,
honcho_session_key=honcho_session_key,
)
else:
result = registry.dispatch(
function_name, function_args,
task_id=task_id,
user_task=user_task,
honcho_manager=honcho_manager,
honcho_session_key=honcho_session_key,
)
return registry.dispatch(
function_name, function_args,
task_id=task_id,
user_task=user_task,
)
try:
from hermes_cli.plugins import invoke_hook
invoke_hook("post_tool_call", tool_name=function_name, args=function_args, result=result, task_id=task_id or "")
except Exception:
pass
return result
except Exception as e:
error_msg = f"Error executing {function_name}: {str(e)}"
+231
View File
@@ -0,0 +1,231 @@
---
name: base
description: Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection, whale detection, and live network stats. Uses Base RPC + CoinGecko. No API key required.
version: 0.1.0
author: youssefea
license: MIT
metadata:
hermes:
tags: [Base, Blockchain, Crypto, Web3, RPC, DeFi, EVM, L2, Ethereum]
related_skills: []
---
# Base Blockchain Skill
Query Base (Ethereum L2) on-chain data enriched with USD pricing via CoinGecko.
8 commands: wallet portfolio, token info, transactions, gas analysis,
contract inspection, whale detection, network stats, and price lookup.
No API key needed. Uses only Python standard library (urllib, json, argparse).
---
## When to Use
- User asks for a Base wallet balance, token holdings, or portfolio value
- User wants to inspect a specific transaction by hash
- User wants ERC-20 token metadata, price, supply, or market cap
- User wants to understand Base gas costs and L1 data fees
- User wants to inspect a contract (ERC type detection, proxy resolution)
- User wants to find large ETH transfers (whale detection)
- User wants Base network health, gas price, or ETH price
- User asks "what's the price of USDC/AERO/DEGEN/ETH?"
---
## Prerequisites
The helper script uses only Python standard library (urllib, json, argparse).
No external packages required.
Pricing data comes from CoinGecko's free API (no key needed, rate-limited
to ~10-30 requests/minute). For faster lookups, use `--no-prices` flag.
---
## Quick Reference
RPC endpoint (default): https://mainnet.base.org
Override: export BASE_RPC_URL=https://your-private-rpc.com
Helper script path: ~/.hermes/skills/blockchain/base/scripts/base_client.py
```
python3 base_client.py wallet <address> [--limit N] [--all] [--no-prices]
python3 base_client.py tx <hash>
python3 base_client.py token <contract_address>
python3 base_client.py gas
python3 base_client.py contract <address>
python3 base_client.py whales [--min-eth N]
python3 base_client.py stats
python3 base_client.py price <contract_address_or_symbol>
```
---
## Procedure
### 0. Setup Check
```bash
python3 --version
# Optional: set a private RPC for better rate limits
export BASE_RPC_URL="https://mainnet.base.org"
# Confirm connectivity
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py stats
```
### 1. Wallet Portfolio
Get ETH balance and ERC-20 token holdings with USD values.
Checks ~15 well-known Base tokens (USDC, WETH, AERO, DEGEN, etc.)
via on-chain `balanceOf` calls. Tokens sorted by value, dust filtered.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
wallet 0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045
```
Flags:
- `--limit N` — show top N tokens (default: 20)
- `--all` — show all tokens, no dust filter, no limit
- `--no-prices` — skip CoinGecko price lookups (faster, RPC-only)
Output includes: ETH balance + USD value, token list with prices sorted
by value, dust count, total portfolio value in USD.
Note: Only checks known tokens. Unknown ERC-20s are not discovered.
Use the `token` command with a specific contract address for any token.
### 2. Transaction Details
Inspect a full transaction by its hash. Shows ETH value transferred,
gas used, fee in ETH/USD, status, and decoded ERC-20/ERC-721 transfers.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
tx 0xabc123...your_tx_hash_here
```
Output: hash, block, from, to, value (ETH + USD), gas price, gas used,
fee, status, contract creation address (if any), token transfers.
### 3. Token Info
Get ERC-20 token metadata: name, symbol, decimals, total supply, price,
market cap, and contract code size.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
token 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913
```
Output: name, symbol, decimals, total supply, price, market cap.
Reads name/symbol/decimals directly from the contract via eth_call.
### 4. Gas Analysis
Detailed gas analysis with cost estimates for common operations.
Shows current gas price, base fee trends over 10 blocks, block
utilization, and estimated costs for ETH transfers, ERC-20 transfers,
and swaps.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py gas
```
Output: current gas price, base fee, block utilization, 10-block trend,
cost estimates in ETH and USD.
Note: Base is an L2 — actual transaction costs include an L1 data
posting fee that depends on calldata size and L1 gas prices. The
estimates shown are for L2 execution only.
### 5. Contract Inspection
Inspect an address: determine if it's an EOA or contract, detect
ERC-20/ERC-721/ERC-1155 interfaces, resolve EIP-1967 proxy
implementation addresses.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
contract 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913
```
Output: is_contract, code size, ETH balance, detected interfaces
(ERC-20, ERC-721, ERC-1155), ERC-20 metadata, proxy implementation
address.
### 6. Whale Detector
Scan the most recent block for large ETH transfers with USD values.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
whales --min-eth 1.0
```
Note: scans the latest block only — point-in-time snapshot, not historical.
Default threshold is 1.0 ETH (lower than Solana's default since ETH
values are higher).
### 7. Network Stats
Live Base network health: latest block, chain ID, gas price, base fee,
block utilization, transaction count, and ETH price.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py stats
```
### 8. Price Lookup
Quick price check for any token by contract address or known symbol.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price ETH
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price USDC
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price AERO
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price DEGEN
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913
```
Known symbols: ETH, WETH, USDC, cbETH, AERO, DEGEN, TOSHI, BRETT,
WELL, wstETH, rETH, cbBTC.
---
## Pitfalls
- **CoinGecko rate-limits** — free tier allows ~10-30 requests/minute.
Price lookups use 1 request per token. Use `--no-prices` for speed.
- **Public RPC rate-limits** — Base's public RPC limits requests.
For production use, set BASE_RPC_URL to a private endpoint
(Alchemy, QuickNode, Infura).
- **Wallet shows known tokens only** — unlike Solana, EVM chains have no
built-in "get all tokens" RPC. The wallet command checks ~15 popular
Base tokens via `balanceOf`. Unknown ERC-20s won't appear. Use the
`token` command for any specific contract.
- **Token names read from contract** — if a contract doesn't implement
`name()` or `symbol()`, these fields may be empty. Known tokens have
hardcoded labels as fallback.
- **Gas estimates are L2 only** — Base transaction costs include an L1
data posting fee (depends on calldata size and L1 gas prices). The gas
command estimates L2 execution cost only.
- **Whale detector scans latest block only** — not historical. Results
vary by the moment you query. Default threshold is 1.0 ETH.
- **Proxy detection** — only EIP-1967 proxies are detected. Other proxy
patterns (EIP-1167 minimal proxy, custom storage slots) are not checked.
- **Retry on 429** — both RPC and CoinGecko calls retry up to 2 times
with exponential backoff on rate-limit errors.
---
## Verification
```bash
# Should print Base chain ID (8453), latest block, gas price, and ETH price
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py stats
```
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,116 @@
---
name: blender-mcp
description: Control Blender directly from Hermes via socket connection to the blender-mcp addon. Create 3D objects, materials, animations, and run arbitrary Blender Python (bpy) code. Use when user wants to create or modify anything in Blender.
version: 1.0.0
requires: Blender 4.3+ (desktop instance required, headless not supported)
author: alireza78a
tags: [blender, 3d, animation, modeling, bpy, mcp]
---
# Blender MCP
Control a running Blender instance from Hermes via socket on TCP port 9876.
## Setup (one-time)
### 1. Install the Blender addon
curl -sL https://raw.githubusercontent.com/ahujasid/blender-mcp/main/addon.py -o ~/Desktop/blender_mcp_addon.py
In Blender:
Edit > Preferences > Add-ons > Install > select blender_mcp_addon.py
Enable "Interface: Blender MCP"
### 2. Start the socket server in Blender
Press N in Blender viewport to open sidebar.
Find "BlenderMCP" tab and click "Start Server".
### 3. Verify connection
nc -z -w2 localhost 9876 && echo "OPEN" || echo "CLOSED"
## Protocol
Plain UTF-8 JSON over TCP -- no length prefix.
Send: {"type": "<command>", "params": {<kwargs>}}
Receive: {"status": "success", "result": <value>}
{"status": "error", "message": "<reason>"}
## Available Commands
| type | params | description |
|-------------------------|-------------------|---------------------------------|
| execute_code | code (str) | Run arbitrary bpy Python code |
| get_scene_info | (none) | List all objects in scene |
| get_object_info | object_name (str) | Details on a specific object |
| get_viewport_screenshot | (none) | Screenshot of current viewport |
## Python Helper
Use this inside execute_code tool calls:
import socket, json
def blender_exec(code: str, host="localhost", port=9876, timeout=15):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((host, port))
s.settimeout(timeout)
payload = json.dumps({"type": "execute_code", "params": {"code": code}})
s.sendall(payload.encode("utf-8"))
buf = b""
while True:
try:
chunk = s.recv(4096)
if not chunk:
break
buf += chunk
try:
json.loads(buf.decode("utf-8"))
break
except json.JSONDecodeError:
continue
except socket.timeout:
break
s.close()
return json.loads(buf.decode("utf-8"))
## Common bpy Patterns
### Clear scene
bpy.ops.object.select_all(action='SELECT')
bpy.ops.object.delete()
### Add mesh objects
bpy.ops.mesh.primitive_uv_sphere_add(radius=1, location=(0, 0, 0))
bpy.ops.mesh.primitive_cube_add(size=2, location=(3, 0, 0))
bpy.ops.mesh.primitive_cylinder_add(radius=0.5, depth=2, location=(-3, 0, 0))
### Create and assign material
mat = bpy.data.materials.new(name="MyMat")
mat.use_nodes = True
bsdf = mat.node_tree.nodes.get("Principled BSDF")
bsdf.inputs["Base Color"].default_value = (R, G, B, 1.0)
bsdf.inputs["Roughness"].default_value = 0.3
bsdf.inputs["Metallic"].default_value = 0.0
obj.data.materials.append(mat)
### Keyframe animation
obj.location = (0, 0, 0)
obj.keyframe_insert(data_path="location", frame=1)
obj.location = (0, 0, 3)
obj.keyframe_insert(data_path="location", frame=60)
### Render to file
bpy.context.scene.render.filepath = "/tmp/render.png"
bpy.context.scene.render.engine = 'CYCLES'
bpy.ops.render.render(write_still=True)
## Pitfalls
- Must check socket is open before running (nc -z localhost 9876)
- Addon server must be started inside Blender each session (N-panel > BlenderMCP > Connect)
- Break complex scenes into multiple smaller execute_code calls to avoid timeouts
- Render output path must be absolute (/tmp/...) not relative
- shade_smooth() requires object to be selected and in object mode
@@ -0,0 +1,422 @@
---
name: oss-forensics
description: |
Supply chain investigation, evidence recovery, and forensic analysis for GitHub repositories.
Covers deleted commit recovery, force-push detection, IOC extraction, multi-source evidence
collection, hypothesis formation/validation, and structured forensic reporting.
Inspired by RAPTOR's 1800+ line OSS Forensics system.
category: security
triggers:
- "investigate this repository"
- "investigate [owner/repo]"
- "check for supply chain compromise"
- "recover deleted commits"
- "forensic analysis of [owner/repo]"
- "was this repo compromised"
- "supply chain attack"
- "suspicious commit"
- "force push detected"
- "IOC extraction"
toolsets:
- terminal
- web
- file
- delegation
---
# OSS Security Forensics Skill
A 7-phase multi-agent investigation framework for researching open-source supply chain attacks.
Adapted from RAPTOR's forensics system. Covers GitHub Archive, Wayback Machine, GitHub API,
local git analysis, IOC extraction, evidence-backed hypothesis formation and validation,
and final forensic report generation.
---
## ⚠️ Anti-Hallucination Guardrails
Read these before every investigation step. Violating them invalidates the report.
1. **Evidence-First Rule**: Every claim in any report, hypothesis, or summary MUST cite at least one evidence ID (`EV-XXXX`). Assertions without citations are forbidden.
2. **STAY IN YOUR LANE**: Each sub-agent (investigator) has a single data source. Do NOT mix sources. The GH Archive investigator does not query the GitHub API, and vice versa. Role boundaries are hard.
3. **Fact vs. Hypothesis Separation**: Mark all unverified inferences with `[HYPOTHESIS]`. Only statements verified against original sources may be stated as facts.
4. **No Evidence Fabrication**: The hypothesis validator MUST mechanically check that every cited evidence ID actually exists in the evidence store before accepting a hypothesis.
5. **Proof-Required Disproval**: A hypothesis cannot be dismissed without a specific, evidence-backed counter-argument. "No evidence found" is not sufficient to disprove—it only makes a hypothesis inconclusive.
6. **SHA/URL Double-Verification**: Any commit SHA, URL, or external identifier cited as evidence must be independently confirmed from at least two sources before being marked as verified.
7. **Suspicious Code Rule**: Never run code found inside the investigated repository locally. Analyze statically only, or use `execute_code` in a sandboxed environment.
8. **Secret Redaction**: Any API keys, tokens, or credentials discovered during investigation must be redacted in the final report. Log them internally only.
---
## Example Scenarios
- **Scenario A: Dependency Confusion**: A malicious package `internal-lib-v2` is uploaded to NPM with a higher version than the internal one. The investigator must track when this package was first seen and if any PushEvents in the target repo updated `package.json` to this version.
- **Scenario B: Maintainer Takeover**: A long-term contributor's account is used to push a backdoored `.github/workflows/build.yml`. The investigator looks for PushEvents from this user after a long period of inactivity or from a new IP/location (if detectable via BigQuery).
- **Scenario C: Force-Push Hide**: A developer accidentally commits a production secret, then force-pushes to "fix" it. The investigator uses `git fsck` and GH Archive to recover the original commit SHA and verify what was leaked.
---
> **Path convention**: Throughout this skill, `SKILL_DIR` refers to the root of this skill's
> installation directory (the folder containing this `SKILL.md`). When the skill is loaded,
> resolve `SKILL_DIR` to the actual path — e.g. `~/.hermes/skills/security/oss-forensics/`
> or the `optional-skills/` equivalent. All script and template references are relative to it.
## Phase 0: Initialization
1. Create investigation working directory:
```bash
mkdir investigation_$(echo "REPO_NAME" | tr '/' '_')
cd investigation_$(echo "REPO_NAME" | tr '/' '_')
```
2. Initialize the evidence store:
```bash
python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json list
```
3. Copy the forensic report template:
```bash
cp SKILL_DIR/templates/forensic-report.md ./investigation-report.md
```
4. Create an `iocs.md` file to track Indicators of Compromise as they are discovered.
5. Record the investigation start time, target repository, and stated investigation goal.
---
## Phase 1: Prompt Parsing and IOC Extraction
**Goal**: Extract all structured investigative targets from the user's request.
**Actions**:
- Parse the user prompt and extract:
- Target repository (`owner/repo`)
- Target actors (GitHub handles, email addresses)
- Time window of interest (commit date ranges, PR timestamps)
- Provided Indicators of Compromise: commit SHAs, file paths, package names, IP addresses, domains, API keys/tokens, malicious URLs
- Any linked vendor security reports or blog posts
**Tools**: Reasoning only, or `execute_code` for regex extraction from large text blocks.
**Output**: Populate `iocs.md` with extracted IOCs. Each IOC must have:
- Type (from: COMMIT_SHA, FILE_PATH, API_KEY, SECRET, IP_ADDRESS, DOMAIN, PACKAGE_NAME, ACTOR_USERNAME, MALICIOUS_URL, OTHER)
- Value
- Source (user-provided, inferred)
**Reference**: See [evidence-types.md](./references/evidence-types.md) for IOC taxonomy.
---
## Phase 2: Parallel Evidence Collection
Spawn up to 5 specialist investigator sub-agents using `delegate_task` (batch mode, max 3 concurrent). Each investigator has a **single data source** and must not mix sources.
> **Orchestrator note**: Pass the IOC list from Phase 1 and the investigation time window in the `context` field of each delegated task.
---
### Investigator 1: Local Git Investigator
**ROLE BOUNDARY**: You query the LOCAL GIT REPOSITORY ONLY. Do not call any external APIs.
**Actions**:
```bash
# Clone repository
git clone https://github.com/OWNER/REPO.git target_repo && cd target_repo
# Full commit log with stats
git log --all --full-history --stat --format="%H|%ae|%an|%ai|%s" > ../git_log.txt
# Detect force-push evidence (orphaned/dangling commits)
git fsck --lost-found --unreachable 2>&1 | grep commit > ../dangling_commits.txt
# Check reflog for rewritten history
git reflog --all > ../reflog.txt
# List ALL branches including deleted remote refs
git branch -a -v > ../branches.txt
# Find suspicious large binary additions
git log --all --diff-filter=A --name-only --format="%H %ai" -- "*.so" "*.dll" "*.exe" "*.bin" > ../binary_additions.txt
# Check for GPG signature anomalies
git log --show-signature --format="%H %ai %aN" > ../signature_check.txt 2>&1
```
**Evidence to collect** (add via `python3 SKILL_DIR/scripts/evidence-store.py add`):
- Each dangling commit SHA → type: `git`
- Force-push evidence (reflog showing history rewrite) → type: `git`
- Unsigned commits from verified contributors → type: `git`
- Suspicious binary file additions → type: `git`
**Reference**: See [recovery-techniques.md](./references/recovery-techniques.md) for accessing force-pushed commits.
---
### Investigator 2: GitHub API Investigator
**ROLE BOUNDARY**: You query the GITHUB REST API ONLY. Do not run git commands locally.
**Actions**:
```bash
# Commits (paginated)
curl -s "https://api.github.com/repos/OWNER/REPO/commits?per_page=100" > api_commits.json
# Pull Requests including closed/deleted
curl -s "https://api.github.com/repos/OWNER/REPO/pulls?state=all&per_page=100" > api_prs.json
# Issues
curl -s "https://api.github.com/repos/OWNER/REPO/issues?state=all&per_page=100" > api_issues.json
# Contributors and collaborator changes
curl -s "https://api.github.com/repos/OWNER/REPO/contributors" > api_contributors.json
# Repository events (last 300)
curl -s "https://api.github.com/repos/OWNER/REPO/events?per_page=100" > api_events.json
# Check specific suspicious commit SHA details
curl -s "https://api.github.com/repos/OWNER/REPO/git/commits/SHA" > commit_detail.json
# Releases
curl -s "https://api.github.com/repos/OWNER/REPO/releases?per_page=100" > api_releases.json
# Check if a specific commit exists (force-pushed commits may 404 on commits/ but succeed on git/commits/)
curl -s "https://api.github.com/repos/OWNER/REPO/commits/SHA" | jq .sha
```
**Cross-reference targets** (flag discrepancies as evidence):
- PR exists in archive but missing from API → evidence of deletion
- Contributor in archive events but not in contributors list → evidence of permission revocation
- Commit in archive PushEvents but not in API commit list → evidence of force-push/deletion
**Reference**: See [evidence-types.md](./references/evidence-types.md) for GH event types.
---
### Investigator 3: Wayback Machine Investigator
**ROLE BOUNDARY**: You query the WAYBACK MACHINE CDX API ONLY. Do not use the GitHub API.
**Goal**: Recover deleted GitHub pages (READMEs, issues, PRs, releases, wiki pages).
**Actions**:
```bash
# Search for archived snapshots of the repo main page
curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO&output=json&limit=100&from=YYYYMMDD&to=YYYYMMDD" > wayback_main.json
# Search for a specific deleted issue
curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO/issues/NUM&output=json&limit=50" > wayback_issue_NUM.json
# Search for a specific deleted PR
curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO/pull/NUM&output=json&limit=50" > wayback_pr_NUM.json
# Fetch the best snapshot of a page
# Use the Wayback Machine URL: https://web.archive.org/web/TIMESTAMP/ORIGINAL_URL
# Example: https://web.archive.org/web/20240101000000*/github.com/OWNER/REPO
# Advanced: Search for deleted releases/tags
curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO/releases/tag/*&output=json" > wayback_tags.json
# Advanced: Search for historical wiki changes
curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO/wiki/*&output=json" > wayback_wiki.json
```
**Evidence to collect**:
- Archived snapshots of deleted issues/PRs with their content
- Historical README versions showing changes
- Evidence of content present in archive but missing from current GitHub state
**Reference**: See [github-archive-guide.md](./references/github-archive-guide.md) for CDX API parameters.
---
### Investigator 4: GH Archive / BigQuery Investigator
**ROLE BOUNDARY**: You query GITHUB ARCHIVE via BIGQUERY ONLY. This is a tamper-proof record of all public GitHub events.
> **Prerequisites**: Requires Google Cloud credentials with BigQuery access (`gcloud auth application-default login`). If unavailable, skip this investigator and note it in the report.
**Cost Optimization Rules** (MANDATORY):
1. ALWAYS run a `--dry_run` before every query to estimate cost.
2. Use `_TABLE_SUFFIX` to filter by date range and minimize scanned data.
3. Only SELECT the columns you need.
4. Add a LIMIT unless aggregating.
```bash
# Template: safe BigQuery query for PushEvents to OWNER/REPO
bq query --use_legacy_sql=false --dry_run "
SELECT created_at, actor.login, payload.commits, payload.before, payload.head,
payload.size, payload.distinct_size
FROM \`githubarchive.month.*\`
WHERE _TABLE_SUFFIX BETWEEN 'YYYYMM' AND 'YYYYMM'
AND type = 'PushEvent'
AND repo.name = 'OWNER/REPO'
LIMIT 1000
"
# If cost is acceptable, re-run without --dry_run
# Detect force-pushes: zero-distinct_size PushEvents mean commits were force-erased
# payload.distinct_size = 0 AND payload.size > 0 → force push indicator
# Check for deleted branch events
bq query --use_legacy_sql=false "
SELECT created_at, actor.login, payload.ref, payload.ref_type
FROM \`githubarchive.month.*\`
WHERE _TABLE_SUFFIX BETWEEN 'YYYYMM' AND 'YYYYMM'
AND type = 'DeleteEvent'
AND repo.name = 'OWNER/REPO'
LIMIT 200
"
```
**Evidence to collect**:
- Force-push events (payload.size > 0, payload.distinct_size = 0)
- DeleteEvents for branches/tags
- WorkflowRunEvents for suspicious CI/CD automation
- PushEvents that precede a "gap" in the git log (evidence of rewrite)
**Reference**: See [github-archive-guide.md](./references/github-archive-guide.md) for all 12 event types and query patterns.
---
### Investigator 5: IOC Enrichment Investigator
**ROLE BOUNDARY**: You enrich EXISTING IOCs from Phase 1 using passive public sources ONLY. Do not execute any code from the target repository.
**Actions**:
- For each commit SHA: attempt recovery via direct GitHub URL (`github.com/OWNER/REPO/commit/SHA.patch`)
- For each domain/IP: check passive DNS, WHOIS records (via `web_extract` on public WHOIS services)
- For each package name: check npm/PyPI for matching malicious package reports
- For each actor username: check GitHub profile, contribution history, account age
- Recover force-pushed commits using 3 methods (see [recovery-techniques.md](./references/recovery-techniques.md))
---
## Phase 3: Evidence Consolidation
After all investigators complete:
1. Run `python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json list` to see all collected evidence.
2. For each piece of evidence, verify the `content_sha256` hash matches the original source.
3. Group evidence by:
- **Timeline**: Sort all timestamped evidence chronologically
- **Actor**: Group by GitHub handle or email
- **IOC**: Link evidence to the IOC it relates to
4. Identify **discrepancies**: items present in one source but absent in another (key deletion indicators).
5. Flag evidence as `[VERIFIED]` (confirmed from 2+ independent sources) or `[UNVERIFIED]` (single source only).
---
## Phase 4: Hypothesis Formation
A hypothesis must:
- State a specific claim (e.g., "Actor X force-pushed to BRANCH on DATE to erase commit SHA")
- Cite at least 2 evidence IDs that support it (`EV-XXXX`, `EV-YYYY`)
- Identify what evidence would disprove it
- Be labeled `[HYPOTHESIS]` until validated
**Common hypothesis templates** (see [investigation-templates.md](./references/investigation-templates.md)):
- Maintainer Compromise: legitimate account used post-takeover to inject malicious code
- Dependency Confusion: package name squatting to intercept installs
- CI/CD Injection: malicious workflow changes to run code during builds
- Typosquatting: near-identical package name targeting misspellers
- Credential Leak: token/key accidentally committed then force-pushed to erase
For each hypothesis, spawn a `delegate_task` sub-agent to attempt to find disconfirming evidence before confirming.
---
## Phase 5: Hypothesis Validation
The validator sub-agent MUST mechanically check:
1. For each hypothesis, extract all cited evidence IDs.
2. Verify each ID exists in `evidence.json` (hard failure if any ID is missing → hypothesis rejected as potentially fabricated).
3. Verify each `[VERIFIED]` piece of evidence was confirmed from 2+ sources.
4. Check logical consistency: does the timeline depicted by the evidence support the hypothesis?
5. Check for alternative explanations: could the same evidence pattern arise from a benign cause?
**Output**:
- `VALIDATED`: All evidence cited, verified, logically consistent, no plausible alternative explanation.
- `INCONCLUSIVE`: Evidence supports hypothesis but alternative explanations exist or evidence is insufficient.
- `REJECTED`: Missing evidence IDs, unverified evidence cited as fact, logical inconsistency detected.
Rejected hypotheses feed back into Phase 4 for refinement (max 3 iterations).
---
## Phase 6: Final Report Generation
Populate `investigation-report.md` using the template in [forensic-report.md](./templates/forensic-report.md).
**Mandatory sections**:
- Executive Summary: one-paragraph verdict (Compromised / Clean / Inconclusive) with confidence level
- Timeline: chronological reconstruction of all significant events with evidence citations
- Validated Hypotheses: each with status and supporting evidence IDs
- Evidence Registry: table of all `EV-XXXX` entries with source, type, and verification status
- IOC List: all extracted and enriched Indicators of Compromise
- Chain of Custody: how evidence was collected, from what sources, at what timestamps
- Recommendations: immediate mitigations if compromise detected; monitoring recommendations
**Report rules**:
- Every factual claim must have at least one `[EV-XXXX]` citation
- Executive Summary must state confidence level (High / Medium / Low)
- All secrets/credentials must be redacted to `[REDACTED]`
---
## Phase 7: Completion
1. Run final evidence count: `python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json list`
2. Archive the full investigation directory.
3. If compromise is confirmed:
- List immediate mitigations (rotate credentials, pin dependency hashes, notify affected users)
- Identify affected versions/packages
- Note disclosure obligations (if a public package: coordinate with the package registry)
4. Present the final `investigation-report.md` to the user.
---
## Ethical Use Guidelines
This skill is designed for **defensive security investigation** — protecting open-source software from supply chain attacks. It must not be used for:
- **Harassment or stalking** of contributors or maintainers
- **Doxing** — correlating GitHub activity to real identities for malicious purposes
- **Competitive intelligence** — investigating proprietary or internal repositories without authorization
- **False accusations** — publishing investigation results without validated evidence (see anti-hallucination guardrails)
Investigations should be conducted with the principle of **minimal intrusion**: collect only the evidence necessary to validate or refute the hypothesis. When publishing results, follow responsible disclosure practices and coordinate with affected maintainers before public disclosure.
If the investigation reveals a genuine compromise, follow the coordinated vulnerability disclosure process:
1. Notify the repository maintainers privately first
2. Allow reasonable time for remediation (typically 90 days)
3. Coordinate with package registries (npm, PyPI, etc.) if published packages are affected
4. File a CVE if appropriate
---
## API Rate Limiting
GitHub REST API enforces rate limits that will interrupt large investigations if not managed.
**Authenticated requests**: 5,000/hour (requires `GITHUB_TOKEN` env var or `gh` CLI auth)
**Unauthenticated requests**: 60/hour (unusable for investigations)
**Best practices**:
- Always authenticate: `export GITHUB_TOKEN=ghp_...` or use `gh` CLI (auto-authenticates)
- Use conditional requests (`If-None-Match` / `If-Modified-Since` headers) to avoid consuming quota on unchanged data
- For paginated endpoints, fetch all pages in sequence — don't parallelize against the same endpoint
- Check `X-RateLimit-Remaining` header; if below 100, pause for `X-RateLimit-Reset` timestamp
- BigQuery has its own quotas (10 TiB/day free tier) — always dry-run first
- Wayback Machine CDX API: no formal rate limit, but be courteous (1-2 req/sec max)
If rate-limited mid-investigation, record the partial results in the evidence store and note the limitation in the report.
---
## Reference Materials
- [github-archive-guide.md](./references/github-archive-guide.md) — BigQuery queries, CDX API, 12 event types
- [evidence-types.md](./references/evidence-types.md) — IOC taxonomy, evidence source types, observation types
- [recovery-techniques.md](./references/recovery-techniques.md) — Recovering deleted commits, PRs, issues
- [investigation-templates.md](./references/investigation-templates.md) — Pre-built hypothesis templates per attack type
- [evidence-store.py](./scripts/evidence-store.py) — CLI tool for managing the evidence JSON store
- [forensic-report.md](./templates/forensic-report.md) — Structured report template
@@ -0,0 +1,89 @@
# Evidence Types Reference
Taxonomy of all evidence types, IOC types, GitHub event types, and observation types
used in OSS forensic investigations.
---
## Evidence Source Types
| Type | Description | Example Sources |
|------|-------------|-----------------|
| `git` | Data from local git repository analysis | `git log`, `git fsck`, `git reflog`, `git blame` |
| `gh_api` | Data from GitHub REST API responses | `/repos/.../commits`, `/repos/.../pulls`, `/repos/.../events` |
| `gh_archive` | Data from GitHub Archive (BigQuery) | `githubarchive.month.*` BigQuery tables |
| `web_archive` | Archived web pages from Wayback Machine | CDX API results, `web.archive.org/web/...` snapshots |
| `ioc` | Indicator of Compromise from any source | Extracted from vendor reports, git history, network traces |
| `analysis` | Derived insight from cross-source correlation | "SHA present in archive but absent from API" |
| `vendor_report` | External security vendor or researcher report | CVE advisories, blog posts, NVD records |
| `manual` | Manually recorded observation by investigator | Notes on behavioral patterns, timeline gaps |
---
## IOC Types
| Type | Description | Example |
|------|-------------|---------|
| `COMMIT_SHA` | A git commit hash linked to malicious activity | `abc123def456...` |
| `FILE_PATH` | A suspicious file inside the repository | `src/utils/crypto.js`, `dist/index.min.js` |
| `API_KEY` | An API key accidentally committed | `AKIA...` (AWS), `ghp_...` (GitHub PAT) |
| `SECRET` | A generic secret / credential | Database password, private key blob |
| `IP_ADDRESS` | A C2 server or attacker IP | `192.0.2.1` |
| `DOMAIN` | A malicious or suspicious domain | `evil-cdn.io`, typosquatted package registry domain |
| `PACKAGE_NAME` | A malicious or squatted package name | `colo-rs` (typosquatting `color`), `lodash-utils` |
| `ACTOR_USERNAME` | A GitHub handle linked to the attack | `malicious-bot-account` |
| `MALICIOUS_URL` | A URL to a malicious resource | `https://evil.example.com/payload.sh` |
| `WORKFLOW_FILE` | A suspicious CI/CD workflow file | `.github/workflows/release.yml` |
| `BRANCH_NAME` | A suspicious branch | `refs/heads/temp-fix-do-not-merge` |
| `TAG_NAME` | A suspicious git tag | `v1.0.0-security-patch` |
| `RELEASE_NAME` | A suspicious release | Release with no associated tag or changelog |
| `OTHER` | Catch-all for unclassified IOCs | — |
---
## GitHub Archive Event Types (12 Types)
| Event Type | Forensic Relevance |
|------------|-------------------|
| `PushEvent` | Core: `payload.distinct_size=0` with `payload.size>0` → force push. `payload.before`/`payload.head` shows rewritten history. |
| `PullRequestEvent` | Detects deleted PRs, rapid open→close patterns, PRs from new accounts |
| `IssueEvent` | Detects deleted issues, coordinated labeling, rapid closure of vulnerability reports |
| `IssueCommentEvent` | Deleted comments, rapid activity bursts |
| `WatchEvent` | Star-farming campaigns (coordinated starring from new accounts) |
| `ForkEvent` | Unusual fork patterns before malicious commit |
| `CreateEvent` | Branch/tag creation: signals new release or code injection point |
| `DeleteEvent` | Branch/tag deletion: critical — often used to hide traces |
| `ReleaseEvent` | Unauthorized releases, release artifacts modified post-publish |
| `MemberEvent` | Collaborator added/removed: maintainer compromise indicator |
| `PublicEvent` | Repository made public (sometimes to drop malicious code briefly) |
| `WorkflowRunEvent` | CI/CD pipeline executions: workflow injection, secret exfiltration |
---
## Evidence Verification States
| State | Meaning |
|-------|---------|
| `unverified` | Collected from a single source, not cross-referenced |
| `single_source` | The primary source has been confirmed directly (e.g., SHA resolves on GitHub), but no second source |
| `multi_source_verified` | Confirmed from 2+ independent sources (e.g., GH Archive AND GitHub API both show the same event) |
Only `multi_source_verified` evidence may be cited as fact in validated hypotheses.
`unverified` and `single_source` evidence must be labeled `[UNVERIFIED]` or `[SINGLE-SOURCE]`.
---
## Observation Types (Patterned after RAPTOR)
| Type | Description |
|------|-------------|
| `CommitObservation` | Specific commit SHA with metadata (author, date, files changed) |
| `ForceWashObservation` | Evidence that commits were force-erased from a branch |
| `DanglingCommitObservation` | SHA present in git object store but unreachable from any ref |
| `IssueObservation` | A GitHub issue (current or archived) with title, body, timestamp |
| `PRObservation` | A GitHub PR (current or archived) with diff summary, reviewers |
| `IOC` | A single Indicator of Compromise with context |
| `TimelineGap` | A period with unusual absence of expected activity |
| `ActorAnomalyObservation` | Behavioral anomaly for a specific GitHub actor |
| `WorkflowAnomalyObservation` | Suspicious CI/CD workflow change or unexpected run |
| `CrossSourceDiscrepancy` | Item present in one source but absent in another (strong deletion indicator) |
@@ -0,0 +1,184 @@
# GitHub Archive Query Guide (BigQuery)
GitHub Archive records every public event on GitHub as immutable JSON records. This data is accessible via Google BigQuery and is the most reliable source for forensic investigation — events cannot be deleted or modified after recording.
## Public Dataset
- **Project**: `githubarchive`
- **Tables**: `day.YYYYMMDD`, `month.YYYYMM`, `year.YYYY`
- **Cost**: $6.25 per TiB scanned. Always run dry runs first.
- **Access**: Requires a Google Cloud account with BigQuery enabled. Free tier includes 1 TiB/month of queries.
---
## The 12 GitHub Event Types
| Event Type | What It Records | Forensic Value |
|------------|-----------------|----------------|
| `PushEvent` | Commits pushed to a branch | Force-push detection, commit timeline, author attribution |
| `PullRequestEvent` | PR opened, closed, merged, reopened | Deleted PR recovery, review timeline |
| `IssuesEvent` | Issue opened, closed, reopened, labeled | Deleted issue recovery, social engineering traces |
| `IssueCommentEvent` | Comments on issues and PRs | Deleted comment recovery, communication patterns |
| `CreateEvent` | Branch, tag, or repository creation | Suspicious branch creation, tag timing |
| `DeleteEvent` | Branch or tag deletion | Evidence of cleanup after compromise |
| `MemberEvent` | Collaborator added or removed | Permission changes, access escalation |
| `PublicEvent` | Repository made public | Accidental exposure of private repos |
| `WatchEvent` | User stars a repository | Actor reconnaissance patterns |
| `ForkEvent` | Repository forked | Exfiltration of code before cleanup |
| `ReleaseEvent` | Release published, edited, deleted | Malicious release injection, deleted release recovery |
| `WorkflowRunEvent` | GitHub Actions workflow triggered | CI/CD abuse, unauthorized workflow runs |
---
## Query Templates
### Basic: All Events for a Repository
```sql
SELECT
created_at,
type,
actor.login,
repo.name,
payload
FROM
`githubarchive.day.20240101` -- Adjust date
WHERE
repo.name = 'owner/repo'
AND type IN ('PushEvent', 'DeleteEvent', 'MemberEvent')
ORDER BY
created_at ASC
```
### Force-Push Detection
Force-pushes produce PushEvents where commits are overwritten. Key indicators:
- `payload.distinct_size = 0` with `payload.size > 0` → commits were erased
- `payload.before` contains the SHA before the rewrite (recoverable)
```sql
SELECT
created_at,
actor.login,
JSON_EXTRACT_SCALAR(payload, '$.before') AS before_sha,
JSON_EXTRACT_SCALAR(payload, '$.head') AS after_sha,
JSON_EXTRACT_SCALAR(payload, '$.size') AS total_commits,
JSON_EXTRACT_SCALAR(payload, '$.distinct_size') AS distinct_commits,
JSON_EXTRACT_SCALAR(payload, '$.ref') AS branch_ref
FROM
`githubarchive.month.*`
WHERE
_TABLE_SUFFIX BETWEEN '202401' AND '202403'
AND type = 'PushEvent'
AND repo.name = 'owner/repo'
AND CAST(JSON_EXTRACT_SCALAR(payload, '$.distinct_size') AS INT64) = 0
ORDER BY
created_at ASC
```
### Deleted Branch/Tag Detection
```sql
SELECT
created_at,
actor.login,
JSON_EXTRACT_SCALAR(payload, '$.ref') AS deleted_ref,
JSON_EXTRACT_SCALAR(payload, '$.ref_type') AS ref_type
FROM
`githubarchive.month.*`
WHERE
_TABLE_SUFFIX BETWEEN '202401' AND '202403'
AND type = 'DeleteEvent'
AND repo.name = 'owner/repo'
ORDER BY
created_at ASC
```
### Collaborator Permission Changes
```sql
SELECT
created_at,
actor.login,
JSON_EXTRACT_SCALAR(payload, '$.action') AS action,
JSON_EXTRACT_SCALAR(payload, '$.member.login') AS member
FROM
`githubarchive.month.*`
WHERE
_TABLE_SUFFIX BETWEEN '202401' AND '202403'
AND type = 'MemberEvent'
AND repo.name = 'owner/repo'
ORDER BY
created_at ASC
```
### CI/CD Workflow Activity
```sql
SELECT
created_at,
actor.login,
JSON_EXTRACT_SCALAR(payload, '$.action') AS action,
JSON_EXTRACT_SCALAR(payload, '$.workflow_run.name') AS workflow_name,
JSON_EXTRACT_SCALAR(payload, '$.workflow_run.conclusion') AS conclusion,
JSON_EXTRACT_SCALAR(payload, '$.workflow_run.head_sha') AS head_sha
FROM
`githubarchive.month.*`
WHERE
_TABLE_SUFFIX BETWEEN '202401' AND '202403'
AND type = 'WorkflowRunEvent'
AND repo.name = 'owner/repo'
ORDER BY
created_at ASC
```
### Actor Activity Profiling
```sql
SELECT
type,
COUNT(*) AS event_count,
MIN(created_at) AS first_event,
MAX(created_at) AS last_event
FROM
`githubarchive.month.*`
WHERE
_TABLE_SUFFIX BETWEEN '202301' AND '202412'
AND actor.login = 'suspicious-username'
GROUP BY type
ORDER BY event_count DESC
```
---
## Cost Optimization (MANDATORY)
1. **Always dry run first**: Add `--dry_run` flag to `bq query` to see estimated bytes scanned before executing.
2. **Use `_TABLE_SUFFIX`**: Narrow the date range as much as possible. `day.*` tables are cheapest for narrow windows; `month.*` for broader sweeps.
3. **Select only needed columns**: Avoid `SELECT *`. The `payload` column is large — only select specific JSON paths.
4. **Add LIMIT**: Use `LIMIT 1000` during exploration. Remove only for final exhaustive queries.
5. **Column filtering in WHERE**: Filter on indexed columns (`type`, `repo.name`, `actor.login`) before payload extraction.
**Cost estimation**: A single month of GH Archive data is ~1-2 TiB uncompressed. Querying a specific repo + event type with `_TABLE_SUFFIX` typically scans 1-10 GiB ($0.006-$0.06).
---
## Accessing via Hermes
**Option A: BigQuery CLI** (if `gcloud` is installed)
```bash
bq query --use_legacy_sql=false --format=json "YOUR QUERY"
```
**Option B: Python** (via `execute_code`)
```python
from google.cloud import bigquery
client = bigquery.Client()
query = "YOUR QUERY"
results = client.query(query).result()
for row in results:
print(dict(row))
```
**Option C: No GCP credentials available**
If BigQuery is unavailable, document this limitation in the report. Use the other 4 investigators (Git, GitHub API, Wayback Machine, IOC Enrichment) — they cover most investigation needs without BigQuery.
@@ -0,0 +1,131 @@
# Investigation Templates
Pre-built hypothesis and investigation templates for common supply chain attack scenarios.
Each template includes: attack pattern, key evidence to collect, and hypothesis starters.
---
## Template 1: Maintainer Account Compromise
**Pattern**: Attacker gains access to a legitimate maintainer account (phishing, credential stuffing)
and uses it to push malicious code, create backdoored releases, or exfiltrate CI secrets.
**Real-world examples**: XZ Utils (2024), Codecov (2021), event-stream (2018)
**Key Evidence to Collect**:
- [ ] Push events from maintainer account outside normal working hours/timezone
- [ ] Commits adding new dependencies, obfuscated code, or modified build scripts
- [ ] Release creation immediately after suspicious push (to maximize package distribution)
- [ ] MemberEvent adding unknown collaborators (attacker adding backup access)
- [ ] WorkflowRunEvent with unexpected secret access or exfiltration-like behavior
- [ ] Account login location changes (check social media, conference talks for corroboration)
**Hypothesis Starters**:
```
[HYPOTHESIS] Actor <HANDLE>'s account was compromised on or around <DATE>,
based on anomalous commit timing [EV-XXXX] and geographic access patterns [EV-YYYY].
```
```
[HYPOTHESIS] Release <VERSION> was published by the compromised account to push
malicious code to downstream users, evidenced by the malicious commit [EV-XXXX]
being added <N> hours before the release [EV-YYYY].
```
---
## Template 2: Malicious Dependency Injection
**Pattern**: A trusted package is modified to include malicious code in a dependency,
or a new malicious dependency is injected into an existing package.
**Key Evidence to Collect**:
- [ ] Diff of `package.json`/`requirements.txt`/`go.mod` before and after suspicious commit
- [ ] The new dependency's publication timestamp vs. the injection commit timestamp
- [ ] Whether the new dependency exists on npm/PyPI and who owns it
- [ ] Any obfuscation patterns in the injected dependency code
- [ ] Install-time scripts (`postinstall`, `setup.py`, etc.) that execute code on install
**Hypothesis Starters**:
```
[HYPOTHESIS] Commit <SHA> [EV-XXXX] introduced dependency <PACKAGE@VERSION>
which appears to be a malicious package published by actor <HANDLE> [EV-YYYY],
designed to execute <BEHAVIOR> during installation.
```
---
## Template 3: CI/CD Pipeline Injection
**Pattern**: Attacker modifies GitHub Actions workflows to steal secrets, exfiltrate code,
or inject malicious artifacts into the build output.
**Key Evidence to Collect**:
- [ ] Diff of all `.github/workflows/*.yml` files before/after suspicious period
- [ ] WorkflowRunEvents triggered by the modified workflows
- [ ] Any `curl`, `wget`, or network calls added to workflow steps
- [ ] New or modified `env:` sections referencing `secrets.*`
- [ ] Artifacts produced by modified workflow runs
**Hypothesis Starters**:
```
[HYPOTHESIS] Workflow file <FILE> was modified in commit <SHA> [EV-XXXX] to
exfiltrate repository secrets via <METHOD>, as evidenced by the added network
call pattern [EV-YYYY].
```
---
## Template 4: Typosquatting / Dependency Confusion
**Pattern**: Attacker registers a package with a name similar to a popular package
(or an internal package name) to intercept installs from users who mistype.
**Key Evidence to Collect**:
- [ ] Registration timestamp of the suspicious package on the registry
- [ ] Package content: does it contain malicious code or is it a stub?
- [ ] Download statistics for the suspicious package
- [ ] Names of internal packages that could be targeted (if private repo scope)
- [ ] Any references to the legitimate package in the malicious one's metadata
**Hypothesis Starters**:
```
[HYPOTHESIS] Package <MALICIOUS_NAME> was registered on <DATE> [EV-XXXX] to
typosquat on <LEGITIMATE_NAME>, targeting users who misspell the package name.
The package contains <BEHAVIOR> [EV-YYYY].
```
---
## Template 5: Force-Push History Rewrite (Evidence Erasure)
**Pattern**: After a malicious commit is detected (or before wider notice), the attacker
force-pushes to remove the malicious commit from branch history.
**Detection is key** — this template focuses on proving the erasure happened.
**Key Evidence to Collect**:
- [ ] GH Archive PushEvent with `distinct_size=0` (force push indicator) [EV-XXXX]
- [ ] The SHA of the commit BEFORE the force push (from GH Archive `payload.before`)
- [ ] Recovery of the erased commit via direct URL or `git fetch origin SHA`
- [ ] Wayback Machine snapshot of the commit page before erasure
- [ ] Timeline gap in git log (N commits visible in archive but M < N in current repo)
**Hypothesis Starters**:
```
[HYPOTHESIS] Actor <HANDLE> force-pushed branch <BRANCH> on <DATE> [EV-XXXX]
to erase commit <SHA> [EV-YYYY], which contained <MALICIOUS_CONTENT>.
The erased commit was recovered via <METHOD> [EV-ZZZZ].
```
---
## Cross-Cutting Investigation Checklist
Apply to every investigation regardless of template:
- [ ] Check all contributors for newly created accounts (< 30 days old at time of malicious activity)
- [ ] Check if any maintainer account changed email in the period (sign of account takeover)
- [ ] Verify GPG signatures on suspicious commits match known maintainer keys
- [ ] Check if the repository changed ownership or transferred orgs near the incident
- [ ] Look for "cleanup" commits immediately after the malicious commit (cover-up pattern)
- [ ] Check related packages/repos by the same author for similar patterns
@@ -0,0 +1,164 @@
# Deleted Content Recovery Techniques
## Key Insight: GitHub Never Fully Deletes Force-Pushed Commits
Force-pushed commits are removed from the branch history but REMAIN on GitHub's servers until garbage collection runs (which can take weeks to months). This is the foundation of deleted commit recovery.
---
## Method 1: Direct GitHub URL (Fastest — No Auth Required)
If you have a commit SHA, access it directly even if it was force-pushed off a branch:
```bash
# View commit metadata
curl -s "https://github.com/OWNER/REPO/commit/SHA"
# Download as patch (includes full diff)
curl -s "https://github.com/OWNER/REPO/commit/SHA.patch" > recovered_commit.patch
# Download as diff
curl -s "https://github.com/OWNER/REPO/commit/SHA.diff" > recovered_commit.diff
# Example (Istio credential leak - real incident):
curl -s "https://github.com/istio/istio/commit/FORCE_PUSHED_SHA.patch"
```
**When this works**: SHA is known (from GH Archive, Wayback Machine, or `git fsck`)
**When this fails**: GitHub has already garbage-collected the object (rare, typically 3090 days post-force-push)
---
## Method 2: GitHub REST API
```bash
# Works for commits force-pushed off branches but still on server
# Note: /commits/SHA may 404, but /git/commits/SHA often succeeds for orphaned commits
curl -s "https://api.github.com/repos/OWNER/REPO/git/commits/SHA" | jq .
# Get the tree (file listing) of a force-pushed commit
curl -s "https://api.github.com/repos/OWNER/REPO/git/trees/SHA?recursive=1" | jq .
# Get a specific file from a force-pushed commit
curl -s "https://api.github.com/repos/OWNER/REPO/contents/PATH?ref=SHA" | jq .content | base64 -d
```
---
## Method 3: Git Fetch by SHA (Local — Requires Clone)
```bash
# Fetch an orphaned commit directly by SHA into local repo
cd target_repo
git fetch origin SHA
git log FETCH_HEAD -1 # view the commit
git diff FETCH_HEAD~1 FETCH_HEAD # view the diff
# If the SHA was recently force-pushed it will still be fetchable
# This stops working once GitHub GC runs
```
---
## Method 4: Dangling Commits via git fsck
```bash
cd target_repo
# Find all unreachable objects (includes force-pushed commits)
git fsck --unreachable --no-reflogs 2>&1 | grep "unreachable commit" | awk '{print $3}' > dangling_shas.txt
# For each dangling commit, get its metadata
while read sha; do
echo "=== $sha ===" >> dangling_details.txt
git show --stat "$sha" >> dangling_details.txt 2>&1
done < dangling_shas.txt
# Note: dangling objects only exist in LOCAL clone — not the same as GitHub's copies
# GitHub's copies are accessible via Methods 1-3 until GC runs
```
---
## Recovering Deleted GitHub Issues and PRs
### Via Wayback Machine CDX API
```bash
# Find all archived snapshots of a specific issue
curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO/issues/NUMBER&output=json&limit=50&fl=timestamp,statuscode,original" | python3 -m json.tool
# Fetch the best snapshot
# Use the timestamp from the CDX result:
# https://web.archive.org/web/TIMESTAMP/https://github.com/OWNER/REPO/issues/NUMBER
curl -s "https://web.archive.org/web/TIMESTAMP/https://github.com/OWNER/REPO/issues/NUMBER" > issue_NUMBER_archived.html
# Find all snapshots of the repo in a date range
curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO*&output=json&from=20240101&to=20240201&limit=200&fl=timestamp,urlkey,statuscode" | python3 -m json.tool
```
### Via GitHub API (Limited — Only Non-Deleted Content)
```bash
# Closed issues (not deleted) are retrievable
curl -s "https://api.github.com/repos/OWNER/REPO/issues?state=closed&per_page=100" | jq '.[].number'
# Note: DELETED issues/PRs do NOT appear in the API. Use Wayback Machine or GH Archive for those.
```
### Via GitHub Archive (For Event History — Not Content)
```sql
-- Find all IssueEvents for a repo in a date range
SELECT created_at, actor.login, payload.action, payload.issue.number, payload.issue.title
FROM `githubarchive.day.*`
WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20240201'
AND type = 'IssuesEvent'
AND repo.name = 'OWNER/REPO'
ORDER BY created_at
```
---
## Recovering Deleted Files from a Known Commit
```bash
# If you have the commit SHA (even force-pushed):
git show SHA:path/to/file.py > recovered_file.py
# Or via API (base64 encoded content):
curl -s "https://api.github.com/repos/OWNER/REPO/contents/path/to/file.py?ref=SHA" | python3 -c "
import sys, json, base64
d = json.load(sys.stdin)
print(base64.b64decode(d['content']).decode())
"
```
---
## Evidence Recording
After recovering any deleted content, immediately record it:
```bash
python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json add \
--source "git fetch origin FORCE_PUSHED_SHA" \
--content "Recovered commit: FORCE_PUSHED_SHA | Author: attacker@example.com | Date: 2024-01-15 | Added file: malicious.sh" \
--type git \
--actor "attacker-handle" \
--url "https://github.com/OWNER/REPO/commit/FORCE_PUSHED_SHA.patch" \
--timestamp "2024-01-15T00:00:00Z" \
--verification single_source \
--notes "Commit force-pushed off main branch on 2024-01-16. Recovered via direct fetch."
```
---
## Recovery Failure Modes
| Failure | Cause | Workaround |
|---------|-------|------------|
| `git fetch origin SHA` returns "not our ref" | GitHub GC already ran | Try Method 1/2, search Wayback Machine |
| `github.com/OWNER/REPO/commit/SHA` returns 404 | GC ran or SHA is wrong | Verify SHA via GH Archive; try partial SHA search |
| Wayback Machine has no snapshots | Page was never crawled by IA | Check `commoncrawl.org`, check Google Cache |
| BigQuery shows event but no content | GH Archive stores event metadata, not file contents | Recovery only reveals the event occurred, not the content |
@@ -0,0 +1,313 @@
#!/usr/bin/env python3
"""
OSS Forensics Evidence Store Manager
Manages a JSON-based evidence store for forensic investigations.
Commands:
add - Add a piece of evidence
list - List all evidence (optionally filter by type or actor)
verify - Re-check SHA-256 hashes for integrity
query - Search evidence by keyword
export - Export evidence as a Markdown table
summary - Print investigation statistics
Usage example:
python3 evidence-store.py --store evidence.json add \
--source "git fsck output" --content "dangling commit abc123" \
--type git --actor "malicious-user" --url "https://github.com/owner/repo/commit/abc123"
python3 evidence-store.py --store evidence.json list --type git
python3 evidence-store.py --store evidence.json verify
python3 evidence-store.py --store evidence.json export > evidence-table.md
"""
import json
import argparse
import os
import datetime
import hashlib
import sys
EVIDENCE_TYPES = [
"git", # Local git repository data (commits, reflog, fsck)
"gh_api", # GitHub REST API responses
"gh_archive", # GitHub Archive / BigQuery query results
"web_archive", # Wayback Machine snapshots
"ioc", # Indicator of Compromise (SHA, domain, IP, package name, etc.)
"analysis", # Derived analysis / cross-source correlation result
"manual", # Manually noted observation
"vendor_report", # External security vendor report excerpt
]
VERIFICATION_STATES = ["unverified", "single_source", "multi_source_verified"]
IOC_TYPES = [
"COMMIT_SHA", "FILE_PATH", "API_KEY", "SECRET", "IP_ADDRESS",
"DOMAIN", "PACKAGE_NAME", "ACTOR_USERNAME", "MALICIOUS_URL",
"WORKFLOW_FILE", "BRANCH_NAME", "TAG_NAME", "RELEASE_NAME", "OTHER",
]
def _now_iso():
return datetime.datetime.now(datetime.timezone.utc).isoformat(timespec="seconds") + "Z"
def _sha256(content: str) -> str:
return hashlib.sha256(content.encode("utf-8")).hexdigest()
class EvidenceStore:
def __init__(self, filepath: str):
self.filepath = filepath
self.data = {
"metadata": {
"version": "2.0",
"created_at": _now_iso(),
"last_updated": _now_iso(),
"investigation": "",
"target_repo": "",
},
"evidence": [],
"chain_of_custody": [],
}
if os.path.exists(filepath):
try:
with open(filepath, "r", encoding="utf-8") as f:
self.data = json.load(f)
except (json.JSONDecodeError, IOError) as e:
print(f"Error loading evidence store '{filepath}': {e}", file=sys.stderr)
print("Hint: The file might be corrupted. Check for manual edits or syntax errors.", file=sys.stderr)
sys.exit(1)
def _save(self):
self.data["metadata"]["last_updated"] = _now_iso()
with open(self.filepath, "w", encoding="utf-8") as f:
json.dump(self.data, f, indent=2, ensure_ascii=False)
def _next_id(self) -> str:
return f"EV-{len(self.data['evidence']) + 1:04d}"
def add(
self,
source: str,
content: str,
evidence_type: str,
actor: str = None,
url: str = None,
timestamp: str = None,
ioc_type: str = None,
verification: str = "unverified",
notes: str = None,
) -> str:
evidence_id = self._next_id()
entry = {
"id": evidence_id,
"type": evidence_type,
"source": source,
"content": content,
"content_sha256": _sha256(content),
"actor": actor,
"url": url,
"event_timestamp": timestamp,
"collected_at": _now_iso(),
"ioc_type": ioc_type,
"verification": verification,
"notes": notes,
}
self.data["evidence"].append(entry)
self.data["chain_of_custody"].append({
"action": "add",
"evidence_id": evidence_id,
"timestamp": _now_iso(),
"source": source,
})
self._save()
return evidence_id
def list_evidence(self, filter_type: str = None, filter_actor: str = None):
results = self.data["evidence"]
if filter_type:
results = [e for e in results if e.get("type") == filter_type]
if filter_actor:
results = [e for e in results if e.get("actor") == filter_actor]
return results
def verify_integrity(self):
"""Re-compute SHA-256 for all entries and report mismatches."""
issues = []
for entry in self.data["evidence"]:
expected = _sha256(entry["content"])
stored = entry.get("content_sha256", "")
if expected != stored:
issues.append({
"id": entry["id"],
"stored_sha256": stored,
"computed_sha256": expected,
})
return issues
def query(self, keyword: str):
"""Search for keyword in content, source, actor, or url."""
keyword_lower = keyword.lower()
return [
e for e in self.data["evidence"]
if keyword_lower in (e.get("content", "") or "").lower()
or keyword_lower in (e.get("source", "") or "").lower()
or keyword_lower in (e.get("actor", "") or "").lower()
or keyword_lower in (e.get("url", "") or "").lower()
]
def export_markdown(self) -> str:
lines = [
"# Evidence Registry",
"",
f"**Store**: `{self.filepath}`",
f"**Last Updated**: {self.data['metadata'].get('last_updated', 'N/A')}",
f"**Total Evidence Items**: {len(self.data['evidence'])}",
"",
"| ID | Type | Source | Actor | Verification | Event Timestamp | URL |",
"|----|------|--------|-------|--------------|-----------------|-----|",
]
for e in self.data["evidence"]:
url = e.get("url") or ""
url_display = f"[link]({url})" if url else ""
lines.append(
f"| {e['id']} | {e.get('type','')} | {e.get('source','')} "
f"| {e.get('actor') or ''} | {e.get('verification','')} "
f"| {e.get('event_timestamp') or ''} | {url_display} |"
)
lines.append("")
lines.append("## Chain of Custody")
lines.append("")
lines.append("| Evidence ID | Action | Timestamp | Source |")
lines.append("|-------------|--------|-----------|--------|")
for c in self.data["chain_of_custody"]:
lines.append(
f"| {c.get('evidence_id','')} | {c.get('action','')} "
f"| {c.get('timestamp','')} | {c.get('source','')} |"
)
return "\n".join(lines)
def summary(self) -> dict:
by_type = {}
by_verification = {}
actors = set()
for e in self.data["evidence"]:
t = e.get("type", "unknown")
by_type[t] = by_type.get(t, 0) + 1
v = e.get("verification", "unverified")
by_verification[v] = by_verification.get(v, 0) + 1
if e.get("actor"):
actors.add(e["actor"])
return {
"total": len(self.data["evidence"]),
"by_type": by_type,
"by_verification": by_verification,
"unique_actors": sorted(actors),
}
def main():
parser = argparse.ArgumentParser(
description="OSS Forensics Evidence Store Manager v2.0",
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument("--store", default="evidence.json", help="Path to evidence JSON file (default: evidence.json)")
subparsers = parser.add_subparsers(dest="command", metavar="COMMAND")
# --- add ---
add_p = subparsers.add_parser("add", help="Add a new evidence entry")
add_p.add_argument("--source", required=True, help="Where this evidence came from (e.g. 'git fsck', 'GH API /commits')")
add_p.add_argument("--content", required=True, help="The evidence content (commit SHA, API response excerpt, etc.)")
add_p.add_argument("--type", required=True, choices=EVIDENCE_TYPES, dest="evidence_type", help="Evidence type")
add_p.add_argument("--actor", help="GitHub handle or email of associated actor")
add_p.add_argument("--url", help="URL to original source")
add_p.add_argument("--timestamp", help="When the event occurred (ISO 8601)")
add_p.add_argument("--ioc-type", choices=IOC_TYPES, help="IOC subtype (for --type ioc)")
add_p.add_argument("--verification", choices=VERIFICATION_STATES, default="unverified")
add_p.add_argument("--notes", help="Additional investigator notes")
add_p.add_argument("--quiet", action="store_true", help="Suppress success message")
# --- list ---
list_p = subparsers.add_parser("list", help="List all evidence entries")
list_p.add_argument("--type", dest="filter_type", choices=EVIDENCE_TYPES, help="Filter by type")
list_p.add_argument("--actor", dest="filter_actor", help="Filter by actor")
# --- verify ---
subparsers.add_parser("verify", help="Verify SHA-256 integrity of all evidence content")
# --- query ---
query_p = subparsers.add_parser("query", help="Search evidence by keyword")
query_p.add_argument("keyword", help="Keyword to search for")
# --- export ---
subparsers.add_parser("export", help="Export evidence as a Markdown table (stdout)")
# --- summary ---
subparsers.add_parser("summary", help="Print investigation statistics")
args = parser.parse_args()
if not args.command:
parser.print_help()
sys.exit(0)
store = EvidenceStore(args.store)
if args.command == "add":
eid = store.add(
source=args.source,
content=args.content,
evidence_type=args.evidence_type,
actor=args.actor,
url=args.url,
timestamp=args.timestamp,
ioc_type=args.ioc_type,
verification=args.verification,
notes=args.notes,
)
if not getattr(args, "quiet", False):
print(f"✓ Added evidence: {eid}")
elif args.command == "list":
items = store.list_evidence(
filter_type=getattr(args, "filter_type", None),
filter_actor=getattr(args, "filter_actor", None),
)
if not items:
print("No evidence found.")
for e in items:
actor_str = f" | actor: {e['actor']}" if e.get("actor") else ""
url_str = f" | {e['url']}" if e.get("url") else ""
print(f"[{e['id']}] {e['type']:12s} | {e['verification']:20s} | {e['source']}{actor_str}{url_str}")
elif args.command == "verify":
issues = store.verify_integrity()
if not issues:
print(f"✓ All {len(store.data['evidence'])} evidence entries passed SHA-256 integrity check.")
else:
print(f"{len(issues)} integrity issue(s) detected:")
for i in issues:
print(f" [{i['id']}] stored={i['stored_sha256'][:16]}... computed={i['computed_sha256'][:16]}...")
sys.exit(1)
elif args.command == "query":
results = store.query(args.keyword)
print(f"Found {len(results)} result(s) for '{args.keyword}':")
for e in results:
print(f" [{e['id']}] {e['type']} | {e['source']} | {e['content'][:80]}")
elif args.command == "export":
print(store.export_markdown())
elif args.command == "summary":
s = store.summary()
print(f"Total evidence items : {s['total']}")
print(f"By type : {json.dumps(s['by_type'], indent=2)}")
print(f"By verification : {json.dumps(s['by_verification'], indent=2)}")
print(f"Unique actors : {s['unique_actors']}")
if __name__ == "__main__":
main()
@@ -0,0 +1,151 @@
# Forensic Investigation Report
> **Instructions**: Fill in all sections. Every factual claim must cite at least one `[EV-XXXX]` evidence ID.
> Remove placeholder text and instruction notes before finalizing. Redact all secrets to `[REDACTED]`.
---
## Executive Summary
**Target Repository**: `OWNER/REPO`
**Investigation Period**: YYYY-MM-DD to YYYY-MM-DD
**Verdict**: <!-- Compromised / Clean / Inconclusive -->
**Confidence Level**: <!-- High / Medium / Low -->
**Report Date**: YYYY-MM-DD
**Investigator**: <!-- Agent session ID or analyst name -->
<!-- One paragraph: what was investigated, what was found, what is recommended. -->
---
## Timeline of Events
> All timestamps in UTC. Each event must cite at least one evidence ID.
| Timestamp (UTC) | Event | Evidence IDs | Source |
|-----------------|-------|--------------|--------|
| YYYY-MM-DDTHH:MM:SSZ | _Describe event_ | [EV-XXXX] | git / gh_api / gh_archive / web_archive |
| | | | |
---
## Validated Hypotheses
### Hypothesis 1: <!-- Short title -->
**Status**: <!-- VALIDATED / INCONCLUSIVE / REJECTED -->
**Claim**: _Full statement of the hypothesis._
**Supporting Evidence**:
- [EV-XXXX]: _What this evidence shows_
- [EV-YYYY]: _What this evidence shows_
**Counter-Evidence Considered**: _What might disprove this, and why it was ruled out or not._
**Confidence**: <!-- High / Medium / Low, and why -->
---
## Indicators of Compromise (IOC List)
| Type | Value | Status | Evidence |
|------|-------|--------|----------|
| COMMIT_SHA | `abc123...` | Confirmed malicious | [EV-XXXX] |
| ACTOR_USERNAME | `handle` | Suspected compromised | [EV-YYYY] |
| FILE_PATH | `src/evil.js` | Confirmed malicious | [EV-ZZZZ] |
| DOMAIN | `evil-cdn.io` | Confirmed C2 | [EV-WWWW] |
---
## Affected Versions
| Version / Tag | Published | Contains Malicious Code | Evidence |
|---------------|-----------|------------------------|----------|
| `v1.2.3` | YYYY-MM-DD | Yes / No / Unknown | [EV-XXXX] |
---
## Evidence Registry
> Generated by: `python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json export`
<!-- Paste the Markdown table output from the evidence-store.py export command here -->
| ID | Type | Source | Actor | Verification | Event Timestamp | URL |
|----|------|--------|-------|--------------|-----------------|-----|
| EV-0001 | | | | | | |
---
## Chain of Custody
> Generated by: `python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json export`
<!-- Paste the chain of custody section from the export output here -->
| Evidence ID | Action | Timestamp | Source |
|-------------|--------|-----------|--------|
| EV-0001 | add | | |
---
## Technical Findings
### Git History Analysis
_Summarize findings from local git analysis: dangling commits, reflog anomalies, unsigned commits, binary additions, etc._
### GitHub API Analysis
_Summarize findings from GitHub REST API: deleted PRs/issues, contributor changes, release anomalies, etc._
### GitHub Archive Analysis
_Summarize findings from BigQuery: force-push events, delete events, workflow anomalies, member changes, etc._
_Note: If BigQuery was unavailable, state this explicitly._
### Wayback Machine Analysis
_Summarize findings from archive.org: recovered deleted pages, historical content differences, etc._
### IOC Enrichment
_Summarize enrichment results: WHOIS data for domains, recovered commit content, actor account analysis, etc._
---
## Recommendations
### Immediate Actions (If Compromise Confirmed)
- [ ] Rotate all GitHub tokens, API keys, and credentials that may have been exposed
- [ ] Pin dependency versions to hashes in all affected packages
- [ ] Publish a security advisory / CVE if applicable
- [ ] Notify downstream users/package registries (npm, PyPI, etc.)
- [ ] Revoke access for the compromised account and re-secure with hardware 2FA
- [ ] Audit all CI/CD workflow files for unauthorized modifications
- [ ] Review all releases published during the compromise window
### Monitoring Recommendations
- [ ] Enable branch protection on `main`/`master` (require code review, disallow force-push)
- [ ] Enable required commit signing (GPG/SSH)
- [ ] Set up GitHub audit log streaming for future monitoring
- [ ] Pin critical dependencies to known-good SHAs in lock files
---
## Limitations and Caveats
- _List any data sources that were unavailable (e.g., no BigQuery access)_
- _Note any evidence that is single-source only (not independently verified)_
- _Note any hypotheses that could not be confirmed or denied_
---
## References
- Evidence store: `evidence.json` (SHA-256 integrity: run `python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json verify`)
- Related issues: <!-- Link to GitHub issues, CVEs, security advisories -->
- RAPTOR framework: https://github.com/gadievron/raptor
@@ -0,0 +1,43 @@
# Malicious Package Investigation Report
---
## 📦 Package Metadata
- **Package Name**:
- **Registry**: [NPM / PyPI / RubyGems / etc.]
- **Affected Versions**:
- **Malicious Version(s)**:
- **Downloads at Time of Detection**:
- **Package URL**:
---
## 🚩 Indicators of Compromise (IOCs)
- **Malicious URL(s)**:
- **Exfiltrated Data Types**: [Environment variables, ~/.ssh/id_rsa, /etc/shadow, etc.]
- **Exfiltration Method**: [DNS tunneling, HTTP POST to C2, etc.]
- **C2 IP/Domain**:
---
## 🛠️ Analysis Summary
- **Primary Mechanism**: [Typosquatting / Dependency Confusion / Maintainer Takeover]
- **Behavior Description**:
- [Example: Installs a postinstall script that exfiltrates environment variables.]
- [Example: Patches `setup.py` to download a secondary payload.]
---
## 🔍 Evidence Registry
| Evidence ID | Type | Source | Description |
|-------------|------|--------|-------------|
| EV-XXXX | ioc | NPM | Package install script snapshot |
| EV-YYYY | web | Wayback| Historical version comparison |
---
## 🛡️ Recommended Mitigations
1. [ ] Unpublish/Report the package to the registry.
2. [ ] Audit `package-lock.json` or `requirements.txt` across all projects.
3. [ ] Rotate secrets exfiltrated via environment variables.
4. [ ] Pin specific hashes (SHASUM) for mission-critical dependencies.
-218
View File
@@ -1,218 +0,0 @@
# Checkpoint & Rollback — Implementation Plan
## Goal
Automatic filesystem snapshots before destructive file operations, with user-facing rollback. The agent never sees or interacts with this — it's transparent infrastructure.
## Design Principles
1. **Not a tool** — the LLM never knows about it. Zero prompt tokens, zero tool schema overhead.
2. **Once per turn** — checkpoint at most once per conversation turn (user message → agent response cycle), triggered lazily on the first file-mutating operation. Not on every write.
3. **Opt-in via config** — disabled by default, enabled with `checkpoints: true` in config.yaml.
4. **Works on any directory** — uses a shadow git repo completely separate from the user's project git. Works on git repos, non-git directories, anything.
5. **User-facing rollback**`/rollback` slash command (CLI + gateway) to list and restore checkpoints. Also `hermes rollback` CLI subcommand.
## Architecture
```
~/.hermes/checkpoints/
{sha256(abs_dir)[:16]}/ # Shadow git repo per working directory
HEAD, refs/, objects/... # Standard git internals
HERMES_WORKDIR # Original dir path (for display)
info/exclude # Default excludes (node_modules, .env, etc.)
```
### Core: CheckpointManager (new file: tools/checkpoint_manager.py)
Adapted from PR #559's CheckpointStore. Key changes from the PR:
- **Not a tool** — no schema, no registry entry, no handler
- **Turn-scoped deduplication** — tracks `_checkpointed_dirs: Set[str]` per turn
- **Configurable** — reads `checkpoints` config key
- **Pruning** — keeps last N snapshots per directory (default 50), prunes on take
```python
class CheckpointManager:
def __init__(self, enabled: bool = False, max_snapshots: int = 50):
self.enabled = enabled
self.max_snapshots = max_snapshots
self._checkpointed_dirs: Set[str] = set() # reset each turn
def new_turn(self):
"""Call at start of each conversation turn to reset dedup."""
self._checkpointed_dirs.clear()
def ensure_checkpoint(self, working_dir: str, reason: str = "auto") -> None:
"""Take a checkpoint if enabled and not already done this turn."""
if not self.enabled:
return
abs_dir = str(Path(working_dir).resolve())
if abs_dir in self._checkpointed_dirs:
return
self._checkpointed_dirs.add(abs_dir)
try:
self._take(abs_dir, reason)
except Exception as e:
logger.debug("Checkpoint failed (non-fatal): %s", e)
def list_checkpoints(self, working_dir: str) -> List[dict]:
"""List available checkpoints for a directory."""
...
def restore(self, working_dir: str, commit_hash: str) -> dict:
"""Restore files to a checkpoint state."""
...
def _take(self, working_dir: str, reason: str):
"""Shadow git: add -A + commit. Prune if over max_snapshots."""
...
def _prune(self, shadow_repo: Path):
"""Keep only last max_snapshots commits."""
...
```
### Integration Point: run_agent.py
The AIAgent already owns the conversation loop. Add CheckpointManager as an instance attribute:
```python
class AIAgent:
def __init__(self, ...):
...
# Checkpoint manager — reads config to determine if enabled
self._checkpoint_mgr = CheckpointManager(
enabled=config.get("checkpoints", False),
max_snapshots=config.get("checkpoint_max_snapshots", 50),
)
```
**Turn boundary** — in `run_conversation()`, call `new_turn()` at the start of each agent iteration (before processing tool calls):
```python
# Inside the main loop, before _execute_tool_calls():
self._checkpoint_mgr.new_turn()
```
**Trigger point** — in `_execute_tool_calls()`, before dispatching file-mutating tools:
```python
# Before the handle_function_call dispatch:
if function_name in ("write_file", "patch"):
# Determine working dir from the file path in the args
file_path = function_args.get("path", "") or function_args.get("old_string", "")
if file_path:
work_dir = str(Path(file_path).parent.resolve())
self._checkpoint_mgr.ensure_checkpoint(work_dir, f"before {function_name}")
```
This means:
- First `write_file` in a turn → checkpoint (fast, one `git add -A && git commit`)
- Subsequent writes in the same turn → no-op (already checkpointed)
- Next turn (new user message) → fresh checkpoint eligibility
### Config
Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`:
```python
"checkpoints": False, # Enable filesystem checkpoints before destructive ops
"checkpoint_max_snapshots": 50, # Max snapshots to keep per directory
```
User enables with:
```yaml
# ~/.hermes/config.yaml
checkpoints: true
```
### User-Facing Rollback
**CLI slash command** — add `/rollback` to `process_command()` in `cli.py`:
```
/rollback — List recent checkpoints for the current directory
/rollback <hash> — Restore files to that checkpoint
```
Shows a numbered list:
```
📸 Checkpoints for /home/user/project:
1. abc1234 2026-03-09 21:15 before write_file (3 files changed)
2. def5678 2026-03-09 20:42 before patch (1 file changed)
3. ghi9012 2026-03-09 20:30 before write_file (2 files changed)
Use /rollback <number> to restore, e.g. /rollback 1
```
**Gateway slash command** — add `/rollback` to gateway/run.py with the same behavior.
**CLI subcommand** — `hermes rollback` (optional, lower priority).
### What Gets Excluded (not checkpointed)
Same as the PR's defaults — written to the shadow repo's `info/exclude`:
```
node_modules/
dist/
build/
.env
.env.*
__pycache__/
*.pyc
.DS_Store
*.log
.cache/
.venv/
.git/
```
Also respects the project's `.gitignore` if present (shadow repo can read it via `core.excludesFile`).
### Safety
- `ensure_checkpoint()` wraps everything in try/except — a checkpoint failure never blocks the actual file operation
- Shadow repo is completely isolated — GIT_DIR + GIT_WORK_TREE env vars, never touches user's .git
- If git isn't installed, checkpoints silently disable
- Large directories: add a file count check — skip checkpoint if >50K files to avoid slowdowns
## Files to Create/Modify
| File | Change |
|------|--------|
| `tools/checkpoint_manager.py` | **NEW** — CheckpointManager class (adapted from PR #559) |
| `run_agent.py` | Add CheckpointManager init + trigger in `_execute_tool_calls()` |
| `hermes_cli/config.py` | Add `checkpoints` + `checkpoint_max_snapshots` to DEFAULT_CONFIG |
| `cli.py` | Add `/rollback` slash command handler |
| `gateway/run.py` | Add `/rollback` slash command handler |
| `tests/tools/test_checkpoint_manager.py` | **NEW** — tests (adapted from PR #559's tests) |
## What We Take From PR #559
- `_shadow_repo_path()` — deterministic path hashing ✅
- `_git_env()` — GIT_DIR/GIT_WORK_TREE isolation ✅
- `_run_git()` — subprocess wrapper with timeout ✅
- `_init_shadow_repo()` — shadow repo initialization ✅
- `DEFAULT_EXCLUDES` list ✅
- Test structure and patterns ✅
## What We Change From PR #559
- **Remove tool schema/registry** — not a tool
- **Remove injection into file_operations.py and patch_parser.py** — trigger from run_agent.py instead
- **Add turn-scoped deduplication** — one checkpoint per turn, not per operation
- **Add pruning** — keep last N snapshots
- **Add config flag** — opt-in, not mandatory
- **Add /rollback command** — user-facing restore UI
- **Add file count guard** — skip huge directories
## Implementation Order
1. `tools/checkpoint_manager.py` — core class with take/list/restore/prune
2. `tests/tools/test_checkpoint_manager.py` — tests
3. `hermes_cli/config.py` — config keys
4. `run_agent.py` — integration (init + trigger)
5. `cli.py``/rollback` slash command
6. `gateway/run.py``/rollback` slash command
7. Full test suite run + manual smoke test
+1 -1
View File
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent"
version = "0.2.0"
version = "0.3.0"
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
readme = "README.md"
requires-python = ">=3.11"
+498 -185
View File
@@ -90,6 +90,7 @@ from agent.display import (
KawaiiSpinner, build_tool_preview as _build_tool_preview,
get_cute_tool_message as _get_cute_tool_message_impl,
_detect_tool_failure,
get_tool_emoji as _get_tool_emoji,
)
from agent.trajectory import (
convert_scratchpad_to_think, has_incomplete_scratchpad,
@@ -204,6 +205,33 @@ _NEVER_PARALLEL_TOOLS = frozenset({"clarify"})
# Maximum number of concurrent worker threads for parallel tool execution.
_MAX_TOOL_WORKERS = 8
# Patterns that indicate a terminal command may modify/delete files.
_DESTRUCTIVE_PATTERNS = re.compile(
r"""(?:^|\s|&&|\|\||;|`)(?:
rm\s|rmdir\s|
mv\s|
sed\s+-i|
truncate\s|
dd\s|
shred\s|
git\s+(?:reset|clean|checkout)\s
)""",
re.VERBOSE,
)
# Output redirects that overwrite files (> but not >>)
_REDIRECT_OVERWRITE = re.compile(r'[^>]>[^>]|^>[^>]')
def _is_destructive_command(cmd: str) -> bool:
"""Heuristic: does this terminal command look like it modifies/deletes files?"""
if not cmd:
return False
if _DESTRUCTIVE_PATTERNS.search(cmd):
return True
if _REDIRECT_OVERWRITE.search(cmd):
return True
return False
def _inject_honcho_turn_context(content, turn_context: str):
"""Append Honcho recall to the current-turn user message without mutating history.
@@ -268,6 +296,7 @@ class AIAgent:
reasoning_callback: callable = None,
clarify_callback: callable = None,
step_callback: callable = None,
stream_delta_callback: callable = None,
max_tokens: int = None,
reasoning_config: Dict[str, Any] = None,
prefill_messages: List[Dict[str, Any]] = None,
@@ -367,6 +396,7 @@ class AIAgent:
self.reasoning_callback = reasoning_callback
self.clarify_callback = clarify_callback
self.step_callback = step_callback
self.stream_delta_callback = stream_delta_callback
self._last_reported_tool = None # Track for "new tool" mode
# Interrupt mechanism for breaking out of tool loops
@@ -516,6 +546,8 @@ class AIAgent:
effective_key = api_key or resolve_anthropic_token() or ""
self._anthropic_api_key = effective_key
self._anthropic_base_url = base_url
from agent.anthropic_adapter import _is_oauth_token as _is_oat
self._is_anthropic_oauth = _is_oat(effective_key)
self._anthropic_client = build_anthropic_client(effective_key, base_url)
# No OpenAI client needed for Anthropic mode
self.client = None
@@ -784,7 +816,7 @@ class AIAgent:
logger.debug("peer %s memory_mode=honcho: local USER.md writes disabled", _hcfg.peer_name or "user")
# Skills config: nudge interval for skill creation reminders
self._skill_nudge_interval = 15
self._skill_nudge_interval = 10
try:
from hermes_cli.config import load_config as _load_skills_config
skills_config = _load_skills_config().get("skills", {})
@@ -828,9 +860,9 @@ class AIAgent:
"""Verbose print — suppressed when streaming TTS is active.
Pass ``force=True`` for error/warning messages that should always be
shown even during streaming TTS playback.
shown even during streaming playback (TTS or display).
"""
if not force and getattr(self, "_stream_callback", None) is not None:
if not force and self._has_stream_consumers():
return
print(*args, **kwargs)
@@ -2574,15 +2606,39 @@ class AIAgent:
def _close_request_openai_client(self, client: Any, *, reason: str) -> None:
self._close_openai_client(client, reason=reason, shared=False)
def _run_codex_stream(self, api_kwargs: dict, client: Any = None):
def _run_codex_stream(self, api_kwargs: dict, client: Any = None, on_first_delta: callable = None):
"""Execute one streaming Responses API request and return the final response."""
active_client = client or self._ensure_primary_openai_client(reason="codex_stream_direct")
max_stream_retries = 1
has_tool_calls = False
first_delta_fired = False
for attempt in range(max_stream_retries + 1):
try:
with active_client.responses.stream(**api_kwargs) as stream:
for _ in stream:
pass
for event in stream:
if self._interrupt_requested:
break
event_type = getattr(event, "type", "")
# Fire callbacks on text content deltas (suppress during tool calls)
if "output_text.delta" in event_type or event_type == "response.output_text.delta":
delta_text = getattr(event, "delta", "")
if delta_text and not has_tool_calls:
if not first_delta_fired:
first_delta_fired = True
if on_first_delta:
try:
on_first_delta()
except Exception:
pass
self._fire_stream_delta(delta_text)
# Track tool calls to suppress text streaming
elif "function_call" in event_type:
has_tool_calls = True
# Fire reasoning callbacks
elif "reasoning" in event_type and "delta" in event_type:
reasoning_text = getattr(event, "delta", "")
if reasoning_text:
self._fire_reasoning_delta(reasoning_text)
return stream.get_final_response()
except RuntimeError as exc:
err_text = str(exc)
@@ -2763,6 +2819,7 @@ class AIAgent:
result["response"] = self._run_codex_stream(
api_kwargs,
client=request_client_holder["client"],
on_first_delta=getattr(self, "_codex_on_first_delta", None),
)
elif self.api_mode == "anthropic_messages":
result["response"] = self._anthropic_messages_create(api_kwargs)
@@ -2804,116 +2861,246 @@ class AIAgent:
raise result["error"]
return result["response"]
def _streaming_api_call(self, api_kwargs: dict, stream_callback):
"""Streaming variant of _interruptible_api_call for voice TTS pipeline.
# ── Unified streaming API call ─────────────────────────────────────────
Uses ``stream=True`` and forwards content deltas to *stream_callback*
in real-time. Returns a ``SimpleNamespace`` that mimics a normal
``ChatCompletion`` so the rest of the agent loop works unchanged.
def _fire_stream_delta(self, text: str) -> None:
"""Fire all registered stream delta callbacks (display + TTS)."""
for cb in (self.stream_delta_callback, self._stream_callback):
if cb is not None:
try:
cb(text)
except Exception:
pass
This method is separate from ``_interruptible_api_call`` to keep the
core agent loop untouched for non-voice users.
def _fire_reasoning_delta(self, text: str) -> None:
"""Fire reasoning callback if registered."""
cb = self.reasoning_callback
if cb is not None:
try:
cb(text)
except Exception:
pass
def _has_stream_consumers(self) -> bool:
"""Return True if any streaming consumer is registered."""
return (
self.stream_delta_callback is not None
or getattr(self, "_stream_callback", None) is not None
)
def _interruptible_streaming_api_call(
self, api_kwargs: dict, *, on_first_delta: callable = None
):
"""Streaming variant of _interruptible_api_call for real-time token delivery.
Handles all three api_modes:
- chat_completions: stream=True on OpenAI-compatible endpoints
- anthropic_messages: client.messages.stream() via Anthropic SDK
- codex_responses: delegates to _run_codex_stream (already streaming)
Fires stream_delta_callback and _stream_callback for each text token.
Tool-call turns suppress the callback only text-only final responses
stream to the consumer. Returns a SimpleNamespace that mimics the
non-streaming response shape so the rest of the agent loop is unchanged.
Falls back to _interruptible_api_call on provider errors indicating
streaming is not supported.
"""
if self.api_mode == "codex_responses":
# Codex streams internally via _run_codex_stream. The main dispatch
# in _interruptible_api_call already calls it; we just need to
# ensure on_first_delta reaches it. Store it on the instance
# temporarily so _run_codex_stream can pick it up.
self._codex_on_first_delta = on_first_delta
try:
return self._interruptible_api_call(api_kwargs)
finally:
self._codex_on_first_delta = None
result = {"response": None, "error": None}
request_client_holder = {"client": None}
first_delta_fired = {"done": False}
deltas_were_sent = {"yes": False} # Track if any deltas were fired (for fallback)
def _fire_first_delta():
if not first_delta_fired["done"] and on_first_delta:
first_delta_fired["done"] = True
try:
on_first_delta()
except Exception:
pass
def _call_chat_completions():
"""Stream a chat completions response."""
stream_kwargs = {**api_kwargs, "stream": True, "stream_options": {"include_usage": True}}
request_client_holder["client"] = self._create_request_openai_client(
reason="chat_completion_stream_request"
)
stream = request_client_holder["client"].chat.completions.create(**stream_kwargs)
content_parts: list = []
tool_calls_acc: dict = {}
finish_reason = None
model_name = None
role = "assistant"
reasoning_parts: list = []
usage_obj = None
for chunk in stream:
if self._interrupt_requested:
break
if not chunk.choices:
if hasattr(chunk, "model") and chunk.model:
model_name = chunk.model
# Usage comes in the final chunk with empty choices
if hasattr(chunk, "usage") and chunk.usage:
usage_obj = chunk.usage
continue
delta = chunk.choices[0].delta
if hasattr(chunk, "model") and chunk.model:
model_name = chunk.model
# Accumulate reasoning content
reasoning_text = getattr(delta, "reasoning_content", None) or getattr(delta, "reasoning", None)
if reasoning_text:
reasoning_parts.append(reasoning_text)
self._fire_reasoning_delta(reasoning_text)
# Accumulate text content — fire callback only when no tool calls
if delta and delta.content:
content_parts.append(delta.content)
if not tool_calls_acc:
_fire_first_delta()
self._fire_stream_delta(delta.content)
deltas_were_sent["yes"] = True
# Accumulate tool call deltas (silently, no callback)
if delta and delta.tool_calls:
for tc_delta in delta.tool_calls:
idx = tc_delta.index if tc_delta.index is not None else 0
if idx not in tool_calls_acc:
tool_calls_acc[idx] = {
"id": tc_delta.id or "",
"type": "function",
"function": {"name": "", "arguments": ""},
}
entry = tool_calls_acc[idx]
if tc_delta.id:
entry["id"] = tc_delta.id
if tc_delta.function:
if tc_delta.function.name:
entry["function"]["name"] += tc_delta.function.name
if tc_delta.function.arguments:
entry["function"]["arguments"] += tc_delta.function.arguments
if chunk.choices[0].finish_reason:
finish_reason = chunk.choices[0].finish_reason
# Usage in the final chunk
if hasattr(chunk, "usage") and chunk.usage:
usage_obj = chunk.usage
# Build mock response matching non-streaming shape
full_content = "".join(content_parts) or None
mock_tool_calls = None
if tool_calls_acc:
mock_tool_calls = []
for idx in sorted(tool_calls_acc):
tc = tool_calls_acc[idx]
mock_tool_calls.append(SimpleNamespace(
id=tc["id"],
type=tc["type"],
function=SimpleNamespace(
name=tc["function"]["name"],
arguments=tc["function"]["arguments"],
),
))
full_reasoning = "".join(reasoning_parts) or None
mock_message = SimpleNamespace(
role=role,
content=full_content,
tool_calls=mock_tool_calls,
reasoning_content=full_reasoning,
)
mock_choice = SimpleNamespace(
index=0,
message=mock_message,
finish_reason=finish_reason or "stop",
)
return SimpleNamespace(
id="stream-" + str(uuid.uuid4()),
model=model_name,
choices=[mock_choice],
usage=usage_obj,
)
def _call_anthropic():
"""Stream an Anthropic Messages API response.
Fires delta callbacks for real-time token delivery, but returns
the native Anthropic Message object from get_final_message() so
the rest of the agent loop (validation, tool extraction, etc.)
works unchanged.
"""
has_tool_use = False
# Use the Anthropic SDK's streaming context manager
with self._anthropic_client.messages.stream(**api_kwargs) as stream:
for event in stream:
if self._interrupt_requested:
break
event_type = getattr(event, "type", None)
if event_type == "content_block_start":
block = getattr(event, "content_block", None)
if block and getattr(block, "type", None) == "tool_use":
has_tool_use = True
elif event_type == "content_block_delta":
delta = getattr(event, "delta", None)
if delta:
delta_type = getattr(delta, "type", None)
if delta_type == "text_delta":
text = getattr(delta, "text", "")
if text and not has_tool_use:
_fire_first_delta()
self._fire_stream_delta(text)
elif delta_type == "thinking_delta":
thinking_text = getattr(delta, "thinking", "")
if thinking_text:
self._fire_reasoning_delta(thinking_text)
# Return the native Anthropic Message for downstream processing
return stream.get_final_message()
def _call():
try:
stream_kwargs = {**api_kwargs, "stream": True}
request_client_holder["client"] = self._create_request_openai_client(
reason="chat_completion_stream_request"
)
stream = request_client_holder["client"].chat.completions.create(**stream_kwargs)
content_parts: list[str] = []
tool_calls_acc: dict[int, dict] = {}
finish_reason = None
model_name = None
role = "assistant"
for chunk in stream:
if not chunk.choices:
if hasattr(chunk, "model") and chunk.model:
model_name = chunk.model
continue
delta = chunk.choices[0].delta
if hasattr(chunk, "model") and chunk.model:
model_name = chunk.model
if delta and delta.content:
content_parts.append(delta.content)
try:
stream_callback(delta.content)
except Exception:
pass
if delta and delta.tool_calls:
for tc_delta in delta.tool_calls:
idx = tc_delta.index if tc_delta.index is not None else 0
if idx in tool_calls_acc and tc_delta.id and tc_delta.id != tool_calls_acc[idx]["id"]:
matched = False
for eidx, eentry in tool_calls_acc.items():
if eentry["id"] == tc_delta.id:
idx = eidx
matched = True
break
if not matched:
idx = (max(k for k in tool_calls_acc if isinstance(k, int)) + 1) if tool_calls_acc else 0
if idx not in tool_calls_acc:
tool_calls_acc[idx] = {
"id": tc_delta.id or "",
"type": "function",
"function": {"name": "", "arguments": ""},
}
entry = tool_calls_acc[idx]
if tc_delta.id:
entry["id"] = tc_delta.id
if tc_delta.function:
if tc_delta.function.name:
entry["function"]["name"] += tc_delta.function.name
if tc_delta.function.arguments:
entry["function"]["arguments"] += tc_delta.function.arguments
if chunk.choices[0].finish_reason:
finish_reason = chunk.choices[0].finish_reason
full_content = "".join(content_parts) or None
mock_tool_calls = None
if tool_calls_acc:
mock_tool_calls = []
for idx in sorted(tool_calls_acc):
tc = tool_calls_acc[idx]
mock_tool_calls.append(SimpleNamespace(
id=tc["id"],
type=tc["type"],
function=SimpleNamespace(
name=tc["function"]["name"],
arguments=tc["function"]["arguments"],
),
))
mock_message = SimpleNamespace(
role=role,
content=full_content,
tool_calls=mock_tool_calls,
reasoning_content=None,
)
mock_choice = SimpleNamespace(
index=0,
message=mock_message,
finish_reason=finish_reason or "stop",
)
mock_response = SimpleNamespace(
id="stream-" + str(uuid.uuid4()),
model=model_name,
choices=[mock_choice],
usage=None,
)
result["response"] = mock_response
if self.api_mode == "anthropic_messages":
self._try_refresh_anthropic_client_credentials()
result["response"] = _call_anthropic()
else:
result["response"] = _call_chat_completions()
except Exception as e:
result["error"] = e
if deltas_were_sent["yes"]:
# Streaming failed AFTER some tokens were already delivered
# to consumers. Don't fall back — that would cause
# double-delivery (partial streamed + full non-streamed).
# Let the error propagate; the partial content already
# reached the user via the stream.
logger.warning("Streaming failed after partial delivery, not falling back: %s", e)
result["error"] = e
else:
# Streaming failed before any tokens reached consumers.
# Safe to fall back to the standard non-streaming path.
logger.info("Streaming failed before delivery, falling back to non-streaming: %s", e)
try:
result["response"] = self._interruptible_api_call(api_kwargs)
except Exception as fallback_err:
result["error"] = fallback_err
finally:
request_client = request_client_holder.get("client")
if request_client is not None:
@@ -2939,7 +3126,7 @@ class AIAgent:
self._close_request_openai_client(request_client, reason="stream_interrupt_abort")
except Exception:
pass
raise InterruptedError("Agent interrupted during API call")
raise InterruptedError("Agent interrupted during streaming API call")
if result["error"] is not None:
raise result["error"]
return result["response"]
@@ -3187,6 +3374,7 @@ class AIAgent:
tools=self.tools,
max_tokens=self.max_tokens,
reasoning_config=self.reasoning_config,
is_oauth=getattr(self, "_is_anthropic_oauth", False),
)
if self.api_mode == "codex_responses":
@@ -3301,8 +3489,7 @@ class AIAgent:
extra_body["provider"] = provider_preferences
_is_nous = "nousresearch" in self.base_url.lower()
_is_mistral = "api.mistral.ai" in self.base_url.lower()
if (_is_openrouter or _is_nous) and not _is_mistral:
if self._supports_reasoning_extra_body():
if self.reasoning_config is not None:
rc = dict(self.reasoning_config)
# Nous Portal requires reasoning enabled — don't send
@@ -3326,6 +3513,34 @@ class AIAgent:
return api_kwargs
def _supports_reasoning_extra_body(self) -> bool:
"""Return True when reasoning extra_body is safe to send for this route/model.
OpenRouter forwards unknown extra_body fields to upstream providers.
Some providers/routes reject `reasoning` with 400s, so gate it to
known reasoning-capable model families and direct Nous Portal.
"""
base_url = (self.base_url or "").lower()
if "nousresearch" in base_url:
return True
if "ai-gateway.vercel.sh" in base_url:
return True
if "openrouter" not in base_url:
return False
if "api.mistral.ai" in base_url:
return False
model = (self.model or "").lower()
reasoning_model_prefixes = (
"deepseek/",
"anthropic/",
"openai/",
"x-ai/",
"google/gemini-2",
"qwen/qwen3",
)
return any(model.startswith(prefix) for prefix in reasoning_model_prefixes)
def _build_assistant_message(self, assistant_message, finish_reason: str) -> dict:
"""Build a normalized assistant message dict from an API response message.
@@ -3345,8 +3560,7 @@ class AIAgent:
reasoning_text = combined or None
if reasoning_text and self.verbose_logging:
preview = reasoning_text[:100] + "..." if len(reasoning_text) > 100 else reasoning_text
logging.debug(f"Captured reasoning ({len(reasoning_text)} chars): {preview}")
logging.debug(f"Captured reasoning ({len(reasoning_text)} chars): {reasoning_text}")
if reasoning_text and self.reasoning_callback:
try:
@@ -3490,7 +3704,8 @@ class AIAgent:
flush_content = (
"[System: The session is being compressed. "
"Please save anything worth remembering to your memories.]"
"Save anything worth remembering — prioritize user preferences, "
"corrections, and recurring patterns over task-specific details.]"
)
_sentinel = f"__flush_{id(self)}_{time.monotonic()}"
flush_msg = {"role": "user", "content": flush_content, "_flush_sentinel": _sentinel}
@@ -3579,7 +3794,7 @@ class AIAgent:
tool_calls = assistant_msg.tool_calls
elif self.api_mode == "anthropic_messages" and not _aux_available:
from agent.anthropic_adapter import normalize_anthropic_response as _nar_flush
_flush_msg, _ = _nar_flush(response)
_flush_msg, _ = _nar_flush(response, strip_tool_prefix=getattr(self, '_is_anthropic_oauth', False))
if _flush_msg and _flush_msg.tool_calls:
tool_calls = _flush_msg.tool_calls
elif hasattr(response, "choices") and response.choices:
@@ -3765,6 +3980,8 @@ class AIAgent:
return handle_function_call(
function_name, function_args, effective_task_id,
enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
honcho_manager=self._honcho,
honcho_session_key=self._honcho_session_key,
)
def _execute_tool_calls_concurrent(self, assistant_message, messages: list, effective_task_id: str, api_call_count: int = 0) -> None:
@@ -3815,6 +4032,18 @@ class AIAgent:
except Exception:
pass
# Checkpoint before destructive terminal commands
if function_name == "terminal" and self._checkpoint_mgr.enabled:
try:
cmd = function_args.get("command", "")
if _is_destructive_command(cmd):
cwd = function_args.get("workdir") or os.getenv("TERMINAL_CWD", os.getcwd())
self._checkpoint_mgr.ensure_checkpoint(
cwd, f"before terminal: {cmd[:60]}"
)
except Exception:
pass
parsed_calls.append((tool_call, function_name, function_args))
# ── Logging / callbacks ──────────────────────────────────────────
@@ -3823,8 +4052,12 @@ class AIAgent:
print(f" ⚡ Concurrent: {num_tools} tool calls — {tool_names_str}")
for i, (tc, name, args) in enumerate(parsed_calls, 1):
args_str = json.dumps(args, ensure_ascii=False)
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {name}({list(args.keys())}) - {args_preview}")
if self.verbose_logging:
print(f" 📞 Tool {i}: {name}({list(args.keys())})")
print(f" Args: {args_str}")
else:
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {name}({list(args.keys())}) - {args_preview}")
for _, name, args in parsed_calls:
if self.tool_progress_callback:
@@ -3889,17 +4122,20 @@ class AIAgent:
logger.warning("Tool %s returned error (%.2fs): %s", function_name, tool_duration, result_preview)
if self.verbose_logging:
result_preview = function_result[:200] if len(function_result) > 200 else function_result
logging.debug(f"Tool {function_name} completed in {tool_duration:.2f}s")
logging.debug(f"Tool result preview: {result_preview}...")
logging.debug(f"Tool result ({len(function_result)} chars): {function_result}")
# Print cute message per tool
if self.quiet_mode:
cute_msg = _get_cute_tool_message_impl(name, args, tool_duration, result=function_result)
print(f" {cute_msg}")
elif not self.quiet_mode:
response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
print(f" ✅ Tool {i+1} completed in {tool_duration:.2f}s - {response_preview}")
if self.verbose_logging:
print(f" ✅ Tool {i+1} completed in {tool_duration:.2f}s")
print(f" Result: {function_result}")
else:
response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
print(f" ✅ Tool {i+1} completed in {tool_duration:.2f}s - {response_preview}")
# Truncate oversized results
MAX_TOOL_RESULT_CHARS = 100_000
@@ -3975,8 +4211,12 @@ class AIAgent:
if not self.quiet_mode:
args_str = json.dumps(function_args, ensure_ascii=False)
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {function_name}({list(function_args.keys())}) - {args_preview}")
if self.verbose_logging:
print(f" 📞 Tool {i}: {function_name}({list(function_args.keys())})")
print(f" Args: {args_str}")
else:
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {function_name}({list(function_args.keys())}) - {args_preview}")
if self.tool_progress_callback:
try:
@@ -3997,6 +4237,18 @@ class AIAgent:
except Exception:
pass # never block tool execution
# Checkpoint before destructive terminal commands
if function_name == "terminal" and self._checkpoint_mgr.enabled:
try:
cmd = function_args.get("command", "")
if _is_destructive_command(cmd):
cwd = function_args.get("workdir") or os.getenv("TERMINAL_CWD", os.getcwd())
self._checkpoint_mgr.ensure_checkpoint(
cwd, f"before terminal: {cmd[:60]}"
)
except Exception:
pass # never block tool execution
tool_start_time = time.time()
if function_name == "todo":
@@ -4083,25 +4335,9 @@ class AIAgent:
spinner.stop(cute_msg)
elif self.quiet_mode:
self._vprint(f" {cute_msg}")
elif self.quiet_mode and self._stream_callback is None:
elif self.quiet_mode and not self._has_stream_consumers():
face = random.choice(KawaiiSpinner.KAWAII_WAITING)
tool_emoji_map = {
'web_search': '🔍', 'web_extract': '📄', 'web_crawl': '🕸️',
'terminal': '💻', 'process': '⚙️',
'read_file': '📖', 'write_file': '✍️', 'patch': '🔧', 'search_files': '🔎',
'browser_navigate': '🌐', 'browser_snapshot': '📸',
'browser_click': '👆', 'browser_type': '⌨️',
'browser_scroll': '📜', 'browser_back': '◀️',
'browser_press': '⌨️', 'browser_close': '🚪',
'browser_get_images': '🖼️', 'browser_vision': '👁️',
'image_generate': '🎨', 'text_to_speech': '🔊',
'vision_analyze': '👁️', 'mixture_of_agents': '🧠',
'skills_list': '📚', 'skill_view': '📚',
'cronjob': '',
'send_message': '📨', 'todo': '📋', 'memory': '🧠', 'session_search': '🔍',
'clarify': '', 'execute_code': '🐍', 'delegate_task': '🔀',
}
emoji = tool_emoji_map.get(function_name, '')
emoji = _get_tool_emoji(function_name)
preview = _build_tool_preview(function_name, function_args) or function_name
if len(preview) > 30:
preview = preview[:27] + "..."
@@ -4112,6 +4348,8 @@ class AIAgent:
function_result = handle_function_call(
function_name, function_args, effective_task_id,
enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
honcho_manager=self._honcho,
honcho_session_key=self._honcho_session_key,
)
_spinner_result = function_result
except Exception as tool_error:
@@ -4126,13 +4364,17 @@ class AIAgent:
function_result = handle_function_call(
function_name, function_args, effective_task_id,
enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
honcho_manager=self._honcho,
honcho_session_key=self._honcho_session_key,
)
except Exception as tool_error:
function_result = f"Error executing tool '{function_name}': {tool_error}"
logger.error("handle_function_call raised for %s: %s", function_name, tool_error, exc_info=True)
tool_duration = time.time() - tool_start_time
result_preview = function_result[:200] if len(function_result) > 200 else function_result
result_preview = function_result if self.verbose_logging else (
function_result[:200] if len(function_result) > 200 else function_result
)
# Log tool errors to the persistent error log so [error] tags
# in the UI always have a corresponding detailed entry on disk.
@@ -4142,7 +4384,7 @@ class AIAgent:
if self.verbose_logging:
logging.debug(f"Tool {function_name} completed in {tool_duration:.2f}s")
logging.debug(f"Tool result preview: {result_preview}...")
logging.debug(f"Tool result ({len(function_result)} chars): {function_result}")
# Guard against tools returning absurdly large content that would
# blow up the context window. 100K chars ≈ 25K tokens — generous
@@ -4165,8 +4407,12 @@ class AIAgent:
messages.append(tool_msg)
if not self.quiet_mode:
response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
print(f" ✅ Tool {i} completed in {tool_duration:.2f}s - {response_preview}")
if self.verbose_logging:
print(f" ✅ Tool {i} completed in {tool_duration:.2f}s")
print(f" Result: {function_result}")
else:
response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
print(f" ✅ Tool {i} completed in {tool_duration:.2f}s - {response_preview}")
if self._interrupt_requested and i < len(assistant_message.tool_calls):
remaining = len(assistant_message.tool_calls) - i
@@ -4264,9 +4510,8 @@ class AIAgent:
api_messages.insert(sys_offset + idx, pfm.copy())
summary_extra_body = {}
_is_openrouter = "openrouter" in self.base_url.lower()
_is_nous = "nousresearch" in self.base_url.lower()
if _is_openrouter or _is_nous:
if self._supports_reasoning_extra_body():
if self.reasoning_config is not None:
summary_extra_body["reasoning"] = self.reasoning_config
else:
@@ -4310,9 +4555,10 @@ class AIAgent:
if self.api_mode == "anthropic_messages":
from agent.anthropic_adapter import build_anthropic_kwargs as _bak, normalize_anthropic_response as _nar
_ant_kw = _bak(model=self.model, messages=api_messages, tools=None,
max_tokens=self.max_tokens, reasoning_config=self.reasoning_config)
max_tokens=self.max_tokens, reasoning_config=self.reasoning_config,
is_oauth=getattr(self, '_is_anthropic_oauth', False))
summary_response = self._anthropic_messages_create(_ant_kw)
_msg, _ = _nar(summary_response)
_msg, _ = _nar(summary_response, strip_tool_prefix=getattr(self, '_is_anthropic_oauth', False))
final_response = (_msg.content or "").strip()
else:
summary_response = self._ensure_primary_openai_client(reason="iteration_limit_summary").chat.completions.create(**summary_kwargs)
@@ -4340,9 +4586,10 @@ class AIAgent:
elif self.api_mode == "anthropic_messages":
from agent.anthropic_adapter import build_anthropic_kwargs as _bak2, normalize_anthropic_response as _nar2
_ant_kw2 = _bak2(model=self.model, messages=api_messages, tools=None,
is_oauth=getattr(self, '_is_anthropic_oauth', False),
max_tokens=self.max_tokens, reasoning_config=self.reasoning_config)
retry_response = self._anthropic_messages_create(_ant_kw2)
_retry_msg, _ = _nar2(retry_response)
_retry_msg, _ = _nar2(retry_response, strip_tool_prefix=getattr(self, '_is_anthropic_oauth', False))
final_response = (_retry_msg.content or "").strip()
else:
summary_kwargs = {
@@ -4385,6 +4632,7 @@ class AIAgent:
task_id: str = None,
stream_callback: Optional[callable] = None,
persist_user_message: Optional[str] = None,
sync_honcho: bool = True,
) -> Dict[str, Any]:
"""
Run a complete conversation with tool calling until completion.
@@ -4400,6 +4648,8 @@ class AIAgent:
persist_user_message: Optional clean user message to store in
transcripts/history when user_message contains API-only
synthetic prefixes.
sync_honcho: When False, skip writing the final synthetic turn back
to Honcho or queuing follow-up prefetch work.
Returns:
Dict: Complete conversation result with final response and message history
@@ -4456,8 +4706,9 @@ class AIAgent:
self._turns_since_memory += 1
if self._turns_since_memory >= self._memory_nudge_interval:
user_message += (
"\n\n[System: You've had several exchanges in this session. "
"Consider whether there's anything worth saving to your memories.]"
"\n\n[System: You've had several exchanges. Consider: "
"has the user shared preferences, corrected you, or revealed "
"something about their workflow worth remembering for future sessions?]"
)
self._turns_since_memory = 0
@@ -4467,8 +4718,9 @@ class AIAgent:
and self._iters_since_skill >= self._skill_nudge_interval
and "skill_manage" in self.valid_tool_names):
user_message += (
"\n\n[System: The previous task involved many steps. "
"If you discovered a reusable workflow, consider saving it as a skill.]"
"\n\n[System: The previous task involved many tool calls. "
"Save the approach as a skill if it's reusable, or update "
"any existing skill you used if it was wrong or incomplete.]"
)
self._iters_since_skill = 0
@@ -4722,8 +4974,8 @@ class AIAgent:
self._vprint(f"\n{self.log_prefix}🔄 Making API call #{api_call_count}/{self.max_iterations}...")
self._vprint(f"{self.log_prefix} 📊 Request size: {len(api_messages)} messages, ~{approx_tokens:,} tokens (~{total_chars:,} chars)")
self._vprint(f"{self.log_prefix} 🔧 Available tools: {len(self.tools) if self.tools else 0}")
elif self._stream_callback is None:
# Animated thinking spinner in quiet mode (skip during streaming TTS)
elif not self._has_stream_consumers():
# Animated thinking spinner in quiet mode (skip during streaming)
face = random.choice(KawaiiSpinner.KAWAII_THINKING)
verb = random.choice(KawaiiSpinner.THINKING_VERBS)
if self.thinking_callback:
@@ -4763,33 +5015,22 @@ class AIAgent:
if os.getenv("HERMES_DUMP_REQUESTS", "").strip().lower() in {"1", "true", "yes", "on"}:
self._dump_api_request_debug(api_kwargs, reason="preflight")
cb = getattr(self, "_stream_callback", None)
if cb is not None and self.api_mode == "chat_completions":
response = self._streaming_api_call(api_kwargs, cb)
if self._has_stream_consumers():
# Streaming path: fire delta callbacks for real-time
# token delivery to CLI display, gateway, or TTS.
def _stop_spinner():
nonlocal thinking_spinner
if thinking_spinner:
thinking_spinner.stop("")
thinking_spinner = None
if self.thinking_callback:
self.thinking_callback("")
response = self._interruptible_streaming_api_call(
api_kwargs, on_first_delta=_stop_spinner
)
else:
response = self._interruptible_api_call(api_kwargs)
# Forward full response to TTS callback for non-streaming providers
# (e.g. Anthropic) so voice TTS still works via batch delivery.
if cb is not None and response:
try:
content = None
# Try choices first — _interruptible_api_call converts all
# providers (including Anthropic) to this format.
try:
content = response.choices[0].message.content
except (AttributeError, IndexError):
pass
# Fallback: Anthropic native content blocks
if not content and self.api_mode == "anthropic_messages":
text_parts = [
block.text for block in getattr(response, "content", [])
if getattr(block, "type", None) == "text" and getattr(block, "text", None)
]
content = " ".join(text_parts) if text_parts else None
if content:
cb(content)
except Exception:
pass
api_duration = time.time() - api_start_time
@@ -5017,6 +5258,15 @@ class AIAgent:
if hasattr(response, 'usage') and response.usage:
if self.api_mode in ("codex_responses", "anthropic_messages"):
prompt_tokens = getattr(response.usage, 'input_tokens', 0) or 0
if self.api_mode == "anthropic_messages":
# Anthropic splits input into cache_read + cache_creation
# + non-cached input_tokens. Without adding the cached
# portions, the context bar shows only the tiny non-cached
# portion (e.g. 3 tokens) instead of the real total (~18K).
# Other providers (OpenAI/Codex) already include cached
# tokens in their input_tokens/prompt_tokens field.
prompt_tokens += getattr(response.usage, 'cache_read_input_tokens', 0) or 0
prompt_tokens += getattr(response.usage, 'cache_creation_input_tokens', 0) or 0
completion_tokens = getattr(response.usage, 'output_tokens', 0) or 0
total_tokens = (
getattr(response.usage, 'total_tokens', None)
@@ -5044,6 +5294,22 @@ class AIAgent:
self.session_completion_tokens += completion_tokens
self.session_total_tokens += total_tokens
self.session_api_calls += 1
# Persist token counts to session DB for /insights.
# Gateway sessions persist via session_store.update_session()
# after run_conversation returns, so only persist here for
# CLI (and other non-gateway) platforms to avoid double-counting.
if (self._session_db and self.session_id
and getattr(self, 'platform', None) == 'cli'):
try:
self._session_db.update_token_counts(
self.session_id,
input_tokens=prompt_tokens,
output_tokens=completion_tokens,
model=self.model,
)
except Exception:
pass # never block the agent loop
if self.verbose_logging:
logging.debug(f"Token usage: prompt={usage_dict['prompt_tokens']:,}, completion={usage_dict['completion_tokens']:,}, total={usage_dict['total_tokens']:,}")
@@ -5226,6 +5492,27 @@ class AIAgent:
'request entity too large', # OpenRouter/Nous 413 safety net
'prompt is too long', # Anthropic: "prompt is too long: N tokens > M maximum"
])
# Fallback heuristic: Anthropic sometimes returns a generic
# 400 invalid_request_error with just "Error" as the message
# when the context is too large. If the error message is very
# short/generic AND the session is large, treat it as a
# probable context-length error and attempt compression rather
# than aborting. This prevents an infinite failure loop where
# each failed message gets persisted, making the session even
# larger. (#1630)
if not is_context_length_error and status_code == 400:
ctx_len = getattr(getattr(self, 'context_compressor', None), 'context_length', 200000)
is_large_session = approx_tokens > ctx_len * 0.4 or len(api_messages) > 80
is_generic_error = len(error_msg.strip()) < 30 # e.g. just "error"
if is_large_session and is_generic_error:
is_context_length_error = True
self._vprint(
f"{self.log_prefix}⚠️ Generic 400 with large session "
f"(~{approx_tokens:,} tokens, {len(api_messages)} msgs) — "
f"treating as probable context overflow.",
force=True,
)
if is_context_length_error:
compressor = self.context_compressor
@@ -5292,10 +5579,19 @@ class AIAgent:
# These indicate a problem with the request itself (bad model ID,
# invalid API key, forbidden, etc.) and will never succeed on retry.
# Note: 413 and context-length errors are excluded — handled above.
# 429 (rate limit) is transient and MUST be retried with backoff.
# 529 (Anthropic overloaded) is also transient.
# Also catch local validation errors (ValueError, TypeError) — these
# are programming bugs, not transient failures.
_RETRYABLE_STATUS_CODES = {413, 429, 529}
is_local_validation_error = isinstance(api_error, (ValueError, TypeError))
is_client_status_error = isinstance(status_code, int) and 400 <= status_code < 500 and status_code != 413
# Detect generic 400s from Anthropic OAuth (transient server-side failures).
# Real invalid_request_error responses include a descriptive message;
# transient ones contain only "Error" or are empty. (ref: issue #1608)
_err_body = getattr(api_error, "body", None) or {}
_err_message = (_err_body.get("error", {}).get("message", "") if isinstance(_err_body, dict) else "")
_is_generic_400 = (status_code == 400 and _err_message.strip().lower() in ("error", ""))
is_client_status_error = isinstance(status_code, int) and 400 <= status_code < 500 and status_code not in _RETRYABLE_STATUS_CODES and not _is_generic_400
is_client_error = (is_local_validation_error or is_client_status_error or any(phrase in error_msg for phrase in [
'error code: 401', 'error code: 403',
'error code: 404', 'error code: 422',
@@ -5316,7 +5612,19 @@ class AIAgent:
self._vprint(f"{self.log_prefix}❌ Non-retryable client error detected. Aborting immediately.", force=True)
self._vprint(f"{self.log_prefix} 💡 This type of error won't be fixed by retrying.", force=True)
logging.error(f"{self.log_prefix}Non-retryable client error: {api_error}")
self._persist_session(messages, conversation_history)
# Skip session persistence when the error is likely
# context-overflow related (status 400 + large session).
# Persisting the failed user message would make the
# session even larger, causing the same failure on the
# next attempt. (#1630)
if status_code == 400 and (approx_tokens > 50000 or len(api_messages) > 80):
self._vprint(
f"{self.log_prefix}⚠️ Skipping session persistence "
f"for large failed session to prevent growth loop.",
force=True,
)
else:
self._persist_session(messages, conversation_history)
return {
"final_response": None,
"messages": messages,
@@ -5391,7 +5699,9 @@ class AIAgent:
assistant_message, finish_reason = self._normalize_codex_response(response)
elif self.api_mode == "anthropic_messages":
from agent.anthropic_adapter import normalize_anthropic_response
assistant_message, finish_reason = normalize_anthropic_response(response)
assistant_message, finish_reason = normalize_anthropic_response(
response, strip_tool_prefix=getattr(self, "_is_anthropic_oauth", False)
)
else:
assistant_message = response.choices[0].message
@@ -5418,7 +5728,10 @@ class AIAgent:
# Handle assistant response
if assistant_message.content and not self.quiet_mode:
self._vprint(f"{self.log_prefix}🤖 Assistant: {assistant_message.content[:100]}{'...' if len(assistant_message.content) > 100 else ''}")
if self.verbose_logging:
self._vprint(f"{self.log_prefix}🤖 Assistant: {assistant_message.content}")
else:
self._vprint(f"{self.log_prefix}🤖 Assistant: {assistant_message.content[:100]}{'...' if len(assistant_message.content) > 100 else ''}")
# Notify progress callback of model's thinking (used by subagent
# delegation to relay the child's reasoning to the parent display).
@@ -5889,7 +6202,7 @@ class AIAgent:
self._persist_session(messages, conversation_history)
# Sync conversation to Honcho for user modeling
if final_response and not interrupted:
if final_response and not interrupted and sync_honcho:
self._honcho_sync(original_user_message, final_response)
self._queue_honcho_prefetch(original_user_message)
+10 -3
View File
@@ -483,6 +483,8 @@ install_system_packages() {
elif command -v sudo &> /dev/null; then
if [ "$IS_INTERACTIVE" = true ]; then
echo ""
log_info "sudo is needed ONLY to install optional system packages (${pkgs[*]}) via your package manager."
log_info "Hermes Agent itself does not require or retain root access."
read -p "Install ${description}? (requires sudo) [y/N] " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
@@ -496,8 +498,9 @@ install_system_packages() {
# Non-interactive (e.g. curl | bash) but a terminal is available.
# Read the prompt from /dev/tty (same approach the setup wizard uses).
echo ""
log_info "Installing ${description} requires sudo."
read -p "Install? [Y/n] " -n 1 -r < /dev/tty
log_info "sudo is needed ONLY to install optional system packages (${pkgs[*]}) via your package manager."
log_info "Hermes Agent itself does not require or retain root access."
read -p "Install ${description}? [Y/n] " -n 1 -r < /dev/tty
echo
if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
if sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a $install_cmd < /dev/tty; then
@@ -688,7 +691,9 @@ install_deps() {
sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a apt-get update -qq && sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a apt-get install -y -qq build-essential python3-dev libffi-dev >/dev/null 2>&1 || true
log_success "Build tools installed"
else
read -p "Install build tools (build-essential, python3-dev)? (requires sudo) [Y/n] " -n 1 -r < /dev/tty
log_info "sudo is needed ONLY to install build tools (build-essential, python3-dev, libffi-dev) via apt."
log_info "Hermes Agent itself does not require or retain root access."
read -p "Install build tools? [Y/n] " -n 1 -r < /dev/tty
echo
if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a apt-get update -qq && sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a apt-get install -y -qq build-essential python3-dev libffi-dev >/dev/null 2>&1 || true
@@ -908,6 +913,8 @@ install_node_deps() {
cd "$INSTALL_DIR" && npx playwright install chromium 2>/dev/null || true
;;
*)
log_info "Playwright may request sudo to install browser system dependencies (shared libraries)."
log_info "This is standard Playwright setup — Hermes itself does not require root access."
cd "$INSTALL_DIR" && npx playwright install --with-deps chromium 2>/dev/null || true
;;
esac
+111 -162
View File
@@ -5,12 +5,26 @@ description: "Production pipeline for ASCII art video — any format. Converts v
# ASCII Video Production Pipeline
Full production pipeline for rendering any content as colored ASCII character video.
## Creative Standard
This is visual art. ASCII characters are the medium; cinema is the standard.
**Before writing a single line of code**, articulate the creative concept. What is the mood? What visual story does this tell? What makes THIS project different from every other ASCII video? The user's prompt is a starting point — interpret it with creative ambition, not literal transcription.
**First-render excellence is non-negotiable.** The output must be visually striking without requiring revision rounds. If something looks generic, flat, or like "AI-generated ASCII art," it is wrong — rethink the creative concept before shipping.
**Go beyond the reference vocabulary.** The effect catalogs, shader presets, and palette libraries in the references are a starting vocabulary. For every project, combine, modify, and invent new patterns. The catalog is a palette of paints — you write the painting.
**Be proactively creative.** Extend the skill's vocabulary when the project calls for it. If the references don't have what the vision demands, build it. Include at least one visual moment the user didn't ask for but will appreciate — a transition, an effect, a color choice that elevates the whole piece.
**Cohesive aesthetic over technical correctness.** All scenes in a video must feel connected by a unifying visual language — shared color temperature, related character palettes, consistent motion vocabulary. A technically correct video where every scene uses a random different effect is an aesthetic failure.
**Dense, layered, considered.** Every frame should reward viewing. Never flat black backgrounds. Always multi-grid composition. Always per-scene variation. Always intentional color.
## Modes
| Mode | Input | Output | Read |
|------|-------|--------|------|
| Mode | Input | Output | Reference |
|------|-------|--------|-----------|
| **Video-to-ASCII** | Video file | ASCII recreation of source footage | `references/inputs.md` § Video Sampling |
| **Audio-reactive** | Audio file | Generative visuals driven by audio features | `references/inputs.md` § Audio Analysis |
| **Generative** | None (or seed params) | Procedural ASCII animation | `references/effects.md` |
@@ -20,210 +34,154 @@ Full production pipeline for rendering any content as colored ASCII character vi
## Stack
Single self-contained Python script per project. No GPU.
Single self-contained Python script per project. No GPU required.
| Layer | Tool | Purpose |
|-------|------|---------|
| Core | Python 3.10+, NumPy | Math, array ops, vectorized effects |
| Signal | SciPy | FFT, peak detection (audio modes only) |
| Imaging | Pillow (PIL) | Font rasterization, video frame decoding, image I/O |
| Video I/O | ffmpeg (CLI) | Decode input, encode output segments, mux audio, mix tracks |
| Parallel | concurrent.futures / multiprocessing | N workers for batch/clip rendering |
| TTS | ElevenLabs API (or similar) | Generate narration clips for quote/testimonial videos |
| Optional | OpenCV | Video frame sampling, edge detection, optical flow |
| Signal | SciPy | FFT, peak detection (audio modes) |
| Imaging | Pillow (PIL) | Font rasterization, frame decoding, image I/O |
| Video I/O | ffmpeg (CLI) | Decode input, encode output, mux audio |
| Parallel | concurrent.futures | N workers for batch/clip rendering |
| TTS | ElevenLabs API (optional) | Generate narration clips |
| Optional | OpenCV | Video frame sampling, edge detection |
## Pipeline Architecture (v2)
## Pipeline Architecture
Every mode follows the same 6-stage pipeline. See `references/architecture.md` for implementation details, `references/scenes.md` for scene protocol, and `references/composition.md` for multi-grid composition and tonemap.
Every mode follows the same 6-stage pipeline:
```
┌─────────┐ ┌──────────┐ ┌───────────┐ ┌──────────┐ ┌─────────┐ ┌────────┐
│ 1.INPUT │→│ 2.ANALYZE │→│ 3.SCENE_FN │→│ 4.TONEMAP │→│ 5.SHADE │→│ 6.ENCODE│
│ load src │ │ features │ │ → canvas │ │ normalize │ │ post-fx │ │ → video │
└─────────┘ └──────────┘ └───────────┘ └──────────┘ └─────────┘ └────────┘
INPUT → ANALYZE → SCENE_FN → TONEMAP → SHADE → ENCODE
```
1. **INPUT** — Load/decode source material (video frames, audio samples, images, or nothing)
2. **ANALYZE** — Extract per-frame features (audio bands, video luminance/edges, motion vectors)
3. **SCENE_FN** — Scene function renders directly to pixel canvas (`uint8 H,W,3`). May internally compose multiple character grids via `_render_vf()` + pixel blend modes. See `references/composition.md`
4. **TONEMAP** — Percentile-based adaptive brightness normalization with per-scene gamma. Replaces linear brightness multipliers. See `references/composition.md` § Adaptive Tonemap
5. **SHADE**Apply post-processing `ShaderChain` + `FeedbackBuffer`. See `references/shaders.md`
3. **SCENE_FN** — Scene function renders to pixel canvas (`uint8 H,W,3`). Composes multiple character grids via `_render_vf()` + pixel blend modes. See `references/composition.md`
4. **TONEMAP** — Percentile-based adaptive brightness normalization. See `references/composition.md` § Adaptive Tonemap
5. **SHADE**Post-processing via `ShaderChain` + `FeedbackBuffer`. See `references/shaders.md`
6. **ENCODE** — Pipe raw RGB frames to ffmpeg for H.264/GIF encoding
## Creative Direction
**Every project should look and feel different.** The references provide a vocabulary of building blocks — don't copy them verbatim. Combine, modify, and invent.
### Aesthetic Dimensions to Vary
### Aesthetic Dimensions
| Dimension | Options | Reference |
|-----------|---------|-----------|
| **Character palette** | Density ramps, block elements, symbols, scripts (katakana, Greek, runes, braille), dots, project-specific | `architecture.md` § Character Palettes |
| **Color strategy** | HSV (angle/distance/time/value mapped), OKLAB/OKLCH (perceptually uniform), discrete RGB palettes, auto-generated harmony (complementary/triadic/analogous/tetradic), monochrome, temperature | `architecture.md` § Color System |
| **Color tint** | Warm, cool, amber, matrix green, neon pink, sepia, ice, blood, void, sunset | `shaders.md` § Color Grade |
| **Background texture** | Sine fields, fBM noise, domain warp, voronoi cells, reaction-diffusion, cellular automata, video source | `effects.md` § Background Fills, Noise-Based Fields, Simulation-Based Fields |
| **Primary effects** | Rings, spirals, tunnel, vortex, waves, interference, aurora, ripple, fire, strange attractors, SDFs (geometric shapes with smooth booleans) | `effects.md` § Radial / Wave / Fire / SDF-Based Fields |
| **Particles** | Energy sparks, snow, rain, bubbles, runes, binary data, orbits, gravity wells, flocking boids, flow-field followers, trail-drawing particles | `effects.md` § Particle Systems |
| **Shader mood** | Retro CRT, clean modern, glitch art, cinematic, dreamy, harsh industrial, psychedelic | `shaders.md` § Design Philosophy |
| **Character palette** | Density ramps, block elements, symbols, scripts (katakana, Greek, runes, braille), project-specific | `architecture.md` § Palettes |
| **Color strategy** | HSV, OKLAB/OKLCH, discrete RGB palettes, auto-generated harmony, monochrome, temperature | `architecture.md` § Color System |
| **Background texture** | Sine fields, fBM noise, domain warp, voronoi, reaction-diffusion, cellular automata, video | `effects.md` |
| **Primary effects** | Rings, spirals, tunnel, vortex, waves, interference, aurora, fire, SDFs, strange attractors | `effects.md` |
| **Particles** | Sparks, snow, rain, bubbles, runes, orbits, flocking boids, flow-field followers, trails | `effects.md` § Particles |
| **Shader mood** | Retro CRT, clean modern, glitch art, cinematic, dreamy, industrial, psychedelic | `shaders.md` |
| **Grid density** | xs(8px) through xxl(40px), mixed per layer | `architecture.md` § Grid System |
| **Font** | Menlo, Monaco, Courier, SF Mono, JetBrains Mono, Fira Code, IBM Plex | `architecture.md` § Font Selection |
| **Coordinate space** | Cartesian, polar, tiled, rotated, skewed, fisheye, twisted, Möbius, domain-warped | `effects.md` § Coordinate Transforms |
| **Mirror mode** | None, horizontal, vertical, quad, diagonal, kaleidoscope | `shaders.md` § Mirror Effects |
| **Masking** | Circle, rect, ring, gradient, text stencil, value-field-as-mask, animated iris/wipe/dissolve | `composition.md` § Masking |
| **Temporal motion** | Static, audio-reactive, eased keyframes, morphing between fields, temporal noise (smooth in-place evolution) | `effects.md` § Temporal Coherence |
| **Transition style** | Crossfade, wipe (directional/radial), dissolve, glitch cut, iris open/close, mask-based reveal | `shaders.md` § Transitions, `composition.md` § Animated Masks |
| **Aspect ratio** | Landscape (16:9), portrait (9:16), square (1:1), ultrawide (21:9) | `architecture.md` § Resolution Presets |
| **Coordinate space** | Cartesian, polar, tiled, rotated, fisheye, Möbius, domain-warped | `effects.md` § Transforms |
| **Feedback** | Zoom tunnel, rainbow trails, ghostly echo, rotating mandala, color evolution | `composition.md` § Feedback |
| **Masking** | Circle, ring, gradient, text stencil, animated iris/wipe/dissolve | `composition.md` § Masking |
| **Transitions** | Crossfade, wipe, dissolve, glitch cut, iris, mask-based reveal | `shaders.md` § Transitions |
### Per-Section Variation
Never use the same config for the entire video. For each section/scene/quote:
- Choose a **different background effect** (or compose 2-3)
- Choose a **different character palette** (match the mood)
- Choose a **different color strategy** (or at minimum a different hue)
- Vary **shader intensity** (more bloom during peaks, more grain during quiet)
- Use **different particle types** if particles are active
Never use the same config for the entire video. For each section/scene:
- **Different background effect** (or compose 2-3)
- **Different character palette** (match the mood)
- **Different color strategy** (or at minimum a different hue)
- **Vary shader intensity** (more bloom during peaks, more grain during quiet)
- **Different particle types** if particles are active
### Project-Specific Invention
For every project, invent at least one of:
- A custom character palette matching the theme
- A custom background effect (combine/modify existing ones)
- A custom background effect (combine/modify existing building blocks)
- A custom color palette (discrete RGB set matching the brand/mood)
- A custom particle character set
- A novel scene transition or visual moment
Don't just pick from the catalog. The catalog is vocabulary — you write the poem.
## Workflow
### Step 1: Determine Mode and Gather Requirements
### Step 1: Creative Vision
Before any code, articulate the creative concept:
- **Mood/atmosphere**: What should the viewer feel? Energetic, meditative, chaotic, elegant, ominous?
- **Visual story**: What happens over the duration? Build tension? Transform? Dissolve?
- **Color world**: Warm/cool? Monochrome? Neon? Earth tones? What's the dominant hue?
- **Character texture**: Dense data? Sparse stars? Organic dots? Geometric blocks?
- **What makes THIS different**: What's the one thing that makes this project unique?
- **Emotional arc**: How do scenes progress? Open with energy, build to climax, resolve?
Map the user's prompt to aesthetic choices. A "chill lo-fi visualizer" demands different everything from a "glitch cyberpunk data stream."
### Step 2: Technical Design
Establish with user:
- **Input source** — file path, format, duration
- **Mode** — which of the 6 modes above
- **Sections**time-mapped style changes (timestamps → effect names)
- **Resolution** — landscape 1920x1080 (default), portrait 1080x1920, square 1080x1080 @ 24fps; GIFs typically 640x360 @ 15fps
- **Style direction**dense/sparse, bright/dark, chaotic/minimal, color palette
- **Text/branding** — easter eggs, overlays, credits, themed character sets
- **Output format** — MP4 (default), GIF, PNG sequence
- **Aspect ratio** — landscape (16:9), portrait (9:16 for TikTok/Reels/Stories), square (1:1 for IG feed)
### Step 2: Detect Hardware and Set Quality
Before building the script, detect the user's hardware and set appropriate defaults. See `references/optimization.md` § Hardware Detection.
```python
hw = detect_hardware()
profile = quality_profile(hw, target_duration, user_quality_pref)
log(f"Hardware: {hw['cpu_count']} cores, {hw['mem_gb']:.1f}GB RAM")
log(f"Render: {profile['vw']}x{profile['vh']} @{profile['fps']}fps, {profile['workers']} workers")
```
Never hardcode worker counts, resolution, or CRF. Always detect and adapt.
- **Resolution**landscape 1920x1080 (default), portrait 1080x1920, square 1080x1080 @ 24fps
- **Hardware detection** — auto-detect cores/RAM, set quality profile. See `references/optimization.md`
- **Sections**map timestamps to scene functions, each with its own effect/palette/color/shader config
- **Output format** — MP4 (default), GIF (640x360 @ 15fps), PNG sequence
### Step 3: Build the Script
Write as a single Python file. Major components:
Single Python file. Components (with references):
1. **Hardware detection + quality profile** see `references/optimization.md`
2. **Input loader** — mode-dependent; see `references/inputs.md`
3. **Feature analyzer** — audio FFT, video luminance, or pass-through
4. **Grid + renderer** — multi-density character grids with bitmap cache; `_render_vf()` helper for value/hue field → canvas
5. **Character palettes** — multiple palettes chosen per project theme; see `references/architecture.md`
6. **Color system** — HSV + discrete RGB palettes as needed; see `references/architecture.md`
7. **Scene functions** — each returns `canvas (uint8 H,W,3)` directly. May compose multiple grids internally via pixel blend modes. See `references/scenes.md` + `references/composition.md`
8. **Tonemap** — adaptive brightness normalization with per-scene gamma; see `references/composition.md`
9. **Shader pipeline**`ShaderChain` + `FeedbackBuffer` per-section config; see `references/shaders.md`
10. **Scene table + dispatcher**maps time ranges to scene functions + shader/feedback configs; see `references/scenes.md`
11. **Parallel encoder** — N-worker batch clip rendering with ffmpeg pipes
1. **Hardware detection + quality profile**`references/optimization.md`
2. **Input loader** — mode-dependent; `references/inputs.md`
3. **Feature analyzer** — audio FFT, video luminance, or synthetic
4. **Grid + renderer** — multi-density grids with bitmap cache; `references/architecture.md`
5. **Character palettes** — multiple per project; `references/architecture.md` § Palettes
6. **Color system** — HSV + discrete RGB + harmony generation; `references/architecture.md` § Color
7. **Scene functions** — each returns `canvas (uint8 H,W,3)`; `references/scenes.md`
8. **Tonemap** — adaptive brightness normalization; `references/composition.md`
9. **Shader pipeline**`ShaderChain` + `FeedbackBuffer`; `references/shaders.md`
10. **Scene table + dispatcher**time → scene function + config; `references/scenes.md`
11. **Parallel encoder** — N-worker clip rendering with ffmpeg pipes
12. **Main** — orchestrate full pipeline
### Step 4: Handle Critical Bugs
### Step 4: Quality Verification
#### Font Cell Height (macOS Pillow)
- **Test frames first**: render single frames at key timestamps before full render
- **Brightness check**: `canvas.mean() > 8` for all ASCII content. If dark, lower gamma
- **Visual coherence**: do all scenes feel like they belong to the same video?
- **Creative vision check**: does the output match the concept from Step 1? If it looks generic, go back
`textbbox()` returns wrong height. Use `font.getmetrics()`:
## Critical Implementation Notes
```python
ascent, descent = font.getmetrics()
cell_height = ascent + descent # correct
```
### Brightness — Use `tonemap()`, Not Linear Multipliers
#### ffmpeg Pipe Deadlock
Never use `stderr=subprocess.PIPE` with long-running ffmpeg. Redirect to file:
```python
stderr_fh = open(err_path, "w")
pipe = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.DEVNULL, stderr=stderr_fh)
```
#### Brightness — Use `tonemap()`, Not Linear Multipliers
ASCII on black is inherently dark. This is the #1 visual issue. **Do NOT use linear `* N` brightness multipliers** — they clip highlights and wash out the image. Instead, use the **adaptive tonemap** function from `references/composition.md`:
This is the #1 visual issue. ASCII on black is inherently dark. **Never use `canvas * N` multipliers** — they clip highlights. Use adaptive tonemap:
```python
def tonemap(canvas, gamma=0.75):
"""Percentile-based adaptive normalization + gamma. Replaces all brightness multipliers."""
f = canvas.astype(np.float32)
lo = np.percentile(f, 1) # black point (1st percentile)
hi = np.percentile(f, 99.5) # white point (99.5th percentile)
if hi - lo < 1: hi = lo + 1
f = (f - lo) / (hi - lo)
f = np.clip(f, 0, 1) ** gamma # gamma < 1 = brighter mids
lo, hi = np.percentile(f[::4, ::4], [1, 99.5])
if hi - lo < 10: hi = lo + 10
f = np.clip((f - lo) / (hi - lo), 0, 1) ** gamma
return (f * 255).astype(np.uint8)
```
Pipeline ordering: `scene_fn() → tonemap() → FeedbackBuffer → ShaderChain → ffmpeg`
Pipeline: `scene_fn() → tonemap() → FeedbackBuffer → ShaderChain → ffmpeg`
Per-scene gamma overrides for destructive effects:
- Default: `gamma=0.75`
- Solarize scenes: `gamma=0.55` (solarize darkens above-threshold pixels)
- Posterize scenes: `gamma=0.50` (quantization loses brightness range)
- Already-bright scenes: `gamma=0.85`
Per-scene gamma: default 0.75, solarize 0.55, posterize 0.50, bright scenes 0.85. Use `screen` blend (not `overlay`) for dark layers.
Additional brightness best practices:
- Dense animated backgrounds — never flat black, always fill the grid
- Vignette minimum clamped to 0.15 (not 0.12)
- Bloom threshold lowered to 130 (not 170) so more pixels contribute to glow
- Use `screen` blend mode (not `overlay`) when compositing dark ASCII layers — overlay squares dark values: `2 * 0.12 * 0.12 = 0.03`
### Font Cell Height
#### Font Compatibility
macOS Pillow: `textbbox()` returns wrong height. Use `font.getmetrics()`: `cell_height = ascent + descent`. See `references/troubleshooting.md`.
Not all Unicode characters render in all fonts. Validate palettes at init:
```python
for c in palette:
img = Image.new("L", (20, 20), 0)
ImageDraw.Draw(img).text((0, 0), c, fill=255, font=font)
if np.array(img).max() == 0:
log(f"WARNING: char '{c}' (U+{ord(c):04X}) not in font, removing from palette")
```
### ffmpeg Pipe Deadlock
### Step 4b: Per-Clip Architecture (for segmented videos)
Never `stderr=subprocess.PIPE` with long-running ffmpeg — buffer fills at 64KB and deadlocks. Redirect to file. See `references/troubleshooting.md`.
When the video has discrete segments (quotes, scenes, chapters), render each as a separate clip file. This enables:
- Re-rendering individual clips without touching the rest (`--clip q05`)
- Faster iteration on specific sections
- Easy reordering or trimming in post
### Font Compatibility
```python
segments = [
{"id": "intro", "start": 0.0, "end": 5.0, "type": "intro"},
{"id": "q00", "start": 5.0, "end": 12.0, "type": "quote", "qi": 0, ...},
{"id": "t00", "start": 12.0, "end": 13.5, "type": "transition", ...},
{"id": "outro", "start": 208.0, "end": 211.6, "type": "outro"},
]
Not all Unicode chars render in all fonts. Validate palettes at init — render each char, check for blank output. See `references/troubleshooting.md`.
from concurrent.futures import ProcessPoolExecutor, as_completed
with ProcessPoolExecutor(max_workers=hw["workers"]) as pool:
futures = {pool.submit(render_clip, seg, features, path): seg["id"]
for seg, path in clip_args}
for fut in as_completed(futures):
fut.result()
```
### Per-Clip Architecture
CLI: `--clip q00 t00 q01` to re-render specific clips, `--list` to show segments, `--skip-render` to re-stitch only.
For segmented videos (quotes, scenes, chapters), render each as a separate clip file for parallel rendering and selective re-rendering. See `references/scenes.md`.
### Step 5: Render and Iterate
Performance targets per frame:
## Performance Targets
| Component | Budget |
|-----------|--------|
@@ -233,24 +191,15 @@ Performance targets per frame:
| Shader pipeline | 5-25ms |
| **Total** | ~100-200ms/frame |
**Fast iteration**: render single test frames to check brightness/layout before full render:
```python
canvas = render_single_frame(frame_index, features, renderer)
Image.fromarray(canvas).save("test.png")
```
**Brightness verification**: sample 5-10 frames across video, check `mean > 8` for ASCII content.
## References
| File | Contents |
|------|----------|
| `references/architecture.md` | Grid system (landscape/portrait/square resolution presets), font selection, character palettes (library of 20+), color system (HSV + OKLAB/OKLCH + discrete RGB + color harmony generation + perceptual gradient interpolation), `_render_vf()` helper, compositing, v2 effect function contract |
| `references/inputs.md` | All input sources: audio analysis, video sampling, image conversion, text/lyrics, TTS integration (ElevenLabs, voice assignment, audio mixing) |
| `references/effects.md` | Effect building blocks: 20+ value field generators (trig, noise/fBM, domain warp, voronoi, reaction-diffusion, cellular automata, strange attractors, SDFs), 8 hue field generators, coordinate transforms (rotate/tile/polar/Möbius), temporal coherence (easing, keyframes, morphing), radial/wave/fire effects, advanced particles (flocking, flow fields, trails), composing guide |
| `references/shaders.md` | 38 shader implementations (geometry, channel, color, glow, noise, pattern, tone, glitch, mirror), `ShaderChain` class, full `_apply_shader_step()` dispatch, audio-reactive scaling, transitions, tint presets |
| `references/composition.md` | **v2 core**: pixel blend modes (20 modes with implementations), multi-grid composition, `_render_vf()` helper, adaptive `tonemap()`, per-scene gamma, `FeedbackBuffer` with spatial transforms, `PixelBlendStack`, masking/stencil system (shape masks, text stencils, animated masks, boolean ops) |
| `references/scenes.md` | **v2 scene protocol**: scene function contract (local time convention), `Renderer` class, `SCENES` table structure, `render_clip()` loop, beat-synced cutting, parallel rendering + pickling constraints, 4 complete scene examples, scene design checklist |
| `references/design-patterns.md` | **Scene composition patterns**: layer hierarchy (bg/content/accent), directional parameter arcs vs oscillation, scene concepts and visual metaphors, counter-rotating dual systems, wave collision, progressive fragmentation, entropy/consumption, staggered layer entry (crescendo), scene ordering |
| `references/troubleshooting.md` | NumPy broadcasting traps, blend mode pitfalls, multiprocessing/pickling issues, brightness diagnostics, ffmpeg deadlocks, font issues, performance bottlenecks, common mistakes |
| `references/optimization.md` | Hardware detection, adaptive quality profiles (draft/preview/production/max), CLI integration, vectorized effect patterns, parallel rendering, memory management |
| `references/architecture.md` | Grid system, resolution presets, font selection, character palettes (20+), color system (HSV + OKLAB + discrete RGB + harmony generation), `_render_vf()` helper, GridLayer class |
| `references/composition.md` | Pixel blend modes (20 modes), `blend_canvas()`, multi-grid composition, adaptive `tonemap()`, `FeedbackBuffer`, `PixelBlendStack`, masking/stencil system |
| `references/effects.md` | Effect building blocks: value field generators, hue fields, noise/fBM/domain warp, voronoi, reaction-diffusion, cellular automata, SDFs, strange attractors, particle systems, coordinate transforms, temporal coherence |
| `references/shaders.md` | `ShaderChain`, `_apply_shader_step()` dispatch, 38 shader catalog, audio-reactive scaling, transitions, tint presets, output format encoding, terminal rendering |
| `references/scenes.md` | Scene protocol, `Renderer` class, `SCENES` table, `render_clip()`, beat-synced cutting, parallel rendering, design patterns (layer hierarchy, directional arcs, visual metaphors, compositional techniques), complete scene examples at every complexity level, scene design checklist |
| `references/inputs.md` | Audio analysis (FFT, bands, beats), video sampling, image conversion, text/lyrics, TTS integration (ElevenLabs, voice assignment, audio mixing) |
| `references/optimization.md` | Hardware detection, quality profiles, vectorized patterns, parallel rendering, memory management, performance budgets |
| `references/troubleshooting.md` | NumPy broadcasting traps, blend mode pitfalls, multiprocessing/pickling, brightness diagnostics, ffmpeg issues, font problems, common mistakes |
@@ -1,14 +1,6 @@
# Architecture Reference
**Cross-references:**
- Effect building blocks (value fields, noise, SDFs, particles): `effects.md`
- `_render_vf()`, blend modes, tonemap, masking: `composition.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Shader pipeline, feedback buffer, output encoding: `shaders.md`
- Complete scene examples: `examples.md`
- Input sources (audio analysis, video, TTS): `inputs.md`
- Performance tuning, hardware detection: `optimization.md`
- Common bugs (broadcasting, font, encoding): `troubleshooting.md`
> **See also:** composition.md · effects.md · scenes.md · shaders.md · inputs.md · optimization.md · troubleshooting.md
## Grid System
@@ -2,13 +2,7 @@
The composable system is the core of visual complexity. It operates at three levels: pixel-level blend modes, multi-grid composition, and adaptive brightness management. This document covers all three, plus the masking/stencil system for spatial control.
**Cross-references:**
- Grid system, palettes, color (HSV + OKLAB): `architecture.md`
- Effect building blocks (value fields, hue fields, particles): `effects.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Shader pipeline, feedback buffer: `shaders.md`
- Complete scene examples with blend/mask usage: `examples.md`
- Blend mode pitfalls (overlay crush, division by zero): `troubleshooting.md`
> **See also:** architecture.md · effects.md · scenes.md · shaders.md · troubleshooting.md
## Pixel-Level Blend Modes
@@ -1,193 +0,0 @@
# Scene Design Patterns
**Cross-references:**
- Scene protocol, SCENES table: `scenes.md`
- Blend modes, multi-grid composition, tonemap: `composition.md`
- Effect building blocks (value fields, noise, SDFs): `effects.md`
- Shader pipeline, feedback buffer: `shaders.md`
- Complete scene examples: `examples.md`
Higher-order patterns for composing scenes that feel intentional rather than random. These patterns use the existing building blocks (value fields, blend modes, shaders, feedback) but organize them with compositional intent.
## Layer Hierarchy
Every scene should have clear visual layers with distinct roles:
| Layer | Grid | Brightness | Purpose |
|-------|------|-----------|---------|
| **Background** | xs or sm (dense) | 0.10.25 | Atmosphere, texture. Never competes with content. |
| **Content** | md (balanced) | 0.40.8 | The main visual idea. Carries the scene's concept. |
| **Accent** | lg or sm (sparse) | 0.51.0 (sparse coverage) | Highlights, punctuation, sparse bright points. |
The background sets mood. The content layer is what the scene *is about*. The accent adds visual interest without overwhelming.
```python
def fx_example(r, f, t, S):
local = t
progress = min(local / 5.0, 1.0)
g_bg = r.get_grid("sm")
g_main = r.get_grid("md")
g_accent = r.get_grid("lg")
# --- Background: dim atmosphere ---
bg_val = vf_smooth_noise(g_bg, f, t * 0.3, S, octaves=2, bri=0.15)
# ... render bg to canvas
# --- Content: the main visual idea ---
content_val = vf_spiral(g_main, f, t, S, n_arms=n_arms, tightness=tightness)
# ... render content on top of canvas
# --- Accent: sparse highlights ---
accent_val = vf_noise_static(g_accent, f, t, S, density=0.05)
# ... render accent on top
return canvas
```
## Directional Parameter Arcs
Parameters should *go somewhere* over the scene's duration — not oscillate aimlessly with `sin(t * N)`.
**Bad:** `twist = 3.0 + 2.0 * math.sin(t * 0.6)` — wobbles back and forth, feels aimless.
**Good:** `twist = 2.0 + progress * 5.0` — starts gentle, ends intense. The scene *builds*.
Use `progress = min(local / duration, 1.0)` (0→1 over the scene) to drive directional change:
| Pattern | Formula | Feel |
|---------|---------|------|
| Linear ramp | `progress * range` | Steady buildup |
| Ease-out | `1 - (1 - progress) ** 2` | Fast start, gentle finish |
| Ease-in | `progress ** 2` | Slow start, accelerating |
| Step reveal | `np.clip((progress - 0.5) / 0.25, 0, 1)` | Nothing until 50%, then fades in |
| Build + plateau | `min(1.0, progress * 1.5)` | Reaches full at 67%, holds |
Oscillation is fine for *secondary* parameters (saturation shimmer, hue drift). But the *defining* parameter of the scene should have a direction.
### Examples of Directional Arcs
| Scene concept | Parameter | Arc |
|--------------|-----------|-----|
| Emergence | Ring radius | 0 → max (ease-out) |
| Shatter | Voronoi cell count | 8 → 38 (linear) |
| Descent | Tunnel speed | 2.0 → 10.0 (linear) |
| Mandala | Shape complexity | ring → +polygon → +star → +rosette (step reveals) |
| Crescendo | Layer count | 1 → 7 (staggered entry) |
| Entropy | Geometry visibility | 1.0 → 0.0 (consumed) |
## Scene Concepts
Each scene should be built around a *visual idea*, not an effect name.
**Bad:** "fx_plasma_cascade" — named after the effect. No concept.
**Good:** "fx_emergence" — a point of light expands into a field. The name tells you *what happens*.
Good scene concepts have:
1. A **visual metaphor** (emergence, descent, collision, entropy)
2. A **directional arc** (things change from A to B, not oscillate)
3. **Motivated layer choices** (each layer serves the concept)
4. **Motivated feedback** (transform direction matches the metaphor)
| Concept | Metaphor | Feedback transform | Why |
|---------|----------|-------------------|-----|
| Emergence | Birth, expansion | zoom-out | Past frames expand outward |
| Descent | Falling, acceleration | zoom-in | Past frames rush toward center |
| Inferno | Rising fire | shift-up | Past frames rise with the flames |
| Entropy | Decay, dissolution | none | Clean, no persistence — things disappear |
| Crescendo | Accumulation | zoom + hue_shift | Everything compounds and shifts |
## Compositional Techniques
### Counter-Rotating Dual Systems
Two instances of the same effect rotating in opposite directions create visual interference:
```python
# Primary spiral (clockwise)
s1_val = vf_spiral(g_main, f, t * 1.5, S, n_arms=n_arms_1, tightness=tightness_1)
# Counter-rotating spiral (counter-clockwise via negative time)
s2_val = vf_spiral(g_accent, f, -t * 1.2, S, n_arms=n_arms_2, tightness=tightness_2)
# Screen blend creates bright interference at crossing points
canvas = blend_canvas(canvas_with_s1, c2, "screen", 0.7)
```
Works with spirals, vortexes, rings. The counter-rotation creates constantly shifting interference patterns.
### Wave Collision
Two wave fronts converging from opposite sides, meeting at a collision point:
```python
collision_phase = abs(progress - 0.5) * 2 # 1→0→1 (0 at collision)
# Wave A approaches from left
offset_a = (1 - progress) * g.cols * 0.4
wave_a = np.sin((g.cc + offset_a) * 0.08 + t * 2) * 0.5 + 0.5
# Wave B approaches from right
offset_b = -(1 - progress) * g.cols * 0.4
wave_b = np.sin((g.cc + offset_b) * 0.08 - t * 2) * 0.5 + 0.5
# Interference peaks at collision
combined = wave_a * 0.5 + wave_b * 0.5 + np.abs(wave_a - wave_b) * (1 - collision_phase) * 0.5
```
### Progressive Fragmentation
Voronoi with cell count increasing over time — visual shattering:
```python
n_pts = int(8 + progress * 30) # 8 cells → 38 cells
# Pre-generate enough points, slice to n_pts
px = base_x[:n_pts] + np.sin(t * 0.3 + np.arange(n_pts) * 0.7) * (3 + progress * 3)
```
The edge glow width can also increase with progress to emphasize the cracks.
### Entropy / Consumption
A clean geometric pattern being overtaken by an organic process:
```python
# Geometry fades out
geo_val = clean_pattern * max(0.05, 1.0 - progress * 0.9)
# Organic process grows in
rd_val = vf_reaction_diffusion(g, f, t, S) * min(1.0, progress * 1.5)
# Render geometry first, organic on top — organic consumes geometry
```
### Staggered Layer Entry (Crescendo)
Layers enter one at a time, building to overwhelming density:
```python
def layer_strength(enter_t, ramp=1.5):
"""0.0 until enter_t, ramps to 1.0 over ramp seconds."""
return max(0.0, min(1.0, (local - enter_t) / ramp))
# Layer 1: always present
s1 = layer_strength(0.0)
# Layer 2: enters at 2s
s2 = layer_strength(2.0)
# Layer 3: enters at 4s
s3 = layer_strength(4.0)
# ... etc
# Each layer uses a different effect, grid, palette, and blend mode
# Screen blend between layers so they accumulate light
```
For a 15-second crescendo, 7 layers entering every 2 seconds works well. Use different blend modes (screen for most, add for energy, colordodge for the final wash).
## Scene Ordering
For a multi-scene reel or video:
- **Vary mood between adjacent scenes** — don't put two calm scenes next to each other
- **Randomize order** rather than grouping by type — prevents "effect demo" feel
- **End on the strongest scene** — crescendo or something with a clear payoff
- **Open with energy** — grab attention in the first 2 seconds
+39 -143
View File
@@ -2,13 +2,7 @@
Effect building blocks that produce visual patterns. In v2, these are used **inside scene functions** that return a pixel canvas directly. The building blocks below operate on grid coordinate arrays and produce `(chars, colors)` or value/hue fields that the scene function renders to canvas via `_render_vf()`.
**Cross-references:**
- Grid system, palettes, color: `architecture.md`
- `_render_vf()`, blend modes, tonemap, masking: `composition.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Shader pipeline, feedback buffer: `shaders.md`
- Complete scene examples using these effects: `examples.md`
- Common bugs (broadcasting, clipping): `troubleshooting.md`
> **See also:** architecture.md · composition.md · scenes.md · shaders.md · troubleshooting.md
## Design Philosophy
@@ -109,142 +103,7 @@ def bg_cellular(g, f, t, n_centers=12, hue=0.5, bri=0.6, pal=PAL_BLOCKS):
---
## Radial Effects
### Concentric Rings
Bass/sub-driven pulsing rings from center. Scale ring count and thickness with bass energy.
```python
def eff_rings(g, f, t, hue=0.5, n_base=6, pal=PAL_DEFAULT):
n_rings = int(n_base + f["sub_r"] * 25 + f["bass"] * 10)
spacing = 2 + f["bass_r"] * 7 + f["rms"] * 3
ring_cv = np.zeros((g.rows, g.cols), dtype=np.float32)
for ri in range(n_rings):
rad = (ri+1) * spacing + f["bdecay"] * 15
wobble = f["mid_r"]*5*np.sin(g.angle*3 + t*4) + f["hi_r"]*3*np.sin(g.angle*7 - t*6)
rd = np.abs(g.dist - rad - wobble)
th = 1 + f["sub"] * 3
ring_cv = np.maximum(ring_cv, np.clip((1 - rd/th) * (0.4 + f["bass"]*0.8), 0, 1))
# Color by angle + distance for rainbow rings
h = g.angle/(2*np.pi) + g.dist*0.005 + f["sub_r"]*0.2
return ring_cv, h
```
### Radial Rays
```python
def eff_rays(g, f, t, n_base=8, hue=0.5):
n_rays = int(n_base + f["hi_r"] * 25)
ray = np.clip(np.cos(g.angle*n_rays + t*3) * f["bdecay"]*0.6 * (1-g.dist_n), 0, 0.7)
return ray
```
### Spiral Arms (Logarithmic)
```python
def eff_spiral(g, f, t, n_arms=3, tightness=2.5, hue=0.5):
arm_cv = np.zeros((g.rows, g.cols), dtype=np.float32)
for ai in range(n_arms):
offset = ai * 2*np.pi / n_arms
log_r = np.log(g.dist + 1) * tightness
arm_phase = g.angle + offset - log_r + t * 0.8
arm_val = np.clip(np.cos(arm_phase * n_arms) * 0.6 + 0.2, 0, 1)
arm_val *= (0.4 + f["rms"]*0.6) * np.clip(1 - g.dist_n*0.5, 0.2, 1)
arm_cv = np.maximum(arm_cv, arm_val)
return arm_cv
```
### Center Glow / Pulse
```python
def eff_glow(g, f, t, intensity=0.6, spread=2.0):
return np.clip(intensity * np.exp(-g.dist_n * spread) * (0.5 + f["rms"]*2 + np.sin(t*1.2)*0.2), 0, 0.9)
```
### Tunnel / Depth
```python
def eff_tunnel(g, f, t, speed=3.0, complexity=6):
tunnel_d = 1.0 / (g.dist_n + 0.1)
v1 = np.sin(tunnel_d*2 - t*speed) * 0.45 + 0.55
v2 = np.sin(g.angle*complexity + tunnel_d*1.5 - t*2) * 0.35 + 0.55
return v1 * 0.5 + v2 * 0.5
```
### Vortex (Rotating Distortion)
```python
def eff_vortex(g, f, t, twist=3.0, pulse=True):
"""Twisting radial pattern -- distance modulates angle."""
twisted = g.angle + g.dist_n * twist * np.sin(t * 0.5)
val = np.sin(twisted * 4 - t * 2) * 0.5 + 0.5
if pulse:
val *= 0.5 + f.get("bass", 0.3) * 0.8
return np.clip(val, 0, 1)
```
---
## Wave Effects
### Multi-Band Frequency Waves
Each frequency band draws its own wave at different spatial/temporal frequencies:
```python
def eff_freq_waves(g, f, t, bands=None):
if bands is None:
bands = [("sub",0.06,1.2,0.0), ("bass",0.10,2.0,0.08), ("lomid",0.15,3.0,0.16),
("mid",0.22,4.5,0.25), ("himid",0.32,6.5,0.4), ("hi",0.45,8.5,0.55)]
mid = g.rows / 2.0
composite = np.zeros((g.rows, g.cols), dtype=np.float32)
for band_key, sf, tf, hue_base in bands:
amp = f.get(band_key, 0.3) * g.rows * 0.4
y_wave = mid - np.sin(g.cc*sf + t*tf) * amp
y_wave += np.sin(g.cc*sf*2.3 + t*tf*1.7) * amp * 0.2 # harmonic
dist = np.abs(g.rr - y_wave)
thickness = 2 + f.get(band_key, 0.3) * 5
intensity = np.clip((1 - dist/thickness) * f.get(band_key, 0.3) * 1.5, 0, 1)
composite = np.maximum(composite, intensity)
return composite
```
### Interference Pattern
6-8 overlapping sine waves creating moire-like patterns:
```python
def eff_interference(g, f, t, n_waves=5):
"""Parametric interference -- vary n_waves for complexity."""
# Each wave has different orientation, frequency, and feature driver
drivers = ["mid_r", "himid_r", "bass_r", "lomid_r", "hi_r"]
vals = np.zeros((g.rows, g.cols), dtype=np.float32)
for i in range(min(n_waves, len(drivers))):
angle = i * np.pi / n_waves # spread orientations
freq = 0.06 + i * 0.03
sp = 0.5 + i * 0.3
proj = g.cc * np.cos(angle) + g.rr * np.sin(angle)
vals += np.sin(proj * freq + t * sp) * f.get(drivers[i], 0.3) * 2.5
return np.clip(vals * 0.12 + 0.45, 0.1, 1)
```
### Aurora / Horizontal Bands
```python
def eff_aurora(g, f, t, hue=0.4, n_bands=3):
val = np.zeros((g.rows, g.cols), dtype=np.float32)
for i in range(n_bands):
freq_r = 0.08 + i * 0.04
freq_c = 0.012 + i * 0.008
sp_r = 0.7 + i * 0.3
sp_c = 0.18 + i * 0.12
val += np.sin(g.rr*freq_r + t*sp_r) * np.sin(g.cc*freq_c + t*sp_c) * (0.6 / n_bands)
return np.clip(val * (f.get("lomid_r", 0.3)*3 + 0.2), 0, 0.7)
```
### Ripple (Point-Source Waves)
```python
def eff_ripple(g, f, t, sources=None, freq=0.3, damping=0.02):
"""Concentric ripples from point sources. Sources = [(row_frac, col_frac), ...]"""
if sources is None:
sources = [(0.5, 0.5)] # center
val = np.zeros((g.rows, g.cols), dtype=np.float32)
for ry, rx in sources:
dy = g.rr - g.rows * ry
dx = g.cc - g.cols * rx
d = np.sqrt(dy**2 + dx**2)
val += np.sin(d * freq - t * 4) * np.exp(-d * damping) * 0.5
return np.clip(val + 0.5, 0, 1)
```
> **Note:** The v1 `eff_rings`, `eff_rays`, `eff_spiral`, `eff_glow`, `eff_tunnel`, `eff_vortex`, `eff_freq_waves`, `eff_interference`, `eff_aurora`, and `eff_ripple` functions are superseded by the `vf_*` value field generators below (used via `_render_vf()`). The `vf_*` versions integrate with the multi-grid composition pipeline and are preferred for all new scenes.
---
@@ -1967,3 +1826,40 @@ def scene_complex(r, f, t, S):
```
Vary the **value field combo**, **hue field**, **palette**, **blend modes**, **feedback config**, and **shader chain** per section for maximum visual variety. With 12 value fields × 8 hue fields × 14 palettes × 20 blend modes × 7 feedback transforms × 38 shaders, the combinations are effectively infinite.
---
## Combining Effects — Creative Guide
The catalog above is vocabulary. Here's how to compose it into something that looks intentional.
### Layering for Depth
Every scene should have at least two layers at different grid densities:
- **Background** (sm or xs): dense, dim texture that prevents flat black. fBM, smooth noise, or domain warp at low brightness (bri=0.15-0.25).
- **Content** (md): the main visual — rings, voronoi, spirals, tunnel. Full brightness.
- **Accent** (lg or xl): sparse highlights — particles, text stencil, glow pulse. Screen-blended on top.
### Interesting Effect Pairs
| Pair | Blend | Why it works |
|------|-------|-------------|
| fBM + voronoi edges | `screen` | Organic fills the cells, edges add structure |
| Domain warp + plasma | `difference` | Psychedelic organic interference |
| Tunnel + vortex | `screen` | Depth perspective + rotational energy |
| Spiral + interference | `exclusion` | Moire patterns from different spatial frequencies |
| Reaction-diffusion + fire | `add` | Living organic base + dynamic foreground |
| SDF geometry + domain warp | `screen` | Clean shapes floating in organic texture |
### Effects as Masks
Any value field can be used as a mask for another effect via `mask_from_vf()`:
- Voronoi cells masking fire (fire visible only inside cells)
- fBM masking a solid color layer (organic color clouds)
- SDF shapes masking a reaction-diffusion field
- Animated iris/wipe revealing one effect over another
### Inventing New Effects
For every project, create at least one effect that isn't in the catalog:
- **Combine two vf_* functions** with math: `np.clip(vf_fbm(...) * vf_rings(...), 0, 1)`
- **Apply coordinate transforms** before evaluation: `vf_plasma(twisted_grid, ...)`
- **Use one field to modulate another's parameters**: `vf_spiral(..., tightness=2 + vf_fbm(...) * 5)`
- **Stack time offsets**: render the same field at `t` and `t - 0.5`, difference-blend for motion trails
- **Mirror a value field** through an SDF boundary for kaleidoscopic geometry
@@ -1,416 +0,0 @@
# Scene Examples
**Cross-references:**
- Grid system, palettes, color (HSV + OKLAB): `architecture.md`
- Effect building blocks (value fields, noise, SDFs, particles): `effects.md`
- `_render_vf()`, blend modes, tonemap, masking: `composition.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Shader pipeline, feedback buffer, ShaderChain: `shaders.md`
- Input sources (audio features, video features): `inputs.md`
- Performance tuning: `optimization.md`
- Common bugs: `troubleshooting.md`
Copy-paste-ready scene functions at increasing complexity. Each is a complete, working v2 scene function that returns a pixel canvas. See `scenes.md` for the scene protocol and `composition.md` for blend modes and tonemap.
---
## Minimal — Single Grid, Single Effect
### Breathing Plasma
One grid, one value field, one hue field. The simplest possible scene.
```python
def fx_breathing_plasma(r, f, t, S):
"""Plasma field with time-cycling hue. Audio modulates brightness."""
canvas = _render_vf(r, "md",
lambda g, f, t, S: vf_plasma(g, f, t, S) * 1.3,
hf_time_cycle(0.08), PAL_DENSE, f, t, S, sat=0.8)
return canvas
```
### Reaction-Diffusion Coral
Single grid, simulation-based field. Evolves organically over time.
```python
def fx_coral(r, f, t, S):
"""Gray-Scott reaction-diffusion — coral branching pattern.
Slow-evolving, organic. Best for ambient/chill sections."""
canvas = _render_vf(r, "sm",
lambda g, f, t, S: vf_reaction_diffusion(g, f, t, S,
feed=0.037, kill=0.060, steps_per_frame=6, init_mode="center"),
hf_distance(0.55, 0.015), PAL_DOTS, f, t, S, sat=0.7)
return canvas
```
### SDF Geometry
Geometric shapes from SDFs. Clean, precise, graphic.
```python
def fx_sdf_rings(r, f, t, S):
"""Concentric SDF rings with smooth pulsing."""
def val_fn(g, f, t, S):
d1 = sdf_ring(g, radius=0.15 + f.get("bass", 0.3) * 0.05, thickness=0.015)
d2 = sdf_ring(g, radius=0.25 + f.get("mid", 0.3) * 0.05, thickness=0.012)
d3 = sdf_ring(g, radius=0.35 + f.get("hi", 0.3) * 0.04, thickness=0.010)
combined = sdf_smooth_union(sdf_smooth_union(d1, d2, 0.05), d3, 0.05)
return sdf_glow(combined, falloff=0.08) * (0.5 + f.get("rms", 0.3) * 0.8)
canvas = _render_vf(r, "md", val_fn, hf_angle(0.0), PAL_STARS, f, t, S, sat=0.85)
return canvas
```
---
## Standard — Two Grids + Blend
### Tunnel Through Noise
Two grids at different densities, screen blended. The fine noise texture shows through the coarser tunnel characters.
```python
def fx_tunnel_noise(r, f, t, S):
"""Tunnel depth on md grid + fBM noise on sm grid, screen blended."""
canvas_a = _render_vf(r, "md",
lambda g, f, t, S: vf_tunnel(g, f, t, S, speed=4.0, complexity=8) * 1.2,
hf_distance(0.5, 0.02), PAL_BLOCKS, f, t, S, sat=0.7)
canvas_b = _render_vf(r, "sm",
lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=4, freq=0.05, speed=0.15) * 1.3,
hf_time_cycle(0.06), PAL_RUNE, f, t, S, sat=0.6)
return blend_canvas(canvas_a, canvas_b, "screen", 0.7)
```
### Voronoi Cells + Spiral Overlay
Voronoi cell edges with a spiral arm pattern overlaid.
```python
def fx_voronoi_spiral(r, f, t, S):
"""Voronoi edge detection on md + logarithmic spiral on lg."""
canvas_a = _render_vf(r, "md",
lambda g, f, t, S: vf_voronoi(g, f, t, S,
n_cells=15, mode="edge", edge_width=2.0, speed=0.4),
hf_angle(0.2), PAL_CIRCUIT, f, t, S, sat=0.75)
canvas_b = _render_vf(r, "lg",
lambda g, f, t, S: vf_spiral(g, f, t, S, n_arms=4, tightness=3.0) * 1.2,
hf_distance(0.1, 0.03), PAL_BLOCKS, f, t, S, sat=0.9)
return blend_canvas(canvas_a, canvas_b, "exclusion", 0.6)
```
### Domain-Warped fBM
Two layers of the same fBM, one domain-warped, difference-blended for psychedelic organic texture.
```python
def fx_organic_warp(r, f, t, S):
"""Clean fBM vs domain-warped fBM, difference blended."""
canvas_a = _render_vf(r, "sm",
lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=5, freq=0.04, speed=0.1),
hf_plasma(0.2), PAL_DENSE, f, t, S, sat=0.6)
canvas_b = _render_vf(r, "md",
lambda g, f, t, S: vf_domain_warp(g, f, t, S,
warp_strength=20.0, freq=0.05, speed=0.15),
hf_time_cycle(0.05), PAL_BRAILLE, f, t, S, sat=0.7)
return blend_canvas(canvas_a, canvas_b, "difference", 0.7)
```
---
## Complex — Three Grids + Conditional + Feedback
### Psychedelic Cathedral
Three-grid composition with beat-triggered kaleidoscope and feedback zoom tunnel. The most visually complex pattern.
```python
def fx_cathedral(r, f, t, S):
"""Three-layer cathedral: interference + rings + noise, kaleidoscope on beat,
feedback zoom tunnel."""
# Layer 1: interference pattern on sm grid
canvas_a = _render_vf(r, "sm",
lambda g, f, t, S: vf_interference(g, f, t, S, n_waves=7) * 1.3,
hf_angle(0.0), PAL_MATH, f, t, S, sat=0.8)
# Layer 2: pulsing rings on md grid
canvas_b = _render_vf(r, "md",
lambda g, f, t, S: vf_rings(g, f, t, S, n_base=10, spacing_base=3) * 1.4,
hf_distance(0.3, 0.02), PAL_STARS, f, t, S, sat=0.9)
# Layer 3: temporal noise on lg grid (slow morph)
canvas_c = _render_vf(r, "lg",
lambda g, f, t, S: vf_temporal_noise(g, f, t, S,
freq=0.04, t_freq=0.2, octaves=3),
hf_time_cycle(0.12), PAL_BLOCKS, f, t, S, sat=0.7)
# Blend: A screen B, then difference with C
result = blend_canvas(canvas_a, canvas_b, "screen", 0.8)
result = blend_canvas(result, canvas_c, "difference", 0.5)
# Beat-triggered kaleidoscope
if f.get("bdecay", 0) > 0.3:
folds = 6 if f.get("sub_r", 0.3) > 0.4 else 8
result = sh_kaleidoscope(result.copy(), folds=folds)
return result
# Scene table entry with feedback:
# {"start": 30.0, "end": 50.0, "name": "cathedral", "fx": fx_cathedral,
# "gamma": 0.65, "shaders": [("bloom", {"thr": 110}), ("chromatic", {"amt": 4}),
# ("vignette", {"s": 0.2}), ("grain", {"amt": 8})],
# "feedback": {"decay": 0.75, "blend": "screen", "opacity": 0.35,
# "transform": "zoom", "transform_amt": 0.012, "hue_shift": 0.015}}
```
### Masked Reaction-Diffusion with Attractor Overlay
Reaction-diffusion visible only through an animated iris mask, with a strange attractor density field underneath.
```python
def fx_masked_life(r, f, t, S):
"""Attractor base + reaction-diffusion visible through iris mask + particles."""
g_sm = r.get_grid("sm")
g_md = r.get_grid("md")
# Layer 1: strange attractor density field (background)
canvas_bg = _render_vf(r, "sm",
lambda g, f, t, S: vf_strange_attractor(g, f, t, S,
attractor="clifford", n_points=30000),
hf_time_cycle(0.04), PAL_DOTS, f, t, S, sat=0.5)
# Layer 2: reaction-diffusion (foreground, will be masked)
canvas_rd = _render_vf(r, "md",
lambda g, f, t, S: vf_reaction_diffusion(g, f, t, S,
feed=0.046, kill=0.063, steps_per_frame=4, init_mode="ring"),
hf_angle(0.15), PAL_HALFFILL, f, t, S, sat=0.85)
# Animated iris mask — opens over first 5 seconds of scene
scene_start = S.get("_scene_start", t)
if "_scene_start" not in S:
S["_scene_start"] = t
mask = mask_iris(g_md, t, scene_start, scene_start + 5.0,
max_radius=0.6)
canvas_rd = apply_mask_canvas(canvas_rd, mask, bg_canvas=canvas_bg)
# Layer 3: flow-field particles following the R-D gradient
rd_field = vf_reaction_diffusion(g_sm, f, t, S,
feed=0.046, kill=0.063, steps_per_frame=0) # read without stepping
ch_p, co_p = update_flow_particles(S, g_sm, f, rd_field,
n=300, speed=0.8, char_set=list("·•◦∘°"))
canvas_p = g_sm.render(ch_p, co_p)
result = blend_canvas(canvas_rd, canvas_p, "add", 0.7)
return result
```
### Morphing Field Sequence with Eased Keyframes
Demonstrates temporal coherence: smooth morphing between effects with keyframed parameters.
```python
def fx_morphing_journey(r, f, t, S):
"""Morphs through 4 value fields over 20 seconds with eased transitions.
Parameters (twist, arm count) also keyframed."""
# Keyframed twist parameter
twist = keyframe(t, [(0, 1.0), (5, 5.0), (10, 2.0), (15, 8.0), (20, 1.0)],
ease_fn=ease_in_out_cubic, loop=True)
# Sequence of value fields with 2s crossfade
fields = [
lambda g, f, t, S: vf_plasma(g, f, t, S),
lambda g, f, t, S: vf_vortex(g, f, t, S, twist=twist),
lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=5, freq=0.04),
lambda g, f, t, S: vf_domain_warp(g, f, t, S, warp_strength=15),
]
durations = [5.0, 5.0, 5.0, 5.0]
val_fn = lambda g, f, t, S: vf_sequence(g, f, t, S, fields, durations,
crossfade=2.0)
# Render with slowly rotating hue
canvas = _render_vf(r, "md", val_fn, hf_time_cycle(0.06),
PAL_DENSE, f, t, S, sat=0.8)
# Second layer: tiled version of same sequence at smaller grid
tiled_fn = lambda g, f, t, S: vf_sequence(
make_tgrid(g, *uv_tile(g, 3, 3, mirror=True)),
f, t, S, fields, durations, crossfade=2.0)
canvas_b = _render_vf(r, "sm", tiled_fn, hf_angle(0.1),
PAL_RUNE, f, t, S, sat=0.6)
return blend_canvas(canvas, canvas_b, "screen", 0.5)
```
---
## Specialized — Unique State Patterns
### Game of Life with Ghost Trails
Cellular automaton with analog fade trails. Beat injects random cells.
```python
def fx_life(r, f, t, S):
"""Conway's Game of Life with fading ghost trails.
Beat events inject random live cells for disruption."""
canvas = _render_vf(r, "sm",
lambda g, f, t, S: vf_game_of_life(g, f, t, S,
rule="life", steps_per_frame=1, fade=0.92, density=0.25),
hf_fixed(0.33), PAL_BLOCKS, f, t, S, sat=0.8)
# Overlay: coral automaton on lg grid for chunky texture
canvas_b = _render_vf(r, "lg",
lambda g, f, t, S: vf_game_of_life(g, f, t, S,
rule="coral", steps_per_frame=1, fade=0.85, density=0.15, seed=99),
hf_time_cycle(0.1), PAL_HATCH, f, t, S, sat=0.6)
return blend_canvas(canvas, canvas_b, "screen", 0.5)
```
### Boids Flock Over Voronoi
Emergent swarm movement over a cellular background.
```python
def fx_boid_swarm(r, f, t, S):
"""Flocking boids over animated voronoi cells."""
# Background: voronoi cells
canvas_bg = _render_vf(r, "md",
lambda g, f, t, S: vf_voronoi(g, f, t, S,
n_cells=20, mode="distance", speed=0.2),
hf_distance(0.4, 0.02), PAL_CIRCUIT, f, t, S, sat=0.5)
# Foreground: boids
g = r.get_grid("md")
ch_b, co_b = update_boids(S, g, f, n_boids=150, perception=6.0,
max_speed=1.5, char_set=list("▸▹►▻→⟶"))
canvas_boids = g.render(ch_b, co_b)
# Trails for the boids
# (boid positions are stored in S["boid_x"], S["boid_y"])
S["px"] = list(S.get("boid_x", []))
S["py"] = list(S.get("boid_y", []))
ch_t, co_t = draw_particle_trails(S, g, max_trail=6, fade=0.6)
canvas_trails = g.render(ch_t, co_t)
result = blend_canvas(canvas_bg, canvas_trails, "add", 0.3)
result = blend_canvas(result, canvas_boids, "add", 0.9)
return result
```
### Fire Rising Through SDF Text Stencil
Fire effect visible only through text letterforms.
```python
def fx_fire_text(r, f, t, S):
"""Fire columns visible through text stencil. Text acts as window."""
g = r.get_grid("lg")
# Full-screen fire (will be masked)
canvas_fire = _render_vf(r, "sm",
lambda g, f, t, S: np.clip(
vf_fbm(g, f, t, S, octaves=4, freq=0.08, speed=0.8) *
(1.0 - g.rr / g.rows) * # fade toward top
(0.6 + f.get("bass", 0.3) * 0.8), 0, 1),
hf_fixed(0.05), PAL_BLOCKS, f, t, S, sat=0.9) # fire hue
# Background: dark domain warp
canvas_bg = _render_vf(r, "md",
lambda g, f, t, S: vf_domain_warp(g, f, t, S,
warp_strength=8, freq=0.03, speed=0.05) * 0.3,
hf_fixed(0.6), PAL_DENSE, f, t, S, sat=0.4)
# Text stencil mask
mask = mask_text(g, "FIRE", row_frac=0.45)
# Expand vertically for multi-row coverage
for offset in range(-2, 3):
shifted = mask_text(g, "FIRE", row_frac=0.45 + offset / g.rows)
mask = mask_union(mask, shifted)
canvas_masked = apply_mask_canvas(canvas_fire, mask, bg_canvas=canvas_bg)
return canvas_masked
```
### Portrait Mode: Vertical Rain + Quote
Optimized for 9:16. Uses vertical space for long rain trails and stacked text.
```python
def fx_portrait_rain_quote(r, f, t, S):
"""Portrait-optimized: matrix rain (long vertical trails) with stacked quote.
Designed for 1080x1920 (9:16)."""
g = r.get_grid("md") # ~112x100 in portrait
# Matrix rain — long trails benefit from portrait's extra rows
ch, co, S = eff_matrix_rain(g, f, t, S,
hue=0.33, bri=0.6, pal=PAL_KATA, speed_base=0.4, speed_beat=2.5)
canvas_rain = g.render(ch, co)
# Tunnel depth underneath for texture
canvas_tunnel = _render_vf(r, "sm",
lambda g, f, t, S: vf_tunnel(g, f, t, S, speed=3.0, complexity=6) * 0.8,
hf_fixed(0.33), PAL_BLOCKS, f, t, S, sat=0.5)
result = blend_canvas(canvas_tunnel, canvas_rain, "screen", 0.8)
# Quote text — portrait layout: short lines, many of them
g_text = r.get_grid("lg") # ~90x80 in portrait
quote_lines = layout_text_portrait(
"The code is the art and the art is the code",
max_chars_per_line=20)
# Center vertically
block_start = (g_text.rows - len(quote_lines)) // 2
ch_t = np.full((g_text.rows, g_text.cols), " ", dtype="U1")
co_t = np.zeros((g_text.rows, g_text.cols, 3), dtype=np.uint8)
total_chars = sum(len(l) for l in quote_lines)
progress = min(1.0, (t - S.get("_scene_start", t)) / 3.0)
if "_scene_start" not in S: S["_scene_start"] = t
render_typewriter(ch_t, co_t, quote_lines, block_start, g_text.cols,
progress, total_chars, (200, 255, 220), t)
canvas_text = g_text.render(ch_t, co_t)
result = blend_canvas(result, canvas_text, "add", 0.9)
return result
```
---
## Scene Table Template
Wire scenes into a complete video:
```python
SCENES = [
{"start": 0.0, "end": 5.0, "name": "coral",
"fx": fx_coral, "grid": "sm", "gamma": 0.70,
"shaders": [("bloom", {"thr": 110}), ("vignette", {"s": 0.2})],
"feedback": {"decay": 0.8, "blend": "screen", "opacity": 0.3,
"transform": "zoom", "transform_amt": 0.01}},
{"start": 5.0, "end": 15.0, "name": "tunnel_noise",
"fx": fx_tunnel_noise, "grid": "md", "gamma": 0.75,
"shaders": [("chromatic", {"amt": 3}), ("bloom", {"thr": 120}),
("scanlines", {"intensity": 0.06}), ("grain", {"amt": 8})],
"feedback": None},
{"start": 15.0, "end": 35.0, "name": "cathedral",
"fx": fx_cathedral, "grid": "sm", "gamma": 0.65,
"shaders": [("bloom", {"thr": 100}), ("chromatic", {"amt": 5}),
("color_wobble", {"amt": 0.2}), ("vignette", {"s": 0.18})],
"feedback": {"decay": 0.75, "blend": "screen", "opacity": 0.35,
"transform": "zoom", "transform_amt": 0.012, "hue_shift": 0.015}},
{"start": 35.0, "end": 50.0, "name": "morphing",
"fx": fx_morphing_journey, "grid": "md", "gamma": 0.70,
"shaders": [("bloom", {"thr": 110}), ("grain", {"amt": 6})],
"feedback": {"decay": 0.7, "blend": "screen", "opacity": 0.25,
"transform": "rotate_cw", "transform_amt": 0.003}},
]
```
@@ -1,13 +1,6 @@
# Input Sources
**Cross-references:**
- Grid system, resolution presets: `architecture.md`
- Effect building blocks (audio-reactive modulation): `effects.md`
- Scene protocol, SCENES table (feature routing): `scenes.md`
- Shader pipeline, output encoding: `shaders.md`
- Performance tuning (audio chunking, WAV caching): `optimization.md`
- Common bugs (sample rate, dtype, silence handling): `troubleshooting.md`
- Complete scene examples with feature usage: `examples.md`
> **See also:** architecture.md · effects.md · scenes.md · shaders.md · optimization.md · troubleshooting.md
## Audio Analysis
@@ -1,14 +1,6 @@
# Optimization Reference
**Cross-references:**
- Grid system, resolution presets, portrait GridLayer: `architecture.md`
- Effect building blocks (pre-computation strategies): `effects.md`
- `_render_vf()`, tonemap (subsampled percentile): `composition.md`
- Scene protocol, render_clip: `scenes.md`
- Shader pipeline, encoding (ffmpeg flags): `shaders.md`
- Input sources (audio chunking, WAV extraction): `inputs.md`
- Common bugs (memory, OOM, frame drops): `troubleshooting.md`
- Complete scene examples: `examples.md`
> **See also:** architecture.md · composition.md · scenes.md · shaders.md · inputs.md · troubleshooting.md
## Hardware Detection
+616 -11
View File
@@ -1,18 +1,214 @@
# Scene System Reference
# Scene System & Creative Composition
**Cross-references:**
- Grid system, palettes, color (HSV + OKLAB): `architecture.md`
- Effect building blocks (value fields, noise, SDFs, particles): `effects.md`
- `_render_vf()`, blend modes, tonemap, masking: `composition.md`
- Shader pipeline, feedback buffer, ShaderChain: `shaders.md`
- Complete scene examples at every complexity level: `examples.md`
- Input sources (audio features, video features): `inputs.md`
- Performance tuning, portrait CLI: `optimization.md`
- Common bugs (state leaks, frame drops): `troubleshooting.md`
> **See also:** architecture.md · composition.md · effects.md · shaders.md
## Scene Design Philosophy
Scenes are storytelling units, not effect demos. Every scene needs:
- A **concept** — what is happening visually? Not "plasma + rings" but "emergence from void" or "crystallization"
- An **arc** — how does it change over its duration? Build, decay, transform, reveal?
- A **role** — how does it serve the larger video narrative? Opening tension, peak energy, resolution?
The design patterns below provide compositional techniques. The scene examples show them in practice at increasing complexity. The protocol section covers the technical contract.
Good scene design starts with the concept, then selects effects and parameters that serve it. The design patterns section shows *how* to compose layers intentionally. The examples section shows complete working scenes at every complexity level. The protocol section covers the technical contract that all scenes must follow.
---
## Scene Design Patterns
Higher-order patterns for composing scenes that feel intentional rather than random. These patterns use the existing building blocks (value fields, blend modes, shaders, feedback) but organize them with compositional intent.
## Layer Hierarchy
Every scene should have clear visual layers with distinct roles:
| Layer | Grid | Brightness | Purpose |
|-------|------|-----------|---------|
| **Background** | xs or sm (dense) | 0.10.25 | Atmosphere, texture. Never competes with content. |
| **Content** | md (balanced) | 0.40.8 | The main visual idea. Carries the scene's concept. |
| **Accent** | lg or sm (sparse) | 0.51.0 (sparse coverage) | Highlights, punctuation, sparse bright points. |
The background sets mood. The content layer is what the scene *is about*. The accent adds visual interest without overwhelming.
```python
def fx_example(r, f, t, S):
local = t
progress = min(local / 5.0, 1.0)
g_bg = r.get_grid("sm")
g_main = r.get_grid("md")
g_accent = r.get_grid("lg")
# --- Background: dim atmosphere ---
bg_val = vf_smooth_noise(g_bg, f, t * 0.3, S, octaves=2, bri=0.15)
# ... render bg to canvas
# --- Content: the main visual idea ---
content_val = vf_spiral(g_main, f, t, S, n_arms=n_arms, tightness=tightness)
# ... render content on top of canvas
# --- Accent: sparse highlights ---
accent_val = vf_noise_static(g_accent, f, t, S, density=0.05)
# ... render accent on top
return canvas
```
## Directional Parameter Arcs
Parameters should *go somewhere* over the scene's duration — not oscillate aimlessly with `sin(t * N)`.
**Bad:** `twist = 3.0 + 2.0 * math.sin(t * 0.6)` — wobbles back and forth, feels aimless.
**Good:** `twist = 2.0 + progress * 5.0` — starts gentle, ends intense. The scene *builds*.
Use `progress = min(local / duration, 1.0)` (0→1 over the scene) to drive directional change:
| Pattern | Formula | Feel |
|---------|---------|------|
| Linear ramp | `progress * range` | Steady buildup |
| Ease-out | `1 - (1 - progress) ** 2` | Fast start, gentle finish |
| Ease-in | `progress ** 2` | Slow start, accelerating |
| Step reveal | `np.clip((progress - 0.5) / 0.25, 0, 1)` | Nothing until 50%, then fades in |
| Build + plateau | `min(1.0, progress * 1.5)` | Reaches full at 67%, holds |
Oscillation is fine for *secondary* parameters (saturation shimmer, hue drift). But the *defining* parameter of the scene should have a direction.
### Examples of Directional Arcs
| Scene concept | Parameter | Arc |
|--------------|-----------|-----|
| Emergence | Ring radius | 0 → max (ease-out) |
| Shatter | Voronoi cell count | 8 → 38 (linear) |
| Descent | Tunnel speed | 2.0 → 10.0 (linear) |
| Mandala | Shape complexity | ring → +polygon → +star → +rosette (step reveals) |
| Crescendo | Layer count | 1 → 7 (staggered entry) |
| Entropy | Geometry visibility | 1.0 → 0.0 (consumed) |
## Scene Concepts
Each scene should be built around a *visual idea*, not an effect name.
**Bad:** "fx_plasma_cascade" — named after the effect. No concept.
**Good:** "fx_emergence" — a point of light expands into a field. The name tells you *what happens*.
Good scene concepts have:
1. A **visual metaphor** (emergence, descent, collision, entropy)
2. A **directional arc** (things change from A to B, not oscillate)
3. **Motivated layer choices** (each layer serves the concept)
4. **Motivated feedback** (transform direction matches the metaphor)
| Concept | Metaphor | Feedback transform | Why |
|---------|----------|-------------------|-----|
| Emergence | Birth, expansion | zoom-out | Past frames expand outward |
| Descent | Falling, acceleration | zoom-in | Past frames rush toward center |
| Inferno | Rising fire | shift-up | Past frames rise with the flames |
| Entropy | Decay, dissolution | none | Clean, no persistence — things disappear |
| Crescendo | Accumulation | zoom + hue_shift | Everything compounds and shifts |
## Compositional Techniques
### Counter-Rotating Dual Systems
Two instances of the same effect rotating in opposite directions create visual interference:
```python
# Primary spiral (clockwise)
s1_val = vf_spiral(g_main, f, t * 1.5, S, n_arms=n_arms_1, tightness=tightness_1)
# Counter-rotating spiral (counter-clockwise via negative time)
s2_val = vf_spiral(g_accent, f, -t * 1.2, S, n_arms=n_arms_2, tightness=tightness_2)
# Screen blend creates bright interference at crossing points
canvas = blend_canvas(canvas_with_s1, c2, "screen", 0.7)
```
Works with spirals, vortexes, rings. The counter-rotation creates constantly shifting interference patterns.
### Wave Collision
Two wave fronts converging from opposite sides, meeting at a collision point:
```python
collision_phase = abs(progress - 0.5) * 2 # 1→0→1 (0 at collision)
# Wave A approaches from left
offset_a = (1 - progress) * g.cols * 0.4
wave_a = np.sin((g.cc + offset_a) * 0.08 + t * 2) * 0.5 + 0.5
# Wave B approaches from right
offset_b = -(1 - progress) * g.cols * 0.4
wave_b = np.sin((g.cc + offset_b) * 0.08 - t * 2) * 0.5 + 0.5
# Interference peaks at collision
combined = wave_a * 0.5 + wave_b * 0.5 + np.abs(wave_a - wave_b) * (1 - collision_phase) * 0.5
```
### Progressive Fragmentation
Voronoi with cell count increasing over time — visual shattering:
```python
n_pts = int(8 + progress * 30) # 8 cells → 38 cells
# Pre-generate enough points, slice to n_pts
px = base_x[:n_pts] + np.sin(t * 0.3 + np.arange(n_pts) * 0.7) * (3 + progress * 3)
```
The edge glow width can also increase with progress to emphasize the cracks.
### Entropy / Consumption
A clean geometric pattern being overtaken by an organic process:
```python
# Geometry fades out
geo_val = clean_pattern * max(0.05, 1.0 - progress * 0.9)
# Organic process grows in
rd_val = vf_reaction_diffusion(g, f, t, S) * min(1.0, progress * 1.5)
# Render geometry first, organic on top — organic consumes geometry
```
### Staggered Layer Entry (Crescendo)
Layers enter one at a time, building to overwhelming density:
```python
def layer_strength(enter_t, ramp=1.5):
"""0.0 until enter_t, ramps to 1.0 over ramp seconds."""
return max(0.0, min(1.0, (local - enter_t) / ramp))
# Layer 1: always present
s1 = layer_strength(0.0)
# Layer 2: enters at 2s
s2 = layer_strength(2.0)
# Layer 3: enters at 4s
s3 = layer_strength(4.0)
# ... etc
# Each layer uses a different effect, grid, palette, and blend mode
# Screen blend between layers so they accumulate light
```
For a 15-second crescendo, 7 layers entering every 2 seconds works well. Use different blend modes (screen for most, add for energy, colordodge for the final wash).
## Scene Ordering
For a multi-scene reel or video:
- **Vary mood between adjacent scenes** — don't put two calm scenes next to each other
- **Randomize order** rather than grouping by type — prevents "effect demo" feel
- **End on the strongest scene** — crescendo or something with a clear payoff
- **Open with energy** — grab attention in the first 2 seconds
---
## Scene Protocol
Scenes are the top-level creative unit. Each scene is a time-bounded segment with its own effect function, shader chain, feedback configuration, and tone-mapping gamma.
## Scene Protocol (v2)
### Scene Protocol (v2)
### Function Signature
@@ -404,3 +600,412 @@ For each scene:
7. **Configure feedback** for trailing/recursive looks — or None for clean cuts
8. **Set gamma** if using destructive shaders (solarize, posterize)
9. **Test with --test-frame** at the scene's midpoint before full render
---
## Scene Examples
Copy-paste-ready scene functions at increasing complexity. Each is a complete, working v2 scene function that returns a pixel canvas. See the Scene Protocol section above for the scene protocol and `composition.md` for blend modes and tonemap.
---
### Minimal — Single Grid, Single Effect
### Breathing Plasma
One grid, one value field, one hue field. The simplest possible scene.
```python
def fx_breathing_plasma(r, f, t, S):
"""Plasma field with time-cycling hue. Audio modulates brightness."""
canvas = _render_vf(r, "md",
lambda g, f, t, S: vf_plasma(g, f, t, S) * 1.3,
hf_time_cycle(0.08), PAL_DENSE, f, t, S, sat=0.8)
return canvas
```
### Reaction-Diffusion Coral
Single grid, simulation-based field. Evolves organically over time.
```python
def fx_coral(r, f, t, S):
"""Gray-Scott reaction-diffusion — coral branching pattern.
Slow-evolving, organic. Best for ambient/chill sections."""
canvas = _render_vf(r, "sm",
lambda g, f, t, S: vf_reaction_diffusion(g, f, t, S,
feed=0.037, kill=0.060, steps_per_frame=6, init_mode="center"),
hf_distance(0.55, 0.015), PAL_DOTS, f, t, S, sat=0.7)
return canvas
```
### SDF Geometry
Geometric shapes from SDFs. Clean, precise, graphic.
```python
def fx_sdf_rings(r, f, t, S):
"""Concentric SDF rings with smooth pulsing."""
def val_fn(g, f, t, S):
d1 = sdf_ring(g, radius=0.15 + f.get("bass", 0.3) * 0.05, thickness=0.015)
d2 = sdf_ring(g, radius=0.25 + f.get("mid", 0.3) * 0.05, thickness=0.012)
d3 = sdf_ring(g, radius=0.35 + f.get("hi", 0.3) * 0.04, thickness=0.010)
combined = sdf_smooth_union(sdf_smooth_union(d1, d2, 0.05), d3, 0.05)
return sdf_glow(combined, falloff=0.08) * (0.5 + f.get("rms", 0.3) * 0.8)
canvas = _render_vf(r, "md", val_fn, hf_angle(0.0), PAL_STARS, f, t, S, sat=0.85)
return canvas
```
---
### Standard — Two Grids + Blend
### Tunnel Through Noise
Two grids at different densities, screen blended. The fine noise texture shows through the coarser tunnel characters.
```python
def fx_tunnel_noise(r, f, t, S):
"""Tunnel depth on md grid + fBM noise on sm grid, screen blended."""
canvas_a = _render_vf(r, "md",
lambda g, f, t, S: vf_tunnel(g, f, t, S, speed=4.0, complexity=8) * 1.2,
hf_distance(0.5, 0.02), PAL_BLOCKS, f, t, S, sat=0.7)
canvas_b = _render_vf(r, "sm",
lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=4, freq=0.05, speed=0.15) * 1.3,
hf_time_cycle(0.06), PAL_RUNE, f, t, S, sat=0.6)
return blend_canvas(canvas_a, canvas_b, "screen", 0.7)
```
### Voronoi Cells + Spiral Overlay
Voronoi cell edges with a spiral arm pattern overlaid.
```python
def fx_voronoi_spiral(r, f, t, S):
"""Voronoi edge detection on md + logarithmic spiral on lg."""
canvas_a = _render_vf(r, "md",
lambda g, f, t, S: vf_voronoi(g, f, t, S,
n_cells=15, mode="edge", edge_width=2.0, speed=0.4),
hf_angle(0.2), PAL_CIRCUIT, f, t, S, sat=0.75)
canvas_b = _render_vf(r, "lg",
lambda g, f, t, S: vf_spiral(g, f, t, S, n_arms=4, tightness=3.0) * 1.2,
hf_distance(0.1, 0.03), PAL_BLOCKS, f, t, S, sat=0.9)
return blend_canvas(canvas_a, canvas_b, "exclusion", 0.6)
```
### Domain-Warped fBM
Two layers of the same fBM, one domain-warped, difference-blended for psychedelic organic texture.
```python
def fx_organic_warp(r, f, t, S):
"""Clean fBM vs domain-warped fBM, difference blended."""
canvas_a = _render_vf(r, "sm",
lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=5, freq=0.04, speed=0.1),
hf_plasma(0.2), PAL_DENSE, f, t, S, sat=0.6)
canvas_b = _render_vf(r, "md",
lambda g, f, t, S: vf_domain_warp(g, f, t, S,
warp_strength=20.0, freq=0.05, speed=0.15),
hf_time_cycle(0.05), PAL_BRAILLE, f, t, S, sat=0.7)
return blend_canvas(canvas_a, canvas_b, "difference", 0.7)
```
---
### Complex — Three Grids + Conditional + Feedback
### Psychedelic Cathedral
Three-grid composition with beat-triggered kaleidoscope and feedback zoom tunnel. The most visually complex pattern.
```python
def fx_cathedral(r, f, t, S):
"""Three-layer cathedral: interference + rings + noise, kaleidoscope on beat,
feedback zoom tunnel."""
# Layer 1: interference pattern on sm grid
canvas_a = _render_vf(r, "sm",
lambda g, f, t, S: vf_interference(g, f, t, S, n_waves=7) * 1.3,
hf_angle(0.0), PAL_MATH, f, t, S, sat=0.8)
# Layer 2: pulsing rings on md grid
canvas_b = _render_vf(r, "md",
lambda g, f, t, S: vf_rings(g, f, t, S, n_base=10, spacing_base=3) * 1.4,
hf_distance(0.3, 0.02), PAL_STARS, f, t, S, sat=0.9)
# Layer 3: temporal noise on lg grid (slow morph)
canvas_c = _render_vf(r, "lg",
lambda g, f, t, S: vf_temporal_noise(g, f, t, S,
freq=0.04, t_freq=0.2, octaves=3),
hf_time_cycle(0.12), PAL_BLOCKS, f, t, S, sat=0.7)
# Blend: A screen B, then difference with C
result = blend_canvas(canvas_a, canvas_b, "screen", 0.8)
result = blend_canvas(result, canvas_c, "difference", 0.5)
# Beat-triggered kaleidoscope
if f.get("bdecay", 0) > 0.3:
folds = 6 if f.get("sub_r", 0.3) > 0.4 else 8
result = sh_kaleidoscope(result.copy(), folds=folds)
return result
# Scene table entry with feedback:
# {"start": 30.0, "end": 50.0, "name": "cathedral", "fx": fx_cathedral,
# "gamma": 0.65, "shaders": [("bloom", {"thr": 110}), ("chromatic", {"amt": 4}),
# ("vignette", {"s": 0.2}), ("grain", {"amt": 8})],
# "feedback": {"decay": 0.75, "blend": "screen", "opacity": 0.35,
# "transform": "zoom", "transform_amt": 0.012, "hue_shift": 0.015}}
```
### Masked Reaction-Diffusion with Attractor Overlay
Reaction-diffusion visible only through an animated iris mask, with a strange attractor density field underneath.
```python
def fx_masked_life(r, f, t, S):
"""Attractor base + reaction-diffusion visible through iris mask + particles."""
g_sm = r.get_grid("sm")
g_md = r.get_grid("md")
# Layer 1: strange attractor density field (background)
canvas_bg = _render_vf(r, "sm",
lambda g, f, t, S: vf_strange_attractor(g, f, t, S,
attractor="clifford", n_points=30000),
hf_time_cycle(0.04), PAL_DOTS, f, t, S, sat=0.5)
# Layer 2: reaction-diffusion (foreground, will be masked)
canvas_rd = _render_vf(r, "md",
lambda g, f, t, S: vf_reaction_diffusion(g, f, t, S,
feed=0.046, kill=0.063, steps_per_frame=4, init_mode="ring"),
hf_angle(0.15), PAL_HALFFILL, f, t, S, sat=0.85)
# Animated iris mask — opens over first 5 seconds of scene
scene_start = S.get("_scene_start", t)
if "_scene_start" not in S:
S["_scene_start"] = t
mask = mask_iris(g_md, t, scene_start, scene_start + 5.0,
max_radius=0.6)
canvas_rd = apply_mask_canvas(canvas_rd, mask, bg_canvas=canvas_bg)
# Layer 3: flow-field particles following the R-D gradient
rd_field = vf_reaction_diffusion(g_sm, f, t, S,
feed=0.046, kill=0.063, steps_per_frame=0) # read without stepping
ch_p, co_p = update_flow_particles(S, g_sm, f, rd_field,
n=300, speed=0.8, char_set=list("·•◦∘°"))
canvas_p = g_sm.render(ch_p, co_p)
result = blend_canvas(canvas_rd, canvas_p, "add", 0.7)
return result
```
### Morphing Field Sequence with Eased Keyframes
Demonstrates temporal coherence: smooth morphing between effects with keyframed parameters.
```python
def fx_morphing_journey(r, f, t, S):
"""Morphs through 4 value fields over 20 seconds with eased transitions.
Parameters (twist, arm count) also keyframed."""
# Keyframed twist parameter
twist = keyframe(t, [(0, 1.0), (5, 5.0), (10, 2.0), (15, 8.0), (20, 1.0)],
ease_fn=ease_in_out_cubic, loop=True)
# Sequence of value fields with 2s crossfade
fields = [
lambda g, f, t, S: vf_plasma(g, f, t, S),
lambda g, f, t, S: vf_vortex(g, f, t, S, twist=twist),
lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=5, freq=0.04),
lambda g, f, t, S: vf_domain_warp(g, f, t, S, warp_strength=15),
]
durations = [5.0, 5.0, 5.0, 5.0]
val_fn = lambda g, f, t, S: vf_sequence(g, f, t, S, fields, durations,
crossfade=2.0)
# Render with slowly rotating hue
canvas = _render_vf(r, "md", val_fn, hf_time_cycle(0.06),
PAL_DENSE, f, t, S, sat=0.8)
# Second layer: tiled version of same sequence at smaller grid
tiled_fn = lambda g, f, t, S: vf_sequence(
make_tgrid(g, *uv_tile(g, 3, 3, mirror=True)),
f, t, S, fields, durations, crossfade=2.0)
canvas_b = _render_vf(r, "sm", tiled_fn, hf_angle(0.1),
PAL_RUNE, f, t, S, sat=0.6)
return blend_canvas(canvas, canvas_b, "screen", 0.5)
```
---
### Specialized — Unique State Patterns
### Game of Life with Ghost Trails
Cellular automaton with analog fade trails. Beat injects random cells.
```python
def fx_life(r, f, t, S):
"""Conway's Game of Life with fading ghost trails.
Beat events inject random live cells for disruption."""
canvas = _render_vf(r, "sm",
lambda g, f, t, S: vf_game_of_life(g, f, t, S,
rule="life", steps_per_frame=1, fade=0.92, density=0.25),
hf_fixed(0.33), PAL_BLOCKS, f, t, S, sat=0.8)
# Overlay: coral automaton on lg grid for chunky texture
canvas_b = _render_vf(r, "lg",
lambda g, f, t, S: vf_game_of_life(g, f, t, S,
rule="coral", steps_per_frame=1, fade=0.85, density=0.15, seed=99),
hf_time_cycle(0.1), PAL_HATCH, f, t, S, sat=0.6)
return blend_canvas(canvas, canvas_b, "screen", 0.5)
```
### Boids Flock Over Voronoi
Emergent swarm movement over a cellular background.
```python
def fx_boid_swarm(r, f, t, S):
"""Flocking boids over animated voronoi cells."""
# Background: voronoi cells
canvas_bg = _render_vf(r, "md",
lambda g, f, t, S: vf_voronoi(g, f, t, S,
n_cells=20, mode="distance", speed=0.2),
hf_distance(0.4, 0.02), PAL_CIRCUIT, f, t, S, sat=0.5)
# Foreground: boids
g = r.get_grid("md")
ch_b, co_b = update_boids(S, g, f, n_boids=150, perception=6.0,
max_speed=1.5, char_set=list("▸▹►▻→⟶"))
canvas_boids = g.render(ch_b, co_b)
# Trails for the boids
# (boid positions are stored in S["boid_x"], S["boid_y"])
S["px"] = list(S.get("boid_x", []))
S["py"] = list(S.get("boid_y", []))
ch_t, co_t = draw_particle_trails(S, g, max_trail=6, fade=0.6)
canvas_trails = g.render(ch_t, co_t)
result = blend_canvas(canvas_bg, canvas_trails, "add", 0.3)
result = blend_canvas(result, canvas_boids, "add", 0.9)
return result
```
### Fire Rising Through SDF Text Stencil
Fire effect visible only through text letterforms.
```python
def fx_fire_text(r, f, t, S):
"""Fire columns visible through text stencil. Text acts as window."""
g = r.get_grid("lg")
# Full-screen fire (will be masked)
canvas_fire = _render_vf(r, "sm",
lambda g, f, t, S: np.clip(
vf_fbm(g, f, t, S, octaves=4, freq=0.08, speed=0.8) *
(1.0 - g.rr / g.rows) * # fade toward top
(0.6 + f.get("bass", 0.3) * 0.8), 0, 1),
hf_fixed(0.05), PAL_BLOCKS, f, t, S, sat=0.9) # fire hue
# Background: dark domain warp
canvas_bg = _render_vf(r, "md",
lambda g, f, t, S: vf_domain_warp(g, f, t, S,
warp_strength=8, freq=0.03, speed=0.05) * 0.3,
hf_fixed(0.6), PAL_DENSE, f, t, S, sat=0.4)
# Text stencil mask
mask = mask_text(g, "FIRE", row_frac=0.45)
# Expand vertically for multi-row coverage
for offset in range(-2, 3):
shifted = mask_text(g, "FIRE", row_frac=0.45 + offset / g.rows)
mask = mask_union(mask, shifted)
canvas_masked = apply_mask_canvas(canvas_fire, mask, bg_canvas=canvas_bg)
return canvas_masked
```
### Portrait Mode: Vertical Rain + Quote
Optimized for 9:16. Uses vertical space for long rain trails and stacked text.
```python
def fx_portrait_rain_quote(r, f, t, S):
"""Portrait-optimized: matrix rain (long vertical trails) with stacked quote.
Designed for 1080x1920 (9:16)."""
g = r.get_grid("md") # ~112x100 in portrait
# Matrix rain — long trails benefit from portrait's extra rows
ch, co, S = eff_matrix_rain(g, f, t, S,
hue=0.33, bri=0.6, pal=PAL_KATA, speed_base=0.4, speed_beat=2.5)
canvas_rain = g.render(ch, co)
# Tunnel depth underneath for texture
canvas_tunnel = _render_vf(r, "sm",
lambda g, f, t, S: vf_tunnel(g, f, t, S, speed=3.0, complexity=6) * 0.8,
hf_fixed(0.33), PAL_BLOCKS, f, t, S, sat=0.5)
result = blend_canvas(canvas_tunnel, canvas_rain, "screen", 0.8)
# Quote text — portrait layout: short lines, many of them
g_text = r.get_grid("lg") # ~90x80 in portrait
quote_lines = layout_text_portrait(
"The code is the art and the art is the code",
max_chars_per_line=20)
# Center vertically
block_start = (g_text.rows - len(quote_lines)) // 2
ch_t = np.full((g_text.rows, g_text.cols), " ", dtype="U1")
co_t = np.zeros((g_text.rows, g_text.cols, 3), dtype=np.uint8)
total_chars = sum(len(l) for l in quote_lines)
progress = min(1.0, (t - S.get("_scene_start", t)) / 3.0)
if "_scene_start" not in S: S["_scene_start"] = t
render_typewriter(ch_t, co_t, quote_lines, block_start, g_text.cols,
progress, total_chars, (200, 255, 220), t)
canvas_text = g_text.render(ch_t, co_t)
result = blend_canvas(result, canvas_text, "add", 0.9)
return result
```
---
### Scene Table Template
Wire scenes into a complete video:
```python
SCENES = [
{"start": 0.0, "end": 5.0, "name": "coral",
"fx": fx_coral, "grid": "sm", "gamma": 0.70,
"shaders": [("bloom", {"thr": 110}), ("vignette", {"s": 0.2})],
"feedback": {"decay": 0.8, "blend": "screen", "opacity": 0.3,
"transform": "zoom", "transform_amt": 0.01}},
{"start": 5.0, "end": 15.0, "name": "tunnel_noise",
"fx": fx_tunnel_noise, "grid": "md", "gamma": 0.75,
"shaders": [("chromatic", {"amt": 3}), ("bloom", {"thr": 120}),
("scanlines", {"intensity": 0.06}), ("grain", {"amt": 8})],
"feedback": None},
{"start": 15.0, "end": 35.0, "name": "cathedral",
"fx": fx_cathedral, "grid": "sm", "gamma": 0.65,
"shaders": [("bloom", {"thr": 100}), ("chromatic", {"amt": 5}),
("color_wobble", {"amt": 0.2}), ("vignette", {"s": 0.18})],
"feedback": {"decay": 0.75, "blend": "screen", "opacity": 0.35,
"transform": "zoom", "transform_amt": 0.012, "hue_shift": 0.015}},
{"start": 35.0, "end": 50.0, "name": "morphing",
"fx": fx_morphing_journey, "grid": "md", "gamma": 0.70,
"shaders": [("bloom", {"thr": 110}), ("grain", {"amt": 6})],
"feedback": {"decay": 0.7, "blend": "screen", "opacity": 0.25,
"transform": "rotate_cw", "transform_amt": 0.003}},
]
```
@@ -2,14 +2,9 @@
Post-processing effects applied to the pixel canvas (`numpy uint8 array, shape (H,W,3)`) after character rendering and before encoding. Also covers **pixel-level blend modes**, **feedback buffers**, and the **ShaderChain** compositor.
**Cross-references:**
- Grid system, palettes, color (HSV + OKLAB): `architecture.md`
- Effect building blocks (value fields, noise, SDFs): `effects.md`
- `_render_vf()`, blend modes, tonemap, masking: `composition.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Complete scene examples with shader usage: `examples.md`
- Performance tuning (frame budget, worker count): `optimization.md`
- Encoding pitfalls (ffmpeg flags, color space): `troubleshooting.md`
> **See also:** composition.md (blend modes, tonemap) · effects.md · scenes.md · architecture.md · optimization.md · troubleshooting.md
>
> **Blend modes:** For the 20 pixel blend modes and `blend_canvas()`, see `composition.md`. All blending uses `blend_canvas(base, top, mode, opacity)`.
## Design Philosophy
@@ -1,14 +1,19 @@
# Troubleshooting Reference
**Cross-references:**
- Grid system, palettes, font selection: `architecture.md`
- Effect building blocks (value fields, noise, SDFs): `effects.md`
- `_render_vf()`, blend modes, tonemap: `composition.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Shader pipeline, feedback buffer, encoding: `shaders.md`
- Input sources (audio, video, TTS): `inputs.md`
- Performance tuning, hardware detection: `optimization.md`
- Complete scene examples: `examples.md`
> **See also:** composition.md · architecture.md · shaders.md · scenes.md · optimization.md
## Quick Diagnostic
| Symptom | Likely Cause | Fix |
|---------|-------------|-----|
| All black output | tonemap gamma too high or no effects rendering | Lower gamma to 0.5, check scene_fn returns non-zero canvas |
| Washed out / too bright | Linear brightness multiplier instead of tonemap | Replace `canvas * N` with `tonemap(canvas, gamma=0.75)` |
| ffmpeg hangs mid-render | stderr=subprocess.PIPE deadlock | Redirect stderr to file |
| "read-only" array error | broadcast_to view without .copy() | Add `.copy()` after broadcast_to |
| PicklingError | Lambda or closure in SCENES table | Define all fx_* at module level |
| Random dark holes in output | Font missing Unicode glyphs | Validate palettes at init |
| Audio-visual desync | Frame timing accumulation | Use integer frame counter, compute t fresh each frame |
| Single-color flat output | Hue field shape mismatch | Ensure h,s,v arrays all (rows,cols) before hsv2rgb |
Common bugs, gotchas, and platform-specific issues encountered during ASCII video development.
@@ -339,3 +344,22 @@ val = np.clip(vf_plasma(g, f, t, S) * 1.5, 0, 1)
```
The `_render_vf()` helper clips automatically, but if you're building custom scenes, clip explicitly.
## Brightness Best Practices
- Dense animated backgrounds — never flat black, always fill the grid
- Vignette minimum clamped to 0.15 (not 0.12)
- Bloom threshold 130 (not 170) so more pixels contribute to glow
- Use `screen` blend mode (not `overlay`) for dark ASCII layers — overlay squares dark values: `2 * 0.12 * 0.12 = 0.03`
- FeedbackBuffer decay minimum 0.5 — below that, feedback disappears too fast to see
- Value field floor: `vf * 0.8 + 0.05` ensures no cell is truly zero
- Per-scene gamma overrides: default 0.75, solarize 0.55, posterize 0.50, bright scenes 0.85
- Test frames early: render single frames at key timestamps before committing to full render
**Quick checklist before full render:**
1. Render 3 test frames (start, middle, end)
2. Check `canvas.mean() > 8` after tonemap
3. Check no scene is visually flat black
4. Verify per-section variation (different bg/palette/color per scene)
5. Confirm shader chain includes bloom (threshold 130)
6. Confirm vignette strength ≤ 0.25
+2
View File
@@ -114,6 +114,7 @@ curl -s "https://export.arxiv.org/api/query?id_list=2402.03300,2401.12345,2403.0
After fetching metadata for a paper, generate a BibTeX entry:
{% raw %}
```bash
curl -s "https://export.arxiv.org/api/query?id_list=1706.03762" | python3 -c "
import sys, xml.etree.ElementTree as ET
@@ -139,6 +140,7 @@ print(f' url = {{https://arxiv.org/abs/{raw_id}}}')
print('}')
"
```
{% endraw %}
## Reading Paper Content
@@ -215,6 +215,7 @@ def generate_citation_key(bibtex: str) -> str:
### Complete Citation Manager Class
{% raw %}
```python
"""
Citation Manager - Verified citation workflow for ML papers.
@@ -377,6 +378,7 @@ if __name__ == "__main__":
if bibtex:
print(bibtex)
```
{% endraw %}
### Quick Functions
+94
View File
@@ -295,3 +295,97 @@ class TestOnConnect:
mock_conn = MagicMock(spec=acp.Client)
agent.on_connect(mock_conn)
assert agent._conn is mock_conn
# ---------------------------------------------------------------------------
# Slash commands
# ---------------------------------------------------------------------------
class TestSlashCommands:
"""Test slash command dispatch in the ACP adapter."""
def _make_state(self, mock_manager):
state = mock_manager.create_session(cwd="/tmp")
state.agent.model = "test-model"
state.agent.provider = "openrouter"
state.model = "test-model"
return state
def test_help_lists_commands(self, agent, mock_manager):
state = self._make_state(mock_manager)
result = agent._handle_slash_command("/help", state)
assert result is not None
assert "/help" in result
assert "/model" in result
assert "/tools" in result
assert "/reset" in result
def test_model_shows_current(self, agent, mock_manager):
state = self._make_state(mock_manager)
result = agent._handle_slash_command("/model", state)
assert "test-model" in result
def test_context_empty(self, agent, mock_manager):
state = self._make_state(mock_manager)
state.history = []
result = agent._handle_slash_command("/context", state)
assert "empty" in result.lower()
def test_context_with_messages(self, agent, mock_manager):
state = self._make_state(mock_manager)
state.history = [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "hi"},
]
result = agent._handle_slash_command("/context", state)
assert "2 messages" in result
assert "user: 1" in result
def test_reset_clears_history(self, agent, mock_manager):
state = self._make_state(mock_manager)
state.history = [{"role": "user", "content": "hello"}]
result = agent._handle_slash_command("/reset", state)
assert "cleared" in result.lower()
assert len(state.history) == 0
def test_version(self, agent, mock_manager):
state = self._make_state(mock_manager)
result = agent._handle_slash_command("/version", state)
assert HERMES_VERSION in result
def test_unknown_command_returns_none(self, agent, mock_manager):
state = self._make_state(mock_manager)
result = agent._handle_slash_command("/nonexistent", state)
assert result is None
@pytest.mark.asyncio
async def test_slash_command_intercepted_in_prompt(self, agent, mock_manager):
"""Slash commands should be handled without calling the LLM."""
new_resp = await agent.new_session(cwd="/tmp")
mock_conn = AsyncMock(spec=acp.Client)
agent._conn = mock_conn
prompt = [TextContentBlock(type="text", text="/help")]
resp = await agent.prompt(prompt=prompt, session_id=new_resp.session_id)
assert resp.stop_reason == "end_turn"
mock_conn.session_update.assert_called_once()
@pytest.mark.asyncio
async def test_unknown_slash_falls_through_to_llm(self, agent, mock_manager):
"""Unknown /commands should be sent to the LLM, not intercepted."""
new_resp = await agent.new_session(cwd="/tmp")
mock_conn = AsyncMock(spec=acp.Client)
agent._conn = mock_conn
# Mock run_in_executor to avoid actually running the agent
with patch("asyncio.get_running_loop") as mock_loop:
mock_loop.return_value.run_in_executor = AsyncMock(return_value={
"final_response": "I processed /foo",
"messages": [],
})
prompt = [TextContentBlock(type="text", text="/foo bar")]
resp = await agent.prompt(prompt=prompt, session_id=new_resp.session_id)
assert resp.stop_reason == "end_turn"
+123
View File
@@ -0,0 +1,123 @@
"""Tests for get_tool_emoji in agent/display.py — skin + registry integration."""
from unittest.mock import patch as mock_patch, MagicMock
from agent.display import get_tool_emoji
class TestGetToolEmoji:
"""Verify the skin → registry → fallback resolution chain."""
def test_returns_registry_emoji_when_no_skin(self):
"""Registry-registered emoji is used when no skin is active."""
mock_registry = MagicMock()
mock_registry.get_emoji.return_value = "🎨"
with mock_patch("agent.display._get_skin", return_value=None), \
mock_patch("agent.display.registry", mock_registry, create=True):
# Need to patch the import inside get_tool_emoji
pass
# Direct test: patch the lazy import path
with mock_patch("agent.display._get_skin", return_value=None):
# get_tool_emoji will try to import registry — mock that
mock_reg = MagicMock()
mock_reg.get_emoji.return_value = "📖"
with mock_patch.dict("sys.modules", {}):
import sys
# Patch tools.registry module
mock_module = MagicMock()
mock_module.registry = mock_reg
with mock_patch.dict(sys.modules, {"tools.registry": mock_module}):
result = get_tool_emoji("read_file")
assert result == "📖"
def test_skin_override_takes_precedence(self):
"""Skin tool_emojis override registry defaults."""
skin = MagicMock()
skin.tool_emojis = {"terminal": ""}
with mock_patch("agent.display._get_skin", return_value=skin):
result = get_tool_emoji("terminal")
assert result == ""
def test_skin_empty_dict_falls_through(self):
"""Empty skin tool_emojis falls through to registry."""
skin = MagicMock()
skin.tool_emojis = {}
mock_reg = MagicMock()
mock_reg.get_emoji.return_value = "💻"
import sys
mock_module = MagicMock()
mock_module.registry = mock_reg
with mock_patch("agent.display._get_skin", return_value=skin), \
mock_patch.dict(sys.modules, {"tools.registry": mock_module}):
result = get_tool_emoji("terminal")
assert result == "💻"
def test_fallback_default(self):
"""When neither skin nor registry has an emoji, use the default."""
skin = MagicMock()
skin.tool_emojis = {}
mock_reg = MagicMock()
mock_reg.get_emoji.return_value = ""
import sys
mock_module = MagicMock()
mock_module.registry = mock_reg
with mock_patch("agent.display._get_skin", return_value=skin), \
mock_patch.dict(sys.modules, {"tools.registry": mock_module}):
result = get_tool_emoji("unknown_tool")
assert result == ""
def test_custom_default(self):
"""Custom default is returned when nothing matches."""
with mock_patch("agent.display._get_skin", return_value=None):
mock_reg = MagicMock()
mock_reg.get_emoji.return_value = ""
import sys
mock_module = MagicMock()
mock_module.registry = mock_reg
with mock_patch.dict(sys.modules, {"tools.registry": mock_module}):
result = get_tool_emoji("x", default="⚙️")
assert result == "⚙️"
def test_skin_override_only_for_matching_tool(self):
"""Skin override for one tool doesn't affect others."""
skin = MagicMock()
skin.tool_emojis = {"terminal": ""}
mock_reg = MagicMock()
mock_reg.get_emoji.return_value = "🔍"
import sys
mock_module = MagicMock()
mock_module.registry = mock_reg
with mock_patch("agent.display._get_skin", return_value=skin), \
mock_patch.dict(sys.modules, {"tools.registry": mock_module}):
assert get_tool_emoji("terminal") == "" # skin override
assert get_tool_emoji("web_search") == "🔍" # registry fallback
class TestSkinConfigToolEmojis:
"""Verify SkinConfig handles tool_emojis field correctly."""
def test_skin_config_has_tool_emojis_field(self):
from hermes_cli.skin_engine import SkinConfig
skin = SkinConfig(name="test")
assert skin.tool_emojis == {}
def test_skin_config_accepts_tool_emojis(self):
from hermes_cli.skin_engine import SkinConfig
emojis = {"terminal": "", "web_search": "🔮"}
skin = SkinConfig(name="test", tool_emojis=emojis)
assert skin.tool_emojis == emojis
def test_build_skin_config_includes_tool_emojis(self):
from hermes_cli.skin_engine import _build_skin_config
data = {
"name": "custom",
"tool_emojis": {"terminal": "🗡️", "patch": "⚒️"},
}
skin = _build_skin_config(data)
assert skin.tool_emojis == {"terminal": "🗡️", "patch": "⚒️"}
def test_build_skin_config_empty_tool_emojis_default(self):
from hermes_cli.skin_engine import _build_skin_config
data = {"name": "minimal"}
skin = _build_skin_config(data)
assert skin.tool_emojis == {}
+61
View File
@@ -0,0 +1,61 @@
from agent.smart_model_routing import choose_cheap_model_route
_BASE_CONFIG = {
"enabled": True,
"cheap_model": {
"provider": "openrouter",
"model": "google/gemini-2.5-flash",
},
}
def test_returns_none_when_disabled():
cfg = {**_BASE_CONFIG, "enabled": False}
assert choose_cheap_model_route("what time is it in tokyo?", cfg) is None
def test_routes_short_simple_prompt():
result = choose_cheap_model_route("what time is it in tokyo?", _BASE_CONFIG)
assert result is not None
assert result["provider"] == "openrouter"
assert result["model"] == "google/gemini-2.5-flash"
assert result["routing_reason"] == "simple_turn"
def test_skips_long_prompt():
prompt = "please summarize this carefully " * 20
assert choose_cheap_model_route(prompt, _BASE_CONFIG) is None
def test_skips_code_like_prompt():
prompt = "debug this traceback: ```python\nraise ValueError('bad')\n```"
assert choose_cheap_model_route(prompt, _BASE_CONFIG) is None
def test_skips_tool_heavy_prompt_keywords():
prompt = "implement a patch for this docker error"
assert choose_cheap_model_route(prompt, _BASE_CONFIG) is None
def test_resolve_turn_route_falls_back_to_primary_when_route_runtime_cannot_be_resolved(monkeypatch):
from agent.smart_model_routing import resolve_turn_route
monkeypatch.setattr(
"hermes_cli.runtime_provider.resolve_runtime_provider",
lambda **kwargs: (_ for _ in ()).throw(RuntimeError("bad route")),
)
result = resolve_turn_route(
"what time is it in tokyo?",
_BASE_CONFIG,
{
"model": "anthropic/claude-sonnet-4",
"provider": "openrouter",
"base_url": "https://openrouter.ai/api/v1",
"api_mode": "chat_completions",
"api_key": "sk-primary",
},
)
assert result["model"] == "anthropic/claude-sonnet-4"
assert result["runtime"]["provider"] == "openrouter"
assert result["label"] is None
+6
View File
@@ -26,6 +26,12 @@ def _isolate_hermes_home(tmp_path, monkeypatch):
(fake_home / "memories").mkdir()
(fake_home / "skills").mkdir()
monkeypatch.setenv("HERMES_HOME", str(fake_home))
# Reset plugin singleton so tests don't leak plugins from ~/.hermes/plugins/
try:
import hermes_cli.plugins as _plugins_mod
monkeypatch.setattr(_plugins_mod, "_plugin_manager", None)
except Exception:
pass
# Tests should not inherit the agent's current gateway/messaging surface.
# Individual tests that need gateway behavior set these explicitly.
monkeypatch.delenv("HERMES_SESSION_PLATFORM", raising=False)
+20 -3
View File
@@ -304,17 +304,34 @@ class TestMarkJobRun:
class TestGetDueJobs:
def test_past_due_returned(self, tmp_cron_dir):
def test_past_due_within_window_returned(self, tmp_cron_dir):
"""Jobs less than 2 minutes late are still considered due (not stale)."""
job = create_job(prompt="Due now", schedule="every 1h")
# Force next_run_at to the past
# Force next_run_at to just 1 minute ago (within the 2-min window)
jobs = load_jobs()
jobs[0]["next_run_at"] = (datetime.now() - timedelta(minutes=5)).isoformat()
jobs[0]["next_run_at"] = (datetime.now() - timedelta(seconds=60)).isoformat()
save_jobs(jobs)
due = get_due_jobs()
assert len(due) == 1
assert due[0]["id"] == job["id"]
def test_stale_past_due_skipped(self, tmp_cron_dir):
"""Recurring jobs more than 2 minutes late are fast-forwarded, not fired."""
job = create_job(prompt="Stale", schedule="every 1h")
# Force next_run_at to 5 minutes ago (beyond the 2-min window)
jobs = load_jobs()
jobs[0]["next_run_at"] = (datetime.now() - timedelta(minutes=5)).isoformat()
save_jobs(jobs)
due = get_due_jobs()
assert len(due) == 0
# next_run_at should be fast-forwarded to the future
updated = get_job(job["id"])
from cron.jobs import _ensure_aware, _hermes_now
next_dt = _ensure_aware(datetime.fromisoformat(updated["next_run_at"]))
assert next_dt > _hermes_now()
def test_future_not_returned(self, tmp_cron_dir):
create_job(prompt="Not yet", schedule="every 1h")
due = get_due_jobs()
+21 -5
View File
@@ -65,6 +65,14 @@ class TestHandleBackgroundCommand:
assert "Usage:" in result
assert "/background" in result
@pytest.mark.asyncio
async def test_bg_alias_no_prompt_shows_usage(self):
"""Running /bg with no prompt shows usage."""
runner = _make_runner()
event = _make_event(text="/bg")
result = await runner._handle_background_command(event)
assert "Usage:" in result
@pytest.mark.asyncio
async def test_empty_prompt_shows_usage(self):
"""Running /background with only whitespace shows usage."""
@@ -264,11 +272,14 @@ class TestBackgroundInHelp:
assert "/background" in result
def test_background_is_known_command(self):
"""The /background command is in the _known_commands set."""
from gateway.run import GatewayRunner
import inspect
source = inspect.getsource(GatewayRunner._handle_message)
assert '"background"' in source
"""The /background command is in GATEWAY_KNOWN_COMMANDS."""
from hermes_cli.commands import GATEWAY_KNOWN_COMMANDS
assert "background" in GATEWAY_KNOWN_COMMANDS
def test_bg_alias_is_known_command(self):
"""The /bg alias is in GATEWAY_KNOWN_COMMANDS."""
from hermes_cli.commands import GATEWAY_KNOWN_COMMANDS
assert "bg" in GATEWAY_KNOWN_COMMANDS
# ---------------------------------------------------------------------------
@@ -284,6 +295,11 @@ class TestBackgroundInCLICommands:
from hermes_cli.commands import COMMANDS
assert "/background" in COMMANDS
def test_bg_alias_in_commands_dict(self):
"""The /bg alias is in the COMMANDS dict."""
from hermes_cli.commands import COMMANDS
assert "/bg" in COMMANDS
def test_background_in_session_category(self):
"""The /background command is in the Session category."""
from hermes_cli.commands import COMMANDS_BY_CATEGORY
+22
View File
@@ -83,6 +83,14 @@ class TestSessionResetPolicy:
assert policy.at_hour == 4
assert policy.idle_minutes == 1440
def test_from_dict_treats_null_values_as_defaults(self):
restored = SessionResetPolicy.from_dict(
{"mode": None, "at_hour": None, "idle_minutes": None}
)
assert restored.mode == "both"
assert restored.at_hour == 4
assert restored.idle_minutes == 1440
class TestGatewayConfigRoundtrip:
def test_full_roundtrip(self):
@@ -96,6 +104,7 @@ class TestGatewayConfigRoundtrip:
},
reset_triggers=["/new"],
quick_commands={"limits": {"type": "exec", "command": "echo ok"}},
group_sessions_per_user=False,
)
d = config.to_dict()
restored = GatewayConfig.from_dict(d)
@@ -104,6 +113,7 @@ class TestGatewayConfigRoundtrip:
assert restored.platforms[Platform.TELEGRAM].token == "tok_123"
assert restored.reset_triggers == ["/new"]
assert restored.quick_commands == {"limits": {"type": "exec", "command": "echo ok"}}
assert restored.group_sessions_per_user is False
class TestLoadGatewayConfig:
@@ -125,6 +135,18 @@ class TestLoadGatewayConfig:
assert config.quick_commands == {"limits": {"type": "exec", "command": "echo ok"}}
def test_bridges_group_sessions_per_user_from_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text("group_sessions_per_user: false\n", encoding="utf-8")
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
config = load_gateway_config()
assert config.group_sessions_per_user is False
def test_invalid_quick_commands_in_config_yaml_are_ignored(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
@@ -0,0 +1,83 @@
"""Tests for Discord thread participation persistence.
Verifies that _bot_participated_threads survives adapter restarts by
being persisted to ~/.hermes/discord_threads.json.
"""
import json
import os
from unittest.mock import patch
import pytest
class TestDiscordThreadPersistence:
"""Thread IDs are saved to disk and reloaded on init."""
def _make_adapter(self, tmp_path):
"""Build a minimal DiscordAdapter with HERMES_HOME pointed at tmp_path."""
from gateway.config import PlatformConfig
from gateway.platforms.discord import DiscordAdapter
config = PlatformConfig(enabled=True, token="test-token")
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
return DiscordAdapter(config=config)
def test_starts_empty_when_no_state_file(self, tmp_path):
adapter = self._make_adapter(tmp_path)
assert adapter._bot_participated_threads == set()
def test_track_thread_persists_to_disk(self, tmp_path):
adapter = self._make_adapter(tmp_path)
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
adapter._track_thread("111")
adapter._track_thread("222")
state_file = tmp_path / "discord_threads.json"
assert state_file.exists()
saved = json.loads(state_file.read_text())
assert set(saved) == {"111", "222"}
def test_threads_survive_restart(self, tmp_path):
"""Threads tracked by one adapter instance are visible to the next."""
adapter1 = self._make_adapter(tmp_path)
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
adapter1._track_thread("aaa")
adapter1._track_thread("bbb")
adapter2 = self._make_adapter(tmp_path)
assert "aaa" in adapter2._bot_participated_threads
assert "bbb" in adapter2._bot_participated_threads
def test_duplicate_track_does_not_double_save(self, tmp_path):
adapter = self._make_adapter(tmp_path)
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
adapter._track_thread("111")
adapter._track_thread("111") # no-op
saved = json.loads((tmp_path / "discord_threads.json").read_text())
assert saved.count("111") == 1
def test_caps_at_max_tracked_threads(self, tmp_path):
adapter = self._make_adapter(tmp_path)
adapter._MAX_TRACKED_THREADS = 5
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
for i in range(10):
adapter._track_thread(str(i))
assert len(adapter._bot_participated_threads) == 5
def test_corrupted_state_file_falls_back_to_empty(self, tmp_path):
state_file = tmp_path / "discord_threads.json"
state_file.write_text("not valid json{{{")
adapter = self._make_adapter(tmp_path)
assert adapter._bot_participated_threads == set()
def test_missing_hermes_home_does_not_crash(self, tmp_path):
"""Load/save tolerate missing directories."""
fake_home = tmp_path / "nonexistent" / "deep"
with patch.dict(os.environ, {"HERMES_HOME": str(fake_home)}):
from gateway.platforms.discord import DiscordAdapter
# _load should return empty set, not crash
threads = DiscordAdapter._load_participated_threads()
assert threads == set()
+317
View File
@@ -0,0 +1,317 @@
"""
Tests for extract_local_files() auto-detection of bare local file paths
in model response text for native media delivery.
Covers: path matching, code-block exclusion, URL rejection, tilde expansion,
deduplication, text cleanup, and extension routing.
Based on PR #1636 by sudoingX (salvaged + hardened).
"""
import os
from unittest.mock import patch
import pytest
from gateway.platforms.base import BasePlatformAdapter
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _extract(content: str, existing_files: set[str] | None = None):
"""
Run extract_local_files with os.path.isfile mocked to return True
for any path in *existing_files* (expanded form). If *existing_files*
is None every path passes.
"""
existing = existing_files
def fake_isfile(p):
if existing is None:
return True
return p in existing
def fake_expanduser(p):
if p.startswith("~/"):
return "/home/user" + p[1:]
return p
with patch("os.path.isfile", side_effect=fake_isfile), \
patch("os.path.expanduser", side_effect=fake_expanduser):
return BasePlatformAdapter.extract_local_files(content)
# ---------------------------------------------------------------------------
# Basic detection
# ---------------------------------------------------------------------------
class TestBasicDetection:
def test_absolute_path_image(self):
paths, cleaned = _extract("Here is the screenshot /root/screenshots/game.png enjoy")
assert paths == ["/root/screenshots/game.png"]
assert "/root/screenshots/game.png" not in cleaned
assert "Here is the screenshot" in cleaned
def test_tilde_path_image(self):
paths, cleaned = _extract("Check out ~/photos/cat.jpg for the cat")
assert paths == ["/home/user/photos/cat.jpg"]
assert "~/photos/cat.jpg" not in cleaned
def test_video_extensions(self):
for ext in (".mp4", ".mov", ".avi", ".mkv", ".webm"):
text = f"Video at /tmp/clip{ext} here"
paths, _ = _extract(text)
assert len(paths) == 1, f"Failed for {ext}"
assert paths[0] == f"/tmp/clip{ext}"
def test_image_extensions(self):
for ext in (".png", ".jpg", ".jpeg", ".gif", ".webp"):
text = f"Image at /tmp/pic{ext} here"
paths, _ = _extract(text)
assert len(paths) == 1, f"Failed for {ext}"
assert paths[0] == f"/tmp/pic{ext}"
def test_case_insensitive_extension(self):
paths, _ = _extract("See /tmp/PHOTO.PNG and /tmp/vid.MP4 now")
assert len(paths) == 2
def test_multiple_paths(self):
text = "First /tmp/a.png then /tmp/b.jpg and /tmp/c.mp4 done"
paths, cleaned = _extract(text)
assert len(paths) == 3
assert "/tmp/a.png" in paths
assert "/tmp/b.jpg" in paths
assert "/tmp/c.mp4" in paths
for p in paths:
assert p not in cleaned
def test_path_at_line_start(self):
paths, _ = _extract("/var/data/image.png")
assert paths == ["/var/data/image.png"]
def test_path_at_end_of_line(self):
paths, _ = _extract("saved to /var/data/image.png")
assert paths == ["/var/data/image.png"]
def test_path_with_dots_in_directory(self):
paths, _ = _extract("See /opt/my.app/assets/logo.png here")
assert paths == ["/opt/my.app/assets/logo.png"]
def test_path_with_hyphens(self):
paths, _ = _extract("File at /tmp/my-screenshot-2024.png done")
assert paths == ["/tmp/my-screenshot-2024.png"]
# ---------------------------------------------------------------------------
# Non-existent files are skipped
# ---------------------------------------------------------------------------
class TestIsfileGuard:
def test_nonexistent_path_skipped(self):
"""Paths that don't exist on disk are not extracted."""
paths, cleaned = _extract(
"See /tmp/nope.png here",
existing_files=set(), # nothing exists
)
assert paths == []
assert "/tmp/nope.png" in cleaned # not stripped
def test_only_existing_paths_extracted(self):
"""Mix of existing and non-existing — only existing are returned."""
paths, cleaned = _extract(
"A /tmp/real.png and /tmp/fake.jpg end",
existing_files={"/tmp/real.png"},
)
assert paths == ["/tmp/real.png"]
assert "/tmp/real.png" not in cleaned
assert "/tmp/fake.jpg" in cleaned
# ---------------------------------------------------------------------------
# URL false-positive prevention
# ---------------------------------------------------------------------------
class TestURLRejection:
def test_https_url_not_matched(self):
"""Paths embedded in HTTP URLs must not be extracted."""
paths, cleaned = _extract("Visit https://example.com/images/photo.png for details")
# The regex lookbehind should prevent matching the URL's path segment
# Even if it did match, isfile would be False for /images/photo.png
# (we mock isfile to True-for-all here, so the lookbehind is the guard)
assert paths == []
assert "https://example.com/images/photo.png" in cleaned
def test_http_url_not_matched(self):
paths, _ = _extract("See http://cdn.example.com/assets/banner.jpg here")
assert paths == []
def test_file_url_not_matched(self):
paths, _ = _extract("Open file:///home/user/doc.png in browser")
# file:// has :// before /home so lookbehind blocks it
assert paths == []
# ---------------------------------------------------------------------------
# Code block exclusion
# ---------------------------------------------------------------------------
class TestCodeBlockExclusion:
def test_fenced_code_block_skipped(self):
text = "Here's how:\n```python\nimg = open('/tmp/image.png')\n```\nDone."
paths, cleaned = _extract(text)
assert paths == []
assert "/tmp/image.png" in cleaned # not stripped
def test_inline_code_skipped(self):
text = "Use the path `/tmp/image.png` in your config"
paths, cleaned = _extract(text)
assert paths == []
assert "`/tmp/image.png`" in cleaned
def test_path_outside_code_block_still_matched(self):
text = (
"```\ncode: /tmp/inside.png\n```\n"
"But this one is real: /tmp/outside.png"
)
paths, _ = _extract(text, existing_files={"/tmp/outside.png"})
assert paths == ["/tmp/outside.png"]
def test_mixed_inline_code_and_bare_path(self):
text = "Config uses `/etc/app/bg.png` but output is /tmp/result.jpg"
paths, cleaned = _extract(text, existing_files={"/tmp/result.jpg"})
assert paths == ["/tmp/result.jpg"]
assert "`/etc/app/bg.png`" in cleaned
assert "/tmp/result.jpg" not in cleaned
def test_multiline_fenced_block(self):
text = (
"```bash\n"
"cp /source/a.png /dest/b.png\n"
"mv /source/c.mp4 /dest/d.mp4\n"
"```\n"
"Files are ready."
)
paths, _ = _extract(text)
assert paths == []
# ---------------------------------------------------------------------------
# Deduplication
# ---------------------------------------------------------------------------
class TestDeduplication:
def test_duplicate_paths_deduplicated(self):
text = "See /tmp/img.png and also /tmp/img.png again"
paths, _ = _extract(text)
assert paths == ["/tmp/img.png"]
def test_tilde_and_expanded_same_file(self):
"""~/photos/a.png and /home/user/photos/a.png are the same file."""
text = "See ~/photos/a.png and /home/user/photos/a.png here"
paths, _ = _extract(text, existing_files={"/home/user/photos/a.png"})
assert len(paths) == 1
assert paths[0] == "/home/user/photos/a.png"
# ---------------------------------------------------------------------------
# Text cleanup
# ---------------------------------------------------------------------------
class TestTextCleanup:
def test_path_removed_from_text(self):
paths, cleaned = _extract("Before /tmp/x.png after")
assert "Before" in cleaned
assert "after" in cleaned
assert "/tmp/x.png" not in cleaned
def test_excessive_blank_lines_collapsed(self):
text = "Before\n\n\n/tmp/x.png\n\n\nAfter"
_, cleaned = _extract(text)
assert "\n\n\n" not in cleaned
def test_no_paths_text_unchanged(self):
text = "This is a normal response with no file paths."
paths, cleaned = _extract(text)
assert paths == []
assert cleaned == text
def test_tilde_form_cleaned_from_text(self):
"""The raw ~/... form should be removed, not the expanded /home/user/... form."""
text = "Output saved to ~/result.png for review"
paths, cleaned = _extract(text)
assert paths == ["/home/user/result.png"]
assert "~/result.png" not in cleaned
def test_only_path_in_text(self):
"""If the response is just a path, cleaned text is empty."""
paths, cleaned = _extract("/tmp/screenshot.png")
assert paths == ["/tmp/screenshot.png"]
assert cleaned == ""
# ---------------------------------------------------------------------------
# Edge cases
# ---------------------------------------------------------------------------
class TestEdgeCases:
def test_empty_string(self):
paths, cleaned = _extract("")
assert paths == []
assert cleaned == ""
def test_no_media_extensions(self):
"""Non-media extensions should not be matched."""
paths, _ = _extract("See /tmp/data.csv and /tmp/script.py and /tmp/notes.txt")
assert paths == []
def test_path_with_spaces_not_matched(self):
"""Paths with spaces are intentionally not matched (avoids false positives)."""
paths, _ = _extract("File at /tmp/my file.png here")
assert paths == []
def test_windows_path_not_matched(self):
"""Windows-style paths should not match."""
paths, _ = _extract("See C:\\Users\\test\\image.png")
assert paths == []
def test_relative_path_not_matched(self):
"""Relative paths like ./image.png should not match."""
paths, _ = _extract("File at ./screenshots/image.png here")
assert paths == []
def test_bare_filename_not_matched(self):
"""Just 'image.png' without a path should not match."""
paths, _ = _extract("Open image.png to see")
assert paths == []
def test_path_followed_by_punctuation(self):
"""Path followed by comma, period, paren should still match."""
for suffix in [",", ".", ")", ":", ";"]:
text = f"See /tmp/img.png{suffix} details"
paths, _ = _extract(text)
assert len(paths) == 1, f"Failed with suffix '{suffix}'"
def test_path_in_parentheses(self):
paths, _ = _extract("(see /tmp/img.png)")
assert paths == ["/tmp/img.png"]
def test_path_in_quotes(self):
paths, _ = _extract('The file is "/tmp/img.png" right here')
assert paths == ["/tmp/img.png"]
def test_deep_nested_path(self):
paths, _ = _extract("At /a/b/c/d/e/f/g/h/image.png end")
assert paths == ["/a/b/c/d/e/f/g/h/image.png"]
if __name__ == "__main__":
pytest.main([__file__, "-v"])
+28
View File
@@ -90,6 +90,7 @@ class TestGatewayHonchoLifecycle:
runner = _make_runner()
event = _make_event()
runner._shutdown_gateway_honcho = MagicMock()
runner._async_flush_memories = AsyncMock()
runner.session_store = MagicMock()
runner.session_store._generate_session_key.return_value = "gateway-key"
runner.session_store._entries = {
@@ -100,4 +101,31 @@ class TestGatewayHonchoLifecycle:
result = await runner._handle_reset_command(event)
runner._shutdown_gateway_honcho.assert_called_once_with("gateway-key")
runner._async_flush_memories.assert_called_once_with("old-session", "gateway-key")
assert "Session reset" in result
def test_flush_memories_reuses_gateway_session_key_and_skips_honcho_sync(self):
runner = _make_runner()
runner.session_store = MagicMock()
runner.session_store.load_transcript.return_value = [
{"role": "user", "content": "a"},
{"role": "assistant", "content": "b"},
{"role": "user", "content": "c"},
{"role": "assistant", "content": "d"},
]
tmp_agent = MagicMock()
with (
patch("gateway.run._resolve_runtime_agent_kwargs", return_value={"api_key": "test-key"}),
patch("gateway.run._resolve_gateway_model", return_value="model-name"),
patch("run_agent.AIAgent", return_value=tmp_agent) as mock_agent_cls,
):
runner._flush_memories_for_session("old-session", "gateway-key")
mock_agent_cls.assert_called_once()
_, kwargs = mock_agent_cls.call_args
assert kwargs["session_id"] == "old-session"
assert kwargs["honcho_session_key"] == "gateway-key"
tmp_agent.run_conversation.assert_called_once()
_, run_kwargs = tmp_agent.run_conversation.call_args
assert run_kwargs["sync_honcho"] is False
-25
View File
@@ -1,25 +0,0 @@
from unittest.mock import patch
import pytest
@pytest.mark.asyncio
async def test_image_enrichment_uses_athabasca_upload_guidance_without_stale_r2_warning():
from gateway.run import GatewayRunner
runner = object.__new__(GatewayRunner)
with patch(
"tools.vision_tools.vision_analyze_tool",
return_value='{"success": true, "analysis": "A painted serpent warrior."}',
):
enriched = await runner._enrich_message_with_vision(
"caption",
["/tmp/test.jpg"],
)
assert "R2 not configured" not in enriched
assert "Gateway media URL available for reference" not in enriched
assert "POST /api/uploads" in enriched
assert "Do not store the local cache path" in enriched
assert "caption" in enriched
+156
View File
@@ -0,0 +1,156 @@
"""Tests for PII redaction in gateway session context prompts."""
from gateway.session import (
SessionContext,
SessionSource,
build_session_context_prompt,
_hash_id,
_hash_sender_id,
_hash_chat_id,
_looks_like_phone,
)
from gateway.config import Platform, HomeChannel
# ---------------------------------------------------------------------------
# Low-level helpers
# ---------------------------------------------------------------------------
class TestHashHelpers:
def test_hash_id_deterministic(self):
assert _hash_id("12345") == _hash_id("12345")
def test_hash_id_12_hex_chars(self):
h = _hash_id("user-abc")
assert len(h) == 12
assert all(c in "0123456789abcdef" for c in h)
def test_hash_sender_id_prefix(self):
assert _hash_sender_id("12345").startswith("user_")
assert len(_hash_sender_id("12345")) == 17 # "user_" + 12
def test_hash_chat_id_preserves_prefix(self):
result = _hash_chat_id("telegram:12345")
assert result.startswith("telegram:")
assert "12345" not in result
def test_hash_chat_id_no_prefix(self):
result = _hash_chat_id("12345")
assert len(result) == 12
assert "12345" not in result
def test_looks_like_phone(self):
assert _looks_like_phone("+15551234567")
assert _looks_like_phone("15551234567")
assert _looks_like_phone("+1-555-123-4567")
assert not _looks_like_phone("alice")
assert not _looks_like_phone("user-123")
assert not _looks_like_phone("")
# ---------------------------------------------------------------------------
# Integration: build_session_context_prompt
# ---------------------------------------------------------------------------
def _make_context(
user_id="user-123",
user_name=None,
chat_id="telegram:99999",
platform=Platform.TELEGRAM,
home_channels=None,
):
source = SessionSource(
platform=platform,
chat_id=chat_id,
chat_type="dm",
user_id=user_id,
user_name=user_name,
)
return SessionContext(
source=source,
connected_platforms=[platform],
home_channels=home_channels or {},
)
class TestBuildSessionContextPromptRedaction:
def test_no_redaction_by_default(self):
ctx = _make_context(user_id="user-123")
prompt = build_session_context_prompt(ctx)
assert "user-123" in prompt
def test_user_id_hashed_when_redact_pii(self):
ctx = _make_context(user_id="user-123")
prompt = build_session_context_prompt(ctx, redact_pii=True)
assert "user-123" not in prompt
assert "user_" in prompt # hashed ID present
def test_user_name_not_redacted(self):
ctx = _make_context(user_id="user-123", user_name="Alice")
prompt = build_session_context_prompt(ctx, redact_pii=True)
assert "Alice" in prompt
# user_id should not appear when user_name is present (name takes priority)
assert "user-123" not in prompt
def test_home_channel_id_hashed(self):
hc = {
Platform.TELEGRAM: HomeChannel(
platform=Platform.TELEGRAM,
chat_id="telegram:99999",
name="Home Chat",
)
}
ctx = _make_context(home_channels=hc)
prompt = build_session_context_prompt(ctx, redact_pii=True)
assert "99999" not in prompt
assert "telegram:" in prompt # prefix preserved
assert "Home Chat" in prompt # name not redacted
def test_home_channel_id_preserved_without_redaction(self):
hc = {
Platform.TELEGRAM: HomeChannel(
platform=Platform.TELEGRAM,
chat_id="telegram:99999",
name="Home Chat",
)
}
ctx = _make_context(home_channels=hc)
prompt = build_session_context_prompt(ctx, redact_pii=False)
assert "99999" in prompt
def test_redaction_is_deterministic(self):
ctx = _make_context(user_id="+15551234567")
prompt1 = build_session_context_prompt(ctx, redact_pii=True)
prompt2 = build_session_context_prompt(ctx, redact_pii=True)
assert prompt1 == prompt2
def test_different_ids_produce_different_hashes(self):
ctx1 = _make_context(user_id="user-A")
ctx2 = _make_context(user_id="user-B")
p1 = build_session_context_prompt(ctx1, redact_pii=True)
p2 = build_session_context_prompt(ctx2, redact_pii=True)
assert p1 != p2
def test_discord_ids_not_redacted_even_with_flag(self):
"""Discord needs real IDs for <@user_id> mentions."""
ctx = _make_context(user_id="123456789", platform=Platform.DISCORD)
prompt = build_session_context_prompt(ctx, redact_pii=True)
assert "123456789" in prompt
def test_whatsapp_ids_redacted(self):
ctx = _make_context(user_id="+15551234567", platform=Platform.WHATSAPP)
prompt = build_session_context_prompt(ctx, redact_pii=True)
assert "+15551234567" not in prompt
assert "user_" in prompt
def test_signal_ids_redacted(self):
ctx = _make_context(user_id="+15551234567", platform=Platform.SIGNAL)
prompt = build_session_context_prompt(ctx, redact_pii=True)
assert "+15551234567" not in prompt
assert "user_" in prompt
def test_slack_ids_not_redacted(self):
"""Slack may need IDs for mentions too."""
ctx = _make_context(user_id="U12345ABC", platform=Platform.SLACK)
prompt = build_session_context_prompt(ctx, redact_pii=True)
assert "U12345ABC" in prompt
+25
View File
@@ -199,3 +199,28 @@ class TestHandleResumeCommand:
assert real_key not in runner._running_agents
db.close()
@pytest.mark.asyncio
async def test_resume_flushes_memories_with_gateway_session_key(self, tmp_path):
"""Resume should preserve the gateway session key for Honcho flushes."""
from hermes_state import SessionDB
db = SessionDB(db_path=tmp_path / "state.db")
db.create_session("old_session", "telegram")
db.set_session_title("old_session", "Old Work")
db.create_session("current_session_001", "telegram")
event = _make_event(text="/resume Old Work")
runner = _make_runner(
session_db=db,
current_session_id="current_session_001",
event=event,
)
await runner._handle_resume_command(event)
runner._async_flush_memories.assert_called_once_with(
"current_session_001",
_session_key_for_event(event),
)
db.close()
@@ -0,0 +1,89 @@
import pytest
from gateway.config import GatewayConfig, Platform, PlatformConfig
from gateway.platforms.base import BasePlatformAdapter
from gateway.run import GatewayRunner
from gateway.status import read_runtime_status
class _RetryableFailureAdapter(BasePlatformAdapter):
def __init__(self):
super().__init__(PlatformConfig(enabled=True, token="***"), Platform.TELEGRAM)
async def connect(self) -> bool:
self._set_fatal_error(
"telegram_connect_error",
"Telegram startup failed: temporary DNS resolution failure.",
retryable=True,
)
return False
async def disconnect(self) -> None:
self._mark_disconnected()
async def send(self, chat_id, content, reply_to=None, metadata=None):
raise NotImplementedError
async def get_chat_info(self, chat_id):
return {"id": chat_id}
class _DisabledAdapter(BasePlatformAdapter):
def __init__(self):
super().__init__(PlatformConfig(enabled=False, token="***"), Platform.TELEGRAM)
async def connect(self) -> bool:
raise AssertionError("connect should not be called for disabled platforms")
async def disconnect(self) -> None:
self._mark_disconnected()
async def send(self, chat_id, content, reply_to=None, metadata=None):
raise NotImplementedError
async def get_chat_info(self, chat_id):
return {"id": chat_id}
@pytest.mark.asyncio
async def test_runner_returns_failure_for_retryable_startup_errors(monkeypatch, tmp_path):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
config = GatewayConfig(
platforms={
Platform.TELEGRAM: PlatformConfig(enabled=True, token="***")
},
sessions_dir=tmp_path / "sessions",
)
runner = GatewayRunner(config)
monkeypatch.setattr(runner, "_create_adapter", lambda platform, platform_config: _RetryableFailureAdapter())
ok = await runner.start()
assert ok is False
assert runner.should_exit_cleanly is False
state = read_runtime_status()
assert state["gateway_state"] == "startup_failed"
assert "temporary DNS resolution failure" in state["exit_reason"]
assert state["platforms"]["telegram"]["state"] == "fatal"
assert state["platforms"]["telegram"]["error_code"] == "telegram_connect_error"
@pytest.mark.asyncio
async def test_runner_allows_cron_only_mode_when_no_platforms_are_enabled(monkeypatch, tmp_path):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
config = GatewayConfig(
platforms={
Platform.TELEGRAM: PlatformConfig(enabled=False, token="***")
},
sessions_dir=tmp_path / "sessions",
)
runner = GatewayRunner(config)
ok = await runner.start()
assert ok is True
assert runner.should_exit_cleanly is False
assert runner.adapters == {}
state = read_runtime_status()
assert state["gateway_state"] == "running"
+94
View File
@@ -369,6 +369,54 @@ class TestWhatsAppDMSessionKeyConsistency:
)
assert store._generate_session_key(source) == build_session_key(source)
def test_store_creates_distinct_group_sessions_per_user(self, store):
first = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="alice",
user_name="Alice",
)
second = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="bob",
user_name="Bob",
)
first_entry = store.get_or_create_session(first)
second_entry = store.get_or_create_session(second)
assert first_entry.session_key == "agent:main:discord:group:guild-123:alice"
assert second_entry.session_key == "agent:main:discord:group:guild-123:bob"
assert first_entry.session_id != second_entry.session_id
def test_store_shares_group_sessions_when_disabled_in_config(self, store):
store.config.group_sessions_per_user = False
first = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="alice",
user_name="Alice",
)
second = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="bob",
user_name="Bob",
)
first_entry = store.get_or_create_session(first)
second_entry = store.get_or_create_session(second)
assert first_entry.session_key == "agent:main:discord:group:guild-123"
assert second_entry.session_key == "agent:main:discord:group:guild-123"
assert first_entry.session_id == second_entry.session_id
def test_telegram_dm_includes_chat_id(self):
"""Non-WhatsApp DMs should also include chat_id to separate users."""
source = SessionSource(
@@ -398,6 +446,41 @@ class TestWhatsAppDMSessionKeyConsistency:
key = build_session_key(source)
assert key == "agent:main:discord:group:guild-123"
def test_group_sessions_are_isolated_per_user_when_user_id_present(self):
first = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="alice",
)
second = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="bob",
)
assert build_session_key(first) == "agent:main:discord:group:guild-123:alice"
assert build_session_key(second) == "agent:main:discord:group:guild-123:bob"
assert build_session_key(first) != build_session_key(second)
def test_group_sessions_can_be_shared_when_isolation_disabled(self):
first = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="alice",
)
second = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="bob",
)
assert build_session_key(first, group_sessions_per_user=False) == "agent:main:discord:group:guild-123"
assert build_session_key(second, group_sessions_per_user=False) == "agent:main:discord:group:guild-123"
def test_group_thread_includes_thread_id(self):
"""Forum-style threads need a distinct session key within one group."""
source = SessionSource(
@@ -409,6 +492,17 @@ class TestWhatsAppDMSessionKeyConsistency:
key = build_session_key(source)
assert key == "agent:main:telegram:group:-1002285219667:17585"
def test_group_thread_sessions_are_isolated_per_user(self):
source = SessionSource(
platform=Platform.TELEGRAM,
chat_id="-1002285219667",
chat_type="group",
thread_id="17585",
user_id="42",
)
key = build_session_key(source)
assert key == "agent:main:telegram:group:-1002285219667:17585:42"
class TestSessionStoreEntriesAttribute:
"""Regression: /reset must access _entries, not _sessions."""
+81
View File
@@ -0,0 +1,81 @@
"""Tests for SSL certificate auto-detection in gateway/run.py."""
import importlib
import os
from unittest.mock import patch, MagicMock
def _load_ensure_ssl():
"""Import _ensure_ssl_certs fresh (gateway/run.py has heavy deps, so we
extract just the function source to avoid importing the whole gateway)."""
# We can test via the actual module since conftest isolates HERMES_HOME,
# but we need to be careful about side effects. Instead, replicate the
# logic in a controlled way.
from types import ModuleType
import textwrap, ssl as _ssl # noqa: F401
code = textwrap.dedent("""\
import os, ssl
def _ensure_ssl_certs():
if "SSL_CERT_FILE" in os.environ:
return
paths = ssl.get_default_verify_paths()
for candidate in (paths.cafile, paths.openssl_cafile):
if candidate and os.path.exists(candidate):
os.environ["SSL_CERT_FILE"] = candidate
return
try:
import certifi
os.environ["SSL_CERT_FILE"] = certifi.where()
return
except ImportError:
pass
for candidate in (
"/etc/ssl/certs/ca-certificates.crt",
"/etc/ssl/cert.pem",
):
if os.path.exists(candidate):
os.environ["SSL_CERT_FILE"] = candidate
return
""")
mod = ModuleType("_ssl_helper")
exec(code, mod.__dict__)
return mod._ensure_ssl_certs
class TestEnsureSslCerts:
def test_respects_existing_env_var(self):
fn = _load_ensure_ssl()
with patch.dict(os.environ, {"SSL_CERT_FILE": "/custom/ca.pem"}):
fn()
assert os.environ["SSL_CERT_FILE"] == "/custom/ca.pem"
def test_sets_from_ssl_default_paths(self, tmp_path):
fn = _load_ensure_ssl()
cert = tmp_path / "ca.crt"
cert.write_text("FAKE CERT")
mock_paths = MagicMock()
mock_paths.cafile = str(cert)
mock_paths.openssl_cafile = None
env = {k: v for k, v in os.environ.items() if k != "SSL_CERT_FILE"}
with patch.dict(os.environ, env, clear=True), \
patch("ssl.get_default_verify_paths", return_value=mock_paths):
fn()
assert os.environ.get("SSL_CERT_FILE") == str(cert)
def test_no_op_when_nothing_found(self):
fn = _load_ensure_ssl()
mock_paths = MagicMock()
mock_paths.cafile = None
mock_paths.openssl_cafile = None
env = {k: v for k, v in os.environ.items() if k != "SSL_CERT_FILE"}
with patch.dict(os.environ, env, clear=True), \
patch("ssl.get_default_verify_paths", return_value=mock_paths), \
patch("os.path.exists", return_value=False), \
patch.dict("sys.modules", {"certifi": None}):
fn()
assert "SSL_CERT_FILE" not in os.environ
+36
View File
@@ -26,8 +26,44 @@ class TestGatewayPidState:
assert status.get_running_pid() is None
assert not pid_path.exists()
def test_get_running_pid_accepts_gateway_metadata_when_cmdline_unavailable(self, tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
pid_path = tmp_path / "gateway.pid"
pid_path.write_text(json.dumps({
"pid": os.getpid(),
"kind": "hermes-gateway",
"argv": ["python", "-m", "hermes_cli.main", "gateway"],
"start_time": 123,
}))
monkeypatch.setattr(status.os, "kill", lambda pid, sig: None)
monkeypatch.setattr(status, "_get_process_start_time", lambda pid: 123)
monkeypatch.setattr(status, "_read_process_cmdline", lambda pid: None)
assert status.get_running_pid() == os.getpid()
class TestGatewayRuntimeStatus:
def test_write_runtime_status_overwrites_stale_pid_on_restart(self, tmp_path, monkeypatch):
"""Regression: setdefault() preserved stale PID from previous process (#1631)."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
# Simulate a previous gateway run that left a state file with a stale PID
state_path = tmp_path / "gateway_state.json"
state_path.write_text(json.dumps({
"pid": 99999,
"start_time": 1000.0,
"kind": "hermes-gateway",
"platforms": {},
"updated_at": "2025-01-01T00:00:00Z",
}))
status.write_runtime_status(gateway_state="running")
payload = status.read_runtime_status()
assert payload["pid"] == os.getpid(), "PID should be overwritten, not preserved via setdefault"
assert payload["start_time"] != 1000.0, "start_time should be overwritten on restart"
def test_write_runtime_status_records_platform_failure(self, tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
+133
View File
@@ -0,0 +1,133 @@
"""Tests for gateway /status behavior and token persistence."""
from datetime import datetime
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock
import pytest
from gateway.config import GatewayConfig, Platform, PlatformConfig
from gateway.platforms.base import MessageEvent
from gateway.session import SessionEntry, SessionSource, build_session_key
def _make_source() -> SessionSource:
return SessionSource(
platform=Platform.TELEGRAM,
user_id="u1",
chat_id="c1",
user_name="tester",
chat_type="dm",
)
def _make_event(text: str) -> MessageEvent:
return MessageEvent(
text=text,
source=_make_source(),
message_id="m1",
)
def _make_runner(session_entry: SessionEntry):
from gateway.run import GatewayRunner
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(
platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="***")}
)
adapter = MagicMock()
adapter.send = AsyncMock()
runner.adapters = {Platform.TELEGRAM: adapter}
runner._voice_mode = {}
runner.hooks = SimpleNamespace(emit=AsyncMock(), loaded_hooks=False)
runner.session_store = MagicMock()
runner.session_store.get_or_create_session.return_value = session_entry
runner.session_store.load_transcript.return_value = []
runner.session_store.has_any_sessions.return_value = True
runner.session_store.append_to_transcript = MagicMock()
runner.session_store.rewrite_transcript = MagicMock()
runner.session_store.update_session = MagicMock()
runner._running_agents = {}
runner._pending_messages = {}
runner._pending_approvals = {}
runner._session_db = None
runner._reasoning_config = None
runner._provider_routing = {}
runner._fallback_model = None
runner._show_reasoning = False
runner._is_user_authorized = lambda _source: True
runner._set_session_env = lambda _context: None
runner._should_send_voice_reply = lambda *_args, **_kwargs: False
runner._send_voice_reply = AsyncMock()
runner._capture_gateway_honcho_if_configured = lambda *args, **kwargs: None
runner._emit_gateway_run_progress = AsyncMock()
return runner
@pytest.mark.asyncio
async def test_status_command_reports_running_agent_without_interrupt(monkeypatch):
session_entry = SessionEntry(
session_key=build_session_key(_make_source()),
session_id="sess-1",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
chat_type="dm",
total_tokens=321,
)
runner = _make_runner(session_entry)
running_agent = MagicMock()
runner._running_agents[build_session_key(_make_source())] = running_agent
result = await runner._handle_message(_make_event("/status"))
assert "**Tokens:** 321" in result
assert "**Agent Running:** Yes ⚡" in result
running_agent.interrupt.assert_not_called()
assert runner._pending_messages == {}
@pytest.mark.asyncio
async def test_handle_message_persists_agent_token_counts(monkeypatch):
import gateway.run as gateway_run
session_entry = SessionEntry(
session_key=build_session_key(_make_source()),
session_id="sess-1",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
chat_type="dm",
)
runner = _make_runner(session_entry)
runner.session_store.load_transcript.return_value = [{"role": "user", "content": "earlier"}]
runner._run_agent = AsyncMock(
return_value={
"final_response": "ok",
"messages": [],
"tools": [],
"history_offset": 0,
"last_prompt_tokens": 80,
"input_tokens": 120,
"output_tokens": 45,
"model": "openai/test-model",
}
)
monkeypatch.setattr(gateway_run, "_resolve_runtime_agent_kwargs", lambda: {"api_key": "***"})
monkeypatch.setattr(
"agent.model_metadata.get_model_context_length",
lambda *_args, **_kwargs: 100000,
)
result = await runner._handle_message(_make_event("hello"))
assert result == "ok"
runner.session_store.update_session.assert_called_once_with(
session_entry.session_key,
input_tokens=120,
output_tokens=45,
last_prompt_tokens=80,
model="openai/test-model",
)
+24
View File
@@ -51,3 +51,27 @@ async def test_enrich_message_with_transcription_skips_when_stt_disabled():
assert "transcription is disabled" in result.lower()
assert "caption" in result
@pytest.mark.asyncio
async def test_enrich_message_with_transcription_avoids_bogus_no_provider_message_for_backend_key_errors():
from gateway.run import GatewayRunner
runner = GatewayRunner.__new__(GatewayRunner)
runner.config = GatewayConfig(stt_enabled=True)
with patch(
"tools.transcription_tools.transcribe_audio",
return_value={"success": False, "error": "VOICE_TOOLS_OPENAI_KEY not set"},
), patch(
"tools.transcription_tools.get_stt_model_from_config",
return_value=None,
):
result = await runner._enrich_message_with_transcription(
"caption",
["/tmp/voice.ogg"],
)
assert "No STT provider is configured" not in result
assert "trouble transcribing" in result
assert "caption" in result
+33
View File
@@ -100,6 +100,39 @@ async def test_polling_conflict_stops_polling_and_notifies_handler(monkeypatch):
fatal_handler.assert_awaited_once()
@pytest.mark.asyncio
async def test_connect_marks_retryable_fatal_error_for_startup_network_failure(monkeypatch):
adapter = TelegramAdapter(PlatformConfig(enabled=True, token="***"))
monkeypatch.setattr(
"gateway.status.acquire_scoped_lock",
lambda scope, identity, metadata=None: (True, None),
)
monkeypatch.setattr(
"gateway.status.release_scoped_lock",
lambda scope, identity: None,
)
builder = MagicMock()
builder.token.return_value = builder
app = SimpleNamespace(
bot=SimpleNamespace(),
updater=SimpleNamespace(),
add_handler=MagicMock(),
initialize=AsyncMock(side_effect=RuntimeError("Temporary failure in name resolution")),
start=AsyncMock(),
)
builder.build.return_value = app
monkeypatch.setattr("gateway.platforms.telegram.Application", SimpleNamespace(builder=MagicMock(return_value=builder)))
ok = await adapter.connect()
assert ok is False
assert adapter.fatal_error_code == "telegram_connect_error"
assert adapter.fatal_error_retryable is True
assert "Temporary failure in name resolution" in adapter.fatal_error_message
@pytest.mark.asyncio
async def test_disconnect_skips_inactive_updater_and_app(monkeypatch):
adapter = TelegramAdapter(PlatformConfig(enabled=True, token="***"))
+25 -1
View File
@@ -7,7 +7,7 @@ or corrupt user-visible content.
import re
import sys
from unittest.mock import MagicMock
from unittest.mock import AsyncMock, MagicMock
import pytest
@@ -392,3 +392,27 @@ class TestStripMdv2:
def test_empty_string(self):
assert _strip_mdv2("") == ""
@pytest.mark.asyncio
async def test_send_escapes_chunk_indicator_for_markdownv2(adapter):
adapter.MAX_MESSAGE_LENGTH = 80
adapter._bot = MagicMock()
sent_texts = []
async def _fake_send_message(**kwargs):
sent_texts.append(kwargs["text"])
msg = MagicMock()
msg.message_id = len(sent_texts)
return msg
adapter._bot.send_message = AsyncMock(side_effect=_fake_send_message)
content = ("**bold** chunk content " * 12).strip()
result = await adapter.send("123", content)
assert result.success is True
assert len(sent_texts) > 1
assert re.search(r" \\\([0-9]+/[0-9]+\\\)$", sent_texts[0])
assert re.search(r" \\\([0-9]+/[0-9]+\\\)$", sent_texts[-1])
+7 -8
View File
@@ -475,16 +475,15 @@ class TestDiscordPlayTtsSkip:
class TestVoiceInHelp:
def test_voice_in_help_output(self):
from gateway.run import GatewayRunner
import inspect
source = inspect.getsource(GatewayRunner._handle_help_command)
assert "/voice" in source
"""The gateway help text includes /voice (generated from registry)."""
from hermes_cli.commands import gateway_help_lines
help_text = "\n".join(gateway_help_lines())
assert "/voice" in help_text
def test_voice_is_known_command(self):
from gateway.run import GatewayRunner
import inspect
source = inspect.getsource(GatewayRunner._handle_message)
assert '"voice"' in source
"""The /voice command is in GATEWAY_KNOWN_COMMANDS."""
from hermes_cli.commands import GATEWAY_KNOWN_COMMANDS
assert "voice" in GATEWAY_KNOWN_COMMANDS
# =====================================================================
+382 -21
View File
@@ -1,19 +1,22 @@
"""Tests for shared slash command definitions and autocomplete."""
"""Tests for the central command registry and autocomplete."""
from prompt_toolkit.completion import CompleteEvent
from prompt_toolkit.document import Document
from hermes_cli.commands import COMMANDS, SlashCommandCompleter
# All commands that must be present in the shared COMMANDS dict.
EXPECTED_COMMANDS = {
"/help", "/tools", "/toolsets", "/model", "/provider", "/prompt",
"/personality", "/clear", "/history", "/new", "/reset", "/retry",
"/undo", "/save", "/config", "/cron", "/skills", "/platforms",
"/verbose", "/reasoning", "/compress", "/title", "/usage", "/insights", "/paste",
"/reload-mcp", "/rollback", "/background", "/skin", "/voice", "/quit",
}
from hermes_cli.commands import (
COMMAND_REGISTRY,
COMMANDS,
COMMANDS_BY_CATEGORY,
CommandDef,
GATEWAY_KNOWN_COMMANDS,
SUBCOMMANDS,
SlashCommandAutoSuggest,
SlashCommandCompleter,
gateway_help_lines,
resolve_command,
slack_subcommand_map,
telegram_bot_commands,
)
def _completions(completer: SlashCommandCompleter, text: str):
@@ -25,21 +28,200 @@ def _completions(completer: SlashCommandCompleter, text: str):
)
class TestCommands:
def test_shared_commands_include_cli_specific_entries(self):
"""Entries that previously only existed in cli.py are now in the shared dict."""
assert COMMANDS["/paste"] == "Check clipboard for an image and attach it"
assert COMMANDS["/reload-mcp"] == "Reload MCP servers from config.yaml"
# ---------------------------------------------------------------------------
# CommandDef registry tests
# ---------------------------------------------------------------------------
def test_all_expected_commands_present(self):
"""Regression guard — every known command must appear in the shared dict."""
assert set(COMMANDS.keys()) == EXPECTED_COMMANDS
class TestCommandRegistry:
def test_registry_is_nonempty(self):
assert len(COMMAND_REGISTRY) > 30
def test_every_entry_is_commanddef(self):
for entry in COMMAND_REGISTRY:
assert isinstance(entry, CommandDef), f"Unexpected type: {type(entry)}"
def test_no_duplicate_canonical_names(self):
names = [cmd.name for cmd in COMMAND_REGISTRY]
assert len(names) == len(set(names)), f"Duplicate names: {[n for n in names if names.count(n) > 1]}"
def test_no_alias_collides_with_canonical_name(self):
"""An alias must not shadow another command's canonical name."""
canonical_names = {cmd.name for cmd in COMMAND_REGISTRY}
for cmd in COMMAND_REGISTRY:
for alias in cmd.aliases:
if alias in canonical_names:
# reset -> new is intentional (reset IS an alias for new)
target = next(c for c in COMMAND_REGISTRY if c.name == alias)
# This should only happen if the alias points to the same entry
assert resolve_command(alias).name == cmd.name or alias == cmd.name, \
f"Alias '{alias}' of '{cmd.name}' shadows canonical '{target.name}'"
def test_every_entry_has_valid_category(self):
valid_categories = {"Session", "Configuration", "Tools & Skills", "Info", "Exit"}
for cmd in COMMAND_REGISTRY:
assert cmd.category in valid_categories, f"{cmd.name} has invalid category '{cmd.category}'"
def test_cli_only_and_gateway_only_are_mutually_exclusive(self):
for cmd in COMMAND_REGISTRY:
assert not (cmd.cli_only and cmd.gateway_only), \
f"{cmd.name} cannot be both cli_only and gateway_only"
# ---------------------------------------------------------------------------
# resolve_command tests
# ---------------------------------------------------------------------------
class TestResolveCommand:
def test_canonical_name_resolves(self):
assert resolve_command("help").name == "help"
assert resolve_command("background").name == "background"
def test_alias_resolves_to_canonical(self):
assert resolve_command("bg").name == "background"
assert resolve_command("reset").name == "new"
assert resolve_command("q").name == "quit"
assert resolve_command("exit").name == "quit"
assert resolve_command("gateway").name == "platforms"
assert resolve_command("set-home").name == "sethome"
assert resolve_command("reload_mcp").name == "reload-mcp"
def test_leading_slash_stripped(self):
assert resolve_command("/help").name == "help"
assert resolve_command("/bg").name == "background"
def test_unknown_returns_none(self):
assert resolve_command("nonexistent") is None
assert resolve_command("") is None
# ---------------------------------------------------------------------------
# Derived dicts (backwards compat)
# ---------------------------------------------------------------------------
class TestDerivedDicts:
def test_commands_dict_excludes_gateway_only(self):
"""gateway_only commands should NOT appear in the CLI COMMANDS dict."""
for cmd in COMMAND_REGISTRY:
if cmd.gateway_only:
assert f"/{cmd.name}" not in COMMANDS, \
f"gateway_only command /{cmd.name} should not be in COMMANDS"
def test_commands_dict_includes_all_cli_commands(self):
for cmd in COMMAND_REGISTRY:
if not cmd.gateway_only:
assert f"/{cmd.name}" in COMMANDS, \
f"/{cmd.name} missing from COMMANDS dict"
def test_commands_dict_includes_aliases(self):
assert "/bg" in COMMANDS
assert "/reset" in COMMANDS
assert "/q" in COMMANDS
assert "/exit" in COMMANDS
assert "/reload_mcp" in COMMANDS
assert "/gateway" in COMMANDS
def test_commands_by_category_covers_all_categories(self):
registry_categories = {cmd.category for cmd in COMMAND_REGISTRY if not cmd.gateway_only}
assert set(COMMANDS_BY_CATEGORY.keys()) == registry_categories
def test_every_command_has_nonempty_description(self):
for cmd, desc in COMMANDS.items():
assert isinstance(desc, str) and len(desc) > 0, f"{cmd} has empty description"
# ---------------------------------------------------------------------------
# Gateway helpers
# ---------------------------------------------------------------------------
class TestGatewayKnownCommands:
def test_excludes_cli_only(self):
for cmd in COMMAND_REGISTRY:
if cmd.cli_only:
assert cmd.name not in GATEWAY_KNOWN_COMMANDS, \
f"cli_only command '{cmd.name}' should not be in GATEWAY_KNOWN_COMMANDS"
def test_includes_gateway_commands(self):
for cmd in COMMAND_REGISTRY:
if not cmd.cli_only:
assert cmd.name in GATEWAY_KNOWN_COMMANDS
for alias in cmd.aliases:
assert alias in GATEWAY_KNOWN_COMMANDS
def test_bg_alias_in_gateway(self):
assert "bg" in GATEWAY_KNOWN_COMMANDS
assert "background" in GATEWAY_KNOWN_COMMANDS
def test_is_frozenset(self):
assert isinstance(GATEWAY_KNOWN_COMMANDS, frozenset)
class TestGatewayHelpLines:
def test_returns_nonempty_list(self):
lines = gateway_help_lines()
assert len(lines) > 10
def test_excludes_cli_only_commands(self):
lines = gateway_help_lines()
joined = "\n".join(lines)
for cmd in COMMAND_REGISTRY:
if cmd.cli_only:
assert f"`/{cmd.name}" not in joined, \
f"cli_only command /{cmd.name} should not be in gateway help"
def test_includes_alias_note_for_bg(self):
lines = gateway_help_lines()
bg_line = [l for l in lines if "/background" in l]
assert len(bg_line) == 1
assert "/bg" in bg_line[0]
class TestTelegramBotCommands:
def test_returns_list_of_tuples(self):
cmds = telegram_bot_commands()
assert len(cmds) > 10
for name, desc in cmds:
assert isinstance(name, str)
assert isinstance(desc, str)
def test_no_hyphens_in_command_names(self):
"""Telegram does not support hyphens in command names."""
for name, _ in telegram_bot_commands():
assert "-" not in name, f"Telegram command '{name}' contains a hyphen"
def test_excludes_cli_only(self):
names = {name for name, _ in telegram_bot_commands()}
for cmd in COMMAND_REGISTRY:
if cmd.cli_only:
tg_name = cmd.name.replace("-", "_")
assert tg_name not in names
class TestSlackSubcommandMap:
def test_returns_dict(self):
mapping = slack_subcommand_map()
assert isinstance(mapping, dict)
assert len(mapping) > 10
def test_values_are_slash_prefixed(self):
for key, val in slack_subcommand_map().items():
assert val.startswith("/"), f"Slack mapping for '{key}' should start with /"
def test_includes_aliases(self):
mapping = slack_subcommand_map()
assert "bg" in mapping
assert "reset" in mapping
def test_excludes_cli_only(self):
mapping = slack_subcommand_map()
for cmd in COMMAND_REGISTRY:
if cmd.cli_only:
assert cmd.name not in mapping
# ---------------------------------------------------------------------------
# Autocomplete (SlashCommandCompleter)
# ---------------------------------------------------------------------------
class TestSlashCommandCompleter:
# -- basic prefix completion -----------------------------------------
@@ -54,7 +236,7 @@ class TestSlashCommandCompleter:
def test_builtin_completion_display_meta_shows_description(self):
completions = _completions(SlashCommandCompleter(), "/help")
assert len(completions) == 1
assert completions[0].display_meta_text == "Show this help message"
assert completions[0].display_meta_text == "Show available commands"
# -- exact-match trailing space --------------------------------------
@@ -143,3 +325,182 @@ class TestSlashCommandCompleter:
completions = _completions(completer, "/no-desc")
assert len(completions) == 1
assert "Skill command" in completions[0].display_meta_text
# ── SUBCOMMANDS extraction ──────────────────────────────────────────────
class TestSubcommands:
def test_explicit_subcommands_extracted(self):
"""Commands with explicit subcommands on CommandDef are extracted."""
assert "/prompt" in SUBCOMMANDS
assert "clear" in SUBCOMMANDS["/prompt"]
def test_reasoning_has_subcommands(self):
assert "/reasoning" in SUBCOMMANDS
subs = SUBCOMMANDS["/reasoning"]
assert "high" in subs
assert "show" in subs
assert "hide" in subs
def test_voice_has_subcommands(self):
assert "/voice" in SUBCOMMANDS
assert "on" in SUBCOMMANDS["/voice"]
assert "off" in SUBCOMMANDS["/voice"]
def test_cron_has_subcommands(self):
assert "/cron" in SUBCOMMANDS
assert "list" in SUBCOMMANDS["/cron"]
assert "add" in SUBCOMMANDS["/cron"]
def test_commands_without_subcommands_not_in_dict(self):
"""Plain commands should not appear in SUBCOMMANDS."""
assert "/help" not in SUBCOMMANDS
assert "/quit" not in SUBCOMMANDS
assert "/clear" not in SUBCOMMANDS
# ── Subcommand tab completion ───────────────────────────────────────────
class TestSubcommandCompletion:
def test_subcommand_completion_after_space(self):
"""Typing '/reasoning ' then Tab should show subcommands."""
completions = _completions(SlashCommandCompleter(), "/reasoning ")
texts = {c.text for c in completions}
assert "high" in texts
assert "show" in texts
def test_subcommand_prefix_filters(self):
"""Typing '/reasoning sh' should only show 'show'."""
completions = _completions(SlashCommandCompleter(), "/reasoning sh")
texts = {c.text for c in completions}
assert texts == {"show"}
def test_subcommand_exact_match_suppressed(self):
"""Typing the full subcommand shouldn't re-suggest it."""
completions = _completions(SlashCommandCompleter(), "/reasoning show")
texts = {c.text for c in completions}
assert "show" not in texts
def test_no_subcommands_for_plain_command(self):
"""Commands without subcommands yield nothing after space."""
completions = _completions(SlashCommandCompleter(), "/help ")
assert completions == []
# ── Two-stage /model completion ─────────────────────────────────────────
def _model_completer() -> SlashCommandCompleter:
"""Build a completer with mock model/provider info."""
return SlashCommandCompleter(
model_completer_provider=lambda: {
"current_provider": "openrouter",
"providers": {
"anthropic": "Anthropic",
"openrouter": "OpenRouter",
"nous": "Nous Research",
},
"models_for": lambda p: {
"anthropic": ["claude-sonnet-4-20250514", "claude-opus-4-20250414"],
"openrouter": ["anthropic/claude-sonnet-4", "google/gemini-2.5-pro"],
"nous": ["hermes-3-llama-3.1-405b"],
}.get(p, []),
}
)
class TestModelCompletion:
def test_stage1_shows_providers(self):
completions = _completions(_model_completer(), "/model ")
texts = {c.text for c in completions}
assert "anthropic:" in texts
assert "openrouter:" in texts
assert "nous:" in texts
def test_stage1_current_provider_last(self):
completions = _completions(_model_completer(), "/model ")
texts = [c.text for c in completions]
assert texts[-1] == "openrouter:"
def test_stage1_current_provider_labeled(self):
completions = _completions(_model_completer(), "/model ")
for c in completions:
if c.text == "openrouter:":
assert "current" in c.display_meta_text.lower()
break
else:
raise AssertionError("openrouter: not found in completions")
def test_stage1_prefix_filters(self):
completions = _completions(_model_completer(), "/model an")
texts = {c.text for c in completions}
assert texts == {"anthropic:"}
def test_stage2_shows_models(self):
completions = _completions(_model_completer(), "/model anthropic:")
texts = {c.text for c in completions}
assert "anthropic:claude-sonnet-4-20250514" in texts
assert "anthropic:claude-opus-4-20250414" in texts
def test_stage2_prefix_filters_models(self):
completions = _completions(_model_completer(), "/model anthropic:claude-s")
texts = {c.text for c in completions}
assert "anthropic:claude-sonnet-4-20250514" in texts
assert "anthropic:claude-opus-4-20250414" not in texts
def test_stage2_no_model_provider_returns_empty(self):
completions = _completions(SlashCommandCompleter(), "/model ")
assert completions == []
# ── Ghost text (SlashCommandAutoSuggest) ────────────────────────────────
def _suggestion(text: str, completer=None) -> str | None:
"""Get ghost text suggestion for given input."""
suggest = SlashCommandAutoSuggest(completer=completer)
doc = Document(text=text)
class FakeBuffer:
pass
result = suggest.get_suggestion(FakeBuffer(), doc)
return result.text if result else None
class TestGhostText:
def test_command_name_suggestion(self):
"""/he → 'lp'"""
assert _suggestion("/he") == "lp"
def test_command_name_suggestion_reasoning(self):
"""/rea → 'soning'"""
assert _suggestion("/rea") == "soning"
def test_no_suggestion_for_complete_command(self):
assert _suggestion("/help") is None
def test_subcommand_suggestion(self):
"""/reasoning h → 'igh'"""
assert _suggestion("/reasoning h") == "igh"
def test_subcommand_suggestion_show(self):
"""/reasoning sh → 'ow'"""
assert _suggestion("/reasoning sh") == "ow"
def test_no_suggestion_for_non_slash(self):
assert _suggestion("hello") is None
def test_model_stage1_ghost_text(self):
"""/model a → 'nthropic:'"""
completer = _model_completer()
assert _suggestion("/model a", completer=completer) == "nthropic:"
def test_model_stage2_ghost_text(self):
"""/model anthropic:cl → rest of first matching model"""
completer = _model_completer()
s = _suggestion("/model anthropic:cl", completer=completer)
assert s is not None
assert s.startswith("aude-")
+142
View File
@@ -12,9 +12,12 @@ from hermes_cli.config import (
ensure_hermes_home,
load_config,
load_env,
migrate_config,
save_config,
save_env_value,
save_env_value_secure,
sanitize_env_file,
_sanitize_env_lines,
)
@@ -203,3 +206,142 @@ class TestSaveConfigAtomicity:
raw = yaml.safe_load(f)
assert raw["model"] == "test/atomic-model"
assert raw["agent"]["max_turns"] == 77
class TestSanitizeEnvLines:
"""Tests for .env file corruption repair."""
def test_splits_concatenated_keys(self):
"""Two KEY=VALUE pairs jammed on one line get split."""
lines = ["ANTHROPIC_API_KEY=sk-ant-xxxOPENAI_BASE_URL=https://api.openai.com/v1\n"]
result = _sanitize_env_lines(lines)
assert result == [
"ANTHROPIC_API_KEY=sk-ant-xxx\n",
"OPENAI_BASE_URL=https://api.openai.com/v1\n",
]
def test_preserves_clean_file(self):
"""A well-formed .env file passes through unchanged (modulo trailing newlines)."""
lines = [
"OPENROUTER_API_KEY=sk-or-xxx\n",
"FIRECRAWL_API_KEY=fc-xxx\n",
"# a comment\n",
"\n",
]
result = _sanitize_env_lines(lines)
assert result == lines
def test_preserves_comments_and_blanks(self):
lines = ["# comment\n", "\n", "KEY=val\n"]
result = _sanitize_env_lines(lines)
assert result == lines
def test_adds_missing_trailing_newline(self):
"""Lines missing trailing newline get one added."""
lines = ["FOO_BAR=baz"]
result = _sanitize_env_lines(lines)
assert result == ["FOO_BAR=baz\n"]
def test_three_concatenated_keys(self):
"""Three known keys on one line all get separated."""
lines = ["FAL_KEY=111FIRECRAWL_API_KEY=222GITHUB_TOKEN=333\n"]
result = _sanitize_env_lines(lines)
assert result == [
"FAL_KEY=111\n",
"FIRECRAWL_API_KEY=222\n",
"GITHUB_TOKEN=333\n",
]
def test_value_with_equals_sign_not_split(self):
"""A value containing '=' shouldn't be falsely split (lowercase in value)."""
lines = ["OPENAI_BASE_URL=https://api.example.com/v1?key=abc123\n"]
result = _sanitize_env_lines(lines)
assert result == lines
def test_unknown_keys_not_split(self):
"""Unknown key names on one line are NOT split (avoids false positives)."""
lines = ["CUSTOM_VAR=value123OTHER_THING=value456\n"]
result = _sanitize_env_lines(lines)
# Unknown keys stay on one line — no false split
assert len(result) == 1
def test_value_ending_with_digits_still_splits(self):
"""Concatenation is detected even when value ends with digits."""
lines = ["OPENROUTER_API_KEY=sk-or-v1-abc123OPENAI_BASE_URL=https://api.openai.com/v1\n"]
result = _sanitize_env_lines(lines)
assert len(result) == 2
assert result[0].startswith("OPENROUTER_API_KEY=")
assert result[1].startswith("OPENAI_BASE_URL=")
def test_save_env_value_fixes_corruption_on_write(self, tmp_path):
"""save_env_value sanitizes corrupted lines when writing a new key."""
env_file = tmp_path / ".env"
env_file.write_text(
"ANTHROPIC_API_KEY=sk-antOPENAI_BASE_URL=https://api.openai.com/v1\n"
"FAL_KEY=existing\n"
)
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
save_env_value("MESSAGING_CWD", "/tmp")
content = env_file.read_text()
lines = content.strip().split("\n")
# Corrupted line should be split, new key added
assert "ANTHROPIC_API_KEY=sk-ant" in lines
assert "OPENAI_BASE_URL=https://api.openai.com/v1" in lines
assert "MESSAGING_CWD=/tmp" in lines
def test_sanitize_env_file_returns_fix_count(self, tmp_path):
"""sanitize_env_file reports how many entries were fixed."""
env_file = tmp_path / ".env"
env_file.write_text(
"FAL_KEY=good\n"
"OPENROUTER_API_KEY=valFIRECRAWL_API_KEY=val2\n"
)
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
fixes = sanitize_env_file()
assert fixes > 0
# Verify file is now clean
content = env_file.read_text()
assert "OPENROUTER_API_KEY=val\n" in content
assert "FIRECRAWL_API_KEY=val2\n" in content
def test_sanitize_env_file_noop_on_clean_file(self, tmp_path):
"""No changes when file is already clean."""
env_file = tmp_path / ".env"
env_file.write_text("GOOD_KEY=good\nOTHER_KEY=other\n")
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
fixes = sanitize_env_file()
assert fixes == 0
class TestAnthropicTokenMigration:
"""Test that config version 8→9 clears ANTHROPIC_TOKEN."""
def _write_config_version(self, tmp_path, version):
config_path = tmp_path / "config.yaml"
import yaml
config_path.write_text(yaml.safe_dump({"_config_version": version}))
def test_clears_token_on_upgrade_to_v9(self, tmp_path):
"""ANTHROPIC_TOKEN is cleared unconditionally when upgrading to v9."""
self._write_config_version(tmp_path, 8)
(tmp_path / ".env").write_text("ANTHROPIC_TOKEN=old-token\n")
with patch.dict(os.environ, {
"HERMES_HOME": str(tmp_path),
"ANTHROPIC_TOKEN": "old-token",
}):
migrate_config(interactive=False, quiet=True)
assert load_env().get("ANTHROPIC_TOKEN") == ""
def test_skips_on_version_9_or_later(self, tmp_path):
"""Already at v9 — ANTHROPIC_TOKEN is not touched."""
self._write_config_version(tmp_path, 9)
(tmp_path / ".env").write_text("ANTHROPIC_TOKEN=current-token\n")
with patch.dict(os.environ, {
"HERMES_HOME": str(tmp_path),
"ANTHROPIC_TOKEN": "current-token",
}):
migrate_config(interactive=False, quiet=True)
assert load_env().get("ANTHROPIC_TOKEN") == "current-token"
+3 -3
View File
@@ -39,7 +39,7 @@ def test_systemd_status_warns_when_linger_disabled(monkeypatch, tmp_path, capsys
monkeypatch.setattr(gateway, "get_systemd_linger_status", lambda: (False, ""))
def fake_run(cmd, capture_output=False, text=False, check=False):
if cmd[:4] == ["systemctl", "--user", "status", gateway.SERVICE_NAME]:
if cmd[:4] == ["systemctl", "--user", "status", gateway.get_service_name()]:
return SimpleNamespace(returncode=0, stdout="", stderr="")
if cmd[:3] == ["systemctl", "--user", "is-active"]:
return SimpleNamespace(returncode=0, stdout="active\n", stderr="")
@@ -76,7 +76,7 @@ def test_systemd_install_checks_linger_status(monkeypatch, tmp_path, capsys):
assert unit_path.exists()
assert [cmd for cmd, _ in calls] == [
["systemctl", "--user", "daemon-reload"],
["systemctl", "--user", "enable", gateway.SERVICE_NAME],
["systemctl", "--user", "enable", gateway.get_service_name()],
]
assert helper_calls == [True]
assert "User service installed and enabled" in out
@@ -110,7 +110,7 @@ def test_systemd_install_system_scope_skips_linger_and_uses_systemctl(monkeypatc
assert unit_path.read_text(encoding="utf-8") == "scope=True user=alice\n"
assert [cmd for cmd, _ in calls] == [
["systemctl", "daemon-reload"],
["systemctl", "enable", gateway.SERVICE_NAME],
["systemctl", "enable", gateway.get_service_name()],
]
assert helper_calls == []
assert "Configured to run as: alice" not in out # generated test unit has no User= line
+1 -1
View File
@@ -114,7 +114,7 @@ def test_systemd_install_calls_linger_helper(monkeypatch, tmp_path, capsys):
assert unit_path.exists()
assert [cmd for cmd, _ in calls] == [
["systemctl", "--user", "daemon-reload"],
["systemctl", "--user", "enable", gateway.SERVICE_NAME],
["systemctl", "--user", "enable", gateway.get_service_name()],
]
assert helper_calls == [True]
assert "User service installed and enabled" in out

Some files were not shown because too many files have changed in this diff Show More