Compare commits

...

111 Commits

Author SHA1 Message Date
teknium1 d6ab35c1a3 fix(signal): align send() signature with base class (content, reply_to, metadata)
Signal's send() used 'text' instead of 'content' and 'reply_to_message_id'
instead of 'reply_to', mismatching BasePlatformAdapter.send(). Callers in
gateway/run.py use keyword args matching the base interface, so Signal's
send() was missing its required 'text' positional arg.

Fixes: 'SignalAdapter.send() missing 1 required positional argument: text'
2026-03-10 15:18:26 -07:00
teknium1 cea78c5e27 fix(gateway): add metadata param to _keep_typing and base send_typing
_keep_typing() was called with metadata= for thread-aware typing
indicators, but neither it nor the base send_typing() accepted
that parameter. Most adapter overrides (Slack, Discord, Telegram,
WhatsApp, HA) already accept metadata=None, but the base class
and Signal adapter did not.

- Add metadata=None to BasePlatformAdapter.send_typing()
- Add metadata=None to BasePlatformAdapter._keep_typing(), pass through
- Add metadata=None to SignalAdapter.send_typing()

Fixes TypeError in _process_message_background for Signal.
2026-03-10 15:08:40 -07:00
teknium1 d04b9f4dc5 fix(signal): use media_urls/media_types instead of non-existent image_paths/audio_path/document_paths
The Signal adapter was passing image_paths, audio_path, and document_paths
to MessageEvent.__init__(), but those fields don't exist on the dataclass.
MessageEvent uses media_urls (List[str]) and media_types (List[str]).

Changes:
- Replace separate image_paths/audio_path/document_paths with unified
  media_urls and media_types lists (matching Discord, Slack, etc.)
- Add _ext_to_mime() helper to map file extensions to MIME types
- Use Signal's contentType from attachment metadata when available,
  falling back to extension-based mapping
- Update message type detection to check media_types prefixes

Fixes TypeError: MessageEvent.__init__() got an unexpected keyword
argument 'image_paths'
2026-03-10 14:58:16 -07:00
teknium1 8eefbef91c fix: replace ANSI response box with Rich Panel + reduce widget flashing
Major UX improvements:

1. Response box now uses a Rich Panel rendered through ChatConsole
   instead of hand-rolled ANSI box-drawing borders. Rich Panels
   adapt to terminal width at render time, wrap content inside
   the borders properly, and use skin colors natively.

2. ChatConsole now reads terminal width at render time via
   shutil.get_terminal_size() instead of defaulting to 80 cols.
   All Rich output adapts to the current terminal size.

3. User-input separator reduced to fixed 40-char width so it
   never wraps regardless of terminal resize.

4. Approval and clarify countdown repaints throttled to every 5s
   (was 1s), dramatically reducing flicker in Kitty/ghostty.
   Selection changes still trigger instant repaints via key bindings.

5. Sudo widget now uses dynamic _panel_box_width() instead of
   hardcoded border strings.

Tests: 2860 passed.
2026-03-10 07:04:02 -07:00
teknium1 e590caf8d8 Revert "Merge PR #702: feat: configurable embedding infrastructure — local (fastembed) + API (OpenAI)"
This reverts commit 46b95ee694, reversing
changes made to 0fdeffe6c4.
2026-03-10 07:00:54 -07:00
teknium1 46b95ee694 Merge PR #702: feat: configurable embedding infrastructure — local (fastembed) + API (OpenAI)
Authored by teyrebaz33. Adds agent/embeddings.py with Embedder protocol,
FastEmbedEmbedder (local, 384d), OpenAIEmbedder (API, 1536d), factory,
and cosine similarity utilities. 30 tests. Optional fastembed dependency.
Infrastructure for #509 (cognitive memory) and #489 (semantic search).
Closes #675.
2026-03-10 06:59:22 -07:00
teknium1 0fdeffe6c4 fix: replace silent exception swallowing with debug logging across tools
Add logger.debug() calls to 27 bare 'except: pass' blocks across 7 core
files, giving visibility into errors that were previously silently
swallowed. This makes it much easier to diagnose user-reported issues
from debug logs.

Files changed:
- tools/terminal_tool.py: 5 catches (stat, termios, fd close, cleanup)
- tools/delegate_tool.py: 7 catches + added logger (spinner, callbacks)
- tools/browser_tool.py: 5 catches (screenshot/recording cleanup, daemon kill)
- tools/code_execution_tool.py: 2 remaining catches (socket, server close)
- gateway/session.py: 2 catches (platform enum parse, temp file cleanup)
- agent/display.py: 2 catches + added logger (JSON parse in failure detect)
- agent/prompt_builder.py: 1 catch (skill description read)

Deliberately kept bare pass for:
- ImportError checks for optional dependencies (terminal_tool.py)
- SystemExit/KeyboardInterrupt handlers
- Spinner _write catch (would spam on every frame when stdout closed)
- process_registry PID-alive check (canonical os.kill(pid,0) pattern)

Extends the pattern from PR #686 (@aydnOktay).
2026-03-10 06:59:20 -07:00
teyrebaz33 cc4ead999a feat: configurable embedding infrastructure — local (fastembed) + API (OpenAI) (#675)
- Add agent/embeddings.py with Embedder protocol, FastEmbedEmbedder, OpenAIEmbedder
- Factory function get_embedder() reads provider from config.yaml embeddings section
- Lazy initialization — no startup impact, model loaded on first embed call
- cosine_similarity() and cosine_similarity_matrix() utility functions included
- Add fastembed as optional dependency in pyproject.toml
- 30 unit tests, all passing

Closes #675
2026-03-10 06:56:18 -07:00
teknium1 60cba55d82 Merge PR #701: fix: tool call repair — auto-lowercase, fuzzy match, helpful error on unknown tool
Authored by teyrebaz33. Adds _repair_tool_call() method: tries lowercase,
normalize (hyphens/spaces → underscores), then fuzzy match (difflib, 0.7
cutoff). Replaces hard abort after 3 retries with graceful error message
sent back to model for self-correction. Fixed bug where valid tool calls
in a mixed batch would get no results (now all get results).
Fixes #520.
2026-03-10 06:54:17 -07:00
teyrebaz33 1caee06b22 fix: tool call repair — auto-lowercase, fuzzy match, helpful error on unknown tool (#520)
- Add _repair_tool_call(): tries lowercase, normalize, then fuzzy match (difflib 0.7)
- Replace 3-retry-then-abort with graceful error: model receives helpful message and self-corrects
- Conversation stays alive instead of dying on hallucinated tool names

Closes #520
2026-03-10 06:54:11 -07:00
teknium1 a6eaf0f41f Merge PR #700: fix(config): atomic write for config.yaml to prevent data loss on crash
Authored by alireza78a. Adds atomic_yaml_write() to utils.py (mirrors
existing atomic_json_write pattern), replaces bare open('w') in
save_config(). Integrated with max_turns normalization and commented
sections via extra_content param. 3 new tests for crash safety.
2026-03-10 06:48:43 -07:00
alireza78a fadad820dd fix(config): atomic write for config.yaml to prevent data loss on crash 2026-03-10 06:48:37 -07:00
teknium1 e8b19b5826 fix: cap user-input separator at 120 cols (matches response box) 2026-03-10 06:47:26 -07:00
teknium1 9ea2209a43 fix: reduce approval/clarify widget flashing + dynamic border widths
Three UI improvements:

1. Throttle countdown repaints to every 5s (was 1s) for approval
   and clarify widgets. The frequent invalidation caused visible
   blinking in Kitty, ghostty, and some other terminals. Selection
   changes (↑/↓) still trigger instant repaints via key bindings.

2. Make echo Link2them00n. | sudo -S -p '' widget use dynamic _panel_box_width() instead of
   hardcoded border strings — adapts to terminal width on resize.

3. Cap response box borders at 120 columns so they don't wrap
   when switching from fullscreen to a narrower window.

Tests: 2857 passed.
2026-03-10 06:44:13 -07:00
teknium1 87af622df4 Merge PR #686: improve error handling and logging in code execution tool
Authored by @aydnOktay. Adds exc_info=True to exception logging, replaces
silent pass statements with logger.debug calls, fixes variable shadowing
in _kill_process_group nested except blocks.
2026-03-10 06:43:11 -07:00
teknium1 2c21c4b897 Merge PR #698: fix(security): pipe sudo password via stdin instead of shell cmdline
Authored by johnh4098. Fixes CWE-214: SUDO_PASSWORD was visible in
/proc/PID/cmdline via echo pipe. Now passed through subprocess stdin.
All 6 backends updated: local, ssh, docker, singularity pipe via stdin;
modal and daytona use printf fallback (remote sandbox, documented).
2026-03-10 06:38:44 -07:00
teknium1 771969f747 fix: wire up enabled_tools in agent loop + simplify sandbox tool selection
Completes the fix started in 8318a51 — handle_function_call() accepted
enabled_tools but run_agent.py never passed it. Now both call sites in
_execute_tool_calls() pass self.valid_tool_names, so each agent session
uses its own tool list instead of the process-global
_last_resolved_tool_names (which subagents can overwrite).

Also simplifies the redundant ternary in code_execution_tool.py:
sandbox_tools is already computed correctly (intersection with session
tools, or full SANDBOX_ALLOWED_TOOLS as fallback), so the conditional
was dead logic.

Inspired by PR #663 (JasonOA888). Closes #662.
Tests: 2857 passed.
2026-03-10 06:35:28 -07:00
johnh4098 e9742e202f fix(security): pipe sudo password via stdin instead of shell cmdline 2026-03-10 06:34:59 -07:00
teknium1 a2ea85924a Merge PR #687: fix(file_tools): pass docker_volumes to sandbox container config
Authored by manuelschipper. Adds missing docker_volumes key to
container_config in file_tools.py, matching terminal_tool.py.
Without this, Docker sandbox containers created by file operations
lack user volume mounts when file tools run before terminal.
2026-03-10 06:33:30 -07:00
teknium1 8318a519e6 fix: pass enabled_tools through handle_function_call to avoid global race
The process-global _last_resolved_tool_names gets overwritten when
subagents resolve their own toolsets, causing execute_code in the
parent agent to generate imports for the wrong set of tools.

Fix: handle_function_call() now accepts an enabled_tools parameter.
run_agent.py already passes self.valid_tool_names at both call sites.
This change makes model_tools.py actually use it, falling back to the
global only when the caller doesn't provide a list (backward compat).
2026-03-10 06:32:08 -07:00
teknium1 8ef3c815e7 Merge PR #680: feat: add Nous Portal API key provider
Authored by Indelwin. Adds 'nous-api' provider for direct API key
access to Nous Portal inference, mirroring how OpenRouter and other
API-key providers work. Includes PROVIDER_REGISTRY entry, setup wizard
option, OPTIONAL_ENV_VARS, provider aliases, and test.
Fixes #644.
2026-03-10 06:31:03 -07:00
Indelwin de07aa7c40 feat: add Nous Portal API key provider (#644)
Add support for using Nous Portal via a direct API key, mirroring
how OpenRouter and other API-key providers work. This gives users a
simpler alternative to the OAuth device-code flow when they already
have a Nous API key.

Changes:
- Add 'nous-api' to PROVIDER_REGISTRY as an api_key provider
  pointing to https://inference-api.nousresearch.com/v1
- Add NOUS_API_KEY and NOUS_BASE_URL to OPTIONAL_ENV_VARS
- Add NOUS_API_BASE_URL / NOUS_API_CHAT_URL to hermes_constants
- Add 'Nous Portal API key' as first option in setup wizard
- Add provider aliases (nous_api, nousapi, nous-portal-api)
- Add test for nous-api runtime provider resolution

Closes #644
2026-03-10 06:28:00 -07:00
teknium1 928bb16da1 fix: forward thread_id to Telegram adapter + update send_typing signatures
Part 2 of thread_id forum topic fix: add metadata param to
send_voice, send_image, send_animation, send_typing in Telegram
adapter and pass message_thread_id to all Bot API calls. Update
send_typing signature in Discord, Slack, WhatsApp, HomeAssistant
for compatibility.

Based on the fix proposed by @Bitstreamono in PR #656.
2026-03-10 06:26:32 -07:00
teknium1 441f498d6f Merge PR #679: fix(code_execution): handle empty enabled_sandbox_tools in schema description
Authored by 0xbyt4. Fixes broken 'from hermes_tools import , ...'
syntax in schema description when no sandbox tools are enabled.
Adds 29 new tests for schema generation, env var filtering,
edge cases, and interrupt handling.
2026-03-10 06:22:56 -07:00
teknium1 a630ca15de fix: forward thread_id metadata for Telegram forum topic routing
Replies in Telegram forum topics (supergroups with topics) now land in
the correct topic thread instead of 'General'.

- base.py: build thread_id metadata from event.source, pass to all
  send/media calls; add metadata param to send_typing, send_image,
  send_animation, send_voice, send_video, send_document, send_image_file,
  _keep_typing
- telegram.py: extract thread_id from metadata and pass as
  message_thread_id to all Bot API calls (send_photo, send_voice,
  send_audio, send_animation, send_chat_action)
- run.py: pass thread_id metadata to progress/streaming send calls
- discord/slack/whatsapp/homeassistant: update send_typing signature

Based on the fix proposed by @Bitstreamono in PR #656.
2026-03-10 06:21:15 -07:00
0xbyt4 52e3580cd4 refactor: merge new tests into test_code_execution.py
Move all new tests (schema, env filtering, edge cases, interrupt) into
the existing test_code_execution.py instead of a separate file.
Delete the now-redundant test_code_execution_schema.py.
2026-03-10 06:18:27 -07:00
0xbyt4 694a3ebdd5 fix(code_execution): handle empty enabled_sandbox_tools in schema description
build_execute_code_schema(set()) produced "from hermes_tools import , ..."
in the code property description — invalid Python syntax shown to the model.

This triggers when a user enables only the code_execution toolset without
any of the sandbox-allowed tools (e.g. `hermes tools code_execution`),
because SANDBOX_ALLOWED_TOOLS & {"execute_code"} = empty set.

Also adds 29 unit tests covering build_execute_code_schema, environment
variable filtering, execute_code edge cases, and interrupt handling.
2026-03-10 06:18:27 -07:00
teknium1 2a062e2f45 Merge PR #840: background process notification modes + fix spinner line spam
- feat(gateway): configurable background_process_notifications (off/result/error/all)
- fix(display): rate-limit spinner flushes to prevent line spam under patch_stdout

Background notifications inspired by @PeterFile (PR #593).
2026-03-10 06:17:18 -07:00
teknium1 49ec1c9e8f Merge PR #655: fix: normalize max turns config path
Authored by stablegenius49. Rebased onto current main, resolved 3
conflicts (load_config encoding, save_config commented sections, setup
default value), fixed missing MagicMock import, aligned DEFAULT_CONFIG
default to 90 (matching cli.py).

Migrates legacy root-level max_turns to agent.max_turns across all
config loaders (load_config, load_cli_config, save_config, setup).
Adds _normalize_max_turns_config() for consistent migration.
Fixes #634.
2026-03-10 06:05:20 -07:00
stablegenius49 4bd579f915 fix: normalize max turns config path 2026-03-10 06:05:02 -07:00
teknium1 e4adb67ed8 fix(display): rate-limit spinner flushes to prevent line spam under patch_stdout
The KawaiiSpinner animation would occasionally spam dozens of duplicate
lines instead of overwriting in-place with \r. This happened because
prompt_toolkit's StdoutProxy processes each flush() as a separate
run_in_terminal() call — when the write thread is slow (busy event loop
during long tool executions), each \r frame gets its own call, and the
terminal layout save/restore between calls breaks the \r overwrite
semantics.

Fix: rate-limit flush() calls to at most every 0.4s. Between flushes,
\r-frame writes accumulate in StdoutProxy's buffer. When flushed, they
concatenate into one string (e.g. \r frame1 \r frame2 \r frame3) and
are written in a single run_in_terminal() call where \r works correctly.

The spinner still animates (flush ~2.5x/sec) but each flush batches
~3 frames, guaranteeing the \r collapse always works. Most visible
with execute_code and terminal tools (3+ second executions).
2026-03-10 06:02:07 -07:00
teknium1 ff09cad879 Merge PR #621: fix: limit concurrent Modal sandbox creations to avoid deadlocks
Authored by voteblake.

- Semaphore limits concurrent Modal sandbox creations to 8 (configurable)
  to prevent thread pool deadlocks when 86+ tasks fire simultaneously
- Modal cleanup guard for failed init (prevents AttributeError)
- CWD override to /app for TB2 containers
- Add /home/ to host path validation for container backends
2026-03-10 05:57:54 -07:00
teknium1 580e6ba2ff feat: add proper favicon and logo for landing page and docs site
Generated favicon files (ico, 16x16, 32x32, 180x180, 192x192, 512x512)
from the Hermes Agent logo. Replaces the inline SVG caduceus emoji with
real favicon files so Google's favicon service can pick up the logo.

Landing page: updated <link> tags to reference favicon.ico, favicon PNGs,
and apple-touch-icon.
Docusaurus: updated config to use favicon.ico and logo.png instead of
favicon.svg.
2026-03-10 05:51:45 -07:00
teknium1 d6d5a43d3a Merge PR #627: fix: continue non-tool replies after output-length truncation
Authored by tripledoublev (vincent). Rebased onto current main and
conflict-resolved.

When finish_reason='length' on a non-tool chat-completions response,
instead of rolling back and returning None, the agent now:
- Appends the truncated text and a continuation prompt
- Retries up to 3 times, accumulating partial chunks
- Concatenates all chunks into the final response
- Preserves existing rollback behavior for tool-call truncations
2026-03-10 04:33:14 -07:00
teknium1 d723208b1b Merge PR #617: Improve skills tool error handling
Authored by aydnOktay. Adds logging to skills_tool.py with specific
exception handling for file read errors (UnicodeDecodeError, PermissionError)
vs unexpected exceptions, replacing bare except-and-continue blocks.
2026-03-10 04:32:26 -07:00
vincent b0a5fe8974 fix: continue after output-length truncation 2026-03-10 04:30:19 -07:00
teknium1 899dfdcfb9 Merge PR #616: fix: retry with rebuilt payload after compression
Authored by tripledoublev.

After context compression on 413/400 errors, the inner retry loop was
reusing the stale pre-compression api_messages payload. Fix breaks out
of the inner retry loop so the outer loop rebuilds api_messages from
the now-compressed messages list. Adds regression test verifying the
second request actually contains the compressed payload.
2026-03-10 04:22:42 -07:00
teknium1 8f0b07ed29 Merge PR #611: fix(session): atomic write for sessions.json to prevent data loss on crash
Authored by alireza78a.

Replaces open('w') + json.dump with tempfile.mkstemp + os.replace atomic
write pattern, matching the existing pattern in cron/jobs.py. Prevents
silent session loss if the process crashes or gets OOM-killed mid-write.

Resolved conflict: kept encoding='utf-8' from HEAD in the new fdopen call.
2026-03-10 04:18:53 -07:00
teknium1 f16f2912cf Merge PR #607: fix: reset all retry counters at start of run_conversation()
Authored by 0xbyt4. Adds missing resets for _incomplete_scratchpad_retries and _codex_incomplete_retries to prevent stale counters carrying over between CLI conversations.
2026-03-10 04:17:47 -07:00
teknium1 af748539f8 Merge PR #608: fix: remove unused imports and unnecessary f-strings
Authored by JackTheGit.

- Remove unused 'random' import from agent/display.py
- Remove unused 'Optional' import from agent/redact.py
- Remove unnecessary f-string prefixes in batch_runner.py
2026-03-10 04:16:23 -07:00
teknium1 695c017411 Merge PR #603: fix: return deny on approval callback timeout instead of None
Authored by 0xbyt4.

_approval_callback() had no return statement after the timeout break,
causing it to return None instead of 'deny'. Callers in approval.py
expect one of 'once', 'session', 'always', or 'deny'. This matches
the existing timeout behavior in approval.py:209.
2026-03-10 04:15:31 -07:00
teknium1 5e6c7bc205 Merge PR #602: fix: prevent data loss in clipboard PNG conversion when ImageMagick fails
Authored by 0xbyt4. Only deletes temp .bmp after confirmed successful conversion, restores original on failure. Adds 3 tests.
2026-03-10 04:15:05 -07:00
teknium1 e8cec55fad feat(gateway): configurable background process watcher notifications
Add display.background_process_notifications config option to control
how chatty the gateway process watcher is when using
terminal(background=true, check_interval=...) from messaging platforms.

Modes:
  - all:    running-output updates + final message (default, current behavior)
  - result: only the final completion message
  - error:  only the final message when exit code != 0
  - off:    no watcher messages at all

Also supports HERMES_BACKGROUND_NOTIFICATIONS env var override.

Includes 12 tests (5 config loading + 7 watcher behavior).

Inspired by @PeterFile's PR #593. Closes #592.
2026-03-10 04:12:39 -07:00
teknium1 67fc6bc4e9 Merge PR #600: fix(security): use in-memory set for permanent allowlist save
Authored by alireza78a. Uses _permanent_approved directly instead of re-reading from disk, preventing potential data loss if a previous save failed.
2026-03-10 04:12:11 -07:00
teknium1 cbca0225f6 Merge PR #599: fix: strip MarkdownV2 italic markers in Telegram plaintext fallback
Authored by 0xbyt4.
2026-03-10 04:09:33 -07:00
teknium1 36ac91c902 Merge PR #598: feat(skill): expand duckduckgo-search with DDGS Python API coverage
Authored by areu01or00. Adds Python DDGS library examples for text, news, images, and video search with structured return field docs.
2026-03-10 04:08:53 -07:00
teknium1 a2902fbad5 Merge PR #594: Improve TTS error handling and logging
Authored by aydnOktay. Adds specific exception handlers, ffmpeg return code checking, and exc_info logging to tts_tool.py.
2026-03-10 04:04:17 -07:00
teknium1 d03de749a1 fix: add themed hero art for all skins, fix triple-quote syntax
Each themed skin (ares, poseidon, sisyphus, charizard) now has custom
banner_hero art that replaces the default Hermes caduceus. The hero art
uses braille-dot patterns themed to each skin:
- Ares: shield/spear emblem in crimson/bronze
- Poseidon: trident with wave patterns in blue/seafoam
- Sisyphus: boulder on slope in grayscale
- Charizard: dragon silhouette in orange/ember

Also fixes triple-quote string termination that caused a syntax error
in the previous commit.
2026-03-10 03:54:12 -07:00
Dev User c3dec1dcda fix(file_tools): pass docker_volumes to sandbox container config
file_tools.py creates its own Docker sandbox when read_file/search_files
runs before any terminal command. The container_config was missing
docker_volumes, so the sandbox had no user volume mounts — breaking
access to heartbeat state, cron output, and all other mounted data.

Matches the existing pattern in terminal_tool.py:872.

Missed in original PR #158 (feat: add docker_volumes config).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 10:18:33 +01:00
teknium1 4945240fc3 feat: add poseidon/sisyphus/charizard skins + banner logo support
Adds 3 new built-in skins (poseidon, sisyphus, charizard) with full
customization — colors, spinner faces/verbs/wings, branding text, and
custom ASCII art banner logos. Total: 7 built-in skins.

Also adds banner_logo and banner_hero fields to SkinConfig, allowing
any skin to replace the HERMES-AGENT ASCII art logo and the caduceus
hero art with custom artwork. The CLI now renders the skin's logo when
available, falling back to the default Hermes logo.

Skins with custom logos: ares, poseidon, sisyphus, charizard
Skins using default logo: default, mono, slate
2026-03-10 02:11:50 -07:00
teknium1 f6bc620d39 fix: apply skin colors to local build_welcome_banner in cli.py
cli.py had a local copy of build_welcome_banner() that shadowed the
imported one from banner.py. This local copy had all colors hardcoded,
so /skin changes had no visible effect on the banner.

Now the local copy resolves skin colors at render time using
get_active_skin(), matching the banner.py behavior. All hardcoded
#FFD700/#CD7F32/#FFBF00/#B8860B/#FFF8DC/#8B8682 values in the local
function are replaced with skin-aware lookups.
2026-03-10 00:58:42 -07:00
teknium1 b4b46d1b67 docs: comprehensive skin/theme system documentation
- AGENTS.md: add Skin/Theme System section with architecture, skinnable
  elements table, built-in skins list, adding built-in/user skins guide,
  YAML example; add skin_engine.py to project structure; mention skin
  engine in CLI Architecture section
- CONTRIBUTING.md: add skin_engine.py to project structure; add 'Adding
  a Skin/Theme' section with YAML schema, activation instructions
- cli-config.yaml.example: add full skin config documentation with
  schema reference, built-in skins list, all color/spinner/branding keys
- docs/skins/example-skin.yaml: complete annotated skin template with
  all available fields and inline documentation
- hermes_cli/skin_engine.py: expand module docstring to full schema
  reference with all fields documented, usage examples, built-in skins
  list
2026-03-10 00:51:27 -07:00
teknium1 c1775de56f feat: filesystem checkpoints and /rollback command
Automatic filesystem snapshots before destructive file operations,
with user-facing rollback.  Inspired by PR #559 (by @alireza78a).

Architecture:
- Shadow git repos at ~/.hermes/checkpoints/{hash}/ via GIT_DIR
- CheckpointManager: take/list/restore, turn-scoped dedup, pruning
- Transparent — the LLM never sees it, no tool schema, no tokens
- Once per turn — only first write_file/patch triggers a snapshot

Integration:
- Config: checkpoints.enabled + checkpoints.max_snapshots
- CLI flag: hermes --checkpoints
- Trigger: run_agent.py _execute_tool_calls() before write_file/patch
- /rollback slash command in CLI + gateway (list, restore by number)
- Pre-rollback snapshot auto-created on restore (undo the undo)

Safety:
- Never blocks file operations — all errors silently logged
- Skips root dir, home dir, dirs >50K files
- Disables gracefully when git not installed
- Shadow repo completely isolated from project git

Tests: 35 new tests, all passing (2798 total suite)
Docs: feature page, config reference, CLI commands reference
2026-03-10 00:49:15 -07:00
teknium1 de6750ed23 feat: add data-driven skin/theme engine for CLI customization
Adds a skin system that lets users customize the CLI's visual appearance
through data files (YAML) rather than code changes. Skins define: color
palette, spinner faces/verbs/wings, branding text, and tool output prefix.

New files:
- hermes_cli/skin_engine.py — SkinConfig dataclass, built-in skins
  (default, ares, mono, slate), YAML loader for user skins from
  ~/.hermes/skins/, skin management API
- tests/hermes_cli/test_skin_engine.py — 26 tests covering config,
  built-in skins, user YAML skins, display integration

Modified files:
- agent/display.py — skin-aware spinner wings, faces, verbs, tool prefix
- hermes_cli/banner.py — skin-aware banner colors (title, border, accent,
  dim, text, session) via _skin_color()/_skin_branding() helpers
- cli.py — /skin command handler, skin init from config, skin-aware
  response box label and welcome message
- hermes_cli/config.py — add display.skin default
- hermes_cli/commands.py — add /skin to slash commands

Built-in skins:
- default: classic Hermes gold/kawaii
- ares: crimson/bronze war-god theme (from community PRs #579/#725)
- mono: clean grayscale
- slate: cool blue developer theme

User skins: drop a YAML file in ~/.hermes/skins/ with name, colors,
spinner, branding, and tool_prefix fields. Missing values inherit from
the default skin.
2026-03-10 00:37:28 -07:00
teknium1 c0ffd6b704 feat: expand OpenClaw migration to cover all platform channels, provider keys, model/TTS config, shared skills, and daily memory
Adds 9 new migration categories to the OpenClaw-to-Hermes migration script:

Platform channels (non-secret, in user-data preset):
- discord-settings: bot token + allowlist → .env
- slack-settings: bot/app tokens + allowlist → .env
- whatsapp-settings: allowlist → .env
- signal-settings: account, HTTP URL, allowlist → .env

Configuration:
- model-config: default model → config.yaml
- tts-config: TTS provider/voice settings → config.yaml tts.*

Data:
- shared-skills: ~/.openclaw/skills/ → ~/.hermes/skills/openclaw-imports/
- daily-memory: workspace/memory/*.md entries → merged into MEMORY.md

Secrets (full preset only, requires --migrate-secrets):
- provider-keys: OpenRouter/OpenAI/Anthropic API keys, ElevenLabs/OpenAI TTS keys

Bug fix: workspace-agents now records 'skipped' status when source is
missing instead of silently returning (invisible failure in reports).

Total migration options: 10 → 19
Tests: 14 → 24 (10 new tests covering all new categories)
Full suite: 2798 passed, 0 failures
2026-03-10 00:35:14 -07:00
teknium1 8b9de366f2 Merge PR #570: feat: OpenClaw migration skill + CLI panel width improvements
Authored by unmodeled-tyler. Adds openclaw-migration skill to optional-skills/
with migration script, SKILL.md, and 7 tests. Also improves clarify/approval
panel rendering with dynamic width calculation.
2026-03-10 00:06:40 -07:00
teknium1 60d3f79c72 Merge PR #565: fix: sanitize FTS5 queries and close mirror DB connections
Authored by 0xbyt4. Fixes #N/A (no linked issue).

- Sanitize user input before FTS5 MATCH to prevent OperationalError on
  special characters (C++, unbalanced quotes, dangling operators, etc.)
- Close SessionDB connection in mirror._append_to_sqlite() via finally block
- Added tests for both fixes
2026-03-09 23:59:26 -07:00
teknium1 6f3a673aba fix: restore success-path server_sock.close() before rpc_thread.join()
PR #568 moved the close entirely to the finally block, but the success-path
close is needed to break the RPC thread out of accept() immediately. Without
it, rpc_thread.join(3) may block for up to 3 seconds if the child process
never connected. The finally-block close remains as a safety net for the
exception/error path (the actual fd leak fix).
2026-03-09 23:40:20 -07:00
teknium1 ab6a6338c4 Merge PR #568: fix(code-execution): close server socket in finally block to prevent fd leak
Authored by alireza78a. Moves server_sock.close() into the finally block so
the socket fd is always cleaned up, even if an exception occurs between socket
creation and the success-path close.
2026-03-09 23:39:13 -07:00
teknium1 1ec8c1fcaa Merge PR #564: fix: count actual tool calls instead of tool-related messages
Authored by 0xbyt4. Fixes tool_call_count double-counting tool responses
and under-counting parallel tool calls.
2026-03-09 23:32:54 -07:00
teknium1 739eb6702e Merge PR #551: Make skill file writes atomic
Authored by aydnOktay. Adds _atomic_write_text() helper using tempfile.mkstemp()
+ os.replace() to prevent skill file corruption on crash/interrupt. All 7
write_text() calls in skill_manager_tool.py converted, including rollback writes
during security scans.
2026-03-09 23:31:43 -07:00
teknium1 1aa7badb3c fix: add missing Platform.SIGNAL to toolset mappings, update test + config docs
Platform.SIGNAL was missing from default_toolset_map and platform_config_key
in gateway/run.py, causing Signal to silently fall back to hermes-telegram
toolset (same bug as HomeAssistant, fixed in PR #538).

Also updates:
- tests/test_toolsets.py: include hermes-signal and hermes-homeassistant in
  the platform core-tools consistency check
- cli-config.yaml.example: document signal and homeassistant platform keys
2026-03-09 23:27:19 -07:00
teknium1 ee4008431a fix: stop terminal border flashing with steady cursor and TUI spinner widget
Cherry-picked and improved from PR #470 (fixes #464).

Problem: On Ubuntu 24.04 with ghostty + tmux, the prompt input box
border lines flash due to cursor blink and raw spinner terminal writes
conflicting with prompt_toolkit's rendering.

Changes:
- cli.py: Add CursorShape.BLOCK to Application() to disable cursor blink
- cli.py: Add thinking_callback + spinner_widget in TUI layout so
  thinking status displays as a proper prompt_toolkit widget instead of
  raw terminal writes that conflict with the TUI renderer
- run_agent.py: Add thinking_callback parameter to AIAgent; when set,
  uses the callback instead of KawaiiSpinner for thinking display

What was NOT changed (preserving existing behavior):
- agent/display.py: Untouched. KawaiiSpinner _write() stdout capture,
  _animate() logic, and 0.12s frame interval all preserved. This
  protects subagent stdout redirection and keeps smooth animations
  for non-CLI contexts (gateway, batch runner).
- Original emoji spinner types (brain/sparkle/pulse/moon/star) preserved
  for all non-CLI contexts.

Fixes from original PR #470:
- CursorShape.STEADY_BLOCK -> CursorShape.BLOCK (STEADY_BLOCK doesn't
  exist in prompt_toolkit 3.0.52)
- Removed duplicate self._spinner_text = '' line
- Removed redundant nested if-checks

Tested: 2706 tests pass, interactive CLI verified via tmux.
2026-03-09 23:26:43 -07:00
teknium1 88f8bcde38 Merge PR #538: fix cron HERMES_HOME path mismatch, missing HomeAssistant toolset mapping, Daytona timeout drift
Authored by Himess. Three independent fixes:
- cron/jobs.py: respect HERMES_HOME env var (consistent with scheduler.py)
- gateway/run.py: add Platform.HOMEASSISTANT to toolset mappings
- tools/environments/daytona.py: use time.monotonic() for timeout deadline
2026-03-09 23:20:52 -07:00
teknium1 2285615010 Merge PR #533: fix: use regex for search output parsing to handle Windows drive-letter paths
Authored by Himess. Replaces split(':', 2) with regex that optionally
captures Windows drive-letter prefix in rg/grep output parsing. Fixes
search_files returning zero results on Windows where paths like
C:\path\file.py:42:content were misparsed by naive colon splitting.
No behavior change on Unix/Mac.
2026-03-09 23:18:42 -07:00
teknium1 805ce8177b Merge PR #529: fix: restrict .env file permissions to owner-only
Authored by Himess. Adds 0600 chmod on ~/.hermes/.env after writing API keys,
matching the existing pattern in auth.py for auth.json.
2026-03-09 23:10:59 -07:00
teknium1 bdce33e239 Merge PR #810: fix(cli): handle unquoted multi-word session names in -c/--continue and -r/--resume 2026-03-09 23:08:45 -07:00
Teknium 9be8d88ccc Merge pull request #815 from NousResearch/hermes/hermes-5ab2a29e
Add hermes-atropos-environments bundled skill
2026-03-09 23:06:19 -07:00
teknium1 6ab3ebf195 Add hermes-atropos-environments skill (bundled)
Add comprehensive skill for building, testing, and debugging Hermes Agent
RL environments for Atropos training. Includes:

- SKILL.md: Full guide covering HermesAgentBaseEnv interface, required
  methods, config class, CLI modes (serve/process/evaluate), reward
  function patterns, common pitfalls, and minimum implementation checklist
- New 'Inference Setup' section: instructs the agent to always ask the
  user for their inference provider (OpenRouter + model choice, self-hosted
  VLLM endpoint, or other OpenAI-compatible API) before running tests
- references/agentresult-fields.md: AgentResult dataclass field reference
- references/atropos-base-env.md: Atropos BaseEnv API reference
- references/usage-patterns.md: Step-by-step patterns for process,
  evaluate, serve, and smoke test modes

Will be auto-synced to ~/.hermes/skills/ via skills_sync.
2026-03-09 23:04:17 -07:00
teknium1 0a628c1aef fix(cli): handle unquoted multi-word session names in -c/--continue and -r/--resume
When a user runs `hermes -w -c Pokemon Agent Dev` without quoting the
session name, argparse would fail with:
  error: argument command: invalid choice: 'Agent'

This is because argparse parses `-c Pokemon` (consuming one token via
nargs='?'), then sees 'Agent' and tries to match it as a subcommand.

Fix: add _coalesce_session_name_args() that pre-processes sys.argv before
argparse, joining consecutive non-flag, non-subcommand tokens after -c or
-r into a single argument. This makes both quoted and unquoted multi-word
session names work transparently.

Includes 17 tests covering all edge cases: multi-word names, single-word,
bare flags, flag ordering, subcommand boundaries, and passthrough.
2026-03-09 21:36:29 -07:00
teknium1 36328a996f Merge PR #458: Add explicit UTF-8 encoding to config/data file I/O
Authored by shitcoinsherpa. Adds encoding='utf-8' to all text-mode
open() calls in gateway/run.py, gateway/config.py, hermes_cli/config.py,
hermes_cli/main.py, and hermes_cli/status.py. Prevents encoding errors
on Windows where the default locale is not UTF-8.

Also fixed 4 additional open() calls in gateway/run.py that were added
after the PR branch was created.
2026-03-09 21:19:20 -07:00
shitcoinsherpa 4bc32dc0f1 Fix password reader for Windows using msvcrt.getwch()
The existing password prompt uses /dev/tty and termios to read input
with echo disabled. Neither exists on Windows.

On Windows, msvcrt.getwch() reads a single character from the console
without echoing it. This adds a Windows code path that uses getwch()
in a loop, collecting characters until Enter is pressed.

The Unix path using termios and /dev/tty is unchanged.
2026-03-09 21:15:59 -07:00
teknium1 4de5e017f1 Merge PR #457: Use pywinpty for PTY support on Windows
Authored by shitcoinsherpa. Imports winpty.PtyProcess on Windows instead
of ptyprocess.PtyProcess, and adds platform markers to the [pty] extra
so the correct package is installed automatically.
2026-03-09 21:09:56 -07:00
teknium1 3e352f8a0d fix: add upstream guard for non-dict function_args + tests for build_tool_preview
Complements PR #453 by 0xbyt4. Adds isinstance(dict) guard in
run_agent.py to catch cases where json.loads returns non-dict
(e.g. null, list, string) before they reach downstream code.

Also adds 15 tests for build_tool_preview covering None args,
empty dicts, known/unknown tools, fallback keys, truncation,
and all special-cased tools (process, todo, memory, session_search).
2026-03-09 21:01:40 -07:00
teknium1 28ae5db9b0 Merge PR #453: fix: handle None args in build_tool_preview
Authored by 0xbyt4. Adds defensive guard for None/empty args in
build_tool_preview() to prevent crashes when a model returns null
tool call arguments.
2026-03-09 20:58:34 -07:00
teknium1 d5811c887a Merge: fix double judge call + eval buffer pollution in WebResearchEnv 2026-03-09 20:57:54 -07:00
teknium1 975fd86dc4 fix: eliminate double LLM judge call and eval buffer pollution
evaluate() was calling _llm_judge twice per item (once via
compute_reward, once directly) — double the API cost for no benefit.
Now extracts correctness from compute_reward's buffer instead.

Also: compute_reward appends to training metric buffers during eval,
which would pollute wandb training charts. Now rolls back buffer
entries added during eval so training metrics stay clean.
2026-03-09 20:57:46 -07:00
teknium1 0ff7fe3ee2 Merge PR #439: docs: fix spelling of 'publicly'
Authored by JackTheGit. Simple typo fix: publically → publicly in axolotl reference docs.
2026-03-09 20:55:37 -07:00
teknium1 b9d55d5719 feat: add pokemon-player skill with battle-tested gameplay tips
Comprehensive skill for playing Pokemon Red/Blue via the pokemon-agent
package (NousResearch/pokemon-agent). Includes:

- Full startup procedure (uv venv, server, localhost.run dashboard tunnel)
- Save/load lifecycle and naming conventions
- Gameplay loop with emphasis on frequent vision checks
- Hard-learned navigation tips:
  - Use vision every 2-4 steps (RAM state is blind to obstacles)
  - Wait 2-3 seconds after door/stair warps for map transitions
  - Sidestep after exiting buildings to avoid re-entering
  - Hold B to speed Gen 1's slow text scrolling
  - Ledges are one-way — use vision to find gaps
- Battle strategy, type chart, Gen 1 quirks
- Memory conventions with PKM: prefix
- Progression milestones through all 8 gyms + Elite Four
2026-03-09 20:29:38 -07:00
teknium1 ab7dc22984 Merge: WebResearchEnv evaluate() with full agent loop + tools 2026-03-09 19:53:36 -07:00
teknium1 bf8350ac18 fix: evaluate() uses full agent loop with tools, not single-turn
The evaluate method was doing single-turn chat_completion (no tools),
which defeats the purpose of an agentic research benchmark. Fixed to
run the full HermesAgentLoop with web_search/web_extract tools.

Results comparison (Claude Sonnet 4.5, FRAMES benchmark):
  Without tools (broken): 0.56 mean correctness
  With agent loop + tools: 1.00 mean correctness, 0.994 reward

New eval metrics: mean_correctness, mean_reward, mean_tool_calls,
tool_usage_rate — all logged via evaluate_log() in lighteval format.
2026-03-09 19:53:28 -07:00
teknium1 a5c6348d41 Merge: WebResearchEnv compute_reward fix (verified with live test) 2026-03-09 19:29:19 -07:00
teknium1 320f881e0b fix: WebResearchEnv compute_reward extracts from AgentResult.messages
AgentResult has .messages (list of dicts), not .final_response or
.tool_calls. Fixed compute_reward to extract the final response
and tool names from the message history.

Verified with live process mode test:
  - Agent used 7 tool calls (web_search, web_extract)
  - Produced a 1106-char researched response about Winter Olympics
  - Reward: 0.384 (partial correctness via LLM judge)
  - JSONL output contains valid tokens, masks, scores, messages
2026-03-09 19:29:12 -07:00
0xbyt4 d8df91dfa8 fix: resolve merge conflict with main in clipboard.py 2026-03-09 03:50:29 +03:00
aydnOktay 7b1f40dd00 Improve error handling and logging in code execution tool 2026-03-08 14:50:23 +03:00
vincent 86eed141af fix: rebuild compressed payload before retry 2026-03-07 18:55:01 -05:00
Blake Johnson c6df39955c fix: limit concurrent Modal sandbox creations to avoid deadlocks
- Add max_concurrent_tasks config (default 8) with semaphore in TB2 eval
- Pass cwd: /app via register_task_env_overrides for TB2 tasks
- Add /home/ to host path prefixes as safety net for container backends

When all 86 TerminalBench2 tasks fire simultaneously, each creates a Modal sandbox
via asyncio.run() inside a thread pool worker. Modal's blocking calls deadlock
when too many are created at once. The semaphore ensures max 8 concurrent creations.

Co-Authored-By: hermes-agent[bot] <hermes-agent[bot]@users.noreply.github.com>
2026-03-07 14:02:34 -08:00
aydnOktay 19459b7623 Improve skills tool error handling 2026-03-08 00:30:49 +03:00
alireza78a b0b19fdeb1 fix(session): atomic write for sessions.json to prevent data loss on crash 2026-03-07 20:57:00 +03:30
0xbyt4 8c26a057a3 fix: reset all retry counters at start of run_conversation()
_incomplete_scratchpad_retries and _codex_incomplete_retries were not
reset at the start of run_conversation(). In CLI mode, where the same
AIAgent instance is reused across conversations, stale counters from
a previous conversation could carry over, causing premature retry
exhaustion and partial responses.
2026-03-07 20:12:08 +03:00
JackTheGit ae4644f495 Fix Ruff lint warnings (unused imports and unnecessary f-strings) 2026-03-07 17:08:09 +00:00
0xbyt4 70cffa4d3b fix: return "deny" on approval callback timeout instead of None
_approval_callback() had no return statement after the timeout break,
causing it to return None. Callers expect a string ("once", "session",
"always", or "deny"), so None could lead to undefined behavior when
approving dangerous commands.
2026-03-07 20:02:13 +03:00
0xbyt4 ee7d8c56c7 fix: prevent data loss in clipboard PNG conversion when ImageMagick fails
_convert_to_png() renamed the original file to .bmp before calling
ImageMagick convert, then unconditionally deleted the .bmp regardless
of whether convert succeeded. If convert failed, both files were gone.

- Only delete .bmp after confirmed successful conversion
- Restore original file on convert failure, timeout, or missing binary
- Add 3 tests covering failure, not-installed, and timeout scenarios
2026-03-07 20:02:12 +03:00
alireza78a 40bc7216e1 fix(security): use in-memory set for permanent allowlist save 2026-03-07 19:33:30 +03:30
0xbyt4 5cdcb9e26f fix: strip MarkdownV2 italic markers in Telegram plaintext fallback
When MarkdownV2 parsing fails, _strip_mdv2() removes escape backslashes
and bold markers (*text*) but missed italic markers (_text_). Users saw
raw underscores around italic text in the plaintext fallback.

- Add regex to strip _text_ italic markers in _strip_mdv2()
- Use word boundary lookaround to preserve snake_case identifiers
- Add tests for _strip_mdv2 covering italic, bold, snake_case, and edge cases
2026-03-07 18:55:25 +03:00
areu01or00 ce7e7fef30 docs(skill): expand duckduckgo-search with DDGS Python API coverage
Add Python DDGS library examples for all 4 search types (text, news,
images, videos) with return field documentation, quick reference table,
and validated gotchas. Reorganize to put Python API primary, CLI secondary.
Soften Firecrawl-fallback framing. All examples validated on ddgs==9.11.2.
2026-03-07 21:15:29 +05:30
aydnOktay 86caa8539c Improve TTS error handling and logging 2026-03-07 16:53:30 +03:00
Tyler 53b4b7651a Add official OpenClaw migration skill for Hermes Agent
Introduces a new OpenClaw-to-Hermes migration skill with a Python
helper script that handles importing SOUL.md, memories, user profiles,
messaging settings, command allowlists, skills, TTS assets, and
workspace instructions.

Supports two migration presets (user-data / full), three skill conflict
modes (skip / overwrite / rename), overflow file export for entries that
exceed character limits, and granular include/exclude option filtering.

Includes detailed SKILL.md agent instructions covering the clarify-tool
interaction protocol, decision-to-command mapping, post-run reporting
rules, and path resolution guidance.

Adds dynamic panel width calculation to CLI clarify/approval widgets so
panels adapt to content and terminal size.

Includes 7 new tests covering presets, include/exclude, conflict modes,
overflow exports, and skills_guard integration.
2026-03-06 18:57:12 -08:00
alireza78a a857321463 fix(code-execution): close server socket in finally block to prevent fd leak 2026-03-07 05:49:48 +03:30
0xbyt4 33cfe1515d fix: sanitize FTS5 queries and close mirror DB connections
Two bugs fixed:

1. search_messages() crashes with OperationalError when user queries
   contain FTS5 special characters (+, ", (, {, dangling AND/OR, etc).
   Added _sanitize_fts5_query() to strip dangerous operators and a
   fallback try-except for edge cases.

2. _append_to_sqlite() in mirror.py creates a new SessionDB per call
   but never closes it, leaking SQLite connections. Added finally block
   to ensure db.close() is always called.
2026-03-07 04:24:45 +03:00
0xbyt4 3b43f7267a fix: count actual tool calls instead of tool-related messages
tool_call_count was inaccurate in two ways:

1. Under-counting: an assistant message with N parallel tool calls
   (e.g. "kill the light and shut off the fan" = 2 ha_call_service)
   only incremented tool_call_count by 1 instead of N.

2. Over-counting: tool response messages (role=tool) also incremented
   tool_call_count, double-counting every tool interaction.

Combined: 2 parallel tool calls produced tool_call_count=3 (1 from
assistant + 2 from tool responses) instead of the correct value of 2.

Fix: only count from assistant messages with tool_calls, incrementing
by len(tool_calls) to handle parallel calls correctly. Tool response
messages no longer affect tool_call_count.

This impacts /insights and /usage accuracy for sessions with tool use.
2026-03-07 04:07:52 +03:00
unmodeled-tyler 1755a9e38a Design agent migration skill for Hermes Agent from OpenClaw | Run
successful dry tests with reports
2026-03-06 15:12:45 -08:00
aydnOktay 566aeaeefa Make skill file writes atomic 2026-03-07 00:49:10 +03:00
Himess 7a0544ab57 fix: three small inconsistencies across cron, gateway, and daytona
1. cron/jobs.py: respect HERMES_HOME env var for job storage path.
   scheduler.py already uses os.getenv("HERMES_HOME", ...) but jobs.py
   hardcodes Path.home() / ".hermes", causing path mismatch when
   HERMES_HOME is set.

2. gateway/run.py: add Platform.HOMEASSISTANT to default_toolset_map
   and platform_config_key. The adapter and hermes-homeassistant
   toolset both exist but the mapping dicts omit it, so HomeAssistant
   events silently fall back to the Telegram toolset.

3. tools/environments/daytona.py: use time.monotonic() for deadline
   instead of float subtraction. All other backends (docker, ssh,
   singularity, local) use monotonic clock for timeout tracking.
   The accumulator pattern (deadline -= 0.2) drifts because
   t.join(0.2) + interrupt checks take longer than 0.2s per iteration.
2026-03-06 16:52:17 +03:00
Himess 453e0677d6 fix: use regex for search output parsing to handle Windows drive-letter paths
The ripgrep/grep output parser uses `split(':', 2)` to extract
file:lineno:content from match lines. On Windows, absolute paths
contain a drive letter colon (e.g. `C:\Users\foo\bar.py:42:content`),
so `split(':', 2)` produces `["C", "\Users\...", "42:content"]`.
`int(parts[1])` then raises ValueError and the match is silently
dropped. All search results are lost on Windows.

Same category as #390 — string-based path parsing that fails on
Windows. Replace `split()` with a regex that optionally captures
the drive letter prefix: `^([A-Za-z]:)?(.*?):(\d+):(.*)$`.

Applied to both `_search_with_rg` and `_search_with_grep`.
2026-03-06 15:54:33 +03:00
Himess 32dbd31b9a fix: restrict .env file permissions to owner-only
save_env_value() writes API keys to ~/.hermes/.env but never sets file
permissions, leaving the file world-readable (0644). auth.py already
restricts auth.json to 0600 — apply the same treatment to .env.

Skipped on Windows where chmod is not effective.
2026-03-06 15:14:26 +03:00
shitcoinsherpa 81986022b7 Add explicit encoding="utf-8" to all config/data file open() calls
On Windows, open() defaults to the system locale encoding (cp1252,
cp1254, etc.) rather than UTF-8. This breaks any file containing
non-ASCII characters, and also causes crashes when writing JSON with
ensure_ascii=False.

This adds encoding="utf-8" to open() calls in:
- gateway/run.py (config.yaml reads/writes throughout)
- gateway/config.py (gateway.json and config.yaml)
- hermes_cli/config.py (config.yaml load/save)
- hermes_cli/main.py (session export with ensure_ascii=False)
- hermes_cli/status.py (jobs.json and sessions.json)
2026-03-05 17:16:04 -05:00
shitcoinsherpa dcba291d45 Use pywinpty instead of ptyprocess on Windows for PTY support
ptyprocess depends on Unix-only APIs (fork, openpty) and cannot work
on Windows at all. pywinpty provides a compatible PtyProcess interface
using the Windows ConPTY API.

This conditionally imports winpty.PtyProcess on Windows and
ptyprocess.PtyProcess on Unix. The pyproject.toml pty extra now uses
platform markers so the correct package is installed automatically.
2026-03-05 17:16:04 -05:00
shitcoinsherpa 48e65631f6 Fix auth store file lock for Windows (msvcrt) with reentrancy support
fcntl is not available on Windows. This adds msvcrt.locking as a
fallback for cross-process advisory locking on Windows.

msvcrt.locking is not reentrant within the same thread, unlike fcntl.flock.
This matters because resolve_codex_runtime_credentials holds the lock and
then calls _save_codex_tokens, which tries to acquire it again. Without
reentrancy tracking, this deadlocks on Windows after a 15-second timeout.

Uses threading.local() to track lock depth per thread, allowing nested
acquisitions to pass through without re-acquiring the underlying lock.

Also handles msvcrt-specific requirements: file must be opened in r+ mode
(not a+), must have at least 1 byte of content, and the file pointer must
be at position 0 before locking.
2026-03-05 17:16:03 -05:00
0xbyt4 14a11d24b4 fix: handle None args in build_tool_preview
When an LLM returns null/empty tool call arguments, json.loads()
produces None. build_tool_preview then crashes with
"argument of type 'NoneType' is not iterable" on the `in` check.
Return None early when args is falsy.
2026-03-05 23:09:11 +03:00
JackTheGit 71c0cd00e5 docs: fix spelling of 'publicly' 2026-03-05 16:46:21 +00:00
102 changed files with 8498 additions and 591 deletions
+102 -1
View File
@@ -31,7 +31,8 @@ hermes-agent/
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
│ ├── commands.py # Slash command definitions + SlashCommandCompleter
│ ├── callbacks.py # Terminal callbacks (clarify, sudo, approval)
── setup.py # Interactive setup wizard
── setup.py # Interactive setup wizard
│ └── skin_engine.py # Skin/theme engine — CLI visual customization
├── tools/ # Tool implementations (one file per tool)
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
│ ├── approval.py # Dangerous command detection
@@ -121,6 +122,7 @@ Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`. Re
- **Rich** for banner/panels, **prompt_toolkit** for input with autocomplete
- **KawaiiSpinner** (`agent/display.py`) — animated faces during API calls, `┊` activity feed for tool results
- `load_cli_config()` in cli.py merges hardcoded defaults + user config YAML
- **Skin engine** (`hermes_cli/skin_engine.py`) — data-driven CLI theming; initialized from `display.skin` config key at startup; skins customize banner colors, spinner faces/verbs/wings, tool prefix, response box, branding text
- `process_command()` is a method on `HermesCLI` (not in commands.py)
- Skill slash commands: `agent/skill_commands.py` scans `~/.hermes/skills/`, injects as **user message** (not system prompt) to preserve prompt caching
@@ -195,6 +197,94 @@ The registry handles schema collection, dispatch, availability checking, and err
---
## Skin/Theme System
The skin engine (`hermes_cli/skin_engine.py`) provides data-driven CLI visual customization. Skins are **pure data** — no code changes needed to add a new skin.
### Architecture
```
hermes_cli/skin_engine.py # SkinConfig dataclass, built-in skins, YAML loader
~/.hermes/skins/*.yaml # User-installed custom skins (drop-in)
```
- `init_skin_from_config()` — called at CLI startup, reads `display.skin` from config
- `get_active_skin()` — returns cached `SkinConfig` for the current skin
- `set_active_skin(name)` — switches skin at runtime (used by `/skin` command)
- `load_skin(name)` — loads from user skins first, then built-ins, then falls back to default
- Missing skin values inherit from the `default` skin automatically
### What skins customize
| Element | Skin Key | Used By |
|---------|----------|---------|
| Banner panel border | `colors.banner_border` | `banner.py` |
| Banner panel title | `colors.banner_title` | `banner.py` |
| Banner section headers | `colors.banner_accent` | `banner.py` |
| Banner dim text | `colors.banner_dim` | `banner.py` |
| Banner body text | `colors.banner_text` | `banner.py` |
| Response box border | `colors.response_border` | `cli.py` |
| Spinner faces (waiting) | `spinner.waiting_faces` | `display.py` |
| Spinner faces (thinking) | `spinner.thinking_faces` | `display.py` |
| Spinner verbs | `spinner.thinking_verbs` | `display.py` |
| Spinner wings (optional) | `spinner.wings` | `display.py` |
| Tool output prefix | `tool_prefix` | `display.py` |
| Agent name | `branding.agent_name` | `banner.py`, `cli.py` |
| Welcome message | `branding.welcome` | `cli.py` |
| Response box label | `branding.response_label` | `cli.py` |
| Prompt symbol | `branding.prompt_symbol` | `cli.py` |
### Built-in skins
- `default` — Classic Hermes gold/kawaii (the current look)
- `ares` — Crimson/bronze war-god theme with custom spinner wings
- `mono` — Clean grayscale monochrome
- `slate` — Cool blue developer-focused theme
### Adding a built-in skin
Add to `_BUILTIN_SKINS` dict in `hermes_cli/skin_engine.py`:
```python
"mytheme": {
"name": "mytheme",
"description": "Short description",
"colors": { ... },
"spinner": { ... },
"branding": { ... },
"tool_prefix": "",
},
```
### User skins (YAML)
Users create `~/.hermes/skins/<name>.yaml`:
```yaml
name: cyberpunk
description: Neon-soaked terminal theme
colors:
banner_border: "#FF00FF"
banner_title: "#00FFFF"
banner_accent: "#FF1493"
spinner:
thinking_verbs: ["jacking in", "decrypting", "uploading"]
wings:
- ["⟨⚡", "⚡⟩"]
branding:
agent_name: "Cyber Agent"
response_label: " ⚡ Cyber "
tool_prefix: "▏"
```
Activate with `/skin cyberpunk` or `display.skin: cyberpunk` in config.yaml.
---
## Important Policies
### Prompt Caching Must Not Break
@@ -210,6 +300,17 @@ Cache-breaking forces dramatically higher costs. The ONLY time we alter context
- **CLI**: Uses current directory (`.``os.getcwd()`)
- **Messaging**: Uses `MESSAGING_CWD` env var (default: home directory)
### Background Process Notifications (Gateway)
When `terminal(background=true, check_interval=...)` is used, the gateway runs a watcher that
pushes status updates to the user's chat. Control verbosity with `display.background_process_notifications`
in config.yaml (or `HERMES_BACKGROUND_NOTIFICATIONS` env var):
- `all` — running-output updates + final message (default)
- `result` — only the final completion message
- `error` — only the final message when exit code != 0
- `off` — no watcher messages at all
---
## Known Pitfalls
+52 -1
View File
@@ -139,7 +139,8 @@ hermes-agent/
│ ├── commands.py # Slash command definitions + autocomplete
│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval)
│ ├── doctor.py # Diagnostics
── skills_hub.py # Skills Hub CLI + /skills slash command
── skills_hub.py # Skills Hub CLI + /skills slash command
│ └── skin_engine.py # Skin/theme engine — data-driven CLI visual customization
├── tools/ # Tool implementations (self-registering)
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
@@ -375,6 +376,56 @@ If the field is omitted or empty, the skill loads on all platforms (backward com
---
## Adding a Skin / Theme
Hermes uses a data-driven skin system — no code changes needed to add a new skin.
**Option A: User skin (YAML file)**
Create `~/.hermes/skins/<name>.yaml`:
```yaml
name: mytheme
description: Short description of the theme
colors:
banner_border: "#HEX" # Panel border color
banner_title: "#HEX" # Panel title color
banner_accent: "#HEX" # Section header color
banner_dim: "#HEX" # Muted/dim text color
banner_text: "#HEX" # Body text color
response_border: "#HEX" # Response box border
spinner:
waiting_faces: ["(⚔)", "(⛨)"]
thinking_faces: ["(⚔)", "(⌁)"]
thinking_verbs: ["forging", "plotting"]
wings: # Optional left/right decorations
- ["⟪⚔", "⚔⟫"]
branding:
agent_name: "My Agent"
welcome: "Welcome message"
response_label: " ⚔ Agent "
prompt_symbol: "⚔ "
tool_prefix: "╎" # Tool output line prefix
```
All fields are optional — missing values inherit from the default skin.
**Option B: Built-in skin**
Add to `_BUILTIN_SKINS` dict in `hermes_cli/skin_engine.py`. Use the same schema as above but as a Python dict. Built-in skins ship with the package and are always available.
**Activating:**
- CLI: `/skin mytheme` or set `display.skin: mytheme` in config.yaml
- Config: `display: { skin: mytheme }`
See `hermes_cli/skin_engine.py` for the full schema and existing skins as examples.
---
## Cross-Platform Compatibility
Hermes runs on Linux, macOS, and Windows. When writing code that touches the OS:
+74 -6
View File
@@ -5,8 +5,8 @@ Used by AIAgent._execute_tool_calls for CLI feedback.
"""
import json
import logging
import os
import random
import sys
import threading
import time
@@ -15,6 +15,49 @@ import time
_RED = "\033[31m"
_RESET = "\033[0m"
logger = logging.getLogger(__name__)
# =========================================================================
# Skin-aware helpers (lazy import to avoid circular deps)
# =========================================================================
def _get_skin():
"""Get the active skin config, or None if not available."""
try:
from hermes_cli.skin_engine import get_active_skin
return get_active_skin()
except Exception:
return None
def get_skin_faces(key: str, default: list) -> list:
"""Get spinner face list from active skin, falling back to default."""
skin = _get_skin()
if skin:
faces = skin.get_spinner_list(key)
if faces:
return faces
return default
def get_skin_verbs() -> list:
"""Get thinking verbs from active skin."""
skin = _get_skin()
if skin:
verbs = skin.get_spinner_list("thinking_verbs")
if verbs:
return verbs
return KawaiiSpinner.THINKING_VERBS
def get_skin_tool_prefix() -> str:
"""Get tool output prefix character from active skin."""
skin = _get_skin()
if skin:
return skin.tool_prefix
return ""
# =========================================================================
# Tool preview (one-line summary of a tool call's primary argument)
@@ -22,6 +65,8 @@ _RESET = "\033[0m"
def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str:
"""Build a short preview of a tool call's primary argument for display."""
if not args:
return None
primary_args = {
"terminal": "command", "web_search": "query", "web_extract": "urls",
"read_file": "path", "write_file": "path", "patch": "path",
@@ -163,6 +208,7 @@ class KawaiiSpinner:
self.frame_idx = 0
self.start_time = None
self.last_line_len = 0
self._last_flush_time = 0.0 # Rate-limit flushes for patch_stdout compat
# Capture stdout NOW, before any redirect_stdout(devnull) from
# child agents can replace sys.stdout with a black hole.
self._out = sys.stdout
@@ -177,15 +223,34 @@ class KawaiiSpinner:
pass
def _animate(self):
# Cache skin wings at start (avoid per-frame imports)
skin = _get_skin()
wings = skin.get_spinner_wings() if skin else []
while self.running:
if os.getenv("HERMES_SPINNER_PAUSE"):
time.sleep(0.1)
continue
frame = self.spinner_frames[self.frame_idx % len(self.spinner_frames)]
elapsed = time.time() - self.start_time
line = f" {frame} {self.message} ({elapsed:.1f}s)"
if wings:
left, right = wings[self.frame_idx % len(wings)]
line = f" {left} {frame} {self.message} {right} ({elapsed:.1f}s)"
else:
line = f" {frame} {self.message} ({elapsed:.1f}s)"
pad = max(self.last_line_len - len(line), 0)
self._write(f"\r{line}{' ' * pad}", end='', flush=True)
# Rate-limit flush() calls to avoid spinner spam under
# prompt_toolkit's patch_stdout. Each flush() pushes a queue
# item that may trigger a separate run_in_terminal() call; if
# items are processed one-at-a-time the \r overwrite is lost
# and every frame appears on its own line. By flushing at
# most every 0.4s we guarantee multiple \r-frames are batched
# into a single write, so the terminal collapses them correctly.
now = time.time()
should_flush = (now - self._last_flush_time) >= 0.4
self._write(f"\r{line}{' ' * pad}", end='', flush=should_flush)
if should_flush:
self._last_flush_time = now
self.last_line_len = len(line)
self.frame_idx += 1
time.sleep(0.12)
@@ -300,7 +365,7 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
if exit_code is not None and exit_code != 0:
return True, f" [exit {exit_code}]"
except (json.JSONDecodeError, TypeError, AttributeError):
pass
logger.debug("Could not parse terminal result as JSON for exit code check")
return False, ""
# Memory-specific: distinguish "full" from real errors
@@ -310,7 +375,7 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
if data.get("success") is False and "exceed the limit" in data.get("error", ""):
return True, " [full]"
except (json.JSONDecodeError, TypeError, AttributeError):
pass
logger.debug("Could not parse memory result as JSON for capacity check")
# Generic heuristic for non-terminal tools
lower = result[:500].lower()
@@ -332,6 +397,7 @@ def get_cute_tool_message(
"""
dur = f"{duration:.1f}s"
is_failure, failure_suffix = _detect_tool_failure(tool_name, result)
skin_prefix = get_skin_tool_prefix()
def _trunc(s, n=40):
s = str(s)
@@ -342,7 +408,9 @@ def get_cute_tool_message(
return ("..." + p[-(n-3):]) if len(p) > n else p
def _wrap(line: str) -> str:
"""Append failure suffix when the tool failed."""
"""Apply skin tool prefix and failure suffix."""
if skin_prefix != "":
line = line.replace("", skin_prefix, 1)
if not is_failure:
return line
return f"{line}{failure_suffix}"
+2 -2
View File
@@ -159,8 +159,8 @@ def _read_skill_description(skill_file: Path, max_chars: int = 60) -> str:
if len(desc) > max_chars:
desc = desc[:max_chars - 3] + "..."
return desc
except Exception:
pass
except Exception as e:
logger.debug("Failed to read skill description from %s: %s", skill_file, e)
return ""
-1
View File
@@ -10,7 +10,6 @@ the first 6 and last 4 characters for debuggability.
import logging
import os
import re
from typing import Optional
logger = logging.getLogger(__name__)
+8 -8
View File
@@ -606,7 +606,7 @@ class BatchRunner:
# Create batches
self.batches = self._create_batches()
print(f"📊 Batch Runner Initialized")
print("📊 Batch Runner Initialized")
print(f" Dataset: {self.dataset_file} ({len(self.dataset)} prompts)")
print(f" Batch size: {self.batch_size}")
print(f" Total batches: {len(self.batches)}")
@@ -826,7 +826,7 @@ class BatchRunner:
print("=" * 70)
print(f" Original dataset size: {len(self.dataset):,} prompts")
print(f" Already completed: {len(skipped_indices):,} prompts")
print(f" ─────────────────────────────────────────")
print(" ─────────────────────────────────────────")
print(f" 🎯 RESUMING WITH: {len(filtered_entries):,} prompts")
print(f" New batches created: {len(batches_to_process)}")
print("=" * 70 + "\n")
@@ -888,7 +888,7 @@ class BatchRunner:
]
print(f"✅ Created {len(tasks)} batch tasks")
print(f"🚀 Starting parallel batch processing...\n")
print("🚀 Starting parallel batch processing...\n")
# Use rich Progress for better visual tracking with persistent bottom bar
# redirect_stdout/stderr lets rich manage all output so progress bar stays clean
@@ -1057,7 +1057,7 @@ class BatchRunner:
print(f"✅ Total trajectories in merged file: {total_entries - filtered_entries}")
print(f"✅ Total batch files merged: {batch_files_found}")
print(f"⏱️ Total duration: {round(time.time() - start_time, 2)}s")
print(f"\n📈 Tool Usage Statistics:")
print("\n📈 Tool Usage Statistics:")
print("-" * 70)
if total_tool_stats:
@@ -1084,7 +1084,7 @@ class BatchRunner:
# Print reasoning coverage stats
total_discarded = sum(r.get("discarded_no_reasoning", 0) for r in results)
print(f"\n🧠 Reasoning Coverage:")
print("\n🧠 Reasoning Coverage:")
print("-" * 70)
total_turns = total_reasoning_stats["total_assistant_turns"]
with_reasoning = total_reasoning_stats["turns_with_reasoning"]
@@ -1101,8 +1101,8 @@ class BatchRunner:
print(f" 🚫 Samples discarded (zero reasoning): {total_discarded:,}")
print(f"\n💾 Results saved to: {self.output_dir}")
print(f" - Trajectories: trajectories.jsonl (combined)")
print(f" - Individual batches: batch_*.jsonl (for debugging)")
print(" - Trajectories: trajectories.jsonl (combined)")
print(" - Individual batches: batch_*.jsonl (for debugging)")
print(f" - Statistics: {self.stats_file.name}")
print(f" - Checkpoint: {self.checkpoint_file.name}")
@@ -1238,7 +1238,7 @@ def main(
with open(prefill_messages_file, 'r', encoding='utf-8') as f:
prefill_messages = json.load(f)
if not isinstance(prefill_messages, list):
print(f"❌ Error: prefill_messages_file must contain a JSON array of messages")
print("❌ Error: prefill_messages_file must contain a JSON array of messages")
return
print(f"💬 Loaded {len(prefill_messages)} prefill messages from {prefill_messages_file}")
except Exception as e:
+60 -5
View File
@@ -11,6 +11,7 @@ model:
# Inference provider selection:
# "auto" - Use Nous Portal if logged in, otherwise OpenRouter/env vars (default)
# "nous-api" - Use Nous Portal via API key (requires: NOUS_API_KEY)
# "openrouter" - Always use OpenRouter API key from OPENROUTER_API_KEY
# "nous" - Always use Nous Portal (requires: hermes login)
# "zai" - Use z.ai / ZhipuAI GLM models (requires: GLM_API_KEY)
@@ -402,11 +403,13 @@ agent:
# discord: [web, vision, skills, todo]
#
# If not set, defaults are:
# cli: hermes-cli (everything + cronjob management)
# telegram: hermes-telegram (terminal, file, web, vision, image, tts, browser, skills, todo, cronjob, messaging)
# discord: hermes-discord (same as telegram)
# whatsapp: hermes-whatsapp (same as telegram)
# slack: hermes-slack (same as telegram)
# cli: hermes-cli (everything + cronjob management)
# telegram: hermes-telegram (terminal, file, web, vision, image, tts, browser, skills, todo, cronjob, messaging)
# discord: hermes-discord (same as telegram)
# whatsapp: hermes-whatsapp (same as telegram)
# slack: hermes-slack (same as telegram)
# signal: hermes-signal (same as telegram)
# homeassistant: hermes-homeassistant (same as telegram)
#
platform_toolsets:
cli: [hermes-cli]
@@ -414,6 +417,8 @@ platform_toolsets:
discord: [hermes-discord]
whatsapp: [hermes-whatsapp]
slack: [hermes-slack]
signal: [hermes-signal]
homeassistant: [hermes-homeassistant]
# ─────────────────────────────────────────────────────────────────────────────
# Available toolsets (use these names in platform_toolsets or the toolsets list)
@@ -651,7 +656,57 @@ display:
# Toggle at runtime with /verbose in the CLI
tool_progress: all
# Background process notifications (gateway/messaging only).
# Controls how chatty the process watcher is when you use
# terminal(background=true, check_interval=...) from Telegram/Discord/etc.
# off: No watcher messages at all
# result: Only the final completion message
# error: Only the final message when exit code != 0
# all: Running output updates + final message (default)
background_process_notifications: all
# Play terminal bell when agent finishes a response.
# Useful for long-running tasks — your terminal will ding when the agent is done.
# Works over SSH. Most terminals can be configured to flash the taskbar or play a sound.
bell_on_complete: false
# ───────────────────────────────────────────────────────────────────────────
# Skin / Theme
# ───────────────────────────────────────────────────────────────────────────
# Customize CLI visual appearance — banner colors, spinner faces, tool prefix,
# response box label, and branding text. Change at runtime with /skin <name>.
#
# Built-in skins:
# default — Classic Hermes gold/kawaii
# ares — Crimson/bronze war-god theme with spinner wings
# mono — Clean grayscale monochrome
# slate — Cool blue developer-focused
#
# Custom skins: drop a YAML file in ~/.hermes/skins/<name>.yaml
# Schema (all fields optional, missing values inherit from default):
#
# name: my-theme
# description: Short description
# colors:
# banner_border: "#HEX" # Panel border
# banner_title: "#HEX" # Panel title
# banner_accent: "#HEX" # Section headers (Available Tools, etc.)
# banner_dim: "#HEX" # Dim/muted text
# banner_text: "#HEX" # Body text (tool names, skill names)
# ui_accent: "#HEX" # UI accent color
# response_border: "#HEX" # Response box border color
# spinner:
# waiting_faces: ["(⚔)", "(⛨)"] # Faces shown while waiting
# thinking_faces: ["(⚔)", "(⌁)"] # Faces shown while thinking
# thinking_verbs: ["forging", "plotting"] # Verbs for spinner messages
# wings: # Optional left/right spinner decorations
# - ["⟪⚔", "⚔⟫"]
# - ["⟪▲", "▲⟫"]
# branding:
# agent_name: "My Agent" # Banner title and branding
# welcome: "Welcome message" # Shown at CLI startup
# response_label: " ⚔ Agent " # Response box header label
# prompt_symbol: "⚔ " # Prompt symbol
# tool_prefix: "╎" # Tool output line prefix (default: ┊)
#
skin: default
+337 -80
View File
@@ -19,6 +19,7 @@ import sys
import json
import atexit
import uuid
import textwrap
from pathlib import Path
from datetime import datetime
from typing import List, Dict, Any, Optional
@@ -45,6 +46,11 @@ from prompt_toolkit.widgets import TextArea
from prompt_toolkit.key_binding import KeyBindings
from prompt_toolkit import print_formatted_text as _pt_print
from prompt_toolkit.formatted_text import ANSI as _PT_ANSI
try:
from prompt_toolkit.cursor_shapes import CursorShape
_STEADY_CURSOR = CursorShape.BLOCK # Non-blinking block cursor
except (ImportError, AttributeError):
_STEADY_CURSOR = None
import threading
import queue
@@ -196,6 +202,7 @@ def load_cli_config() -> Dict[str, Any]:
"display": {
"compact": False,
"resume_display": "full",
"skin": "default",
},
"clarify": {
"timeout": 120, # Seconds to wait for a clarify answer before auto-proceeding
@@ -250,8 +257,13 @@ def load_cli_config() -> Dict[str, Any]:
if key not in defaults and key != "model":
defaults[key] = file_config[key]
# Handle root-level max_turns (backwards compat) - copy to agent.max_turns
if "max_turns" in file_config and "agent" not in file_config:
# Handle legacy root-level max_turns (backwards compat) - copy to
# agent.max_turns whenever the nested key is missing.
agent_file_config = file_config.get("agent")
if "max_turns" in file_config and not (
isinstance(agent_file_config, dict)
and agent_file_config.get("max_turns") is not None
):
defaults["agent"]["max_turns"] = file_config["max_turns"]
except Exception as e:
logger.warning("Failed to load cli-config.yaml: %s", e)
@@ -377,6 +389,13 @@ def load_cli_config() -> Dict[str, Any]:
# Load configuration at module startup
CLI_CONFIG = load_cli_config()
# Initialize the skin engine from config
try:
from hermes_cli.skin_engine import init_skin_from_config
init_skin_from_config(CLI_CONFIG)
except Exception:
pass # Skin engine is optional — default skin used if unavailable
from rich.console import Console
from rich.panel import Panel
from rich.table import Table
@@ -695,6 +714,8 @@ class ChatConsole:
def print(self, *args, **kwargs):
self._buffer.seek(0)
self._buffer.truncate()
# Read terminal width at render time so panels adapt to current size
self._inner.width = shutil.get_terminal_size((80, 24)).columns
self._inner.print(*args, **kwargs)
output = self._buffer.getvalue()
for line in output.rstrip("\n").split("\n"):
@@ -828,25 +849,43 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
layout_table.add_column("right", justify="left")
# Build left content: caduceus + model info
left_lines = ["", HERMES_CADUCEUS, ""]
# Resolve skin colors for the banner
try:
from hermes_cli.skin_engine import get_active_skin
_bskin = get_active_skin()
_accent = _bskin.get_color("banner_accent", "#FFBF00")
_dim = _bskin.get_color("banner_dim", "#B8860B")
_text = _bskin.get_color("banner_text", "#FFF8DC")
_session_c = _bskin.get_color("session_border", "#8B8682")
_title_c = _bskin.get_color("banner_title", "#FFD700")
_border_c = _bskin.get_color("banner_border", "#CD7F32")
_agent_name = _bskin.get_branding("agent_name", "Hermes Agent")
except Exception:
_bskin = None
_accent, _dim, _text = "#FFBF00", "#B8860B", "#FFF8DC"
_session_c, _title_c, _border_c = "#8B8682", "#FFD700", "#CD7F32"
_agent_name = "Hermes Agent"
_hero = _bskin.banner_hero if hasattr(_bskin, 'banner_hero') and _bskin.banner_hero else HERMES_CADUCEUS
left_lines = ["", _hero, ""]
# Shorten model name for display
model_short = model.split("/")[-1] if "/" in model else model
if len(model_short) > 28:
model_short = model_short[:25] + "..."
ctx_str = f" [dim #B8860B]·[/] [dim #B8860B]{_format_context_length(context_length)} context[/]" if context_length else ""
left_lines.append(f"[#FFBF00]{model_short}[/]{ctx_str} [dim #B8860B]·[/] [dim #B8860B]Nous Research[/]")
left_lines.append(f"[dim #B8860B]{cwd}[/]")
ctx_str = f" [dim {_dim}]·[/] [dim {_dim}]{_format_context_length(context_length)} context[/]" if context_length else ""
left_lines.append(f"[{_accent}]{model_short}[/]{ctx_str} [dim {_dim}]·[/] [dim {_dim}]Nous Research[/]")
left_lines.append(f"[dim {_dim}]{cwd}[/]")
# Add session ID if provided
if session_id:
left_lines.append(f"[dim #8B8682]Session: {session_id}[/]")
left_lines.append(f"[dim {_session_c}]Session: {session_id}[/]")
left_content = "\n".join(left_lines)
# Build right content: tools list grouped by toolset
right_lines = []
right_lines.append("[bold #FFBF00]Available Tools[/]")
right_lines.append(f"[bold {_accent}]Available Tools[/]")
# Group tools by toolset (include all possible tools, both enabled and disabled)
toolsets_dict = {}
@@ -883,7 +922,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
if name in disabled_tools:
colored_names.append(f"[red]{name}[/]")
else:
colored_names.append(f"[#FFF8DC]{name}[/]")
colored_names.append(f"[{_text}]{name}[/]")
tools_str = ", ".join(colored_names)
# Truncate if too long (accounting for markup)
@@ -905,18 +944,18 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
elif name in disabled_tools:
colored_names.append(f"[red]{name}[/]")
else:
colored_names.append(f"[#FFF8DC]{name}[/]")
colored_names.append(f"[{_text}]{name}[/]")
tools_str = ", ".join(colored_names)
right_lines.append(f"[dim #B8860B]{toolset}:[/] {tools_str}")
right_lines.append(f"[dim {_dim}]{toolset}:[/] {tools_str}")
if remaining_toolsets > 0:
right_lines.append(f"[dim #B8860B](and {remaining_toolsets} more toolsets...)[/]")
right_lines.append(f"[dim {_dim}](and {remaining_toolsets} more toolsets...)[/]")
right_lines.append("")
# Add skills section
right_lines.append("[bold #FFBF00]Available Skills[/]")
right_lines.append(f"[bold {_accent}]Available Skills[/]")
skills_by_category = _get_available_skills()
total_skills = sum(len(s) for s in skills_by_category.values())
@@ -932,12 +971,12 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
# Truncate if still too long
if len(skills_str) > 50:
skills_str = skills_str[:47] + "..."
right_lines.append(f"[dim #B8860B]{category}:[/] [#FFF8DC]{skills_str}[/]")
right_lines.append(f"[dim {_dim}]{category}:[/] [{_text}]{skills_str}[/]")
else:
right_lines.append("[dim #B8860B]No skills installed[/]")
right_lines.append(f"[dim {_dim}]No skills installed[/]")
right_lines.append("")
right_lines.append(f"[dim #B8860B]{len(tools)} tools · {total_skills} skills · /help for commands[/]")
right_lines.append(f"[dim {_dim}]{len(tools)} tools · {total_skills} skills · /help for commands[/]")
right_content = "\n".join(right_lines)
@@ -947,16 +986,17 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
# Wrap in a panel with the title
outer_panel = Panel(
layout_table,
title=f"[bold #FFD700]Hermes Agent {VERSION}[/]",
border_style="#CD7F32",
title=f"[bold {_title_c}]{_agent_name} {VERSION}[/]",
border_style=_border_c,
padding=(0, 2),
)
# Print the big HERMES-AGENT logo — skip if terminal is too narrow
# Print the big logo — use skin's custom logo if available
console.print()
term_width = shutil.get_terminal_size().columns
if term_width >= 95:
console.print(HERMES_AGENT_LOGO)
_logo = _bskin.banner_logo if hasattr(_bskin, 'banner_logo') and _bskin.banner_logo else HERMES_AGENT_LOGO
console.print(_logo)
console.print()
# Print the panel with caduceus and info
@@ -1045,6 +1085,7 @@ class HermesCLI:
verbose: bool = False,
compact: bool = False,
resume: str = None,
checkpoints: bool = False,
):
"""
Initialize the Hermes CLI.
@@ -1126,6 +1167,13 @@ class HermesCLI:
if invalid:
self.console.print(f"[bold red]Warning: Unknown toolsets: {', '.join(invalid)}[/]")
# Filesystem checkpoints: CLI flag > config
cp_cfg = CLI_CONFIG.get("checkpoints", {})
if isinstance(cp_cfg, bool):
cp_cfg = {"enabled": cp_cfg}
self.checkpoints_enabled = checkpoints or cp_cfg.get("enabled", False)
self.checkpoint_max_snapshots = cp_cfg.get("max_snapshots", 50)
# Ephemeral system prompt: env var takes precedence, then config
self.system_prompt = (
os.getenv("HERMES_EPHEMERAL_SYSTEM_PROMPT", "")
@@ -1187,6 +1235,7 @@ class HermesCLI:
# History file for persistent input recall across sessions
self._history_file = Path.home() / ".hermes_history"
self._last_invalidate: float = 0.0 # throttle UI repaints
self._spinner_text: str = "" # thinking spinner text for TUI
def _invalidate(self, min_interval: float = 0.25) -> None:
"""Throttled UI repaint — prevents terminal blinking on slow/SSH connections."""
@@ -1250,6 +1299,11 @@ class HermesCLI:
return changed
def _on_thinking(self, text: str) -> None:
"""Called by agent when thinking starts/stops. Updates TUI spinner."""
self._spinner_text = text or ""
self._invalidate()
def _ensure_runtime_credentials(self) -> bool:
"""
Ensure runtime credentials are resolved before agent use.
@@ -1388,6 +1442,9 @@ class HermesCLI:
clarify_callback=self._clarify_callback,
honcho_session_key=self.session_id,
fallback_model=self._fallback_model,
thinking_callback=self._on_thinking,
checkpoints_enabled=self.checkpoints_enabled,
checkpoint_max_snapshots=self.checkpoint_max_snapshots,
)
# Apply any pending title now that the session exists in the DB
if self._pending_title and self._session_db:
@@ -1657,6 +1714,55 @@ class HermesCLI:
self._image_counter -= 1
return False
def _handle_rollback_command(self, command: str):
"""Handle /rollback — list or restore filesystem checkpoints."""
from tools.checkpoint_manager import CheckpointManager, format_checkpoint_list
if not hasattr(self, 'agent') or not self.agent:
print(" No active agent session.")
return
mgr = self.agent._checkpoint_mgr
if not mgr.enabled:
print(" Checkpoints are not enabled.")
print(" Enable with: hermes --checkpoints")
print(" Or in config.yaml: checkpoints: { enabled: true }")
return
cwd = os.getenv("TERMINAL_CWD", os.getcwd())
parts = command.split(maxsplit=1)
arg = parts[1].strip() if len(parts) > 1 else ""
if not arg:
# List checkpoints
checkpoints = mgr.list_checkpoints(cwd)
print(format_checkpoint_list(checkpoints, cwd))
else:
# Restore by number or hash
checkpoints = mgr.list_checkpoints(cwd)
if not checkpoints:
print(f" No checkpoints found for {cwd}")
return
target_hash = None
try:
idx = int(arg) - 1 # 1-indexed for user
if 0 <= idx < len(checkpoints):
target_hash = checkpoints[idx]["hash"]
else:
print(f" Invalid checkpoint number. Use 1-{len(checkpoints)}.")
return
except ValueError:
# Try as a git hash
target_hash = arg
result = mgr.restore(cwd, target_hash)
if result["success"]:
print(f" ✅ Restored to checkpoint {result['restored_to']}: {result['reason']}")
print(f" A pre-rollback snapshot was saved automatically.")
else:
print(f"{result['error']}")
def _handle_paste_command(self):
"""Handle /paste — explicitly check clipboard for an image.
@@ -2666,6 +2772,10 @@ class HermesCLI:
self._handle_paste_command()
elif cmd_lower == "/reload-mcp":
self._reload_mcp()
elif cmd_lower.startswith("/rollback"):
self._handle_rollback_command(cmd_original)
elif cmd_lower.startswith("/skin"):
self._handle_skin_command(cmd_original)
else:
# Check for skill slash commands (/gif-search, /axolotl, etc.)
base_cmd = cmd_lower.split()[0]
@@ -2685,6 +2795,43 @@ class HermesCLI:
return True
def _handle_skin_command(self, cmd: str):
"""Handle /skin [name] — show or change the display skin."""
try:
from hermes_cli.skin_engine import list_skins, set_active_skin, get_active_skin_name
except ImportError:
print("Skin engine not available.")
return
parts = cmd.strip().split(maxsplit=1)
if len(parts) < 2 or not parts[1].strip():
# Show current skin and list available
current = get_active_skin_name()
skins = list_skins()
print(f"\n Current skin: {current}")
print(f" Available skins:")
for s in skins:
marker = "" if s["name"] == current else " "
source = f" ({s['source']})" if s["source"] == "user" else ""
print(f" {marker} {s['name']}{source}{s['description']}")
print(f"\n Usage: /skin <name>")
print(f" Custom skins: drop a YAML file in ~/.hermes/skins/\n")
return
new_skin = parts[1].strip().lower()
available = {s["name"] for s in list_skins()}
if new_skin not in available:
print(f" Unknown skin: {new_skin}")
print(f" Available: {', '.join(sorted(available))}")
return
set_active_skin(new_skin)
if save_config_value("display.skin", new_skin):
print(f" Skin set to: {new_skin} (saved)")
else:
print(f" Skin set to: {new_skin}")
print(" Note: banner colors will update on next session start.")
def _toggle_verbose(self):
"""Cycle tool progress mode: off → new → all → verbose → off."""
cycle = ["off", "new", "all", "verbose"]
@@ -2933,8 +3080,16 @@ class HermesCLI:
# Trigger prompt_toolkit repaint from this (non-main) thread
self._invalidate()
# Poll in 1-second ticks so the countdown refreshes in the UI.
# Each tick triggers an invalidate() to repaint the hint line.
# Poll for the user's response. The countdown in the hint line
# updates on each invalidate — but frequent repaints cause visible
# flicker in some terminals (Kitty, ghostty). We only refresh the
# countdown every 5 s; selection changes (↑/↓) trigger instant
# Poll for the user's response. The countdown in the hint line
# updates on each invalidate — but frequent repaints cause visible
# flicker in some terminals (Kitty, ghostty). We only refresh the
# countdown every 5 s; selection changes (↑/↓) trigger instant
# repaints via the key bindings.
_last_countdown_refresh = _time.monotonic()
while True:
try:
result = response_queue.get(timeout=1)
@@ -2944,8 +3099,14 @@ class HermesCLI:
remaining = self._clarify_deadline - _time.monotonic()
if remaining <= 0:
break
# Repaint so the countdown updates
self._invalidate()
# Only repaint every 5 s for the countdown — avoids flicker
now = _time.monotonic()
if now - _last_countdown_refresh >= 5.0:
_last_countdown_refresh = now
self._invalidate()
if now - _last_countdown_refresh >= 5.0:
_last_countdown_refresh = now
self._invalidate()
# Timed out — tear down the UI and let the agent decide
self._clarify_state = None
@@ -3025,6 +3186,9 @@ class HermesCLI:
self._invalidate()
# Same throttled countdown as _clarify_callback — repaint only
# every 5 s to avoid flicker in Kitty / ghostty / etc.
_last_countdown_refresh = _time.monotonic()
while True:
try:
result = response_queue.get(timeout=1)
@@ -3036,11 +3200,16 @@ class HermesCLI:
remaining = self._approval_deadline - _time.monotonic()
if remaining <= 0:
break
self._invalidate()
now = _time.monotonic()
if now - _last_countdown_refresh >= 5.0:
_last_countdown_refresh = now
self._invalidate()
self._approval_state = None
self._approval_deadline = 0
self._invalidate()
return "deny"
def chat(self, message, images: list = None) -> Optional[str]:
"""
Send a message to the agent and get a response.
@@ -3079,8 +3248,7 @@ class HermesCLI:
# Add user message to history
self.conversation_history.append({"role": "user", "content": message})
w = shutil.get_terminal_size().columns
_cprint(f"{_GOLD}{'' * w}{_RST}")
_cprint(f"{_GOLD}{'' * 40}{_RST}")
print(flush=True)
try:
@@ -3155,15 +3323,25 @@ class HermesCLI:
response = response + "\n\n---\n_[Interrupted - processing new message]_"
if response:
w = shutil.get_terminal_size().columns
label = " ⚕ Hermes "
fill = w - 2 - len(label) # 2 for ╭ and ╮
top = f"{_GOLD}╭─{label}{'' * max(fill - 1, 0)}{_RST}"
bot = f"{_GOLD}{'' * (w - 2)}{_RST}"
# Use a Rich Panel for the response box — adapts to terminal
# width at render time instead of hard-coding border length.
try:
from hermes_cli.skin_engine import get_active_skin
_skin = get_active_skin()
label = _skin.get_branding("response_label", "⚕ Hermes")
_resp_color = _skin.get_color("response_border", "#CD7F32")
except Exception:
label = "⚕ Hermes"
_resp_color = "#CD7F32"
# Render box + response as a single _cprint call so
# nothing can interleave between the box borders.
_cprint(f"\n{top}\n{response}\n\n{bot}")
_chat_console = ChatConsole()
_chat_console.print(Panel(
response,
title=f"[bold]{label}[/bold]",
title_align="left",
border_style=_resp_color,
padding=(1, 2),
))
# Play terminal bell when agent finishes (if enabled).
# Works over SSH — the bell propagates to the user's terminal.
@@ -3228,7 +3406,15 @@ class HermesCLI:
if self._preload_resumed_session():
self._display_resumed_history()
self.console.print("[#FFF8DC]Welcome to Hermes Agent! Type your message or /help for commands.[/]")
try:
from hermes_cli.skin_engine import get_active_skin
_welcome_skin = get_active_skin()
_welcome_text = _welcome_skin.get_branding("welcome", "Welcome to Hermes Agent! Type your message or /help for commands.")
_welcome_color = _welcome_skin.get_color("banner_text", "#FFF8DC")
except Exception:
_welcome_text = "Welcome to Hermes Agent! Type your message or /help for commands."
_welcome_color = "#FFF8DC"
self.console.print(f"[{_welcome_color}]{_welcome_text}[/]")
self.console.print()
# State for async operation
@@ -3616,6 +3802,8 @@ class HermesCLI:
return "type password (hidden), Enter to skip"
if cli_ref._approval_state:
return ""
if cli_ref._clarify_freetext:
return "type your answer here and press Enter"
if cli_ref._clarify_state:
return ""
if cli_ref._agent_running:
@@ -3666,6 +3854,20 @@ class HermesCLI:
# right up against the top rule of the input area
return 1 if cli_ref._agent_running else 0
def get_spinner_text():
txt = cli_ref._spinner_text
if not txt:
return []
return [('class:hint', f' {txt}')]
def get_spinner_height():
return 1 if cli_ref._spinner_text else 0
spinner_widget = Window(
content=FormattedTextControl(get_spinner_text),
height=get_spinner_height,
)
spacer = Window(
content=FormattedTextControl(get_hint_text),
height=get_hint_height,
@@ -3673,6 +3875,32 @@ class HermesCLI:
# --- Clarify tool: dynamic display widget for questions + choices ---
def _panel_box_width(title: str, content_lines: list[str], min_width: int = 46, max_width: int = 76) -> int:
"""Choose a stable panel width wide enough for the title and content."""
term_cols = shutil.get_terminal_size((100, 20)).columns
longest = max([len(title)] + [len(line) for line in content_lines] + [min_width - 4])
inner = min(max(longest + 4, min_width - 2), max_width - 2, max(24, term_cols - 6))
return inner + 2 # account for the single leading/trailing spaces inside borders
def _wrap_panel_text(text: str, width: int, subsequent_indent: str = "") -> list[str]:
wrapped = textwrap.wrap(
text,
width=max(8, width),
break_long_words=False,
break_on_hyphens=False,
subsequent_indent=subsequent_indent,
)
return wrapped or [""]
def _append_panel_line(lines, border_style: str, content_style: str, text: str, box_width: int) -> None:
inner_width = max(0, box_width - 2)
lines.append((border_style, ""))
lines.append((content_style, text.ljust(inner_width)))
lines.append((border_style, "\n"))
def _append_blank_panel_line(lines, border_style: str, box_width: int) -> None:
lines.append((border_style, "" + (" " * box_width) + "\n"))
def _get_clarify_display():
"""Build styled text for the clarify question/choices panel."""
state = cli_ref._clarify_state
@@ -3682,43 +3910,62 @@ class HermesCLI:
question = state["question"]
choices = state.get("choices") or []
selected = state.get("selected", 0)
preview_lines = _wrap_panel_text(question, 60)
for i, choice in enumerate(choices):
prefix = " " if i == selected and not cli_ref._clarify_freetext else " "
preview_lines.extend(_wrap_panel_text(f"{prefix}{choice}", 60, subsequent_indent=" "))
other_label = (
" Other (type below)" if cli_ref._clarify_freetext
else " Other (type your answer)" if selected == len(choices)
else " Other (type your answer)"
)
preview_lines.extend(_wrap_panel_text(other_label, 60, subsequent_indent=" "))
box_width = _panel_box_width("Hermes needs your input", preview_lines)
inner_text_width = max(8, box_width - 2)
lines = []
# Box top border
lines.append(('class:clarify-border', '╭─ '))
lines.append(('class:clarify-title', 'Hermes needs your input'))
lines.append(('class:clarify-border', ' ─────────────────────────────\n'))
lines.append(('class:clarify-border', '\n'))
lines.append(('class:clarify-border', ' ' + ('' * max(0, box_width - len("Hermes needs your input") - 3)) + '\n'))
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
# Question text
lines.append(('class:clarify-border', ''))
lines.append(('class:clarify-question', question))
lines.append(('', '\n'))
lines.append(('class:clarify-border', '\n'))
for wrapped in _wrap_panel_text(question, inner_text_width):
_append_panel_line(lines, 'class:clarify-border', 'class:clarify-question', wrapped, box_width)
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
if cli_ref._clarify_freetext and not choices:
guidance = "Type your answer in the prompt below, then press Enter."
for wrapped in _wrap_panel_text(guidance, inner_text_width):
_append_panel_line(lines, 'class:clarify-border', 'class:clarify-choice', wrapped, box_width)
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
if choices:
# Multiple-choice mode: show selectable options
for i, choice in enumerate(choices):
lines.append(('class:clarify-border', ''))
if i == selected and not cli_ref._clarify_freetext:
lines.append(('class:clarify-selected', f' {choice}'))
else:
lines.append(('class:clarify-choice', f' {choice}'))
lines.append(('', '\n'))
style = 'class:clarify-selected' if i == selected and not cli_ref._clarify_freetext else 'class:clarify-choice'
prefix = ' ' if i == selected and not cli_ref._clarify_freetext else ' '
wrapped_lines = _wrap_panel_text(f"{prefix}{choice}", inner_text_width, subsequent_indent=" ")
for wrapped in wrapped_lines:
_append_panel_line(lines, 'class:clarify-border', style, wrapped, box_width)
# "Other" option (5th line, only shown when choices exist)
other_idx = len(choices)
lines.append(('class:clarify-border', ''))
if selected == other_idx and not cli_ref._clarify_freetext:
lines.append(('class:clarify-selected', ' Other (type your answer)'))
other_style = 'class:clarify-selected'
other_label = ' Other (type your answer)'
elif cli_ref._clarify_freetext:
lines.append(('class:clarify-active-other', ' Other (type below)'))
other_style = 'class:clarify-active-other'
other_label = ' Other (type below)'
else:
lines.append(('class:clarify-choice', ' Other (type your answer)'))
lines.append(('', '\n'))
other_style = 'class:clarify-choice'
other_label = ' Other (type your answer)'
for wrapped in _wrap_panel_text(other_label, inner_text_width, subsequent_indent=" "):
_append_panel_line(lines, 'class:clarify-border', other_style, wrapped, box_width)
lines.append(('class:clarify-border', '\n'))
lines.append(('class:clarify-border', '──────────────────────────────────────────────────\n'))
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
lines.append(('class:clarify-border', '' + ('' * box_width) + '\n'))
return lines
clarify_widget = ConditionalContainer(
@@ -3735,16 +3982,18 @@ class HermesCLI:
state = cli_ref._sudo_state
if not state:
return []
title = '🔐 Sudo Password Required'
body = 'Enter password below (hidden), or press Enter to skip'
box_width = _panel_box_width(title, [body])
inner = max(0, box_width - 2)
lines = []
lines.append(('class:sudo-border', '╭─ '))
lines.append(('class:sudo-title', '🔐 Sudo Password Required'))
lines.append(('class:sudo-border', ' ──────────────────────────\n'))
lines.append(('class:sudo-border', '\n'))
lines.append(('class:sudo-border', ''))
lines.append(('class:sudo-text', 'Enter password below (hidden), or press Enter to skip'))
lines.append(('', '\n'))
lines.append(('class:sudo-border', '\n'))
lines.append(('class:sudo-border', '╰──────────────────────────────────────────────────╯\n'))
lines.append(('class:sudo-title', title))
lines.append(('class:sudo-border', ' ' + ('' * max(0, box_width - len(title) - 3)) + '\n'))
_append_blank_panel_line(lines, 'class:sudo-border', box_width)
_append_panel_line(lines, 'class:sudo-border', 'class:sudo-text', body, box_width)
_append_blank_panel_line(lines, 'class:sudo-border', box_width)
lines.append(('class:sudo-border', '' + ('' * box_width) + '\n'))
return lines
sudo_widget = ConditionalContainer(
@@ -3773,29 +4022,32 @@ class HermesCLI:
"always": "Add to permanent allowlist",
"deny": "Deny",
}
preview_lines = _wrap_panel_text(description, 60)
preview_lines.extend(_wrap_panel_text(cmd_display, 60))
for i, choice in enumerate(choices):
prefix = ' ' if i == selected else ' '
preview_lines.extend(_wrap_panel_text(f"{prefix}{choice_labels.get(choice, choice)}", 60, subsequent_indent=" "))
box_width = _panel_box_width("⚠️ Dangerous Command", preview_lines)
inner_text_width = max(8, box_width - 2)
lines = []
lines.append(('class:approval-border', '╭─ '))
lines.append(('class:approval-title', '⚠️ Dangerous Command'))
lines.append(('class:approval-border', ' ───────────────────────────────\n'))
lines.append(('class:approval-border', '\n'))
lines.append(('class:approval-border', ''))
lines.append(('class:approval-desc', description))
lines.append(('', '\n'))
lines.append(('class:approval-border', ''))
lines.append(('class:approval-cmd', cmd_display))
lines.append(('', '\n'))
lines.append(('class:approval-border', '\n'))
lines.append(('class:approval-border', ' ' + ('' * max(0, box_width - len("⚠️ Dangerous Command") - 3)) + '\n'))
_append_blank_panel_line(lines, 'class:approval-border', box_width)
for wrapped in _wrap_panel_text(description, inner_text_width):
_append_panel_line(lines, 'class:approval-border', 'class:approval-desc', wrapped, box_width)
for wrapped in _wrap_panel_text(cmd_display, inner_text_width):
_append_panel_line(lines, 'class:approval-border', 'class:approval-cmd', wrapped, box_width)
_append_blank_panel_line(lines, 'class:approval-border', box_width)
for i, choice in enumerate(choices):
lines.append(('class:approval-border', ''))
label = choice_labels.get(choice, choice)
if i == selected:
lines.append(('class:approval-selected', f' {label}'))
else:
lines.append(('class:approval-choice', f' {label}'))
lines.append(('', '\n'))
lines.append(('class:approval-border', '\n'))
lines.append(('class:approval-border', '╰──────────────────────────────────────────────────────╯\n'))
style = 'class:approval-selected' if i == selected else 'class:approval-choice'
prefix = ' ' if i == selected else ' '
for wrapped in _wrap_panel_text(f"{prefix}{label}", inner_text_width, subsequent_indent=" "):
_append_panel_line(lines, 'class:approval-border', style, wrapped, box_width)
_append_blank_panel_line(lines, 'class:approval-border', box_width)
lines.append(('class:approval-border', '' + ('' * box_width) + '\n'))
return lines
approval_widget = ConditionalContainer(
@@ -3848,6 +4100,7 @@ class HermesCLI:
sudo_widget,
approval_widget,
clarify_widget,
spinner_widget,
spacer,
input_rule_top,
image_bar,
@@ -3902,6 +4155,7 @@ class HermesCLI:
style=style,
full_screen=False,
mouse_support=False,
**({'cursor': _STEADY_CURSOR} if _STEADY_CURSOR is not None else {}),
)
self._app = app # Store reference for clarify_callback
@@ -3970,6 +4224,7 @@ class HermesCLI:
self.chat(user_input, images=submit_images or None)
finally:
self._agent_running = False
self._spinner_text = ""
app.invalidate() # Refresh status line
except Exception as e:
@@ -4030,6 +4285,7 @@ def main(
resume: str = None,
worktree: bool = False,
w: bool = False,
checkpoints: bool = False,
):
"""
Hermes Agent CLI - Interactive AI Assistant
@@ -4134,6 +4390,7 @@ def main(
verbose=verbose,
compact=compact,
resume=resume,
checkpoints=checkpoints,
)
# Inject worktree context into agent's system prompt
+1 -1
View File
@@ -26,7 +26,7 @@ except ImportError:
# Configuration
# =============================================================================
HERMES_DIR = Path.home() / ".hermes"
HERMES_DIR = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
CRON_DIR = HERMES_DIR / "cron"
JOBS_FILE = CRON_DIR / "jobs.json"
OUTPUT_DIR = CRON_DIR / "output"
+89
View File
@@ -0,0 +1,89 @@
# ============================================================================
# Hermes Agent — Example Skin Template
# ============================================================================
#
# Copy this file to ~/.hermes/skins/<name>.yaml to create a custom skin.
# All fields are optional — missing values inherit from the default skin.
# Activate with: /skin <name> or display.skin: <name> in config.yaml
#
# See hermes_cli/skin_engine.py for the full schema reference.
# ============================================================================
# Required: unique skin name (used in /skin command and config)
name: example
description: An example custom skin — copy and modify this template
# ── Colors ──────────────────────────────────────────────────────────────────
# Hex color values for Rich markup. These control the CLI's visual palette.
colors:
# Banner panel (the startup welcome box)
banner_border: "#CD7F32" # Panel border
banner_title: "#FFD700" # Panel title text
banner_accent: "#FFBF00" # Section headers (Available Tools, Skills, etc.)
banner_dim: "#B8860B" # Dim/muted text (separators, model info)
banner_text: "#FFF8DC" # Body text (tool names, skill names)
# UI elements
ui_accent: "#FFBF00" # General accent color
ui_label: "#4dd0e1" # Labels
ui_ok: "#4caf50" # Success indicators
ui_error: "#ef5350" # Error indicators
ui_warn: "#ffa726" # Warning indicators
# Input area
prompt: "#FFF8DC" # Prompt text color
input_rule: "#CD7F32" # Horizontal rule around input
# Response box
response_border: "#FFD700" # Response box border (ANSI color)
# Session display
session_label: "#DAA520" # Session label
session_border: "#8B8682" # Session ID dim color
# ── Spinner ─────────────────────────────────────────────────────────────────
# Customize the animated spinner shown during API calls and tool execution.
spinner:
# Faces shown while waiting for the API response
waiting_faces:
- "(。◕‿◕。)"
- "(◕‿◕✿)"
- "٩(◕‿◕。)۶"
# Faces shown during extended thinking/reasoning
thinking_faces:
- "(。•́︿•̀。)"
- "(◔_◔)"
- "(¬‿¬)"
# Verbs used in spinner messages (e.g., "pondering your request...")
thinking_verbs:
- "pondering"
- "contemplating"
- "musing"
- "ruminating"
# Optional: left/right decorations around the spinner
# Each entry is a [left, right] pair. Omit entirely for no wings.
# wings:
# - ["⟪⚔", "⚔⟫"]
# - ["⟪▲", "▲⟫"]
# ── Branding ────────────────────────────────────────────────────────────────
# Text strings used throughout the CLI interface.
branding:
agent_name: "Hermes Agent" # Banner title, about display
welcome: "Welcome! Type your message or /help for commands."
goodbye: "Goodbye! ⚕" # Exit message
response_label: " ⚕ Hermes " # Response box header label
prompt_symbol: " " # Input prompt symbol
help_header: "(^_^)? Available Commands" # /help header text
# ── Tool Output ─────────────────────────────────────────────────────────────
# Character used as the prefix for tool output lines.
# Default is "┊" (thin dotted vertical line). Some alternatives:
# "╎" (light triple dash vertical)
# "▏" (left one-eighth block)
# "│" (box drawing light vertical)
# "┃" (box drawing heavy vertical)
tool_prefix: "┊"
@@ -29,6 +29,10 @@ env:
wandb_name: "terminal-bench-2"
ensure_scores_are_not_same: false
data_dir_to_save_evals: "environments/benchmarks/evals/terminal-bench-2"
# CRITICAL: Limit concurrent Modal sandbox creations to avoid deadlocks.
# Modal's blocking calls (App.lookup, etc.) deadlock when too many sandboxes
# are created simultaneously inside thread pool workers via asyncio.run().
max_concurrent_tasks: 8
openai:
base_url: "https://openrouter.ai/api/v1"
@@ -118,6 +118,15 @@ class TerminalBench2EvalConfig(HermesAgentEnvConfig):
"Tasks exceeding this are scored as FAIL. Default 30 minutes.",
)
# --- Concurrency control ---
max_concurrent_tasks: int = Field(
default=8,
description="Maximum number of tasks to run concurrently. "
"Limits concurrent Modal sandbox creations to avoid async/threading deadlocks. "
"Modal has internal limits and creating too many sandboxes simultaneously "
"causes blocking calls to deadlock inside the thread pool.",
)
# Tasks that cannot run properly on Modal and are excluded from scoring.
MODAL_INCOMPATIBLE_TASKS = {
@@ -430,7 +439,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
}
# --- 2. Register per-task Modal image override ---
register_task_env_overrides(task_id, {"modal_image": modal_image})
register_task_env_overrides(task_id, {"modal_image": modal_image, "cwd": "/app"})
logger.info(
"Task %s: registered image override for task_id %s",
task_name, task_id[:8],
@@ -733,12 +742,23 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
print(f" Tool thread pool: {self.config.tool_pool_size}")
print(f" Terminal timeout: {self.config.terminal_timeout}s/cmd")
print(f" Terminal lifetime: {self.config.terminal_lifetime}s (auto: task_timeout + 120)")
print(f" Max concurrent tasks: {self.config.max_concurrent_tasks}")
print(f"{'='*60}\n")
# Semaphore to limit concurrent Modal sandbox creations.
# Without this, all 86 tasks fire simultaneously, each creating a Modal
# sandbox via asyncio.run() inside a thread pool worker. Modal's blocking
# calls (App.lookup, etc.) deadlock when too many are created at once.
semaphore = asyncio.Semaphore(self.config.max_concurrent_tasks)
async def _eval_with_semaphore(item):
async with semaphore:
return await self._eval_with_timeout(item)
# Fire all tasks with wall-clock timeout, track live accuracy on the bar
total_tasks = len(self.all_eval_items)
eval_tasks = [
asyncio.ensure_future(self._eval_with_timeout(item))
asyncio.ensure_future(_eval_with_semaphore(item))
for item in self.all_eval_items
]
+107 -32
View File
@@ -356,10 +356,19 @@ class WebResearchEnv(HermesAgentBaseEnv):
efficiency_weight * efficiency penalizes wasteful tool usage
+ diversity_bonus source diversity (2 distinct domains)
"""
final_response: str = result.final_response or ""
tools_used: list[str] = [
tc.tool_name for tc in (result.tool_calls or [])
] if hasattr(result, "tool_calls") and result.tool_calls else []
# Extract final response from messages (last assistant message with content)
final_response = ""
tools_used: list[str] = []
for msg in reversed(result.messages):
if msg.get("role") == "assistant" and msg.get("content") and not final_response:
final_response = msg["content"]
# Collect tool names from tool call messages
if msg.get("role") == "assistant" and msg.get("tool_calls"):
for tc in msg["tool_calls"]:
fn = tc.get("function", {}) if isinstance(tc, dict) else {}
name = fn.get("name", "")
if name:
tools_used.append(name)
tool_call_count: int = result.turns_used or len(tools_used)
cfg = self.config
@@ -416,8 +425,16 @@ class WebResearchEnv(HermesAgentBaseEnv):
# ------------------------------------------------------------------
async def evaluate(self, *args, **kwargs) -> None:
"""Run evaluation on the held-out split using the agent loop."""
"""Run evaluation on the held-out split using the full agent loop with tools.
Each eval item runs through the same agent loop as training
the model can use web_search, web_extract, etc. to research answers.
This measures actual agentic research capability, not just knowledge.
"""
import time
import uuid
from environments.agent_loop import HermesAgentLoop
from environments.tool_context import ToolContext
items = self._eval_items
if not items:
@@ -427,43 +444,88 @@ class WebResearchEnv(HermesAgentBaseEnv):
eval_size = min(self.config.eval_size, len(items))
eval_items = items[:eval_size]
logger.info(f"Running eval on {len(eval_items)} questions...")
logger.info(f"Running eval on {len(eval_items)} questions (with agent loop + tools)...")
start_time = time.time()
samples = []
for item in eval_items:
# Resolve tools once for all eval items
tools, valid_names = self._resolve_tools_for_group()
for i, item in enumerate(eval_items):
task_id = str(uuid.uuid4())
logger.info(f"Eval [{i+1}/{len(eval_items)}]: {item['question'][:80]}...")
try:
# Use the base env's agent loop for eval (same as training)
prompt = self.format_prompt(item)
completion = await self.server.chat_completion(
messages=[
{"role": "system", "content": self.config.system_prompt or ""},
{"role": "user", "content": prompt},
],
n=1,
# Build messages
messages: List[Dict[str, Any]] = []
if self.config.system_prompt:
messages.append({"role": "system", "content": self.config.system_prompt})
messages.append({"role": "user", "content": self.format_prompt(item)})
# Run the full agent loop with tools
agent = HermesAgentLoop(
server=self.server,
tool_schemas=tools,
valid_tool_names=valid_names,
max_turns=self.config.max_agent_turns,
task_id=task_id,
temperature=0.0, # Deterministic for eval
max_tokens=self.config.max_token_length,
temperature=0.0,
split="eval",
extra_body=self.config.extra_body,
)
result = await agent.run(messages)
response_content = (
completion.choices[0].message.content if completion.choices else ""
)
# Extract final response and tool usage from messages
final_response = ""
tool_call_count = 0
for msg in reversed(result.messages):
if msg.get("role") == "assistant" and msg.get("content") and not final_response:
final_response = msg["content"]
if msg.get("role") == "assistant" and msg.get("tool_calls"):
tool_call_count += len(msg["tool_calls"])
# Score the response
correctness = await self._llm_judge(
question=item["question"],
expected=item["answer"],
model_answer=response_content,
# Compute reward (includes LLM judge for correctness)
# Temporarily save buffer lengths so we can extract the
# correctness score without calling judge twice, and avoid
# polluting training metric buffers with eval data.
buf_len = len(self._correctness_buffer)
ctx = ToolContext(task_id)
try:
reward = await self.compute_reward(item, result, ctx)
finally:
ctx.cleanup()
# Extract correctness from the buffer (compute_reward appended it)
# then remove eval entries from training buffers
correctness = (
self._correctness_buffer[buf_len]
if len(self._correctness_buffer) > buf_len
else 0.0
)
# Roll back buffers to avoid polluting training metrics
for buf in (
self._reward_buffer, self._correctness_buffer,
self._tool_usage_buffer, self._efficiency_buffer,
self._diversity_buffer,
):
if len(buf) > buf_len:
buf.pop()
samples.append({
"prompt": item["question"],
"response": response_content,
"response": final_response[:500],
"expected": item["answer"],
"correctness": correctness,
"reward": reward,
"tool_calls": tool_call_count,
"turns": result.turns_used,
})
logger.info(
f" → correctness={correctness:.2f}, reward={reward:.3f}, "
f"tools={tool_call_count}, turns={result.turns_used}"
)
except Exception as e:
logger.error(f"Eval error on item: {e}")
samples.append({
@@ -471,20 +533,33 @@ class WebResearchEnv(HermesAgentBaseEnv):
"response": f"ERROR: {e}",
"expected": item["answer"],
"correctness": 0.0,
"reward": 0.0,
"tool_calls": 0,
"turns": 0,
})
end_time = time.time()
# Compute metrics
# Compute aggregate metrics
correctness_scores = [s["correctness"] for s in samples]
rewards = [s["reward"] for s in samples]
tool_counts = [s["tool_calls"] for s in samples]
n = len(samples)
eval_metrics = {
"eval/mean_correctness": (
sum(correctness_scores) / len(correctness_scores)
if correctness_scores else 0.0
),
"eval/n_items": len(samples),
"eval/mean_correctness": sum(correctness_scores) / n if n else 0.0,
"eval/mean_reward": sum(rewards) / n if n else 0.0,
"eval/mean_tool_calls": sum(tool_counts) / n if n else 0.0,
"eval/tool_usage_rate": sum(1 for t in tool_counts if t > 0) / n if n else 0.0,
"eval/n_items": n,
}
logger.info(
f"Eval complete — correctness={eval_metrics['eval/mean_correctness']:.3f}, "
f"reward={eval_metrics['eval/mean_reward']:.3f}, "
f"tool_usage={eval_metrics['eval/tool_usage_rate']:.0%}"
)
await self.evaluate_log(
metrics=eval_metrics,
samples=samples,
+3 -3
View File
@@ -270,7 +270,7 @@ def load_gateway_config() -> GatewayConfig:
gateway_config_path = Path.home() / ".hermes" / "gateway.json"
if gateway_config_path.exists():
try:
with open(gateway_config_path, "r") as f:
with open(gateway_config_path, "r", encoding="utf-8") as f:
data = json.load(f)
config = GatewayConfig.from_dict(data)
except Exception as e:
@@ -283,7 +283,7 @@ def load_gateway_config() -> GatewayConfig:
import yaml
config_yaml_path = Path.home() / ".hermes" / "config.yaml"
if config_yaml_path.exists():
with open(config_yaml_path) as f:
with open(config_yaml_path, encoding="utf-8") as f:
yaml_cfg = yaml.safe_load(f) or {}
sr = yaml_cfg.get("session_reset")
if sr and isinstance(sr, dict):
@@ -441,5 +441,5 @@ def save_gateway_config(config: GatewayConfig) -> None:
gateway_config_path = Path.home() / ".hermes" / "gateway.json"
gateway_config_path.parent.mkdir(parents=True, exist_ok=True)
with open(gateway_config_path, "w") as f:
with open(gateway_config_path, "w", encoding="utf-8") as f:
json.dump(config.to_dict(), f, indent=2)
+4
View File
@@ -111,6 +111,7 @@ def _append_to_jsonl(session_id: str, message: dict) -> None:
def _append_to_sqlite(session_id: str, message: dict) -> None:
"""Append a message to the SQLite session database."""
db = None
try:
from hermes_state import SessionDB
db = SessionDB()
@@ -121,3 +122,6 @@ def _append_to_sqlite(session_id: str, message: dict) -> None:
)
except Exception as e:
logger.debug("Mirror SQLite write failed: %s", e)
finally:
if db is not None:
db.close()
+16 -6
View File
@@ -413,11 +413,12 @@ class BasePlatformAdapter(ABC):
"""
return SendResult(success=False, error="Not supported")
async def send_typing(self, chat_id: str) -> None:
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""
Send a typing indicator.
Override in subclasses if the platform supports it.
metadata: optional dict with platform-specific context (e.g. thread_id for Slack).
"""
pass
@@ -620,7 +621,7 @@ class BasePlatformAdapter(ABC):
return media, cleaned
async def _keep_typing(self, chat_id: str, interval: float = 2.0) -> None:
async def _keep_typing(self, chat_id: str, interval: float = 2.0, metadata=None) -> None:
"""
Continuously send typing indicator until cancelled.
@@ -629,7 +630,7 @@ class BasePlatformAdapter(ABC):
"""
try:
while True:
await self.send_typing(chat_id)
await self.send_typing(chat_id, metadata=metadata)
await asyncio.sleep(interval)
except asyncio.CancelledError:
pass # Normal cancellation when handler completes
@@ -687,7 +688,8 @@ class BasePlatformAdapter(ABC):
self._active_sessions[session_key] = interrupt_event
# Start continuous typing indicator (refreshes every 2 seconds)
typing_task = asyncio.create_task(self._keep_typing(event.source.chat_id))
_thread_metadata = {"thread_id": event.source.thread_id} if event.source.thread_id else None
typing_task = asyncio.create_task(self._keep_typing(event.source.chat_id, metadata=_thread_metadata))
try:
# Call the handler (this can take a while with tool calls)
@@ -711,7 +713,8 @@ class BasePlatformAdapter(ABC):
result = await self.send(
chat_id=event.source.chat_id,
content=text_content,
reply_to=event.message_id
reply_to=event.message_id,
metadata=_thread_metadata,
)
# Log send failures (don't raise - user already saw tool progress)
@@ -721,7 +724,8 @@ class BasePlatformAdapter(ABC):
fallback_result = await self.send(
chat_id=event.source.chat_id,
content=f"(Response formatting failed, plain text:)\n\n{text_content[:3500]}",
reply_to=event.message_id
reply_to=event.message_id,
metadata=_thread_metadata,
)
if not fallback_result.success:
print(f"[{self.name}] Fallback send also failed: {fallback_result.error}")
@@ -743,12 +747,14 @@ class BasePlatformAdapter(ABC):
chat_id=event.source.chat_id,
animation_url=image_url,
caption=alt_text if alt_text else None,
metadata=_thread_metadata,
)
else:
img_result = await self.send_image(
chat_id=event.source.chat_id,
image_url=image_url,
caption=alt_text if alt_text else None,
metadata=_thread_metadata,
)
if not img_result.success:
logger.error("[%s] Failed to send image: %s", self.name, img_result.error)
@@ -769,21 +775,25 @@ class BasePlatformAdapter(ABC):
media_result = await self.send_voice(
chat_id=event.source.chat_id,
audio_path=media_path,
metadata=_thread_metadata,
)
elif ext in _VIDEO_EXTS:
media_result = await self.send_video(
chat_id=event.source.chat_id,
video_path=media_path,
metadata=_thread_metadata,
)
elif ext in _IMAGE_EXTS:
media_result = await self.send_image_file(
chat_id=event.source.chat_id,
image_path=media_path,
metadata=_thread_metadata,
)
else:
media_result = await self.send_document(
chat_id=event.source.chat_id,
file_path=media_path,
metadata=_thread_metadata,
)
if not media_result.success:
+1 -1
View File
@@ -359,7 +359,7 @@ class DiscordAdapter(BasePlatformAdapter):
print(f"[{self.name}] Failed to send image attachment, falling back to URL: {e}")
return await super().send_image(chat_id, image_url, caption, reply_to)
async def send_typing(self, chat_id: str) -> None:
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Send typing indicator."""
if self._client:
try:
+1 -1
View File
@@ -419,7 +419,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
except Exception as e:
return SendResult(success=False, error=str(e))
async def send_typing(self, chat_id: str) -> None:
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""No typing indicator for Home Assistant."""
pass
+33 -22
View File
@@ -104,6 +104,20 @@ def _is_audio_ext(ext: str) -> bool:
return ext.lower() in (".mp3", ".wav", ".ogg", ".m4a", ".aac")
_EXT_TO_MIME = {
".jpg": "image/jpeg", ".jpeg": "image/jpeg", ".png": "image/png",
".gif": "image/gif", ".webp": "image/webp",
".ogg": "audio/ogg", ".mp3": "audio/mpeg", ".wav": "audio/wav",
".m4a": "audio/mp4", ".aac": "audio/aac",
".mp4": "video/mp4", ".pdf": "application/pdf", ".zip": "application/zip",
}
def _ext_to_mime(ext: str) -> str:
"""Map file extension to MIME type."""
return _EXT_TO_MIME.get(ext.lower(), "application/octet-stream")
def _render_mentions(text: str, mentions: list) -> str:
"""Replace Signal mention placeholders (\\uFFFC) with readable @identifiers.
@@ -404,9 +418,8 @@ class SignalAdapter(BasePlatformAdapter):
# Process attachments
attachments_data = data_message.get("attachments", [])
image_paths = []
audio_path = None
document_paths = []
media_urls = []
media_types = []
if attachments_data and not getattr(self, "ignore_attachments", False):
for att in attachments_data:
@@ -420,12 +433,10 @@ class SignalAdapter(BasePlatformAdapter):
try:
cached_path, ext = await self._fetch_attachment(att_id)
if cached_path:
if _is_image_ext(ext):
image_paths.append(cached_path)
elif _is_audio_ext(ext):
audio_path = cached_path
else:
document_paths.append(cached_path)
# Use contentType from Signal if available, else map from extension
content_type = att.get("contentType") or _ext_to_mime(ext)
media_urls.append(cached_path)
media_types.append(content_type)
except Exception:
logger.exception("Signal: failed to fetch attachment %s", att_id)
@@ -440,12 +451,13 @@ class SignalAdapter(BasePlatformAdapter):
chat_id_alt=group_id if is_group else None,
)
# Determine message type
# Determine message type from media
msg_type = MessageType.TEXT
if audio_path:
msg_type = MessageType.VOICE
elif image_paths:
msg_type = MessageType.IMAGE
if media_types:
if any(mt.startswith("audio/") for mt in media_types):
msg_type = MessageType.VOICE
elif any(mt.startswith("image/") for mt in media_types):
msg_type = MessageType.IMAGE
# Parse timestamp from envelope data (milliseconds since epoch)
ts_ms = envelope_data.get("timestamp", 0)
@@ -462,9 +474,8 @@ class SignalAdapter(BasePlatformAdapter):
source=source,
text=text or "",
message_type=msg_type,
image_paths=image_paths,
audio_path=audio_path,
document_paths=document_paths,
media_urls=media_urls,
media_types=media_types,
timestamp=timestamp,
)
@@ -546,16 +557,16 @@ class SignalAdapter(BasePlatformAdapter):
async def send(
self,
chat_id: str,
text: str,
reply_to_message_id: Optional[str] = None,
**kwargs,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a text message."""
await self._stop_typing_indicator(chat_id)
params: Dict[str, Any] = {
"account": self.account,
"message": text,
"message": content,
}
if chat_id.startswith("group:"):
@@ -569,7 +580,7 @@ class SignalAdapter(BasePlatformAdapter):
return SendResult(success=True)
return SendResult(success=False, error="RPC send failed")
async def send_typing(self, chat_id: str) -> None:
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Send a typing indicator."""
params: Dict[str, Any] = {
"account": self.account,
+1 -1
View File
@@ -185,7 +185,7 @@ class SlackAdapter(BasePlatformAdapter):
except Exception as e:
return SendResult(success=False, error=str(e))
async def send_typing(self, chat_id: str) -> None:
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Slack doesn't have a direct typing indicator API for bots."""
pass
+18 -2
View File
@@ -86,6 +86,9 @@ def _strip_mdv2(text: str) -> str:
cleaned = re.sub(r'\\([_*\[\]()~`>#\+\-=|{}.!\\])', r'\1', text)
# Remove MarkdownV2 bold markers that format_message converted from **bold**
cleaned = re.sub(r'\*([^*]+)\*', r'\1', cleaned)
# Remove MarkdownV2 italic markers that format_message converted from *italic*
# Use word boundary (\b) to avoid breaking snake_case like my_variable_name
cleaned = re.sub(r'(?<!\w)_([^_]+)_(?!\w)', r'\1', cleaned)
return cleaned
@@ -286,6 +289,7 @@ class TelegramAdapter(BasePlatformAdapter):
audio_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send audio as a native Telegram voice message or audio file."""
if not self._bot:
@@ -299,19 +303,23 @@ class TelegramAdapter(BasePlatformAdapter):
with open(audio_path, "rb") as audio_file:
# .ogg files -> send as voice (round playable bubble)
if audio_path.endswith(".ogg") or audio_path.endswith(".opus"):
_voice_thread = metadata.get("thread_id") if metadata else None
msg = await self._bot.send_voice(
chat_id=int(chat_id),
voice=audio_file,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=int(_voice_thread) if _voice_thread else None,
)
else:
# .mp3 and others -> send as audio file
_audio_thread = metadata.get("thread_id") if metadata else None
msg = await self._bot.send_audio(
chat_id=int(chat_id),
audio=audio_file,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=int(_audio_thread) if _audio_thread else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
@@ -352,6 +360,7 @@ class TelegramAdapter(BasePlatformAdapter):
image_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an image natively as a Telegram photo.
@@ -363,11 +372,13 @@ class TelegramAdapter(BasePlatformAdapter):
try:
# Telegram can send photos directly from URLs (up to ~5MB)
_photo_thread = metadata.get("thread_id") if metadata else None
msg = await self._bot.send_photo(
chat_id=int(chat_id),
photo=image_url,
caption=caption[:1024] if caption else None, # Telegram caption limit
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=int(_photo_thread) if _photo_thread else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
@@ -398,17 +409,20 @@ class TelegramAdapter(BasePlatformAdapter):
animation_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an animated GIF natively as a Telegram animation (auto-plays inline)."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
_anim_thread = metadata.get("thread_id") if metadata else None
msg = await self._bot.send_animation(
chat_id=int(chat_id),
animation=animation_url,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=int(_anim_thread) if _anim_thread else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
@@ -416,13 +430,15 @@ class TelegramAdapter(BasePlatformAdapter):
# Fallback: try as a regular photo
return await self.send_image(chat_id, animation_url, caption, reply_to)
async def send_typing(self, chat_id: str) -> None:
async def send_typing(self, chat_id: str, metadata: Optional[Dict[str, Any]] = None) -> None:
"""Send typing indicator."""
if self._bot:
try:
_typing_thread = metadata.get("thread_id") if metadata else None
await self._bot.send_chat_action(
chat_id=int(chat_id),
action="typing"
action="typing",
message_thread_id=int(_typing_thread) if _typing_thread else None,
)
except Exception:
pass # Ignore typing indicator failures
+1 -1
View File
@@ -493,7 +493,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
file_name or os.path.basename(file_path),
)
async def send_typing(self, chat_id: str) -> None:
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Send typing indicator via bridge."""
if not self._running:
return
+169 -42
View File
@@ -48,7 +48,7 @@ _config_path = _hermes_home / 'config.yaml'
if _config_path.exists():
try:
import yaml as _yaml
with open(_config_path) as _f:
with open(_config_path, encoding="utf-8") as _f:
_cfg = _yaml.safe_load(_f) or {}
# Top-level simple values (fallback only — don't override .env)
for _key, _val in _cfg.items():
@@ -316,7 +316,7 @@ class GatewayRunner:
import yaml as _y
cfg_path = _hermes_home / "config.yaml"
if cfg_path.exists():
with open(cfg_path) as _f:
with open(cfg_path, encoding="utf-8") as _f:
cfg = _y.safe_load(_f) or {}
file_path = cfg.get("prefill_messages_file", "")
except Exception:
@@ -354,7 +354,7 @@ class GatewayRunner:
import yaml as _y
cfg_path = _hermes_home / "config.yaml"
if cfg_path.exists():
with open(cfg_path) as _f:
with open(cfg_path, encoding="utf-8") as _f:
cfg = _y.safe_load(_f) or {}
return (cfg.get("agent", {}).get("system_prompt", "") or "").strip()
except Exception:
@@ -375,7 +375,7 @@ class GatewayRunner:
import yaml as _y
cfg_path = _hermes_home / "config.yaml"
if cfg_path.exists():
with open(cfg_path) as _f:
with open(cfg_path, encoding="utf-8") as _f:
cfg = _y.safe_load(_f) or {}
effort = str(cfg.get("agent", {}).get("reasoning_effort", "") or "").strip()
except Exception:
@@ -391,6 +391,41 @@ class GatewayRunner:
logger.warning("Unknown reasoning_effort '%s', using default (medium)", effort)
return None
@staticmethod
def _load_background_notifications_mode() -> str:
"""Load background process notification mode from config or env var.
Modes:
- ``all`` push running-output updates *and* the final message (default)
- ``result`` only the final completion message (regardless of exit code)
- ``error`` only the final message when exit code is non-zero
- ``off`` no watcher messages at all
"""
mode = os.getenv("HERMES_BACKGROUND_NOTIFICATIONS", "")
if not mode:
try:
import yaml as _y
cfg_path = _hermes_home / "config.yaml"
if cfg_path.exists():
with open(cfg_path, encoding="utf-8") as _f:
cfg = _y.safe_load(_f) or {}
raw = cfg.get("display", {}).get("background_process_notifications")
if raw is False:
mode = "off"
elif raw not in (None, ""):
mode = str(raw)
except Exception:
pass
mode = (mode or "all").strip().lower()
valid = {"all", "result", "error", "off"}
if mode not in valid:
logger.warning(
"Unknown background_process_notifications '%s', defaulting to 'all'",
mode,
)
return "all"
return mode
@staticmethod
def _load_provider_routing() -> dict:
"""Load OpenRouter provider routing preferences from config.yaml."""
@@ -398,7 +433,7 @@ class GatewayRunner:
import yaml as _y
cfg_path = _hermes_home / "config.yaml"
if cfg_path.exists():
with open(cfg_path) as _f:
with open(cfg_path, encoding="utf-8") as _f:
cfg = _y.safe_load(_f) or {}
return cfg.get("provider_routing", {}) or {}
except Exception:
@@ -416,7 +451,7 @@ class GatewayRunner:
import yaml as _y
cfg_path = _hermes_home / "config.yaml"
if cfg_path.exists():
with open(cfg_path) as _f:
with open(cfg_path, encoding="utf-8") as _f:
cfg = _y.safe_load(_f) or {}
fb = cfg.get("fallback_model", {}) or {}
if fb.get("provider") and fb.get("model"):
@@ -771,7 +806,7 @@ class GatewayRunner:
_known_commands = {"new", "reset", "help", "status", "stop", "model",
"personality", "retry", "undo", "sethome", "set-home",
"compress", "usage", "insights", "reload-mcp", "reload_mcp",
"update", "title", "resume", "provider"}
"update", "title", "resume", "provider", "rollback"}
if command and command in _known_commands:
await self.hooks.emit(f"command:{command}", {
"platform": source.platform.value if source.platform else "",
@@ -830,6 +865,9 @@ class GatewayRunner:
if command == "resume":
return await self._handle_resume_command(event)
if command == "rollback":
return await self._handle_rollback_command(event)
# Skill slash commands: /skill-name loads the skill and sends to agent
if command:
@@ -931,7 +969,7 @@ class GatewayRunner:
_hyg_cfg_path = _hermes_home / "config.yaml"
if _hyg_cfg_path.exists():
import yaml as _hyg_yaml
with open(_hyg_cfg_path) as _hyg_f:
with open(_hyg_cfg_path, encoding="utf-8") as _hyg_f:
_hyg_data = _hyg_yaml.safe_load(_hyg_f) or {}
# Resolve model name (same logic as run_sync)
@@ -1400,6 +1438,7 @@ class GatewayRunner:
"`/resume [name]` — Resume a previously-named session",
"`/usage` — Show token usage for this session",
"`/insights [days]` — Show usage insights and analytics",
"`/rollback [number]` — List or restore filesystem checkpoints",
"`/reload-mcp` — Reload MCP servers from config",
"`/update` — Update Hermes Agent to the latest version",
"`/help` — Show this message",
@@ -1434,7 +1473,7 @@ class GatewayRunner:
current_provider = "openrouter"
try:
if config_path.exists():
with open(config_path) as f:
with open(config_path, encoding="utf-8") as f:
cfg = yaml.safe_load(f) or {}
model_cfg = cfg.get("model", {})
if isinstance(model_cfg, str):
@@ -1525,14 +1564,14 @@ class GatewayRunner:
try:
user_config = {}
if config_path.exists():
with open(config_path) as f:
with open(config_path, encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
if "model" not in user_config or not isinstance(user_config["model"], dict):
user_config["model"] = {}
user_config["model"]["default"] = new_model
if provider_changed:
user_config["model"]["provider"] = target_provider
with open(config_path, 'w') as f:
with open(config_path, 'w', encoding="utf-8") as f:
yaml.dump(user_config, f, default_flow_style=False, sort_keys=False)
except Exception as e:
return f"⚠️ Failed to save model change: {e}"
@@ -1569,7 +1608,7 @@ class GatewayRunner:
config_path = _hermes_home / 'config.yaml'
try:
if config_path.exists():
with open(config_path) as f:
with open(config_path, encoding="utf-8") as f:
cfg = yaml.safe_load(f) or {}
model_cfg = cfg.get("model", {})
if isinstance(model_cfg, dict):
@@ -1618,7 +1657,7 @@ class GatewayRunner:
try:
if config_path.exists():
with open(config_path, 'r') as f:
with open(config_path, 'r', encoding="utf-8") as f:
config = yaml.safe_load(f) or {}
personalities = config.get("agent", {}).get("personalities", {})
else:
@@ -1647,7 +1686,7 @@ class GatewayRunner:
if "agent" not in config or not isinstance(config.get("agent"), dict):
config["agent"] = {}
config["agent"]["system_prompt"] = new_prompt
with open(config_path, 'w') as f:
with open(config_path, 'w', encoding="utf-8") as f:
yaml.dump(config, f, default_flow_style=False, sort_keys=False)
except Exception as e:
return f"⚠️ Failed to save personality change: {e}"
@@ -1731,10 +1770,10 @@ class GatewayRunner:
config_path = _hermes_home / 'config.yaml'
user_config = {}
if config_path.exists():
with open(config_path) as f:
with open(config_path, encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
user_config[env_key] = chat_id
with open(config_path, 'w') as f:
with open(config_path, 'w', encoding="utf-8") as f:
yaml.dump(user_config, f, default_flow_style=False)
# Also set in the current environment so it takes effect immediately
os.environ[env_key] = str(chat_id)
@@ -1746,6 +1785,65 @@ class GatewayRunner:
f"Cron jobs and cross-platform messages will be delivered here."
)
async def _handle_rollback_command(self, event: MessageEvent) -> str:
"""Handle /rollback command — list or restore filesystem checkpoints."""
from tools.checkpoint_manager import CheckpointManager, format_checkpoint_list
# Read checkpoint config from config.yaml
cp_cfg = {}
try:
import yaml as _y
_cfg_path = _hermes_home / "config.yaml"
if _cfg_path.exists():
with open(_cfg_path, encoding="utf-8") as _f:
_data = _y.safe_load(_f) or {}
cp_cfg = _data.get("checkpoints", {})
if isinstance(cp_cfg, bool):
cp_cfg = {"enabled": cp_cfg}
except Exception:
pass
if not cp_cfg.get("enabled", False):
return (
"Checkpoints are not enabled.\n"
"Enable in config.yaml:\n```\ncheckpoints:\n enabled: true\n```"
)
mgr = CheckpointManager(
enabled=True,
max_snapshots=cp_cfg.get("max_snapshots", 50),
)
cwd = os.getenv("MESSAGING_CWD", str(Path.home()))
arg = event.get_command_args().strip()
if not arg:
checkpoints = mgr.list_checkpoints(cwd)
return format_checkpoint_list(checkpoints, cwd)
# Restore by number or hash
checkpoints = mgr.list_checkpoints(cwd)
if not checkpoints:
return f"No checkpoints found for {cwd}"
target_hash = None
try:
idx = int(arg) - 1
if 0 <= idx < len(checkpoints):
target_hash = checkpoints[idx]["hash"]
else:
return f"Invalid checkpoint number. Use 1-{len(checkpoints)}."
except ValueError:
target_hash = arg
result = mgr.restore(cwd, target_hash)
if result["success"]:
return (
f"✅ Restored to checkpoint {result['restored_to']}: {result['reason']}\n"
f"A pre-rollback snapshot was saved automatically."
)
return f"{result['error']}"
async def _handle_compress_command(self, event: MessageEvent) -> str:
"""Handle /compress command -- manually compress conversation context."""
source = event.source
@@ -2307,6 +2405,12 @@ class GatewayRunner:
Runs as an asyncio task. Stays silent when nothing changed.
Auto-removes when the process exits or is killed.
Notification mode (from ``display.background_process_notifications``):
- ``all`` running-output updates + final message
- ``result`` final completion message only
- ``error`` final message only when exit code != 0
- ``off`` no messages at all
"""
from tools.process_registry import process_registry
@@ -2315,8 +2419,21 @@ class GatewayRunner:
session_key = watcher.get("session_key", "")
platform_name = watcher.get("platform", "")
chat_id = watcher.get("chat_id", "")
notify_mode = self._load_background_notifications_mode()
logger.debug("Process watcher started: %s (every %ss)", session_id, interval)
logger.debug("Process watcher started: %s (every %ss, notify=%s)",
session_id, interval, notify_mode)
if notify_mode == "off":
# Still wait for the process to exit so we can log it, but don't
# push any messages to the user.
while True:
await asyncio.sleep(interval)
session = process_registry.get(session_id)
if session is None or session.exited:
break
logger.debug("Process watcher ended (silent): %s", session_id)
return
last_output_len = 0
while True:
@@ -2331,27 +2448,31 @@ class GatewayRunner:
last_output_len = current_output_len
if session.exited:
# Process finished -- deliver final update
new_output = session.output_buffer[-1000:] if session.output_buffer else ""
message_text = (
f"[Background process {session_id} finished with exit code {session.exit_code}~ "
f"Here's the final output:\n{new_output}]"
# Decide whether to notify based on mode
should_notify = (
notify_mode in ("all", "result")
or (notify_mode == "error" and session.exit_code not in (0, None))
)
# Try to deliver to the originating platform
adapter = None
for p, a in self.adapters.items():
if p.value == platform_name:
adapter = a
break
if adapter and chat_id:
try:
await adapter.send(chat_id, message_text)
except Exception as e:
logger.error("Watcher delivery error: %s", e)
if should_notify:
new_output = session.output_buffer[-1000:] if session.output_buffer else ""
message_text = (
f"[Background process {session_id} finished with exit code {session.exit_code}~ "
f"Here's the final output:\n{new_output}]"
)
adapter = None
for p, a in self.adapters.items():
if p.value == platform_name:
adapter = a
break
if adapter and chat_id:
try:
await adapter.send(chat_id, message_text)
except Exception as e:
logger.error("Watcher delivery error: %s", e)
break
elif has_new_output:
# New output available -- deliver status update
elif has_new_output and notify_mode == "all":
# New output available -- deliver status update (only in "all" mode)
new_output = session.output_buffer[-500:] if session.output_buffer else ""
message_text = (
f"[Background process {session_id} is still running~ "
@@ -2402,6 +2523,8 @@ class GatewayRunner:
Platform.DISCORD: "hermes-discord",
Platform.WHATSAPP: "hermes-whatsapp",
Platform.SLACK: "hermes-slack",
Platform.SIGNAL: "hermes-signal",
Platform.HOMEASSISTANT: "hermes-homeassistant",
}
# Try to load platform_toolsets from config
@@ -2410,7 +2533,7 @@ class GatewayRunner:
config_path = _hermes_home / 'config.yaml'
if config_path.exists():
import yaml
with open(config_path, 'r') as f:
with open(config_path, 'r', encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
platform_toolsets_config = user_config.get("platform_toolsets", {})
except Exception as e:
@@ -2423,6 +2546,8 @@ class GatewayRunner:
Platform.DISCORD: "discord",
Platform.WHATSAPP: "whatsapp",
Platform.SLACK: "slack",
Platform.SIGNAL: "signal",
Platform.HOMEASSISTANT: "homeassistant",
}.get(source.platform, "telegram")
# Use config override if present (list of toolsets), otherwise hardcoded default
@@ -2440,7 +2565,7 @@ class GatewayRunner:
_tp_cfg_path = _hermes_home / "config.yaml"
if _tp_cfg_path.exists():
import yaml as _tp_yaml
with open(_tp_cfg_path) as _tp_f:
with open(_tp_cfg_path, encoding="utf-8") as _tp_f:
_tp_data = _tp_yaml.safe_load(_tp_f) or {}
_progress_cfg = _tp_data.get("display", {})
except Exception:
@@ -2531,6 +2656,8 @@ class GatewayRunner:
# Background task to send progress messages
# Accumulates tool lines into a single message that gets edited
_progress_metadata = {"thread_id": source.thread_id} if source.thread_id else None
async def send_progress_messages():
if not progress_queue:
return
@@ -2560,15 +2687,15 @@ class GatewayRunner:
# Platform doesn't support editing — stop trying,
# send just this new line as a separate message
can_edit = False
await adapter.send(chat_id=source.chat_id, content=msg)
await adapter.send(chat_id=source.chat_id, content=msg, metadata=_progress_metadata)
else:
if can_edit:
# First tool: send all accumulated text as new message
full_text = "\n".join(progress_lines)
result = await adapter.send(chat_id=source.chat_id, content=full_text)
result = await adapter.send(chat_id=source.chat_id, content=full_text, metadata=_progress_metadata)
else:
# Editing unsupported: send just this line
result = await adapter.send(chat_id=source.chat_id, content=msg)
result = await adapter.send(chat_id=source.chat_id, content=msg, metadata=_progress_metadata)
if result.success and result.message_id:
progress_msg_id = result.message_id
@@ -2658,7 +2785,7 @@ class GatewayRunner:
import yaml as _y
_cfg_path = _hermes_home / "config.yaml"
if _cfg_path.exists():
with open(_cfg_path) as _f:
with open(_cfg_path, encoding="utf-8") as _f:
_cfg = _y.safe_load(_f) or {}
_model_cfg = _cfg.get("model", {})
if isinstance(_model_cfg, str):
@@ -3140,7 +3267,7 @@ def main():
config = None
if args.config:
import json
with open(args.config) as f:
with open(args.config, encoding="utf-8") as f:
data = json.load(f)
config = GatewayConfig.from_dict(data)
+19 -5
View File
@@ -272,8 +272,8 @@ class SessionEntry:
if data.get("platform"):
try:
platform = Platform(data["platform"])
except ValueError:
pass
except ValueError as e:
logger.debug("Unknown platform value %r: %s", data["platform"], e)
return cls(
session_key=data["session_key"],
@@ -353,12 +353,26 @@ class SessionStore:
def _save(self) -> None:
"""Save sessions index to disk (kept for session key -> ID mapping)."""
import tempfile
self.sessions_dir.mkdir(parents=True, exist_ok=True)
sessions_file = self.sessions_dir / "sessions.json"
data = {key: entry.to_dict() for key, entry in self._entries.items()}
with open(sessions_file, "w", encoding="utf-8") as f:
json.dump(data, f, indent=2)
fd, tmp_path = tempfile.mkstemp(
dir=str(self.sessions_dir), suffix=".tmp", prefix=".sessions_"
)
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
json.dump(data, f, indent=2)
f.flush()
os.fsync(f.fileno())
os.replace(tmp_path, sessions_file)
except BaseException:
try:
os.unlink(tmp_path)
except OSError as e:
logger.debug("Could not remove temp file %s: %s", tmp_path, e)
raise
def _generate_session_key(self, source: SessionSource) -> str:
"""Generate a session key from a source."""
+54 -7
View File
@@ -23,6 +23,7 @@ import stat
import base64
import hashlib
import subprocess
import threading
import time
import uuid
import webbrowser
@@ -44,6 +45,10 @@ try:
import fcntl
except Exception:
fcntl = None
try:
import msvcrt
except Exception:
msvcrt = None
# =============================================================================
# Constants
@@ -103,6 +108,14 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
auth_type="oauth_external",
inference_base_url=DEFAULT_CODEX_BASE_URL,
),
"nous-api": ProviderConfig(
id="nous-api",
name="Nous Portal (API Key)",
auth_type="api_key",
inference_base_url="https://inference-api.nousresearch.com/v1",
api_key_env_vars=("NOUS_API_KEY",),
base_url_env_var="NOUS_BASE_URL",
),
"zai": ProviderConfig(
id="zai",
name="Z.AI / GLM",
@@ -299,31 +312,64 @@ def _auth_lock_path() -> Path:
return _auth_file_path().with_suffix(".lock")
_auth_lock_holder = threading.local()
@contextmanager
def _auth_store_lock(timeout_seconds: float = AUTH_LOCK_TIMEOUT_SECONDS):
"""Cross-process advisory lock for auth.json reads+writes."""
"""Cross-process advisory lock for auth.json reads+writes. Reentrant."""
# Reentrant: if this thread already holds the lock, just yield.
if getattr(_auth_lock_holder, "depth", 0) > 0:
_auth_lock_holder.depth += 1
try:
yield
finally:
_auth_lock_holder.depth -= 1
return
lock_path = _auth_lock_path()
lock_path.parent.mkdir(parents=True, exist_ok=True)
with lock_path.open("a+") as lock_file:
if fcntl is None:
if fcntl is None and msvcrt is None:
_auth_lock_holder.depth = 1
try:
yield
return
finally:
_auth_lock_holder.depth = 0
return
# On Windows, msvcrt.locking needs the file to have content and the
# file pointer at position 0. Ensure the lock file has at least 1 byte.
if msvcrt and (not lock_path.exists() or lock_path.stat().st_size == 0):
lock_path.write_text(" ", encoding="utf-8")
with lock_path.open("r+" if msvcrt else "a+") as lock_file:
deadline = time.time() + max(1.0, timeout_seconds)
while True:
try:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
if fcntl:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
else:
lock_file.seek(0)
msvcrt.locking(lock_file.fileno(), msvcrt.LK_NBLCK, 1)
break
except BlockingIOError:
except (BlockingIOError, OSError, PermissionError):
if time.time() >= deadline:
raise TimeoutError("Timed out waiting for auth store lock")
time.sleep(0.05)
_auth_lock_holder.depth = 1
try:
yield
finally:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN)
_auth_lock_holder.depth = 0
if fcntl:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN)
elif msvcrt:
try:
lock_file.seek(0)
msvcrt.locking(lock_file.fileno(), msvcrt.LK_UNLCK, 1)
except (OSError, IOError):
pass
def _load_auth_store(auth_file: Optional[Path] = None) -> Dict[str, Any]:
@@ -475,6 +521,7 @@ def resolve_provider(
# Normalize provider aliases
_PROVIDER_ALIASES = {
"nous_api": "nous-api", "nousapi": "nous-api", "nous-portal-api": "nous-api",
"glm": "zai", "z-ai": "zai", "z.ai": "zai", "zhipu": "zai",
"kimi": "kimi-coding", "moonshot": "kimi-coding",
"minimax-china": "minimax-cn", "minimax_cn": "minimax-cn",
+44 -13
View File
@@ -36,6 +36,28 @@ def cprint(text: str):
_pt_print(_PT_ANSI(text))
# =========================================================================
# Skin-aware color helpers
# =========================================================================
def _skin_color(key: str, fallback: str) -> str:
"""Get a color from the active skin, or return fallback."""
try:
from hermes_cli.skin_engine import get_active_skin
return get_active_skin().get_color(key, fallback)
except Exception:
return fallback
def _skin_branding(key: str, fallback: str) -> str:
"""Get a branding string from the active skin, or return fallback."""
try:
from hermes_cli.skin_engine import get_active_skin
return get_active_skin().get_branding(key, fallback)
except Exception:
return fallback
# =========================================================================
# ASCII Art & Branding
# =========================================================================
@@ -217,18 +239,24 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
layout_table.add_column("left", justify="center")
layout_table.add_column("right", justify="left")
# Resolve skin colors once for the entire banner
accent = _skin_color("banner_accent", "#FFBF00")
dim = _skin_color("banner_dim", "#B8860B")
text = _skin_color("banner_text", "#FFF8DC")
session_color = _skin_color("session_border", "#8B8682")
left_lines = ["", HERMES_CADUCEUS, ""]
model_short = model.split("/")[-1] if "/" in model else model
if len(model_short) > 28:
model_short = model_short[:25] + "..."
ctx_str = f" [dim #B8860B]·[/] [dim #B8860B]{_format_context_length(context_length)} context[/]" if context_length else ""
left_lines.append(f"[#FFBF00]{model_short}[/]{ctx_str} [dim #B8860B]·[/] [dim #B8860B]Nous Research[/]")
left_lines.append(f"[dim #B8860B]{cwd}[/]")
ctx_str = f" [dim {dim}]·[/] [dim {dim}]{_format_context_length(context_length)} context[/]" if context_length else ""
left_lines.append(f"[{accent}]{model_short}[/]{ctx_str} [dim {dim}]·[/] [dim {dim}]Nous Research[/]")
left_lines.append(f"[dim {dim}]{cwd}[/]")
if session_id:
left_lines.append(f"[dim #8B8682]Session: {session_id}[/]")
left_lines.append(f"[dim {session_color}]Session: {session_id}[/]")
left_content = "\n".join(left_lines)
right_lines = ["[bold #FFBF00]Available Tools[/]"]
right_lines = [f"[bold {accent}]Available Tools[/]"]
toolsets_dict: Dict[str, list] = {}
for tool in tools:
@@ -256,7 +284,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
if name in disabled_tools:
colored_names.append(f"[red]{name}[/]")
else:
colored_names.append(f"[#FFF8DC]{name}[/]")
colored_names.append(f"[{text}]{name}[/]")
tools_str = ", ".join(colored_names)
if len(", ".join(sorted(tool_names))) > 45:
@@ -275,7 +303,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
elif name in disabled_tools:
colored_names.append(f"[red]{name}[/]")
else:
colored_names.append(f"[#FFF8DC]{name}[/]")
colored_names.append(f"[{text}]{name}[/]")
tools_str = ", ".join(colored_names)
right_lines.append(f"[dim #B8860B]{toolset}:[/] {tools_str}")
@@ -306,7 +334,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
)
right_lines.append("")
right_lines.append("[bold #FFBF00]Available Skills[/]")
right_lines.append(f"[bold {accent}]Available Skills[/]")
skills_by_category = get_available_skills()
total_skills = sum(len(s) for s in skills_by_category.values())
@@ -320,9 +348,9 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
skills_str = ", ".join(skill_names)
if len(skills_str) > 50:
skills_str = skills_str[:47] + "..."
right_lines.append(f"[dim #B8860B]{category}:[/] [#FFF8DC]{skills_str}[/]")
right_lines.append(f"[dim {dim}]{category}:[/] [{text}]{skills_str}[/]")
else:
right_lines.append("[dim #B8860B]No skills installed[/]")
right_lines.append(f"[dim {dim}]No skills installed[/]")
right_lines.append("")
mcp_connected = sum(1 for s in mcp_status if s["connected"]) if mcp_status else 0
@@ -330,7 +358,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
if mcp_connected:
summary_parts.append(f"{mcp_connected} MCP servers")
summary_parts.append("/help for commands")
right_lines.append(f"[dim #B8860B]{' · '.join(summary_parts)}[/]")
right_lines.append(f"[dim {dim}]{' · '.join(summary_parts)}[/]")
# Update check — show if behind origin/main
try:
@@ -347,10 +375,13 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
right_content = "\n".join(right_lines)
layout_table.add_row(left_content, right_content)
agent_name = _skin_branding("agent_name", "Hermes Agent")
title_color = _skin_color("banner_title", "#FFD700")
border_color = _skin_color("banner_border", "#CD7F32")
outer_panel = Panel(
layout_table,
title=f"[bold #FFD700]Hermes Agent {VERSION}[/]",
border_style="#CD7F32",
title=f"[bold {title_color}]{agent_name} {VERSION}[/]",
border_style=border_color,
padding=(0, 2),
)
+4 -1
View File
@@ -292,9 +292,12 @@ def _convert_to_png(path: Path) -> bool:
["convert", str(tmp), "png:" + str(path)],
capture_output=True, timeout=5,
)
tmp.unlink(missing_ok=True)
if r.returncode == 0 and path.exists() and path.stat().st_size > 0:
tmp.unlink(missing_ok=True)
return True
else:
# Convert failed — restore the original file
tmp.rename(path)
except FileNotFoundError:
logger.debug("ImageMagick not installed — cannot convert BMP to PNG")
if tmp.exists() and not path.exists():
+2
View File
@@ -39,6 +39,8 @@ COMMANDS = {
"/insights": "Show usage insights and analytics (last 30 days)",
"/paste": "Check clipboard for an image and attach it",
"/reload-mcp": "Reload MCP servers from config.yaml",
"/rollback": "List or restore filesystem checkpoints (usage: /rollback [number])",
"/skin": "Show or change the display skin/theme",
"/quit": "Exit the CLI (also: /exit, /q)",
}
+89 -26
View File
@@ -14,8 +14,9 @@ This module provides:
import os
import platform
import sys
import stat
import subprocess
import sys
from pathlib import Path
from typing import Dict, Any, Optional, List, Tuple
@@ -62,7 +63,9 @@ def ensure_hermes_home():
DEFAULT_CONFIG = {
"model": "anthropic/claude-opus-4.6",
"toolsets": ["hermes-cli"],
"max_turns": 100,
"agent": {
"max_turns": 90,
},
"terminal": {
"backend": "local",
@@ -88,6 +91,14 @@ DEFAULT_CONFIG = {
"record_sessions": False, # Auto-record browser sessions as WebM videos
},
# Filesystem checkpoints — automatic snapshots before destructive file ops.
# When enabled, the agent takes a snapshot of the working directory once per
# conversation turn (on first write_file/patch call). Use /rollback to restore.
"checkpoints": {
"enabled": False,
"max_snapshots": 50, # Max checkpoints to keep per directory
},
"compression": {
"enabled": True,
"threshold": 0.85,
@@ -111,8 +122,9 @@ DEFAULT_CONFIG = {
"display": {
"compact": False,
"personality": "kawaii",
"resume_display": "full", # "full" (show previous messages) | "minimal" (one-liner only)
"bell_on_complete": False, # Play terminal bell (\a) when agent finishes a response
"resume_display": "full",
"bell_on_complete": False,
"skin": "default",
},
# Text-to-speech configuration
@@ -170,7 +182,7 @@ DEFAULT_CONFIG = {
"command_allowlist": [],
# Config schema version - bump this when adding new required fields
"_config_version": 5,
"_config_version": 6,
}
# =============================================================================
@@ -195,6 +207,22 @@ REQUIRED_ENV_VARS = {}
# Optional environment variables that enhance functionality
OPTIONAL_ENV_VARS = {
# ── Provider (handled in provider selection, not shown in checklists) ──
"NOUS_API_KEY": {
"description": "Nous Portal API key (direct API key access to Nous inference)",
"prompt": "Nous Portal API key",
"url": "https://portal.nousresearch.com",
"password": True,
"category": "provider",
"advanced": True,
},
"NOUS_BASE_URL": {
"description": "Nous Portal base URL override",
"prompt": "Nous Portal base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
"OPENROUTER_API_KEY": {
"description": "OpenRouter API key (for vision, web scraping helpers, and MoA)",
"prompt": "OpenRouter API key",
@@ -748,6 +776,23 @@ def _deep_merge(base: dict, override: dict) -> dict:
return result
def _normalize_max_turns_config(config: Dict[str, Any]) -> Dict[str, Any]:
"""Normalize legacy root-level max_turns into agent.max_turns."""
config = dict(config)
agent_config = dict(config.get("agent") or {})
if "max_turns" in config and "max_turns" not in agent_config:
agent_config["max_turns"] = config["max_turns"]
if "max_turns" not in agent_config:
agent_config["max_turns"] = DEFAULT_CONFIG["agent"]["max_turns"]
config["agent"] = agent_config
config.pop("max_turns", None)
return config
def load_config() -> Dict[str, Any]:
"""Load configuration from ~/.hermes/config.yaml."""
import copy
@@ -757,14 +802,21 @@ def load_config() -> Dict[str, Any]:
if config_path.exists():
try:
with open(config_path) as f:
with open(config_path, encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
if "max_turns" in user_config:
agent_user_config = dict(user_config.get("agent") or {})
if agent_user_config.get("max_turns") is None:
agent_user_config["max_turns"] = user_config["max_turns"]
user_config["agent"] = agent_user_config
user_config.pop("max_turns", None)
config = _deep_merge(config, user_config)
except Exception as e:
print(f"Warning: Failed to load config: {e}")
return config
return _normalize_max_turns_config(config)
_COMMENTED_SECTIONS = """
@@ -799,23 +851,27 @@ _COMMENTED_SECTIONS = """
def save_config(config: Dict[str, Any]):
"""Save configuration to ~/.hermes/config.yaml."""
from utils import atomic_yaml_write
ensure_hermes_home()
config_path = get_config_path()
with open(config_path, 'w') as f:
yaml.dump(config, f, default_flow_style=False, sort_keys=False)
# Append commented-out sections for features that are off by default
# or only relevant when explicitly configured. Skip sections the
# user has already uncommented and configured.
sections = []
sec = config.get("security", {})
if not sec or sec.get("redact_secrets") is None:
sections.append("security")
fb = config.get("fallback_model", {})
if not fb or not (fb.get("provider") and fb.get("model")):
sections.append("fallback")
if sections:
f.write(_COMMENTED_SECTIONS)
normalized = _normalize_max_turns_config(config)
# Build optional commented-out sections for features that are off by
# default or only relevant when explicitly configured.
sections = []
sec = normalized.get("security", {})
if not sec or sec.get("redact_secrets") is None:
sections.append("security")
fb = normalized.get("fallback_model", {})
if not fb or not (fb.get("provider") and fb.get("model")):
sections.append("fallback")
atomic_yaml_write(
config_path,
normalized,
extra_content=_COMMENTED_SECTIONS if sections else None,
)
def load_env() -> Dict[str, str]:
@@ -869,6 +925,13 @@ def save_env_value(key: str, value: str):
with open(env_path, 'w', **write_kw) as f:
f.writelines(lines)
# Restrict .env permissions to owner-only (contains API keys)
if not _IS_WINDOWS:
try:
os.chmod(env_path, stat.S_IRUSR | stat.S_IWUSR)
except OSError:
pass
def get_env_value(key: str) -> Optional[str]:
"""Get a value from ~/.hermes/.env or environment."""
@@ -932,7 +995,7 @@ def show_config():
print()
print(color("◆ Model", Colors.CYAN, Colors.BOLD))
print(f" Model: {config.get('model', 'not set')}")
print(f" Max turns: {config.get('max_turns', 100)}")
print(f" Max turns: {config.get('agent', {}).get('max_turns', DEFAULT_CONFIG['agent']['max_turns'])}")
print(f" Toolsets: {', '.join(config.get('toolsets', ['all']))}")
# Terminal
@@ -1077,7 +1140,7 @@ def set_config_value(key: str, value: str):
user_config = {}
if config_path.exists():
try:
with open(config_path) as f:
with open(config_path, encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
except Exception:
user_config = {}
@@ -1105,7 +1168,7 @@ def set_config_value(key: str, value: str):
# Write only user config back (not the full merged defaults)
ensure_hermes_home()
with open(config_path, 'w') as f:
with open(config_path, 'w', encoding="utf-8") as f:
yaml.dump(user_config, f, default_flow_style=False, sort_keys=False)
# Keep .env in sync for keys that terminal_tool reads directly from env vars.
+52 -3
View File
@@ -489,6 +489,7 @@ def cmd_chat(args):
"query": args.query,
"resume": getattr(args, "resume", None),
"worktree": getattr(args, "worktree", False),
"checkpoints": getattr(args, "checkpoints", False),
}
# Filter out None values
kwargs = {k: v for k, v in kwargs.items() if v is not None}
@@ -1777,6 +1778,44 @@ def cmd_update(args):
sys.exit(1)
def _coalesce_session_name_args(argv: list) -> list:
"""Join unquoted multi-word session names after -c/--continue and -r/--resume.
When a user types ``hermes -c Pokemon Agent Dev`` without quoting the
session name, argparse sees three separate tokens. This function merges
them into a single argument so argparse receives
``['-c', 'Pokemon Agent Dev']`` instead.
Tokens are collected after the flag until we hit another flag (``-*``)
or a known top-level subcommand.
"""
_SUBCOMMANDS = {
"chat", "model", "gateway", "setup", "whatsapp", "login", "logout",
"status", "cron", "doctor", "config", "pairing", "skills", "tools",
"sessions", "insights", "version", "update", "uninstall",
}
_SESSION_FLAGS = {"-c", "--continue", "-r", "--resume"}
result = []
i = 0
while i < len(argv):
token = argv[i]
if token in _SESSION_FLAGS:
result.append(token)
i += 1
# Collect subsequent non-flag, non-subcommand tokens as one name
parts: list = []
while i < len(argv) and not argv[i].startswith("-") and argv[i] not in _SUBCOMMANDS:
parts.append(argv[i])
i += 1
if parts:
result.append(" ".join(parts))
else:
result.append(token)
i += 1
return result
def main():
"""Main entry point for hermes CLI."""
parser = argparse.ArgumentParser(
@@ -1889,6 +1928,12 @@ For more help on a command:
default=False,
help="Run in an isolated git worktree (for parallel agents on the same repo)"
)
chat_parser.add_argument(
"--checkpoints",
action="store_true",
default=False,
help="Enable filesystem checkpoints before destructive file operations (use /rollback to restore)"
)
chat_parser.set_defaults(func=cmd_chat)
# =========================================================================
@@ -2356,12 +2401,12 @@ For more help on a command:
if not data:
print(f"Session '{args.session_id}' not found.")
return
with open(args.output, "w") as f:
with open(args.output, "w", encoding="utf-8") as f:
f.write(_json.dumps(data, ensure_ascii=False) + "\n")
print(f"Exported 1 session to {args.output}")
else:
sessions = db.export_all(source=args.source)
with open(args.output, "w") as f:
with open(args.output, "w", encoding="utf-8") as f:
for s in sessions:
f.write(_json.dumps(s, ensure_ascii=False) + "\n")
print(f"Exported {len(sessions)} sessions to {args.output}")
@@ -2515,7 +2560,11 @@ For more help on a command:
# =========================================================================
# Parse and execute
# =========================================================================
args = parser.parse_args()
# Pre-process argv so unquoted multi-word session names after -c / -r
# are merged into a single token before argparse sees them.
# e.g. ``hermes -c Pokemon Agent Dev`` → ``hermes -c 'Pokemon Agent Dev'``
_processed_argv = _coalesce_session_name_args(sys.argv[1:])
args = parser.parse_args(_processed_argv)
# Handle --version flag
if args.version:
+54 -14
View File
@@ -516,7 +516,8 @@ def setup_model_provider(config: dict):
keep_label = None # No provider configured — don't show "Keep current"
provider_choices = [
"Login with Nous Portal (Nous Research subscription)",
"Nous Portal API key (direct API key access)",
"Login with Nous Portal (Nous Research subscription — OAuth)",
"Login with OpenAI Codex",
"OpenRouter API key (100+ models, pay-per-use)",
"Custom OpenAI-compatible endpoint (self-hosted / VLLM / etc.)",
@@ -529,7 +530,7 @@ def setup_model_provider(config: dict):
provider_choices.append(keep_label)
# Default to "Keep current" if a provider exists, otherwise OpenRouter (most common)
default_provider = len(provider_choices) - 1 if has_any_provider else 2
default_provider = len(provider_choices) - 1 if has_any_provider else 3
if not has_any_provider:
print_warning("An inference provider is required for Hermes to work.")
@@ -541,7 +542,37 @@ def setup_model_provider(config: dict):
selected_provider = None # "nous", "openai-codex", "openrouter", "custom", or None (keep)
nous_models = [] # populated if Nous login succeeds
if provider_idx == 0: # Nous Portal
if provider_idx == 0: # Nous Portal API Key (direct)
selected_provider = "nous-api"
print()
print_header("Nous Portal API Key")
print_info("Use a Nous Portal API key for direct access to Nous inference.")
print_info("Get your API key at: https://portal.nousresearch.com")
print()
existing_key = get_env_value("NOUS_API_KEY")
if existing_key:
print_info(f"Current: {existing_key[:8]}... (configured)")
if prompt_yes_no("Update Nous API key?", False):
api_key = prompt(" Nous API key", password=True)
if api_key:
save_env_value("NOUS_API_KEY", api_key)
print_success("Nous API key updated")
else:
api_key = prompt(" Nous API key", password=True)
if api_key:
save_env_value("NOUS_API_KEY", api_key)
print_success("Nous API key saved")
else:
print_warning("Skipped - agent won't work without an API key")
# Clear custom endpoint vars if switching
if existing_custom:
save_env_value("OPENAI_BASE_URL", "")
save_env_value("OPENAI_API_KEY", "")
_update_config_for_provider("nous-api", "https://inference-api.nousresearch.com/v1")
elif provider_idx == 1: # Nous Portal
selected_provider = "nous"
print()
print_header("Nous Portal Login")
@@ -581,7 +612,7 @@ def setup_model_provider(config: dict):
print_info("You can try again later with: hermes model")
selected_provider = None
elif provider_idx == 1: # OpenAI Codex
elif provider_idx == 2: # OpenAI Codex
selected_provider = "openai-codex"
print()
print_header("OpenAI Codex Login")
@@ -605,7 +636,7 @@ def setup_model_provider(config: dict):
print_info("You can try again later with: hermes model")
selected_provider = None
elif provider_idx == 2: # OpenRouter
elif provider_idx == 3: # OpenRouter
selected_provider = "openrouter"
print()
print_header("OpenRouter API Key")
@@ -655,7 +686,7 @@ def setup_model_provider(config: dict):
except Exception as e:
logger.debug("Could not save provider to config.yaml: %s", e)
elif provider_idx == 3: # Custom endpoint
elif provider_idx == 4: # Custom endpoint
selected_provider = "custom"
print()
print_header("Custom OpenAI-Compatible Endpoint")
@@ -706,7 +737,7 @@ def setup_model_provider(config: dict):
print_success("Custom endpoint configured")
elif provider_idx == 4: # Z.AI / GLM
elif provider_idx == 5: # Z.AI / GLM
selected_provider = "zai"
print()
print_header("Z.AI / GLM API Key")
@@ -760,7 +791,7 @@ def setup_model_provider(config: dict):
save_env_value("OPENAI_API_KEY", "")
_update_config_for_provider("zai", zai_base_url)
elif provider_idx == 5: # Kimi / Moonshot
elif provider_idx == 6: # Kimi / Moonshot
selected_provider = "kimi-coding"
print()
print_header("Kimi / Moonshot API Key")
@@ -792,7 +823,7 @@ def setup_model_provider(config: dict):
save_env_value("OPENAI_API_KEY", "")
_update_config_for_provider("kimi-coding", pconfig.inference_base_url)
elif provider_idx == 6: # MiniMax
elif provider_idx == 7: # MiniMax
selected_provider = "minimax"
print()
print_header("MiniMax API Key")
@@ -824,7 +855,7 @@ def setup_model_provider(config: dict):
save_env_value("OPENAI_API_KEY", "")
_update_config_for_provider("minimax", pconfig.inference_base_url)
elif provider_idx == 7: # MiniMax China
elif provider_idx == 8: # MiniMax China
selected_provider = "minimax-cn"
print()
print_header("MiniMax China API Key")
@@ -856,12 +887,12 @@ def setup_model_provider(config: dict):
save_env_value("OPENAI_API_KEY", "")
_update_config_for_provider("minimax-cn", pconfig.inference_base_url)
# else: provider_idx == 8 (Keep current) — only shown when a provider already exists
# else: provider_idx == 9 (Keep current) — only shown when a provider already exists
# ── OpenRouter API Key for tools (if not already set) ──
# Tools (vision, web, MoA) use OpenRouter independently of the main provider.
# Prompt for OpenRouter key if not set and a non-OpenRouter provider was chosen.
if selected_provider in ("nous", "openai-codex", "custom", "zai", "kimi-coding", "minimax", "minimax-cn") and not get_env_value("OPENROUTER_API_KEY"):
if selected_provider in ("nous", "nous-api", "openai-codex", "custom", "zai", "kimi-coding", "minimax", "minimax-cn") and not get_env_value("OPENROUTER_API_KEY"):
print()
print_header("OpenRouter API Key (for tools)")
print_info("Tools like vision analysis, web search, and MoA use OpenRouter")
@@ -914,6 +945,14 @@ def setup_model_provider(config: dict):
if custom:
config['model'] = custom
save_env_value("LLM_MODEL", custom)
elif selected_provider == "nous-api":
# Nous API key provider — prompt for model manually
print_info("Enter a model name available on Nous inference API.")
print_info("Examples: anthropic/claude-opus-4.6, deepseek/deepseek-r1")
custom = prompt(f" Model name (Enter to keep '{current_model}')")
if custom:
config['model'] = custom
save_env_value("LLM_MODEL", custom)
elif selected_provider == "openai-codex":
from hermes_cli.codex_models import get_codex_model_ids
codex_models = get_codex_model_ids()
@@ -1309,7 +1348,7 @@ def setup_agent_settings(config: dict):
# ── Max Iterations ──
print_header("Agent Settings")
current_max = get_env_value('HERMES_MAX_ITERATIONS') or '90'
current_max = get_env_value('HERMES_MAX_ITERATIONS') or str(config.get('agent', {}).get('max_turns', 90))
print_info("Maximum tool-calling iterations per conversation.")
print_info("Higher = more complex tasks, but costs more tokens.")
print_info("Recommended: 30-60 for most tasks, 100+ for open exploration.")
@@ -1319,7 +1358,8 @@ def setup_agent_settings(config: dict):
max_iter = int(max_iter_str)
if max_iter > 0:
save_env_value("HERMES_MAX_ITERATIONS", str(max_iter))
config['max_turns'] = max_iter
config.setdefault('agent', {})['max_turns'] = max_iter
config.pop('max_turns', None)
print_success(f"Max iterations set to {max_iter}")
except ValueError:
print_warning("Invalid number, keeping current value")
+630
View File
@@ -0,0 +1,630 @@
"""Hermes CLI skin/theme engine.
A data-driven skin system that lets users customize the CLI's visual appearance.
Skins are defined as YAML files in ~/.hermes/skins/ or as built-in presets.
No code changes are needed to add a new skin.
SKIN YAML SCHEMA
================
All fields are optional. Missing values inherit from the ``default`` skin.
.. code-block:: yaml
# Required: skin identity
name: mytheme # Unique skin name (lowercase, hyphens ok)
description: Short description # Shown in /skin listing
# Colors: hex values for Rich markup (banner, UI, response box)
colors:
banner_border: "#CD7F32" # Panel border color
banner_title: "#FFD700" # Panel title text color
banner_accent: "#FFBF00" # Section headers (Available Tools, etc.)
banner_dim: "#B8860B" # Dim/muted text (separators, labels)
banner_text: "#FFF8DC" # Body text (tool names, skill names)
ui_accent: "#FFBF00" # General UI accent
ui_label: "#4dd0e1" # UI labels
ui_ok: "#4caf50" # Success indicators
ui_error: "#ef5350" # Error indicators
ui_warn: "#ffa726" # Warning indicators
prompt: "#FFF8DC" # Prompt text color
input_rule: "#CD7F32" # Input area horizontal rule
response_border: "#FFD700" # Response box border (ANSI)
session_label: "#DAA520" # Session label color
session_border: "#8B8682" # Session ID dim color
# Spinner: customize the animated spinner during API calls
spinner:
waiting_faces: # Faces shown while waiting for API
- "(⚔)"
- "(⛨)"
thinking_faces: # Faces shown during reasoning
- "(⌁)"
- "(<>)"
thinking_verbs: # Verbs for spinner messages
- "forging"
- "plotting"
wings: # Optional left/right spinner decorations
- ["⟪⚔", "⚔⟫"] # Each entry is [left, right] pair
- ["⟪▲", "▲⟫"]
# Branding: text strings used throughout the CLI
branding:
agent_name: "Hermes Agent" # Banner title, status display
welcome: "Welcome message" # Shown at CLI startup
goodbye: "Goodbye! ⚕" # Shown on exit
response_label: " ⚕ Hermes " # Response box header label
prompt_symbol: " " # Input prompt symbol
help_header: "(^_^)? Commands" # /help header text
# Tool prefix: character for tool output lines (default: ┊)
tool_prefix: ""
USAGE
=====
.. code-block:: python
from hermes_cli.skin_engine import get_active_skin, list_skins, set_active_skin
skin = get_active_skin()
print(skin.colors["banner_title"]) # "#FFD700"
print(skin.get_branding("agent_name")) # "Hermes Agent"
set_active_skin("ares") # Switch to built-in ares skin
set_active_skin("mytheme") # Switch to user skin from ~/.hermes/skins/
BUILT-IN SKINS
==============
- ``default`` Classic Hermes gold/kawaii (the current look)
- ``ares`` Crimson/bronze war-god theme with custom spinner wings
- ``mono`` Clean grayscale monochrome
- ``slate`` Cool blue developer-focused theme
USER SKINS
==========
Drop a YAML file in ``~/.hermes/skins/<name>.yaml`` following the schema above.
Activate with ``/skin <name>`` in the CLI or ``display.skin: <name>`` in config.yaml.
"""
import logging
import os
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
logger = logging.getLogger(__name__)
# =============================================================================
# Skin data structure
# =============================================================================
@dataclass
class SkinConfig:
"""Complete skin configuration."""
name: str
description: str = ""
colors: Dict[str, str] = field(default_factory=dict)
spinner: Dict[str, Any] = field(default_factory=dict)
branding: Dict[str, str] = field(default_factory=dict)
tool_prefix: str = ""
banner_logo: str = "" # Rich-markup ASCII art logo (replaces HERMES_AGENT_LOGO)
banner_hero: str = "" # Rich-markup hero art (replaces HERMES_CADUCEUS)
def get_color(self, key: str, fallback: str = "") -> str:
"""Get a color value with fallback."""
return self.colors.get(key, fallback)
def get_spinner_list(self, key: str) -> List[str]:
"""Get a spinner list (faces, verbs, etc.)."""
return self.spinner.get(key, [])
def get_spinner_wings(self) -> List[Tuple[str, str]]:
"""Get spinner wing pairs, or empty list if none."""
raw = self.spinner.get("wings", [])
result = []
for pair in raw:
if isinstance(pair, (list, tuple)) and len(pair) == 2:
result.append((str(pair[0]), str(pair[1])))
return result
def get_branding(self, key: str, fallback: str = "") -> str:
"""Get a branding value with fallback."""
return self.branding.get(key, fallback)
# =============================================================================
# Built-in skin definitions
# =============================================================================
_BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"default": {
"name": "default",
"description": "Classic Hermes — gold and kawaii",
"colors": {
"banner_border": "#CD7F32",
"banner_title": "#FFD700",
"banner_accent": "#FFBF00",
"banner_dim": "#B8860B",
"banner_text": "#FFF8DC",
"ui_accent": "#FFBF00",
"ui_label": "#4dd0e1",
"ui_ok": "#4caf50",
"ui_error": "#ef5350",
"ui_warn": "#ffa726",
"prompt": "#FFF8DC",
"input_rule": "#CD7F32",
"response_border": "#FFD700",
"session_label": "#DAA520",
"session_border": "#8B8682",
},
"spinner": {
# Empty = use hardcoded defaults in display.py
},
"branding": {
"agent_name": "Hermes Agent",
"welcome": "Welcome to Hermes Agent! Type your message or /help for commands.",
"goodbye": "Goodbye! ⚕",
"response_label": " ⚕ Hermes ",
"prompt_symbol": " ",
"help_header": "(^_^)? Available Commands",
},
"tool_prefix": "",
},
"ares": {
"name": "ares",
"description": "War-god theme — crimson and bronze",
"colors": {
"banner_border": "#9F1C1C",
"banner_title": "#C7A96B",
"banner_accent": "#DD4A3A",
"banner_dim": "#6B1717",
"banner_text": "#F1E6CF",
"ui_accent": "#DD4A3A",
"ui_label": "#C7A96B",
"ui_ok": "#4caf50",
"ui_error": "#ef5350",
"ui_warn": "#ffa726",
"prompt": "#F1E6CF",
"input_rule": "#9F1C1C",
"response_border": "#C7A96B",
"session_label": "#C7A96B",
"session_border": "#6E584B",
},
"spinner": {
"waiting_faces": ["(⚔)", "(⛨)", "(▲)", "(<>)", "(/)"],
"thinking_faces": ["(⚔)", "(⛨)", "(▲)", "(⌁)", "(<>)"],
"thinking_verbs": [
"forging", "marching", "sizing the field", "holding the line",
"hammering plans", "tempering steel", "plotting impact", "raising the shield",
],
"wings": [
["⟪⚔", "⚔⟫"],
["⟪▲", "▲⟫"],
["⟪╸", "╺⟫"],
["⟪⛨", "⛨⟫"],
],
},
"branding": {
"agent_name": "Ares Agent",
"welcome": "Welcome to Ares Agent! Type your message or /help for commands.",
"goodbye": "Farewell, warrior! ⚔",
"response_label": " ⚔ Ares ",
"prompt_symbol": " ",
"help_header": "(⚔) Available Commands",
},
"tool_prefix": "",
"banner_logo": """[bold #A3261F] █████╗ ██████╗ ███████╗███████╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
[bold #B73122]██╔══██╗██╔══██╗██╔════╝██╔════╝ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
[#C93C24]███████║██████╔╝█████╗ ███████╗█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
[#D84A28]██╔══██║██╔══██╗██╔══╝ ╚════██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
[#E15A2D]██║ ██║██║ ██║███████╗███████║ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
[#EB6C32]╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝╚══════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
"banner_hero": """[#9F1C1C]⠀⣤⣤⠀[/]
[#9F1C1C]⠀⠀⠀⠀⠀⠀⠀⠀⢀⣴⣿⠟⠻⣿⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#C7A96B]⠀⠀⠀⠀⠀⠀⠀⣠⣾⡿⠋⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀[/]
[#C7A96B]⠀⠀⠀⠀⠀⢀⣾⡿⠋⠀⠀⢠⡄⠀⠀⠙⢿⣷⡀⠀⠀⠀⠀⠀[/]
[#DD4A3A]⠀⠀⠀⠀⣰⣿⠟⠀⠀⣰⣿⣿⣆⠀⠀⠀⠻⣿⣆⠀⠀⠀⠀[/]
[#DD4A3A]⠀⠀⠀⢰⣿⠏⠀⠀⢀⣾⡿⠉⢿⣷⡀⠀⠀⠹⣿⡆⠀⠀⠀[/]
[#9F1C1C]⠀⠀⠀⣿⡟⠀⠀⣠⣿⠟⠀⠀⠻⣿⣄⠀⠀⢻⣿⠀⠀⠀[/]
[#9F1C1C]⠀⠀⠀⣿⡇⠀⠀⠙⠋⠀⠀⚔⠀⠀⠙⠋⠀⠀⢸⣿⠀⠀⠀[/]
[#6B1717]⠀⢿⣧⠀⠀⣼⡿⠀⠀⠀[/]
[#6B1717]⠀⠀⠀⠘⢿⣷⣄⠀⠀⣠⣾⡿⠃⠀⠀⠀[/]
[#C7A96B]⠀⠀⠀⠀⠈⠻⣿⣷⣦⣤⣀⣀⣤⣤⣶⣿⠿⠋⠀⠀⠀⠀[/]
[#C7A96B]⠀⠀⠀⠀⠉⠛⠿⠿⠿⠿⠛⠉⠀⠀⠀⠀⠀⠀⠀[/]
[#DD4A3A]⠀⚔⠀[/]
[dim #6B1717]war god online[/]""",
},
"mono": {
"name": "mono",
"description": "Monochrome — clean grayscale",
"colors": {
"banner_border": "#555555",
"banner_title": "#e6edf3",
"banner_accent": "#aaaaaa",
"banner_dim": "#444444",
"banner_text": "#c9d1d9",
"ui_accent": "#aaaaaa",
"ui_label": "#888888",
"ui_ok": "#888888",
"ui_error": "#cccccc",
"ui_warn": "#999999",
"prompt": "#c9d1d9",
"input_rule": "#444444",
"response_border": "#aaaaaa",
"session_label": "#888888",
"session_border": "#555555",
},
"spinner": {},
"branding": {
"agent_name": "Hermes Agent",
"welcome": "Welcome to Hermes Agent! Type your message or /help for commands.",
"goodbye": "Goodbye! ⚕",
"response_label": " ⚕ Hermes ",
"prompt_symbol": " ",
"help_header": "[?] Available Commands",
},
"tool_prefix": "",
},
"slate": {
"name": "slate",
"description": "Cool blue — developer-focused",
"colors": {
"banner_border": "#4169e1",
"banner_title": "#7eb8f6",
"banner_accent": "#8EA8FF",
"banner_dim": "#4b5563",
"banner_text": "#c9d1d9",
"ui_accent": "#7eb8f6",
"ui_label": "#8EA8FF",
"ui_ok": "#63D0A6",
"ui_error": "#F7A072",
"ui_warn": "#e6a855",
"prompt": "#c9d1d9",
"input_rule": "#4169e1",
"response_border": "#7eb8f6",
"session_label": "#7eb8f6",
"session_border": "#4b5563",
},
"spinner": {},
"branding": {
"agent_name": "Hermes Agent",
"welcome": "Welcome to Hermes Agent! Type your message or /help for commands.",
"goodbye": "Goodbye! ⚕",
"response_label": " ⚕ Hermes ",
"prompt_symbol": " ",
"help_header": "(^_^)? Available Commands",
},
"tool_prefix": "",
},
"poseidon": {
"name": "poseidon",
"description": "Ocean-god theme — deep blue and seafoam",
"colors": {
"banner_border": "#2A6FB9",
"banner_title": "#A9DFFF",
"banner_accent": "#5DB8F5",
"banner_dim": "#153C73",
"banner_text": "#EAF7FF",
"ui_accent": "#5DB8F5",
"ui_label": "#A9DFFF",
"ui_ok": "#4caf50",
"ui_error": "#ef5350",
"ui_warn": "#ffa726",
"prompt": "#EAF7FF",
"input_rule": "#2A6FB9",
"response_border": "#5DB8F5",
"session_label": "#A9DFFF",
"session_border": "#496884",
},
"spinner": {
"waiting_faces": ["(≈)", "(Ψ)", "(∿)", "(◌)", "(◠)"],
"thinking_faces": ["(Ψ)", "(∿)", "(≈)", "(⌁)", "(◌)"],
"thinking_verbs": [
"charting currents", "sounding the depth", "reading foam lines",
"steering the trident", "tracking undertow", "plotting sea lanes",
"calling the swell", "measuring pressure",
],
"wings": [
["⟪≈", "≈⟫"],
["⟪Ψ", "Ψ⟫"],
["⟪∿", "∿⟫"],
["⟪◌", "◌⟫"],
],
},
"branding": {
"agent_name": "Poseidon Agent",
"welcome": "Welcome to Poseidon Agent! Type your message or /help for commands.",
"goodbye": "Fair winds! Ψ",
"response_label": " Ψ Poseidon ",
"prompt_symbol": "Ψ ",
"help_header": "(Ψ) Available Commands",
},
"tool_prefix": "",
"banner_logo": """[bold #B8E8FF]██████╗ ██████╗ ███████╗██╗██████╗ ███████╗ ██████╗ ███╗ ██╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
[bold #97D6FF]██╔══██╗██╔═══██╗██╔════╝██║██╔══██╗██╔════╝██╔═══██╗████╗ ██║ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
[#75C1F6]██████╔╝██║ ██║███████╗██║██║ ██║█████╗ ██║ ██║██╔██╗ ██║█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
[#4FA2E0]██╔═══╝ ██║ ██║╚════██║██║██║ ██║██╔══╝ ██║ ██║██║╚██╗██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
[#2E7CC7]██║ ╚██████╔╝███████║██║██████╔╝███████╗╚██████╔╝██║ ╚████║ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
[#1B4F95]╚═╝ ╚═════╝ ╚══════╝╚═╝╚═════╝ ╚══════╝ ╚═════╝ ╚═╝ ╚═══╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
"banner_hero": """[#2A6FB9]⠀⢀⣀⡀⠀⠀⠀⠀⠀⠀[/]
[#5DB8F5]⠀⣠⣾⣿⣷⣄⠀⠀⠀⠀⠀⠀[/]
[#5DB8F5]⠀⢠⣿⠏⠀Ψ⠀⠹⣿⡄⠀⠀⠀⠀⠀⠀⠀[/]
[#A9DFFF]⠀⣿⡟⠀⠀⢻⣿⠀⠀⠀⠀⠀[/]
[#A9DFFF]⠀⠀⠀≈≈≈≈≈⣿⡇⠀⠀⠀⠀⠀⢸⣿≈≈≈≈≈⠀⠀⠀[/]
[#5DB8F5]⠀⣿⡇⠀⠀⢸⣿⠀⠀⠀⠀⠀[/]
[#2A6FB9]⠀⢿⣧⠀⠀⣼⡿⠀⠀⠀⠀⠀[/]
[#2A6FB9]⠀⠀⠀⠀⠀⠀⠘⢿⣷⣄⣀⣠⣾⡿⠃⠀⠀⠀⠀⠀⠀⠀[/]
[#153C73]⠀⠈⠻⣿⣿⡿⠟⠁⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#153C73]⠀⠈⠁⠀[/]
[#5DB8F5]⠀⠀⠀⠀⠀≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈⠀⠀⠀⠀⠀[/]
[#A9DFFF]⠀⠀⠀⠀⠀⠀≈≈≈≈≈≈≈≈≈≈≈≈≈⠀⠀⠀⠀⠀⠀[/]
[dim #153C73]deep waters hold[/]""",
},
"sisyphus": {
"name": "sisyphus",
"description": "Sisyphean theme — austere grayscale with persistence",
"colors": {
"banner_border": "#B7B7B7",
"banner_title": "#F5F5F5",
"banner_accent": "#E7E7E7",
"banner_dim": "#4A4A4A",
"banner_text": "#D3D3D3",
"ui_accent": "#E7E7E7",
"ui_label": "#D3D3D3",
"ui_ok": "#919191",
"ui_error": "#E7E7E7",
"ui_warn": "#B7B7B7",
"prompt": "#F5F5F5",
"input_rule": "#656565",
"response_border": "#B7B7B7",
"session_label": "#919191",
"session_border": "#656565",
},
"spinner": {
"waiting_faces": ["(◉)", "(◌)", "(◬)", "(⬤)", "(::)"],
"thinking_faces": ["(◉)", "(◬)", "(◌)", "(○)", "(●)"],
"thinking_verbs": [
"finding traction", "measuring the grade", "resetting the boulder",
"counting the ascent", "testing leverage", "setting the shoulder",
"pushing uphill", "enduring the loop",
],
"wings": [
["⟪◉", "◉⟫"],
["⟪◬", "◬⟫"],
["⟪◌", "◌⟫"],
["⟪⬤", "⬤⟫"],
],
},
"branding": {
"agent_name": "Sisyphus Agent",
"welcome": "Welcome to Sisyphus Agent! Type your message or /help for commands.",
"goodbye": "The boulder waits. ◉",
"response_label": " ◉ Sisyphus ",
"prompt_symbol": " ",
"help_header": "(◉) Available Commands",
},
"tool_prefix": "",
"banner_logo": """[bold #F5F5F5]███████╗██╗███████╗██╗ ██╗██████╗ ██╗ ██╗██╗ ██╗███████╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
[bold #E7E7E7]██╔════╝██║██╔════╝╚██╗ ██╔╝██╔══██╗██║ ██║██║ ██║██╔════╝ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
[#D7D7D7]███████╗██║███████╗ ╚████╔╝ ██████╔╝███████║██║ ██║███████╗█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
[#BFBFBF]╚════██║██║╚════██║ ╚██╔╝ ██╔═══╝ ██╔══██║██║ ██║╚════██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
[#8F8F8F]███████║██║███████║ ██║ ██║ ██║ ██║╚██████╔╝███████║ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
[#626262]╚══════╝╚═╝╚══════╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
"banner_hero": """[#B7B7B7]⠀⢀⣀⣀⣀⡀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#D3D3D3]⠀⠀⠀⠀⠀⣠⣾⣿⣿⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#E7E7E7]⠀⠀⠀⠀⠀⣾⣿⣿⣿⣿⣿⣿⣿⣷⠀⠀⠀⠀⠀⠀⠀[/]
[#F5F5F5]⠀⠀⠀⠀⠀⢸⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⠀⠀⠀⠀⠀⠀[/]
[#E7E7E7]⠀⠀⠀⠀⠀⣿⣿⣿⣿⣿⣿⣿⣿⣿⠀⠀⠀⠀⠀⠀⠀[/]
[#D3D3D3]⠀⠀⠀⠀⠀⠘⢿⣿⣿⣿⣿⣿⡿⠃⠀⠀⠀⠀⠀⠀⠀[/]
[#B7B7B7]⠀⠙⠿⣿⠿⠋⠀⠀⠀⠀⠀[/]
[#919191][/]
[#656565]⠀⣰⡄⠀[/]
[#656565]⠀⣰⣿⣿⣆⠀⠀⠀⠀[/]
[#4A4A4A]⠀⣰⣿⣿⣿⣿⣆⠀⠀⠀⠀⠀⠀[/]
[#4A4A4A]⠀⠀⠀⠀⠀⣀⣴⣿⣿⣿⣿⣿⣿⣦⣀⠀⠀⠀⠀⠀⠀[/]
[#656565]⠀⠀⠀━━━━━━━━━━━━━━━━━━━━━━━⠀⠀⠀[/]
[dim #4A4A4A]the boulder[/]""",
},
"charizard": {
"name": "charizard",
"description": "Volcanic theme — burnt orange and ember",
"colors": {
"banner_border": "#C75B1D",
"banner_title": "#FFD39A",
"banner_accent": "#F29C38",
"banner_dim": "#7A3511",
"banner_text": "#FFF0D4",
"ui_accent": "#F29C38",
"ui_label": "#FFD39A",
"ui_ok": "#4caf50",
"ui_error": "#ef5350",
"ui_warn": "#ffa726",
"prompt": "#FFF0D4",
"input_rule": "#C75B1D",
"response_border": "#F29C38",
"session_label": "#FFD39A",
"session_border": "#6C4724",
},
"spinner": {
"waiting_faces": ["(✦)", "(▲)", "(◇)", "(<>)", "(🔥)"],
"thinking_faces": ["(✦)", "(▲)", "(◇)", "(⌁)", "(🔥)"],
"thinking_verbs": [
"banking into the draft", "measuring burn", "reading the updraft",
"tracking ember fall", "setting wing angle", "holding the flame core",
"plotting a hot landing", "coiling for lift",
],
"wings": [
["⟪✦", "✦⟫"],
["⟪▲", "▲⟫"],
["⟪◌", "◌⟫"],
["⟪◇", "◇⟫"],
],
},
"branding": {
"agent_name": "Charizard Agent",
"welcome": "Welcome to Charizard Agent! Type your message or /help for commands.",
"goodbye": "Flame out! ✦",
"response_label": " ✦ Charizard ",
"prompt_symbol": " ",
"help_header": "(✦) Available Commands",
},
"tool_prefix": "",
"banner_logo": """[bold #FFF0D4] ██████╗██╗ ██╗ █████╗ ██████╗ ██╗███████╗ █████╗ ██████╗ ██████╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
[bold #FFD39A]██╔════╝██║ ██║██╔══██╗██╔══██╗██║╚══███╔╝██╔══██╗██╔══██╗██╔══██╗ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
[#F29C38]██║ ███████║███████║██████╔╝██║ ███╔╝ ███████║██████╔╝██║ ██║█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
[#E2832B]██║ ██╔══██║██╔══██║██╔══██╗██║ ███╔╝ ██╔══██║██╔══██╗██║ ██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
[#C75B1D]╚██████╗██║ ██║██║ ██║██║ ██║██║███████╗██║ ██║██║ ██║██████╔╝ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
[#7A3511] ╚═════╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
"banner_hero": """[#FFD39A]⠀⣀⣤⠶⠶⠶⣤⣀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#F29C38]⠀⣴⠟⠁⠀⠀⠈⠻⣦⠀⠀⠀⠀⠀⠀[/]
[#F29C38]⠀⣼⠏⠀⠀✦⠀⠀⠹⣧⠀⠀⠀⠀⠀[/]
[#E2832B]⠀⠀⠀⠀⢰⡟⠀⠀⣀⣤⣤⣤⣀⠀⠀⠀⢻⡆⠀⠀⠀⠀[/]
[#E2832B]⠀⠀⣠⡾⠛⠁⣠⣾⠟⠉⠀⠉⠻⣷⣄⠀⠈⠛⢷⣄⠀⠀[/]
[#C75B1D]⠀⣼⠟⠀⢀⣾⠟⠁⠀⠀⠀⠀⠀⠈⠻⣷⡀⠀⠻⣧⠀[/]
[#C75B1D]⢸⡟⠀⠀⣿⡟⠀⠀🔥⠀⠀⢻⣿⠀⠀⢻⡇[/]
[#7A3511]⠀⠻⣦⡀⠘⢿⣧⡀⠀⠀⠀⠀⠀⢀⣼⡿⠃⢀⣴⠟⠀[/]
[#7A3511]⠀⠀⠈⠻⣦⣀⠙⢿⣷⣤⣤⣤⣾⡿⠋⣀⣴⠟⠁⠀⠀[/]
[#C75B1D]⠀⠀⠀⠀⠈⠙⠛⠶⠤⠭⠭⠤⠶⠛⠋⠁⠀⠀⠀⠀[/]
[#F29C38]⠀⣰⡿⢿⣆⠀⠀⠀[/]
[#F29C38]⠀⣼⡟⠀⠀⢻⣧⠀⠀⠀[/]
[dim #7A3511]tail flame lit[/]""",
},
}
# =============================================================================
# Skin loading and management
# =============================================================================
_active_skin: Optional[SkinConfig] = None
_active_skin_name: str = "default"
def _skins_dir() -> Path:
"""User skins directory."""
home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
return home / "skins"
def _load_skin_from_yaml(path: Path) -> Optional[Dict[str, Any]]:
"""Load a skin definition from a YAML file."""
try:
import yaml
with open(path, "r", encoding="utf-8") as f:
data = yaml.safe_load(f)
if isinstance(data, dict) and "name" in data:
return data
except Exception as e:
logger.debug("Failed to load skin from %s: %s", path, e)
return None
def _build_skin_config(data: Dict[str, Any]) -> SkinConfig:
"""Build a SkinConfig from a raw dict (built-in or loaded from YAML)."""
# Start with default values as base for missing keys
default = _BUILTIN_SKINS["default"]
colors = dict(default.get("colors", {}))
colors.update(data.get("colors", {}))
spinner = dict(default.get("spinner", {}))
spinner.update(data.get("spinner", {}))
branding = dict(default.get("branding", {}))
branding.update(data.get("branding", {}))
return SkinConfig(
name=data.get("name", "unknown"),
description=data.get("description", ""),
colors=colors,
spinner=spinner,
branding=branding,
tool_prefix=data.get("tool_prefix", default.get("tool_prefix", "")),
banner_logo=data.get("banner_logo", ""),
banner_hero=data.get("banner_hero", ""),
)
def list_skins() -> List[Dict[str, str]]:
"""List all available skins (built-in + user-installed).
Returns list of {"name": ..., "description": ..., "source": "builtin"|"user"}.
"""
result = []
for name, data in _BUILTIN_SKINS.items():
result.append({
"name": name,
"description": data.get("description", ""),
"source": "builtin",
})
skins_path = _skins_dir()
if skins_path.is_dir():
for f in sorted(skins_path.glob("*.yaml")):
data = _load_skin_from_yaml(f)
if data:
skin_name = data.get("name", f.stem)
# Skip if it shadows a built-in
if any(s["name"] == skin_name for s in result):
continue
result.append({
"name": skin_name,
"description": data.get("description", ""),
"source": "user",
})
return result
def load_skin(name: str) -> SkinConfig:
"""Load a skin by name. Checks user skins first, then built-in."""
# Check user skins directory
skins_path = _skins_dir()
user_file = skins_path / f"{name}.yaml"
if user_file.is_file():
data = _load_skin_from_yaml(user_file)
if data:
return _build_skin_config(data)
# Check built-in skins
if name in _BUILTIN_SKINS:
return _build_skin_config(_BUILTIN_SKINS[name])
# Fallback to default
logger.warning("Skin '%s' not found, using default", name)
return _build_skin_config(_BUILTIN_SKINS["default"])
def get_active_skin() -> SkinConfig:
"""Get the currently active skin config (cached)."""
global _active_skin
if _active_skin is None:
_active_skin = load_skin(_active_skin_name)
return _active_skin
def set_active_skin(name: str) -> SkinConfig:
"""Switch the active skin. Returns the new SkinConfig."""
global _active_skin, _active_skin_name
_active_skin_name = name
_active_skin = load_skin(name)
return _active_skin
def get_active_skin_name() -> str:
"""Get the name of the currently active skin."""
return _active_skin_name
def init_skin_from_config(config: dict) -> None:
"""Initialize the active skin from CLI config at startup.
Call this once during CLI init with the loaded config dict.
"""
display = config.get("display", {})
skin_name = display.get("skin", "default")
if isinstance(skin_name, str) and skin_name.strip():
set_active_skin(skin_name.strip())
else:
set_active_skin("default")
+2 -2
View File
@@ -263,7 +263,7 @@ def show_status(args):
if jobs_file.exists():
import json
try:
with open(jobs_file) as f:
with open(jobs_file, encoding="utf-8") as f:
data = json.load(f)
jobs = data.get("jobs", [])
enabled_jobs = [j for j in jobs if j.get("enabled", True)]
@@ -283,7 +283,7 @@ def show_status(args):
if sessions_file.exists():
import json
try:
with open(sessions_file) as f:
with open(sessions_file, encoding="utf-8") as f:
data = json.load(f)
print(f" Active: {len(data)} session(s)")
except Exception:
+3
View File
@@ -7,3 +7,6 @@ without risk of circular imports.
OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
OPENROUTER_MODELS_URL = f"{OPENROUTER_BASE_URL}/models"
OPENROUTER_CHAT_URL = f"{OPENROUTER_BASE_URL}/chat/completions"
NOUS_API_BASE_URL = "https://inference-api.nousresearch.com/v1"
NOUS_API_CHAT_URL = f"{NOUS_API_BASE_URL}/chat/completions"
+44 -5
View File
@@ -16,6 +16,7 @@ Key design decisions:
import json
import os
import re
import sqlite3
import time
from pathlib import Path
@@ -490,12 +491,16 @@ class SessionDB:
msg_id = cursor.lastrowid
# Update counters
is_tool_related = role == "tool" or tool_calls is not None
if is_tool_related:
# Count actual tool calls from the tool_calls list (not from tool responses).
# A single assistant message can contain multiple parallel tool calls.
num_tool_calls = 0
if tool_calls is not None:
num_tool_calls = len(tool_calls) if isinstance(tool_calls, list) else 1
if num_tool_calls > 0:
self._conn.execute(
"""UPDATE sessions SET message_count = message_count + 1,
tool_call_count = tool_call_count + 1 WHERE id = ?""",
(session_id,),
tool_call_count = tool_call_count + ? WHERE id = ?""",
(num_tool_calls, session_id),
)
else:
self._conn.execute(
@@ -553,6 +558,32 @@ class SessionDB:
# Search
# =========================================================================
@staticmethod
def _sanitize_fts5_query(query: str) -> str:
"""Sanitize user input for safe use in FTS5 MATCH queries.
FTS5 has its own query syntax where characters like ``"``, ``(``, ``)``,
``+``, ``*``, ``{``, ``}`` and bare boolean operators (``AND``, ``OR``,
``NOT``) have special meaning. Passing raw user input directly to
MATCH can cause ``sqlite3.OperationalError``.
Strategy: strip characters that are only meaningful as FTS5 operators
and would otherwise cause syntax errors. This preserves normal keyword
search while preventing crashes on inputs like ``C++``, ``"unterminated``,
or ``hello AND``.
"""
# Remove FTS5-special characters that are not useful in keyword search
sanitized = re.sub(r'[+{}()"^]', " ", query)
# Collapse repeated * (e.g. "***") into a single one, and remove
# leading * (prefix-only matching requires at least one char before *)
sanitized = re.sub(r"\*+", "*", sanitized)
sanitized = re.sub(r"(^|\s)\*", r"\1", sanitized)
# Remove dangling boolean operators at start/end that would cause
# syntax errors (e.g. "hello AND" or "OR world")
sanitized = re.sub(r"(?i)^(AND|OR|NOT)\b\s*", "", sanitized.strip())
sanitized = re.sub(r"(?i)\s+(AND|OR|NOT)\s*$", "", sanitized.strip())
return sanitized.strip()
def search_messages(
self,
query: str,
@@ -576,6 +607,10 @@ class SessionDB:
if not query or not query.strip():
return []
query = self._sanitize_fts5_query(query)
if not query:
return []
if source_filter is None:
source_filter = ["cli", "telegram", "discord", "whatsapp", "slack"]
@@ -615,7 +650,11 @@ class SessionDB:
LIMIT ? OFFSET ?
"""
cursor = self._conn.execute(sql, params)
try:
cursor = self._conn.execute(sql, params)
except sqlite3.OperationalError:
# FTS5 query syntax error despite sanitization — return empty
return []
matches = [dict(row) for row in cursor.fetchall()]
# Add surrounding context (1 message before + after each match)
Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 870 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

+4 -1
View File
@@ -19,7 +19,10 @@
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap" rel="stylesheet">
<link rel="stylesheet" href="style.css">
<link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>⚕</text></svg>">
<link rel="icon" type="image/x-icon" href="favicon.ico">
<link rel="icon" type="image/png" sizes="32x32" href="favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="favicon-16x16.png">
<link rel="apple-touch-icon" sizes="180x180" href="apple-touch-icon.png">
</head>
<body>
<!-- Ambient glow effects -->
+9 -1
View File
@@ -266,6 +266,7 @@ def handle_function_call(
function_args: Dict[str, Any],
task_id: Optional[str] = None,
user_task: Optional[str] = None,
enabled_tools: Optional[List[str]] = None,
) -> str:
"""
Main function call dispatcher that routes calls to the tool registry.
@@ -275,6 +276,10 @@ def handle_function_call(
function_args: Arguments for the function.
task_id: Unique identifier for terminal/browser session isolation.
user_task: The user's original task (for browser_snapshot context).
enabled_tools: Tool names enabled for this session. When provided,
execute_code uses this list to determine which sandbox
tools to generate. Falls back to the process-global
``_last_resolved_tool_names`` for backward compat.
Returns:
Function result as a JSON string.
@@ -284,10 +289,13 @@ def handle_function_call(
return json.dumps({"error": f"{function_name} must be handled by the agent loop"})
if function_name == "execute_code":
# Prefer the caller-provided list so subagents can't overwrite
# the parent's tool set via the process-global.
sandbox_enabled = enabled_tools if enabled_tools is not None else _last_resolved_tool_names
return registry.dispatch(
function_name, function_args,
task_id=task_id,
enabled_tools=_last_resolved_tool_names,
enabled_tools=sandbox_enabled,
)
return registry.dispatch(
+2
View File
@@ -0,0 +1,2 @@
Optional migration workflows for importing user state and customizations from
other agent systems into Hermes Agent.
@@ -0,0 +1,281 @@
---
name: openclaw-migration
description: Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports Hermes-compatible memories, SOUL.md, command allowlists, user skills, and selected workspace assets from ~/.openclaw, then reports exactly what could not be migrated and why.
version: 1.0.0
author: Hermes Agent (Nous Research)
license: MIT
metadata:
hermes:
tags: [Migration, OpenClaw, Hermes, Memory, Persona, Import]
related_skills: [hermes-agent]
---
# OpenClaw -> Hermes Migration
Use this skill when a user wants to move their OpenClaw setup into Hermes Agent with minimal manual cleanup.
## What this skill does
It uses `scripts/openclaw_to_hermes.py` to:
- import `SOUL.md` into the Hermes home directory as `SOUL.md`
- transform OpenClaw `MEMORY.md` and `USER.md` into Hermes memory entries
- merge OpenClaw command approval patterns into Hermes `command_allowlist`
- migrate Hermes-compatible messaging settings such as `TELEGRAM_ALLOWED_USERS` and `MESSAGING_CWD`
- copy OpenClaw skills into `~/.hermes/skills/openclaw-imports/`
- optionally copy the OpenClaw workspace instructions file into a chosen Hermes workspace
- mirror compatible workspace assets such as `workspace/tts/` into `~/.hermes/tts/`
- archive non-secret docs that do not have a direct Hermes destination
- produce a structured report listing migrated items, conflicts, skipped items, and reasons
## Path resolution
The helper script lives in this skill directory at:
- `scripts/openclaw_to_hermes.py`
When this skill is installed from the Skills Hub, the normal location is:
- `~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py`
Do not guess a shorter path like `~/.hermes/skills/openclaw-migration/...`.
Before running the helper:
1. Prefer the installed path under `~/.hermes/skills/migration/openclaw-migration/`.
2. If that path fails, inspect the installed skill directory and resolve the script relative to the installed `SKILL.md`.
3. Only use `find` as a fallback if the installed location is missing or the skill was moved manually.
4. When calling the terminal tool, do not pass `workdir: "~"`. Use an absolute directory such as the user's home directory, or omit `workdir` entirely.
With `--migrate-secrets`, it will also import a small allowlisted set of Hermes-compatible secrets, currently:
- `TELEGRAM_BOT_TOKEN`
## Default workflow
1. Inspect first with a dry run.
2. Present a simple summary of what can be migrated, what cannot be migrated, and what would be archived.
3. If the `clarify` tool is available, use it for user decisions instead of asking for a free-form prose reply.
4. If the dry run finds imported skill directory conflicts, ask how those should be handled before executing.
5. Ask the user to choose between the two supported migration modes before executing.
6. Ask for a target workspace path only if the user wants the workspace instructions file brought over.
7. Execute the migration with the matching preset and flags.
8. Summarize the results, especially:
- what was migrated
- what was archived for manual review
- what was skipped and why
## User interaction protocol
Hermes CLI supports the `clarify` tool for interactive prompts, but it is limited to:
- one choice at a time
- up to 4 predefined choices
- an automatic `Other` free-text option
It does **not** support true multi-select checkboxes in a single prompt.
For every `clarify` call:
- always include a non-empty `question`
- include `choices` only for real selectable prompts
- keep `choices` to 2-4 plain string options
- never emit placeholder or truncated options such as `...`
- never pad or stylize choices with extra whitespace
- never include fake form fields in the question such as `enter directory here`, blank lines to fill in, or underscores like `_____`
- for open-ended path questions, ask only the plain sentence; the user types in the normal CLI prompt below the panel
If a `clarify` call returns an error, inspect the error text, correct the payload, and retry once with a valid `question` and clean choices.
When `clarify` is available and the dry run reveals any required user decision, your **next action must be a `clarify` tool call**.
Do not end the turn with a normal assistant message such as:
- "Let me present the choices"
- "What would you like to do?"
- "Here are the options"
If a user decision is required, collect it via `clarify` before producing more prose.
If multiple unresolved decisions remain, do not insert an explanatory assistant message between them. After one `clarify` response is received, your next action should usually be the next required `clarify` call.
Treat `workspace-agents` as an unresolved decision whenever the dry run reports:
- `kind="workspace-agents"`
- `status="skipped"`
- reason containing `No workspace target was provided`
In that case, you must ask about workspace instructions before execution. Do not silently treat that as a decision to skip.
Because of that limitation, use this simplified decision flow:
1. For `SOUL.md` conflicts, use `clarify` with choices such as:
- `keep existing`
- `overwrite with backup`
- `review first`
2. If the dry run shows one or more `kind="skill"` items with `status="conflict"`, use `clarify` with choices such as:
- `keep existing skills`
- `overwrite conflicting skills with backup`
- `import conflicting skills under renamed folders`
3. For workspace instructions, use `clarify` with choices such as:
- `skip workspace instructions`
- `copy to a workspace path`
- `decide later`
4. If the user chooses to copy workspace instructions, ask a follow-up open-ended `clarify` question requesting an **absolute path**.
5. If the user chooses `skip workspace instructions` or `decide later`, proceed without `--workspace-target`.
5. For migration mode, use `clarify` with these 3 choices:
- `user-data only`
- `full compatible migration`
- `cancel`
6. `user-data only` means: migrate user data and compatible config, but do **not** import allowlisted secrets.
7. `full compatible migration` means: migrate the same compatible user data plus the allowlisted secrets when present.
8. If `clarify` is not available, ask the same question in normal text, but still constrain the answer to `user-data only`, `full compatible migration`, or `cancel`.
Execution gate:
- Do not execute while a `workspace-agents` skip caused by `No workspace target was provided` remains unresolved.
- The only valid ways to resolve it are:
- user explicitly chooses `skip workspace instructions`
- user explicitly chooses `decide later`
- user provides a workspace path after choosing `copy to a workspace path`
- Absence of a workspace target in the dry run is not itself permission to execute.
- Do not execute while any required `clarify` decision remains unresolved.
Use these exact `clarify` payload shapes as the default pattern:
- `{"question":"Your existing SOUL.md conflicts with the imported one. What should I do?","choices":["keep existing","overwrite with backup","review first"]}`
- `{"question":"One or more imported OpenClaw skills already exist in Hermes. How should I handle those skill conflicts?","choices":["keep existing skills","overwrite conflicting skills with backup","import conflicting skills under renamed folders"]}`
- `{"question":"Choose migration mode: migrate only user data, or run the full compatible migration including allowlisted secrets?","choices":["user-data only","full compatible migration","cancel"]}`
- `{"question":"Do you want to copy the OpenClaw workspace instructions file into a Hermes workspace?","choices":["skip workspace instructions","copy to a workspace path","decide later"]}`
- `{"question":"Please provide an absolute path where the workspace instructions should be copied."}`
## Decision-to-command mapping
Map user decisions to command flags exactly:
- If the user chooses `keep existing` for `SOUL.md`, do **not** add `--overwrite`.
- If the user chooses `overwrite with backup`, add `--overwrite`.
- If the user chooses `review first`, stop before execution and review the relevant files.
- If the user chooses `keep existing skills`, add `--skill-conflict skip`.
- If the user chooses `overwrite conflicting skills with backup`, add `--skill-conflict overwrite`.
- If the user chooses `import conflicting skills under renamed folders`, add `--skill-conflict rename`.
- If the user chooses `user-data only`, execute with `--preset user-data` and do **not** add `--migrate-secrets`.
- If the user chooses `full compatible migration`, execute with `--preset full --migrate-secrets`.
- Only add `--workspace-target` if the user explicitly provided an absolute workspace path.
- If the user chooses `skip workspace instructions` or `decide later`, do not add `--workspace-target`.
Before executing, restate the exact command plan in plain language and make sure it matches the user's choices.
## Post-run reporting rules
After execution, treat the script's JSON output as the source of truth.
1. Base all counts on `report.summary`.
2. Only list an item under "Successfully Migrated" if its `status` is exactly `migrated`.
3. Do not claim a conflict was resolved unless the report shows that item as `migrated`.
4. Do not say `SOUL.md` was overwritten unless the report item for `kind="soul"` has `status="migrated"`.
5. If `report.summary.conflict > 0`, include a conflict section instead of silently implying success.
6. If counts and listed items disagree, fix the list to match the report before responding.
7. Include the `output_dir` path from the report when available so the user can inspect `report.json`, `summary.md`, backups, and archived files.
8. For memory or user-profile overflow, do not say the entries were archived unless the report explicitly shows an archive path. If `details.overflow_file` exists, say the full overflow list was exported there.
9. If a skill was imported under a renamed folder, report the final destination and mention `details.renamed_from`.
10. If `report.skill_conflict_mode` is present, use it as the source of truth for the selected imported-skill conflict policy.
11. If an item has `status="skipped"`, do not describe it as overwritten, backed up, migrated, or resolved.
12. If `kind="soul"` has `status="skipped"` with reason `Target already matches source`, say it was left unchanged and do not mention a backup.
13. If a renamed imported skill has an empty `details.backup`, do not imply the existing Hermes skill was renamed or backed up. Say only that the imported copy was placed in the new destination and reference `details.renamed_from` as the pre-existing folder that remained in place.
## Migration presets
Prefer these two presets in normal use:
- `user-data`
- `full`
`user-data` includes:
- `soul`
- `workspace-agents`
- `memory`
- `user-profile`
- `messaging-settings`
- `command-allowlist`
- `skills`
- `tts-assets`
- `archive`
`full` includes everything in `user-data` plus:
- `secret-settings`
The helper script still supports category-level `--include` / `--exclude`, but treat that as an advanced fallback rather than the default UX.
## Commands
Dry run with full discovery:
```bash
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py
```
When using the terminal tool, prefer an absolute invocation pattern such as:
```json
{"command":"python3 /home/USER/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py","workdir":"/home/USER"}
```
Dry run with the user-data preset:
```bash
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --preset user-data
```
Execute a user-data migration:
```bash
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --execute --preset user-data --skill-conflict skip
```
Execute a full compatible migration:
```bash
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --execute --preset full --migrate-secrets --skill-conflict skip
```
Execute with workspace instructions included:
```bash
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --execute --preset user-data --skill-conflict rename --workspace-target "/absolute/workspace/path"
```
Do not use `$PWD` or the home directory as the workspace target by default. Ask for an explicit workspace path first.
## Important rules
1. Run a dry run before writing unless the user explicitly says to proceed immediately.
2. Do not migrate secrets by default. Tokens, auth blobs, device credentials, and raw gateway config should stay out of Hermes unless the user explicitly asks for secret migration.
3. Do not silently overwrite non-empty Hermes targets unless the user explicitly wants that. The helper script will preserve backups when overwriting is enabled.
4. Always give the user the skipped-items report. That report is part of the migration, not an optional extra.
5. Prefer the primary OpenClaw workspace (`~/.openclaw/workspace/`) over `workspace.default/`. Only use the default workspace as fallback when the primary files are missing.
6. Even in secret-migration mode, only migrate secrets with a clean Hermes destination. Unsupported auth blobs must still be reported as skipped.
7. If the dry run shows a large asset copy, a conflicting `SOUL.md`, or overflowed memory entries, call those out separately before execution.
8. Default to `user-data only` if the user is unsure.
9. Only include `workspace-agents` when the user has explicitly provided a destination workspace path.
10. Treat category-level `--include` / `--exclude` as an advanced escape hatch, not the normal flow.
11. Do not end the dry-run summary with a vague “What would you like to do?” if `clarify` is available. Use structured follow-up prompts instead.
12. Do not use an open-ended `clarify` prompt when a real choice prompt would work. Prefer selectable choices first, then free text only for absolute paths or file review requests.
13. After a dry run, never stop after summarizing if there is still an unresolved decision. Use `clarify` immediately for the highest-priority blocking decision.
14. Priority order for follow-up questions:
- `SOUL.md` conflict
- imported skill conflicts
- migration mode
- workspace instructions destination
15. Do not promise to present choices later in the same message. Present them by actually calling `clarify`.
16. After the migration-mode answer, explicitly check whether `workspace-agents` is still unresolved. If it is, your next action must be the workspace-instructions `clarify` call.
17. After any `clarify` answer, if another required decision remains, do not narrate what was just decided. Ask the next required question immediately.
## Expected result
After a successful run, the user should have:
- Hermes persona state imported
- Hermes memory files populated with converted OpenClaw knowledge
- OpenClaw skills available under `~/.hermes/skills/openclaw-imports/`
- a migration report showing any conflicts, omissions, or unsupported data
File diff suppressed because it is too large Load Diff
+4 -1
View File
@@ -46,7 +46,10 @@ cron = ["croniter"]
slack = ["slack-bolt>=1.18.0", "slack-sdk>=3.27.0"]
cli = ["simple-term-menu"]
tts-premium = ["elevenlabs"]
pty = ["ptyprocess>=0.7.0"]
pty = [
"ptyprocess>=0.7.0; sys_platform != 'win32'",
"pywinpty>=2.0.0; sys_platform == 'win32'",
]
honcho = ["honcho-ai>=2.0.1"]
mcp = ["mcp>=1.2.0"]
homeassistant = ["aiohttp>=3.9.0"]
+179 -47
View File
@@ -172,6 +172,7 @@ class AIAgent:
provider_data_collection: str = None,
session_id: str = None,
tool_progress_callback: callable = None,
thinking_callback: callable = None,
clarify_callback: callable = None,
step_callback: callable = None,
max_tokens: int = None,
@@ -184,6 +185,8 @@ class AIAgent:
honcho_session_key: str = None,
iteration_budget: "IterationBudget" = None,
fallback_model: Dict[str, Any] = None,
checkpoints_enabled: bool = False,
checkpoint_max_snapshots: int = 50,
):
"""
Initialize the AI Agent.
@@ -256,6 +259,7 @@ class AIAgent:
self.api_mode = "chat_completions"
self.tool_progress_callback = tool_progress_callback
self.thinking_callback = thinking_callback
self.clarify_callback = clarify_callback
self.step_callback = step_callback
self._last_reported_tool = None # Track for "new tool" mode
@@ -484,6 +488,13 @@ class AIAgent:
# Cached system prompt -- built once per session, only rebuilt on compression
self._cached_system_prompt: Optional[str] = None
# Filesystem checkpoint manager (transparent — not a tool)
from tools.checkpoint_manager import CheckpointManager
self._checkpoint_mgr = CheckpointManager(
enabled=checkpoints_enabled,
max_snapshots=checkpoint_max_snapshots,
)
# SQLite session store (optional -- provided by CLI or gateway)
self._session_db = session_db
if self._session_db:
@@ -1431,6 +1442,34 @@ class AIAgent:
return "\n\n".join(prompt_parts)
def _repair_tool_call(self, tool_name: str) -> str | None:
"""Attempt to repair a mismatched tool name before aborting.
1. Try lowercase
2. Try normalized (lowercase + hyphens/spaces -> underscores)
3. Try fuzzy match (difflib, cutoff=0.7)
Returns the repaired name if found in valid_tool_names, else None.
"""
from difflib import get_close_matches
# 1. Lowercase
lowered = tool_name.lower()
if lowered in self.valid_tool_names:
return lowered
# 2. Normalize
normalized = lowered.replace("-", "_").replace(" ", "_")
if normalized in self.valid_tool_names:
return normalized
# 3. Fuzzy match
matches = get_close_matches(lowered, self.valid_tool_names, n=1, cutoff=0.7)
if matches:
return matches[0]
return None
def _invalidate_system_prompt(self):
"""
Invalidate the cached system prompt, forcing a rebuild on the next turn.
@@ -2689,6 +2728,8 @@ class AIAgent:
except json.JSONDecodeError as e:
logging.warning(f"Unexpected JSON error after validation: {e}")
function_args = {}
if not isinstance(function_args, dict):
function_args = {}
if not self.quiet_mode:
args_str = json.dumps(function_args, ensure_ascii=False)
@@ -2702,6 +2743,18 @@ class AIAgent:
except Exception as cb_err:
logging.debug(f"Tool progress callback error: {cb_err}")
# Checkpoint: snapshot working dir before file-mutating tools
if function_name in ("write_file", "patch") and self._checkpoint_mgr.enabled:
try:
file_path = function_args.get("path", "")
if file_path:
work_dir = self._checkpoint_mgr.get_working_dir_for_path(file_path)
self._checkpoint_mgr.ensure_checkpoint(
work_dir, f"before {function_name}"
)
except Exception:
pass # never block tool execution
tool_start_time = time.time()
if function_name == "todo":
@@ -2814,7 +2867,10 @@ class AIAgent:
spinner.start()
_spinner_result = None
try:
function_result = handle_function_call(function_name, function_args, effective_task_id)
function_result = handle_function_call(
function_name, function_args, effective_task_id,
enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
)
_spinner_result = function_result
except Exception as tool_error:
function_result = f"Error executing tool '{function_name}': {tool_error}"
@@ -2825,7 +2881,10 @@ class AIAgent:
spinner.stop(cute_msg)
else:
try:
function_result = handle_function_call(function_name, function_args, effective_task_id)
function_result = handle_function_call(
function_name, function_args, effective_task_id,
enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
)
except Exception as tool_error:
function_result = f"Error executing tool '{function_name}': {tool_error}"
logger.error("handle_function_call raised for %s: %s", function_name, tool_error, exc_info=True)
@@ -3042,6 +3101,8 @@ class AIAgent:
self._invalid_tool_retries = 0
self._invalid_json_retries = 0
self._empty_content_retries = 0
self._incomplete_scratchpad_retries = 0
self._codex_incomplete_retries = 0
self._last_content_with_tools = None
self._turns_since_memory = 0
self._iters_since_skill = 0
@@ -3206,11 +3267,16 @@ class AIAgent:
final_response = None
interrupted = False
codex_ack_continuations = 0
length_continue_retries = 0
truncated_response_prefix = ""
# Clear any stale interrupt state at start
self.clear_interrupt()
while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
# Reset per-turn checkpoint dedup so each iteration can take one snapshot
self._checkpoint_mgr.new_turn()
# Check for interrupt request (e.g., user sent new message)
if self._interrupt_requested:
interrupted = True
@@ -3254,7 +3320,7 @@ class AIAgent:
api_messages = []
for msg in messages:
api_msg = msg.copy()
# For ALL assistant messages, pass reasoning back to the API
# This ensures multi-turn reasoning context is preserved
if msg.get("role") == "assistant":
@@ -3262,7 +3328,7 @@ class AIAgent:
if reasoning_text:
# Add reasoning_content for API compatibility (Moonshot AI, Novita, OpenRouter)
api_msg["reasoning_content"] = reasoning_text
# Remove 'reasoning' field - it's for trajectory storage only
# We've copied it to 'reasoning_content' for the API above
if "reasoning" in api_msg:
@@ -3273,7 +3339,7 @@ class AIAgent:
# Keep 'reasoning_details' - OpenRouter uses this for multi-turn reasoning context
# The signature field helps maintain reasoning continuity
api_messages.append(api_msg)
# Build the final system message: cached prompt + ephemeral system prompt.
# The ephemeral part is appended here (not baked into the cached prompt)
# so it stays out of the session DB and logs.
@@ -3286,21 +3352,21 @@ class AIAgent:
effective_system = (effective_system + "\n\n" + self.ephemeral_system_prompt).strip()
if effective_system:
api_messages = [{"role": "system", "content": effective_system}] + api_messages
# Inject ephemeral prefill messages right after the system prompt
# but before conversation history. Same API-call-time-only pattern.
if self.prefill_messages:
sys_offset = 1 if effective_system else 0
for idx, pfm in enumerate(self.prefill_messages):
api_messages.insert(sys_offset + idx, pfm.copy())
# Apply Anthropic prompt caching for Claude models via OpenRouter.
# Auto-detected: if model name contains "claude" and base_url is OpenRouter,
# inject cache_control breakpoints (system + last 3 messages) to reduce
# input token costs by ~75% on multi-turn conversations.
if self._use_prompt_caching:
api_messages = apply_anthropic_cache_control(api_messages, cache_ttl=self._cache_ttl)
# Safety net: strip orphaned tool results / add stubs for missing
# results before sending to the API. The compressor handles this
# during compression, but orphans can also sneak in from session
@@ -3323,9 +3389,13 @@ class AIAgent:
# Animated thinking spinner in quiet mode
face = random.choice(KawaiiSpinner.KAWAII_THINKING)
verb = random.choice(KawaiiSpinner.THINKING_VERBS)
spinner_type = random.choice(['brain', 'sparkle', 'pulse', 'moon', 'star'])
thinking_spinner = KawaiiSpinner(f"{face} {verb}...", spinner_type=spinner_type)
thinking_spinner.start()
if self.thinking_callback:
# CLI TUI mode: use prompt_toolkit widget instead of raw spinner
self.thinking_callback(f"{face} {verb}...")
else:
spinner_type = random.choice(['brain', 'sparkle', 'pulse', 'moon', 'star'])
thinking_spinner = KawaiiSpinner(f"{face} {verb}...", spinner_type=spinner_type)
thinking_spinner.start()
# Log request details if verbose
if self.verbose_logging:
@@ -3340,6 +3410,8 @@ class AIAgent:
max_compression_attempts = 3
codex_auth_retry_attempted = False
nous_auth_retry_attempted = False
restart_with_compressed_messages = False
restart_with_length_continuation = False
finish_reason = "stop"
response = None # Guard against UnboundLocalError if all retries fail
@@ -3362,6 +3434,8 @@ class AIAgent:
if thinking_spinner:
thinking_spinner.stop("")
thinking_spinner = None
if self.thinking_callback:
self.thinking_callback("")
if not self.quiet_mode:
print(f"{self.log_prefix}⏱️ API call completed in {api_duration:.2f}s")
@@ -3402,6 +3476,8 @@ class AIAgent:
if thinking_spinner:
thinking_spinner.stop(f"(´;ω;`) oops, retrying...")
thinking_spinner = None
if self.thinking_callback:
self.thinking_callback("")
# This is often rate limiting or provider returning malformed response
retry_count += 1
@@ -3486,19 +3562,60 @@ class AIAgent:
finish_reason = "stop"
else:
finish_reason = response.choices[0].finish_reason
# Handle "length" finish_reason - response was truncated
if finish_reason == "length":
print(f"{self.log_prefix}⚠️ Response truncated (finish_reason='length') - model hit max output tokens")
if self.api_mode == "chat_completions":
assistant_message = response.choices[0].message
if not assistant_message.tool_calls:
length_continue_retries += 1
interim_msg = self._build_assistant_message(assistant_message, finish_reason)
messages.append(interim_msg)
self._log_msg_to_db(interim_msg)
if assistant_message.content:
truncated_response_prefix += assistant_message.content
if length_continue_retries < 3:
print(
f"{self.log_prefix}↻ Requesting continuation "
f"({length_continue_retries}/3)..."
)
continue_msg = {
"role": "user",
"content": (
"[System: Your previous response was truncated by the output "
"length limit. Continue exactly where you left off. Do not "
"restart or repeat prior text. Finish the answer directly.]"
),
}
messages.append(continue_msg)
self._log_msg_to_db(continue_msg)
self._session_messages = messages
self._save_session_log(messages)
restart_with_length_continuation = True
break
partial_response = self._strip_think_blocks(truncated_response_prefix).strip()
self._cleanup_task_resources(effective_task_id)
self._persist_session(messages, conversation_history)
return {
"final_response": partial_response or None,
"messages": messages,
"api_calls": api_call_count,
"completed": False,
"partial": True,
"error": "Response remained truncated after 3 continuation attempts",
}
# If we have prior messages, roll back to last complete state
if len(messages) > 1:
print(f"{self.log_prefix} ⏪ Rolling back to last complete assistant turn")
rolled_back_messages = self._get_messages_up_to_last_assistant(messages)
self._cleanup_task_resources(effective_task_id)
self._persist_session(messages, conversation_history)
return {
"final_response": None,
"messages": rolled_back_messages,
@@ -3571,6 +3688,8 @@ class AIAgent:
if thinking_spinner:
thinking_spinner.stop("")
thinking_spinner = None
if self.thinking_callback:
self.thinking_callback("")
api_elapsed = time.time() - api_start_time
print(f"{self.log_prefix}⚡ Interrupted during API call.")
self._persist_session(messages, conversation_history)
@@ -3583,6 +3702,8 @@ class AIAgent:
if thinking_spinner:
thinking_spinner.stop(f"(╥_╥) error, retrying...")
thinking_spinner = None
if self.thinking_callback:
self.thinking_callback("")
status_code = getattr(api_error, "status_code", None)
if (
@@ -3665,7 +3786,8 @@ class AIAgent:
if len(messages) < original_len:
print(f"{self.log_prefix} 🗜️ Compressed {original_len}{len(messages)} messages, retrying...")
time.sleep(2) # Brief pause between compression retries
continue # Retry with compressed messages
restart_with_compressed_messages = True
break
else:
print(f"{self.log_prefix}❌ Payload too large and cannot compress further.")
logging.error(f"{self.log_prefix}413 payload too large. Cannot compress further.")
@@ -3733,7 +3855,8 @@ class AIAgent:
if len(messages) < original_len:
print(f"{self.log_prefix} 🗜️ Compressed {original_len}{len(messages)} messages, retrying...")
time.sleep(2) # Brief pause between compression retries
continue # Retry with compressed messages or new tier
restart_with_compressed_messages = True
break
else:
# Can't compress further and already at minimum tier
print(f"{self.log_prefix}❌ Context length exceeded and cannot compress further.")
@@ -3820,6 +3943,14 @@ class AIAgent:
if interrupted:
break
if restart_with_compressed_messages:
api_call_count -= 1
self.iteration_budget.refund()
continue
if restart_with_length_continuation:
continue
# Guard: if all retries exhausted without a successful response
# (e.g. repeated context-length errors that exhausted retry_count),
# the `response` variable is still None. Break out cleanly.
@@ -3964,39 +4095,37 @@ class AIAgent:
logging.debug(f"Tool call: {tc.function.name} with args: {tc.function.arguments[:200]}...")
# Validate tool call names - detect model hallucinations
# Repair mismatched tool names before validating
for tc in assistant_message.tool_calls:
if tc.function.name not in self.valid_tool_names:
repaired = self._repair_tool_call(tc.function.name)
if repaired:
print(f"{self.log_prefix}🔧 Auto-repaired tool name: '{tc.function.name}' -> '{repaired}'")
tc.function.name = repaired
invalid_tool_calls = [
tc.function.name for tc in assistant_message.tool_calls
tc.function.name for tc in assistant_message.tool_calls
if tc.function.name not in self.valid_tool_names
]
if invalid_tool_calls:
# Track retries for invalid tool calls
if not hasattr(self, '_invalid_tool_retries'):
self._invalid_tool_retries = 0
self._invalid_tool_retries += 1
invalid_preview = invalid_tool_calls[0][:80] + "..." if len(invalid_tool_calls[0]) > 80 else invalid_tool_calls[0]
print(f"{self.log_prefix}⚠️ Invalid tool call detected: '{invalid_preview}'")
print(f"{self.log_prefix} Valid tools: {sorted(self.valid_tool_names)}")
if self._invalid_tool_retries < 3:
print(f"{self.log_prefix}🔄 Retrying API call ({self._invalid_tool_retries}/3)...")
# Don't add anything to messages, just retry the API call
continue
else:
print(f"{self.log_prefix}❌ Max retries (3) for invalid tool calls exceeded. Stopping as partial.")
# Return partial result - don't include the bad tool call in messages
self._invalid_tool_retries = 0
self._persist_session(messages, conversation_history)
return {
"final_response": None,
"messages": messages,
"api_calls": api_call_count,
"completed": False,
"partial": True,
"error": f"Model generated invalid tool call: {invalid_preview}"
}
# Return helpful error to model — model can self-correct next turn
available = ", ".join(sorted(self.valid_tool_names))
invalid_name = invalid_tool_calls[0]
invalid_preview = invalid_name[:80] + "..." if len(invalid_name) > 80 else invalid_name
print(f"{self.log_prefix}⚠️ Unknown tool '{invalid_preview}' — sending error to model for self-correction")
assistant_msg = self._build_assistant_message(assistant_message, finish_reason)
messages.append(assistant_msg)
self._log_msg_to_db(assistant_msg)
for tc in assistant_message.tool_calls:
if tc.function.name not in self.valid_tool_names:
content = f"Tool '{tc.function.name}' does not exist. Available tools: {available}"
else:
content = f"Skipped: another tool call in this turn used an invalid name. Please retry this tool call."
messages.append({
"role": "tool",
"tool_call_id": tc.id,
"content": content,
})
continue
# Reset retry counter on successful tool call validation
if hasattr(self, '_invalid_tool_retries'):
self._invalid_tool_retries = 0
@@ -4210,6 +4339,9 @@ class AIAgent:
continue
codex_ack_continuations = 0
if truncated_response_prefix:
final_response = truncated_response_prefix + final_response
# Strip <think> blocks from user-facing response (keep raw in messages for trajectory)
final_response = self._strip_think_blocks(final_response).strip()
+215
View File
@@ -0,0 +1,215 @@
---
name: pokemon-player
description: Play Pokemon games autonomously via headless emulation. Starts a game server, reads structured game state from RAM, makes strategic decisions, and sends button inputs — all from the terminal.
tags: [gaming, pokemon, emulator, pyboy, gameplay, gameboy]
---
# Pokemon Player
Play Pokemon games via headless emulation using the `pokemon-agent` package.
## When to Use
- User says "play pokemon", "start pokemon", "pokemon game"
- User asks about Pokemon Red, Blue, Yellow, FireRed, etc.
- User wants to watch an AI play Pokemon
- User references a ROM file (.gb, .gbc, .gba)
## Startup Procedure
### 1. First-time setup (clone, venv, install)
The repo is NousResearch/pokemon-agent on GitHub. Clone it, then
set up a Python 3.10+ virtual environment. Use uv (preferred for speed)
to create the venv and install the package in editable mode with the
pyboy extra. If uv is not available, fall back to python3 -m venv + pip.
On this machine it is already set up at /home/teknium/pokemon-agent
with a venv ready — just cd there and source .venv/bin/activate.
You also need a ROM file. Ask the user for theirs. On this machine
one exists at roms/pokemon_red.gb inside that directory.
NEVER download or provide ROM files — always ask the user.
### 2. Start the game server
From inside the pokemon-agent directory with the venv activated, run
pokemon-agent serve with --rom pointing to the ROM and --port 9876.
Run it in the background with &.
To resume from a saved game, add --load-state with the save name.
Wait 4 seconds for startup, then verify with GET /health.
### 3. Set up live dashboard for user to watch
Use an SSH reverse tunnel via localhost.run so the user can view
the dashboard in their browser. Connect with ssh, forwarding local
port 9876 to remote port 80 on nokey@localhost.run. Redirect output
to a log file, wait 10 seconds, then grep the log for the .lhr.life
URL. Give the user the URL with /dashboard/ appended.
The tunnel URL changes each time — give the user the new one if restarted.
## Save and Load
### When to save
- Every 15-20 turns of gameplay
- ALWAYS before gym battles, rival encounters, or risky fights
- Before entering a new town or dungeon
- Before any action you are unsure about
### How to save
POST /save with a descriptive name. Good examples:
before_brock, route1_start, mt_moon_entrance, got_cut
### How to load
POST /load with the save name.
### List available saves
GET /saves returns all saved states.
### Loading on server startup
Use --load-state flag when starting the server to auto-load a save.
This is faster than loading via the API after startup.
## The Gameplay Loop
### Step 1: OBSERVE — check state AND take a screenshot
GET /state for position, HP, battle, dialog.
GET /screenshot and save to /tmp/pokemon.png, then use vision_analyze.
Always do BOTH — RAM state gives numbers, vision gives spatial awareness.
### Step 2: ORIENT
- Dialog/text on screen → advance it
- In battle → fight or run
- Party hurt → head to Pokemon Center
- Near objective → navigate carefully
### Step 3: DECIDE
Priority: dialog > battle > heal > story objective > training > explore
### Step 4: ACT — move 2-4 steps max, then re-check
POST /action with a SHORT action list (2-4 actions, not 10-15).
### Step 5: VERIFY — screenshot after every move sequence
Take a screenshot and use vision_analyze to confirm you moved where
intended. This is the MOST IMPORTANT step. Without vision you WILL get lost.
### Step 6: RECORD progress to memory with PKM: prefix
### Step 7: SAVE periodically
## Action Reference
- press_a — confirm, talk, select
- press_b — cancel, close menu
- press_start — open game menu
- walk_up/down/left/right — move one tile
- hold_b_N — hold B for N frames (use for speeding through text)
- wait_60 — wait about 1 second (60 frames)
- a_until_dialog_end — press A repeatedly until dialog clears
## Critical Tips from Experience
### USE VISION CONSTANTLY
- Take a screenshot every 2-4 movement steps
- The RAM state tells you position and HP but NOT what is around you
- Ledges, fences, signs, building doors, NPCs — only visible via screenshot
- Ask the vision model specific questions: "what is one tile north of me?"
- When stuck, always screenshot before trying random directions
### Warp Transitions Need Extra Wait Time
When walking through a door or stairs, the screen fades to black during
the map transition. You MUST wait for it to complete. Add 2-3 wait_60
actions after any door/stair warp. Without waiting, the position reads
as stale and you will think you are still in the old map.
### Building Exit Trap
When you exit a building, you appear directly IN FRONT of the door.
If you walk north, you go right back inside. ALWAYS sidestep first
by walking left or right 2 tiles, then proceed in your intended direction.
### Dialog Handling
Gen 1 text scrolls slowly letter-by-letter. To speed through dialog,
hold B for 120 frames then press A. Repeat as needed. Holding B makes
text display at max speed. Then press A to advance to the next line.
The a_until_dialog_end action checks the RAM dialog flag, but this flag
does not catch ALL text states. If dialog seems stuck, use the manual
hold_b + press_a pattern instead and verify via screenshot.
### Ledges Are One-Way
Ledges (small cliff edges) can only be jumped DOWN (south), never climbed
UP (north). If blocked by a ledge going north, you must go left or right
to find the gap around it. Use vision to identify which direction the
gap is. Ask the vision model explicitly.
### Navigation Strategy
- Move 2-4 steps at a time, then screenshot to check position
- When entering a new area, screenshot immediately to orient
- Ask the vision model "which direction to [destination]?"
- If stuck for 3+ attempts, screenshot and re-evaluate completely
- Do not spam 10-15 movements — you will overshoot or get stuck
### Running from Wild Battles
On the battle menu, RUN is bottom-right. To reach it from the default
cursor position (FIGHT, top-left): press down then right to move cursor
to RUN, then press A. Wrap with hold_b to speed through text/animations.
### Battling (FIGHT)
On the battle menu FIGHT is top-left (default cursor position).
Press A to enter move selection, A again to use the first move.
Then hold B to speed through attack animations and text.
## Battle Strategy
### Decision Tree
1. Want to catch? → Weaken then throw Poke Ball
2. Wild you don't need? → RUN
3. Type advantage? → Use super-effective move
4. No advantage? → Use strongest STAB move
5. Low HP? → Switch or use Potion
### Gen 1 Type Chart (key matchups)
- Water beats Fire, Ground, Rock
- Fire beats Grass, Bug, Ice
- Grass beats Water, Ground, Rock
- Electric beats Water, Flying
- Ground beats Fire, Electric, Rock, Poison
- Psychic beats Fighting, Poison (dominant in Gen 1!)
### Gen 1 Quirks
- Special stat = both offense AND defense for special moves
- Psychic type is overpowered (Ghost moves bugged)
- Critical hits based on Speed stat
- Wrap/Bind prevent opponent from acting
- Focus Energy bug: REDUCES crit rate instead of raising it
## Memory Conventions
| Prefix | Purpose | Example |
|--------|---------|---------|
| PKM:OBJECTIVE | Current goal | Get Parcel from Viridian Mart |
| PKM:MAP | Navigation knowledge | Viridian: mart is northeast |
| PKM:STRATEGY | Battle/team plans | Need Grass type before Misty |
| PKM:PROGRESS | Milestone tracker | Beat rival, heading to Viridian |
| PKM:STUCK | Stuck situations | Ledge at y=28 go right to bypass |
| PKM:TEAM | Team notes | Squirtle Lv6, Tackle + Tail Whip |
## Progression Milestones
- Choose starter
- Deliver Parcel from Viridian Mart, receive Pokedex
- Boulder Badge — Brock (Rock) → use Water/Grass
- Cascade Badge — Misty (Water) → use Grass/Electric
- Thunder Badge — Lt. Surge (Electric) → use Ground
- Rainbow Badge — Erika (Grass) → use Fire/Ice/Flying
- Soul Badge — Koga (Poison) → use Ground/Psychic
- Marsh Badge — Sabrina (Psychic) → hardest gym
- Volcano Badge — Blaine (Fire) → use Water/Ground
- Earth Badge — Giovanni (Ground) → use Water/Grass/Ice
- Elite Four → Champion!
## Stopping Play
1. Save the game with a descriptive name via POST /save
2. Update memory with PKM:PROGRESS
3. Tell user: "Game saved as [name]! Say 'play pokemon' to resume."
4. Kill the server and tunnel background processes
## Pitfalls
- NEVER download or provide ROM files
- Do NOT send more than 4-5 actions without checking vision
- Always sidestep after exiting buildings before going north
- Always add wait_60 x2-3 after door/stair warps
- Dialog detection via RAM is unreliable — verify with screenshots
- Save BEFORE risky encounters
- The tunnel URL changes each time you restart it
@@ -1098,7 +1098,7 @@ Please see the ocifs docs.
The path should start with https://.
This must be publically accessible.
This must be publicly accessible.
Now that you know how to load datasets, you can learn more on how to load your specific dataset format into your target output format dataset formats docs.
@@ -0,0 +1,302 @@
---
name: hermes-atropos-environments
description: Build, test, and debug Hermes Agent RL environments for Atropos training. Covers the HermesAgentBaseEnv interface, reward functions, agent loop integration, evaluation with tools, wandb logging, and the three CLI modes (serve/process/evaluate). Use when creating, reviewing, or fixing RL environments in the hermes-agent repo.
version: 1.1.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [atropos, rl, environments, training, reinforcement-learning, reward-functions]
related_skills: [axolotl, grpo-rl-training, trl-fine-tuning, lm-evaluation-harness]
---
# Hermes Agent Atropos Environments
Guide for building RL environments in the hermes-agent repo that integrate with the Atropos training framework.
## Architecture Overview
```
Atropos BaseEnv (atroposlib/envs/base.py)
└── HermesAgentBaseEnv (environments/hermes_base_env.py)
├── Handles agent loop orchestration
├── Handles tool resolution per group
├── Handles ToolContext for reward verification
└── YOUR ENVIRONMENT (environments/your_env.py)
Only implements: setup, get_next_item, format_prompt,
compute_reward, evaluate, wandb_log
```
Hermes environments are special because they run a **multi-turn agent loop with tool calling** — not just single-turn completions. The base env handles the loop; you implement the task and scoring.
## File Locations
| File | Purpose |
|------|---------|
| `environments/hermes_base_env.py` | Base class with agent loop + tool resolution |
| `environments/agent_loop.py` | `HermesAgentLoop` + `AgentResult` dataclass |
| `environments/tool_context.py` | `ToolContext` for reward verification |
| `environments/tool_call_parsers.py` | Phase 2 tool call parsers (hermes, mistral, etc.) |
| `environments/your_env.py` | Your environment implementation |
## Inference Setup — Ask the User First
**IMPORTANT:** Before running any test, evaluation, or data generation command, always ask the user how they want to handle inference. Do NOT assume OpenRouter or any specific endpoint. Present these options:
1. **OpenRouter** — Ask which model they want to use (e.g., `anthropic/claude-sonnet-4.5`, `google/gemini-2.5-pro`, `meta-llama/llama-3.3-70b-instruct`, etc.). Requires `OPENROUTER_API_KEY` in environment.
2. **Self-hosted VLLM endpoint** — Ask for their base URL (e.g., `http://localhost:8000/v1`) and model name. Set `--openai.server_type vllm`.
3. **Other OpenAI-compatible API** — Ask for the base URL, model name, and any required API key. Set `--openai.server_type openai` and `--openai.health_check false`.
4. **Local Atropos training server** — For `serve` mode with a live training loop. Default `http://localhost:8000/v1`.
Once the user tells you their setup, use those values in all CLI commands for that session. Example prompts:
> "Before I run this, how would you like to handle inference?
> 1. OpenRouter (I'll need your preferred model, e.g. claude-sonnet-4.5)
> 2. A self-hosted VLLM endpoint (give me the URL and model name)
> 3. Another OpenAI-compatible API (give me the URL, model, and any auth details)
> 4. Local Atropos training server (serve mode)"
### Key flags by provider:
| Provider | `--openai.server_type` | `--openai.health_check` | `--openai.api_key` |
|----------|----------------------|------------------------|-------------------|
| OpenRouter | `openai` | `false` | `$OPENROUTER_API_KEY` |
| VLLM (self-hosted) | `vllm` | (default) | (not needed) |
| Other OpenAI-compatible | `openai` | `false` | As needed |
| Local Atropos | (default) | (default) | (not needed) |
## Required Methods
### 1. `setup()` — Load dataset and initialize state
```python
async def setup(self) -> None:
"""Called once at startup. Load datasets, initialize state."""
# Try HuggingFace first, fallback to built-in samples
try:
from datasets import load_dataset
ds = load_dataset("your/dataset", split="test")
self._items = [...]
except Exception:
self._items = BUILTIN_SAMPLES
# Always split into train/eval
random.shuffle(self._items)
eval_size = max(20, int(len(self._items) * 0.1))
self._eval_items = self._items[:eval_size]
self._items = self._items[eval_size:]
```
### 2. `get_next_item()` — Return next training item
```python
async def get_next_item(self) -> dict:
"""Return next item, cycling through dataset."""
item = self._items[self._index % len(self._items)]
self._index += 1
return item
```
### 3. `format_prompt(item)` — Convert item to user message
```python
def format_prompt(self, item: dict) -> str:
"""Convert a dataset item into the user-facing prompt."""
return f"Research this question: {item['question']}"
```
### 4. `compute_reward(item, result, ctx)` — Score the rollout
**CRITICAL**: `result` is an `AgentResult`, NOT a dict. It has these attributes:
- `result.messages` — List of message dicts (OpenAI format)
- `result.turns_used` — Number of LLM calls made
- `result.finished_naturally` — True if model stopped voluntarily
- `result.tool_errors` — List of ToolError objects
**AgentResult does NOT have**: `final_response`, `tool_calls`, `tools_used`.
You must extract these from `result.messages`:
```python
async def compute_reward(self, item, result: AgentResult, ctx: ToolContext) -> float:
# Extract final response (last assistant message with content)
final_response = ""
tools_used = []
for msg in reversed(result.messages):
if msg.get("role") == "assistant" and msg.get("content") and not final_response:
final_response = msg["content"]
if msg.get("role") == "assistant" and msg.get("tool_calls"):
for tc in msg["tool_calls"]:
fn = tc.get("function", {}) if isinstance(tc, dict) else {}
name = fn.get("name", "")
if name:
tools_used.append(name)
# Score using LLM judge, heuristic, or ToolContext verification
correctness = await self._llm_judge(item, final_response)
return correctness
```
`ctx` (ToolContext) gives you terminal/file access to the agent's sandbox for verification:
```python
# Run tests in the agent's sandbox
result = ctx.terminal("pytest /workspace/test.py")
return 1.0 if result["exit_code"] == 0 else 0.0
```
### 5. `evaluate()` — Periodic evaluation with full agent loop
**MUST use the full agent loop with tools**, not single-turn chat_completion.
The whole point of hermes-agent environments is agentic evaluation:
```python
async def evaluate(self, *args, **kwargs) -> None:
import time, uuid
from environments.agent_loop import HermesAgentLoop
from environments.tool_context import ToolContext
start_time = time.time()
tools, valid_names = self._resolve_tools_for_group()
samples = []
for item in self._eval_items[:self.config.eval_size]:
task_id = str(uuid.uuid4())
messages = []
if self.config.system_prompt:
messages.append({"role": "system", "content": self.config.system_prompt})
messages.append({"role": "user", "content": self.format_prompt(item)})
agent = HermesAgentLoop(
server=self.server,
tool_schemas=tools,
valid_tool_names=valid_names,
max_turns=self.config.max_agent_turns,
task_id=task_id,
temperature=0.0, # Deterministic for eval
max_tokens=self.config.max_token_length,
extra_body=self.config.extra_body,
)
result = await agent.run(messages)
ctx = ToolContext(task_id)
try:
reward = await self.compute_reward(item, result, ctx)
finally:
ctx.cleanup()
samples.append({"prompt": ..., "response": ..., "reward": reward})
eval_metrics = {"eval/mean_reward": ...}
await self.evaluate_log(metrics=eval_metrics, samples=samples,
start_time=start_time, end_time=time.time())
```
### 6. `wandb_log()` — Custom metrics logging
Always call `super().wandb_log()` at the end:
```python
async def wandb_log(self, wandb_metrics=None):
if wandb_metrics is None:
wandb_metrics = {}
if self._reward_buffer:
n = len(self._reward_buffer)
wandb_metrics["train/mean_reward"] = sum(self._reward_buffer) / n
self._reward_buffer.clear()
await super().wandb_log(wandb_metrics) # MUST call super
```
**Pitfall**: `compute_reward` appends to metric buffers. During eval, this pollutes training metrics. Roll back buffer entries added during eval.
## Config Class
Always create a custom config subclass with Pydantic Field descriptors. Key inherited fields you can tune: `enabled_toolsets`, `max_agent_turns`, `agent_temperature`, `system_prompt`, `terminal_backend`, `group_size`, `steps_per_eval`, `total_steps`.
## config_init() — Default Configuration
Classmethod returning `(YourEnvConfig, [APIServerConfig(...)])`. Set server_type to "openai" for OpenRouter/external APIs. Load API key from environment variable.
## Three CLI Modes
```bash
# SERVE — Full training loop (connects to Atropos API server)
python environments/my_env.py serve --openai.base_url http://localhost:8000/v1
# PROCESS — Offline data generation (saves JSONL)
python environments/my_env.py process --env.total_steps 10 --env.group_size 1 \
--env.use_wandb false --env.data_path_to_save_groups output.jsonl \
--openai.base_url "<USER_BASE_URL>" \
--openai.model_name "<USER_MODEL>" \
--openai.server_type <USER_SERVER_TYPE> --openai.health_check false
# EVALUATE — Standalone eval (runs setup + evaluate only)
python environments/my_env.py evaluate --env.eval_size 20 \
--env.data_dir_to_save_evals /tmp/eval_results \
--openai.base_url "<USER_BASE_URL>" \
--openai.model_name "<USER_MODEL>" \
--openai.server_type <USER_SERVER_TYPE> --openai.health_check false
```
Config priority: CLI args > YAML file > config_init() defaults.
## Common Pitfalls
1. **AgentResult has .messages, not .final_response** — Extract the final response by iterating reversed(result.messages) looking for the last assistant message with content.
2. **evaluate() must use HermesAgentLoop, not chat_completion** — Single-turn chat_completion has no tools. The whole point of hermes-agent benchmarks is agentic evaluation with tool use.
3. **Don't call _llm_judge twice** — If compute_reward already calls it, extract the score from the buffer instead of calling judge separately in evaluate().
4. **Eval pollutes training buffers** — compute_reward appends to metric buffers. During eval, roll back buffer entries to keep training metrics clean.
5. **Always set health_check=false for OpenRouter** — OpenRouter has no /health endpoint.
6. **Set data_dir_to_save_evals in evaluate mode** — Without it, results aren't saved.
7. **default_toolsets class variable vs enabled_toolsets config** — The class variable is a hint; the config field is what actually controls tool resolution.
8. **Tool call parsing in messages** — Tool calls are dicts with `{"function": {"name": ..., "arguments": ...}}`. Always check `isinstance(tc, dict)`.
9. **ToolContext.cleanup()** — Always call in a finally block to release sandbox resources.
10. **server_type must be "openai" for external APIs** — Without it, Atropos assumes a local VLLM server.
11. **Always ask the user for their inference setup** — Never hardcode or assume a specific provider/model. See the "Inference Setup" section above.
## Reward Function Patterns
### LLM Judge (for open-ended tasks)
Use `self.server.chat_completion()` with a scoring prompt. Parse JSON response for score float. Always include a heuristic fallback (keyword overlap) for when the judge call fails.
### Binary Verification (for code/terminal tasks)
Use `ctx.terminal("pytest test.py -q")` to run tests in the agent's sandbox. Return 1.0 for pass, 0.0 for fail.
### Multi-Signal (combine multiple indicators)
Weight correctness (0.6) + tool usage (0.2) + efficiency (0.2) + optional bonuses. Clamp to [0, 1].
## Testing Your Environment
1. **Import test**: `python -c "from environments.my_env import MyEnv; print('OK')"`
2. **Ask the user for inference setup** (see "Inference Setup" section above)
3. **Process mode** (1 item): Verify JSONL output has valid tokens, masks, scores
4. **Evaluate mode**: Verify full agent loop runs with tools, metrics logged correctly
5. **Check reward range**: Scores should be in [0, 1], not all identical
## Minimum Implementation Checklist
```python
class MyEnv(HermesAgentBaseEnv):
name = "my-env"
env_config_cls = MyEnvConfig
@classmethod
def config_init(cls): ... # Default server + env config
async def setup(self): ... # Load dataset + train/eval split
async def get_next_item(self): ... # Cycle through training items
def format_prompt(self, item): ... # Item → user message string
async def compute_reward(self, item, result, ctx): ... # Score rollout
async def evaluate(self, *args, **kwargs): ... # Full agent loop eval
async def wandb_log(self, metrics=None): ... # Custom metrics + super()
if __name__ == "__main__":
MyEnv.cli()
```
@@ -0,0 +1,59 @@
# AgentResult Fields Reference
`AgentResult` is defined in `environments/agent_loop.py` as a dataclass.
## Fields
| Field | Type | Description |
|-------|------|-------------|
| `messages` | `List[Dict[str, Any]]` | Full conversation history in OpenAI message format |
| `managed_state` | `Optional[Dict]` | ManagedServer.get_state() if Phase 2, else None |
| `turns_used` | `int` | Number of LLM calls made during the loop |
| `finished_naturally` | `bool` | True if model stopped calling tools on its own |
| `reasoning_per_turn` | `List[Optional[str]]` | Extracted reasoning content per turn |
| `tool_errors` | `List[ToolError]` | Tool errors encountered during the loop |
## ToolError Fields
| Field | Type | Description |
|-------|------|-------------|
| `turn` | `int` | Which turn the error occurred |
| `tool_name` | `str` | Name of the tool that failed |
| `arguments` | `str` | Arguments passed to the tool |
| `error` | `str` | Error message |
| `tool_result` | `str` | The result returned to the model |
## Extracting Data from Messages
Messages follow OpenAI format. Common patterns:
```python
# Get final assistant response
for msg in reversed(result.messages):
if msg.get("role") == "assistant" and msg.get("content"):
final_response = msg["content"]
break
# Get all tool names used
tools = []
for msg in result.messages:
if msg.get("role") == "assistant" and msg.get("tool_calls"):
for tc in msg["tool_calls"]:
fn = tc.get("function", {}) if isinstance(tc, dict) else {}
tools.append(fn.get("name", ""))
# Get tool results
for msg in result.messages:
if msg.get("role") == "tool":
tool_output = msg.get("content", "")
call_id = msg.get("tool_call_id", "")
```
## Fields that DO NOT EXIST
These are common mistakes — AgentResult does NOT have:
- `final_response` — extract from messages
- `tool_calls` — extract from messages
- `tools_used` — extract from messages
- `output` — extract from messages
- `response` — extract from messages
@@ -0,0 +1,65 @@
# Atropos BaseEnv Reference
Source: `atroposlib/envs/base.py` (~2124 lines)
## Abstract Methods (MUST implement)
| Method | Signature | Description |
|--------|-----------|-------------|
| `get_next_item()` | `async def get_next_item(self) -> Item` | Return next item for trajectory. Return None to pause. |
| `evaluate()` | `async def evaluate(self, *args, **kwargs)` | Called every steps_per_eval steps. |
| `setup()` | `async def setup(self)` | Called once at start. Load datasets, init models. |
| `collect_trajectory()` | `async def collect_trajectory(self, item) -> Tuple[Optional[ScoredDataItem], List[Item]]` | Single rollout. Or override collect_trajectories instead. |
## Overridable Methods
| Method | Default Behavior | Override When |
|--------|-----------------|---------------|
| `collect_trajectories()` | Runs collect_trajectory group_size times in parallel | Batch generation, MCTS, coupled rollouts |
| `wandb_log()` | Logs completion lengths, rollout table, perf stats | Add custom metrics (always call super) |
| `config_init()` | Returns (env_config_cls(), ServerBaseline()) | Custom defaults + server configs |
| `postprocess_histories()` | Passthrough | Final processing before sending to trainer |
| `save_checkpoint()` | Saves JSON to checkpoint_dir | Custom serialization |
| `cleanup()` | No-op | Release resources after each rollout |
## ScoredDataGroup Structure
```python
ScoredDataGroup = TypedDict with:
tokens: List[List[int]] # Token IDs per rollout
masks: List[List[int]] # -100=prompt, token_id=completion
scores: List[float] # Score per rollout
advantages: Optional[...] # Per-token advantages
ref_logprobs: Optional[...] # Reference model logprobs
messages: Optional[...] # OpenAI-format messages
inference_logprobs: Optional[...] # Inference logprobs
```
## BaseEnvConfig Key Fields
| Field | Default | Description |
|-------|---------|-------------|
| `group_size` | 4 | Responses grouped for scoring |
| `steps_per_eval` | 100 | Steps between evaluations |
| `max_token_length` | 2048 | Max token length for generations |
| `total_steps` | 1000 | Total training steps |
| `use_wandb` | True | Enable wandb logging |
| `tokenizer_name` | DeepHermes-3 | Tokenizer for token encoding |
| `ensure_scores_are_not_same` | True | Skip groups with identical scores |
| `worker_timeout` | 600 | Task timeout seconds |
## Data Flow
```
env_manager() → add_train_workers() → handle_env()
→ collect_trajectories() → postprocess_histories()
→ handle_send_to_api() → training server
```
## Atropos Environment Statistics (82 environments analyzed)
- 95% implement setup, collect_trajectories, evaluate, get_next_item
- 76% override wandb_log
- 54% have custom config class
- Most use collect_trajectories (plural), not collect_trajectory (singular)
- Common reward patterns: LLM-judge (~40), regex-extract (~35), code-exec (~12)
@@ -0,0 +1,199 @@
# Usage Patterns — Testing Environments and Evaluating Models
## Pattern 1: Test Your Environment Works (process mode)
Use `process` mode to verify your environment runs end-to-end before
committing. This generates trajectories without needing an Atropos
training server.
**Before running:** Ask the user for their inference setup (see SKILL.md "Inference Setup" section). Replace `<BASE_URL>`, `<MODEL>`, and `<SERVER_TYPE>` below with their chosen values.
### Step 1: Run 1 trajectory
```bash
cd ~/.hermes/hermes-agent
source .venv/bin/activate
python environments/your_env.py process \
--env.total_steps 1 \
--env.group_size 1 \
--env.use_wandb false \
--env.data_path_to_save_groups /tmp/test_output.jsonl \
--openai.base_url "<BASE_URL>" \
--openai.model_name "<MODEL>" \
--openai.server_type <SERVER_TYPE> \
--openai.health_check false
```
### Step 2: Verify the output
```python
import json
for line in open("/tmp/test_output.jsonl"):
data = json.loads(line)
print(f"Scores: {data.get('scores', [])}")
print(f"Token sequences: {len(data.get('tokens', []))}")
# Check messages include tool calls
for msg_list in data.get("messages", []):
roles = [m.get("role") for m in msg_list]
print(f"Roles: {roles}")
for m in reversed(msg_list):
if m.get("role") == "assistant" and m.get("content"):
print(f"Response: {m['content'][:200]}...")
break
```
### What to check:
- **Scores are not all 0.0** — if so, compute_reward is broken
- **Scores are in [0, 1]** — not negative, not >1
- **Messages include "tool" role entries** — agent used tools
- **Token sequences are non-empty**
- **An HTML visualization is generated** next to the .jsonl
### Common failures:
- `'AgentResult' object has no attribute 'X'` — accessing a field that doesn't exist. See agentresult-fields.md.
- Score always 0.0 — reward function erroring silently
- Score always 1.0 — verification too lenient or not running
## Pattern 2: Evaluate a Model (evaluate mode)
Use `evaluate` mode to benchmark a model on your environment's eval
split. This runs the full agent loop with tools for each eval item.
### Step 1: Run evaluation
```bash
python environments/your_env.py evaluate \
--env.eval_size 20 \
--env.use_wandb false \
--env.data_dir_to_save_evals /tmp/eval_results \
--openai.base_url "<BASE_URL>" \
--openai.model_name "<MODEL>" \
--openai.server_type <SERVER_TYPE> \
--openai.health_check false
```
### Step 2: Read results
Stdout shows a lighteval-compatible table:
```
Evaluation Results: your-env_eval
|Metric | Value|
|mean correctness| 0.850 |
|mean reward | 0.920 |
|mean tool calls | 4.300 |
|n items | 20 |
Evaluation completed in 367 seconds
```
JSON results saved to the eval directory:
```python
import json
data = json.load(open("/tmp/eval_results/metrics.json"))
for metric, value in data["results"]["all"].items():
print(f"{metric}: {value}")
```
### Step 3: Compare models
Run evaluate with different models and compare the metrics.json files.
### What to check:
- **"data_dir_to_save_evals is not set"** — you forgot the flag, results won't be saved
- **Tool usage rate = 0** — evaluate() is using chat_completion instead of HermesAgentLoop
- **All scores identical** — judge failing, falling back to heuristic
- **Very slow** — each item runs a full agent loop (~30-90s). Use `--env.eval_size 5` for quick checks.
## Pattern 3: Generate Training Data (process mode, larger scale)
Generate trajectory data for offline training or analysis:
```bash
python environments/your_env.py process \
--env.total_steps 50 \
--env.group_size 4 \
--env.use_wandb false \
--env.data_path_to_save_groups data/trajectories.jsonl \
--openai.base_url "<BASE_URL>" \
--openai.model_name "<MODEL>" \
--openai.server_type <SERVER_TYPE> \
--openai.health_check false
```
### Analyze the distribution:
```python
import json
scores = []
for line in open("data/trajectories.jsonl"):
data = json.loads(line)
scores.extend(data.get("scores", []))
print(f"Total: {len(scores)}, Mean: {sum(scores)/len(scores):.3f}")
for bucket in [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]:
count = sum(1 for s in scores if abs(s - bucket) < 0.1)
print(f" {bucket:.1f}: {'█' * count} ({count})")
```
### What to check:
- **Score distribution has variance** — RL needs score variance. All-same scores are useless.
## Pattern 4: Full RL Training (serve mode)
For actual RL training with Atropos:
```bash
# Terminal 1: Start Atropos API server
run-api
# Terminal 2: Start your environment
python environments/your_env.py serve \
--config environments/your_env/default.yaml
```
For Phase 2 with VLLM:
```bash
# Terminal 1: VLLM server
python -m vllm.entrypoints.openai.api_server --model your-model --port 8000
# Terminal 2: Atropos API
run-api
# Terminal 3: Environment
python environments/your_env.py serve \
--openai.base_url http://localhost:8000/v1 \
--openai.model_name your-model \
--openai.server_type vllm
```
## Pattern 5: Quick Smoke Test
Verify imports and config before spending money on API calls:
```python
from environments.your_env import YourEnv
print(f"Name: {YourEnv.name}")
cfg, servers = YourEnv.config_init()
print(f"Toolsets: {cfg.enabled_toolsets}")
print(f"Server: {servers[0].model_name}")
print("All imports OK")
```
## Timing Expectations
| Mode | Items | Time per item | Total |
|------|-------|--------------|-------|
| process (1 item) | 1 | 30-90s | ~1 min |
| evaluate (5 items) | 5 | 30-90s | ~5 min |
| evaluate (20 items) | 20 | 30-90s | ~15-30 min |
| process (50 items) | 50 | 30-90s | ~30-75 min |
Times are for cloud APIs with Claude Sonnet-class models. Local models may be faster or slower depending on hardware.
+122 -48
View File
@@ -1,7 +1,7 @@
---
name: duckduckgo-search
description: Free web search via DuckDuckGo when Firecrawl is unavailable. No API key needed. Use ddgs CLI or Python library to find URLs, then web_extract for content.
version: 1.1.0
description: Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Use the Python DDGS library or CLI to search, then web_extract for full content.
version: 1.2.0
author: gamedevCloudy
license: MIT
metadata:
@@ -10,17 +10,11 @@ metadata:
related_skills: [arxiv]
---
# DuckDuckGo Search (Firecrawl Fallback)
# DuckDuckGo Search
Free web search using DuckDuckGo. **No API key required.**
## When to Use This
Use this skill ONLY when the `web_search` tool is not available (i.e., `FIRECRAWL_API_KEY` is not set). If `web_search` works, prefer it — it returns richer results with built-in content extraction.
Signs you need this fallback:
- `web_search` tool is not listed in your available tools
- `web_search` returns an error about missing FIRECRAWL_API_KEY
Preferred when `web_search` tool is unavailable or unsuitable (no `FIRECRAWL_API_KEY` set). Can also be used as a standalone search tool.
## Setup
@@ -29,14 +23,109 @@ Signs you need this fallback:
pip install ddgs
```
## Web Search (Primary Use Case)
## Python API (Primary)
### Via Terminal (ddgs CLI)
Use the `DDGS` class in `execute_code` for structured results with typed fields.
**Important:** `max_results` must always be passed as a **keyword argument** — positional usage raises an error on all methods.
### Text Search
Best for: general research, companies, documentation.
```python
from ddgs import DDGS
with DDGS() as ddgs:
for r in ddgs.text("python async programming", max_results=5):
print(r["title"])
print(r["href"])
print(r.get("body", "")[:200])
print()
```
Returns: `title`, `href`, `body`
### News Search
Best for: current events, breaking news, latest updates.
```python
from ddgs import DDGS
with DDGS() as ddgs:
for r in ddgs.news("AI regulation 2026", max_results=5):
print(r["date"], "-", r["title"])
print(r.get("source", ""), "|", r["url"])
print(r.get("body", "")[:200])
print()
```
Returns: `date`, `title`, `body`, `url`, `image`, `source`
### Image Search
Best for: visual references, product images, diagrams.
```python
from ddgs import DDGS
with DDGS() as ddgs:
for r in ddgs.images("semiconductor chip", max_results=5):
print(r["title"])
print(r["image"]) # direct image URL
print(r.get("thumbnail", ""))
print(r.get("source", ""))
print()
```
Returns: `title`, `image`, `thumbnail`, `url`, `height`, `width`, `source`
### Video Search
Best for: tutorials, demos, explainers.
```python
from ddgs import DDGS
with DDGS() as ddgs:
for r in ddgs.videos("FastAPI tutorial", max_results=5):
print(r["title"])
print(r.get("content", "")) # video URL
print(r.get("duration", "")) # e.g. "26:03"
print(r.get("provider", "")) # YouTube, etc.
print(r.get("published", ""))
print()
```
Returns: `title`, `content`, `description`, `duration`, `provider`, `published`, `statistics`, `uploader`
### Quick Reference
| Method | Use When | Key Fields |
|--------|----------|------------|
| `text()` | General research, companies | title, href, body |
| `news()` | Current events, updates | date, title, source, body, url |
| `images()` | Visuals, diagrams | title, image, thumbnail, url |
| `videos()` | Tutorials, demos | title, content, duration, provider |
## CLI (Alternative)
Use the `ddgs` command via terminal when you don't need structured field access.
```bash
# Basic search — returns titles, URLs, and snippets
# Text search
ddgs text -k "python async programming" -m 5
# News search
ddgs news -k "artificial intelligence" -m 5
# Image search
ddgs images -k "landscape photography" -m 10
# Video search
ddgs videos -k "python tutorial" -m 5
# With region filter
ddgs text -k "best restaurants" -m 5 -r us-en
@@ -47,16 +136,6 @@ ddgs text -k "latest AI news" -m 5 -t w
ddgs text -k "fastapi tutorial" -m 5 -o json
```
### Via Python (in execute_code)
```python
from hermes_tools import terminal
# Search and get results
result = terminal("ddgs text -k 'python web framework comparison' -m 5")
print(result["output"])
```
### CLI Flags
| Flag | Description | Example |
@@ -68,44 +147,39 @@ print(result["output"])
| `-s` | Safe search | `-s off` |
| `-o` | Output format | `-o json` |
## Other Search Types
## Workflow: Search then Extract
```bash
# Image search
ddgs images -k "landscape photography" -m 10
# News search
ddgs news -k "artificial intelligence" -m 5
# Video search
ddgs videos -k "python tutorial" -m 5
```
## Workflow: Search → Extract
DuckDuckGo finds URLs. To get full page content, follow up with `web_extract`:
DuckDuckGo returns titles, URLs, and snippets — not full page content. To get full content, follow up with `web_extract`:
1. **Search** with ddgs to find relevant URLs
2. **Extract** content using the `web_extract` tool (if available) or curl
```bash
# Step 1: Find URLs
ddgs text -k "fastapi tutorial" -m 3
```python
from ddgs import DDGS
# Step 2: Extract full content from a result URL
# (use web_extract tool if available, otherwise curl)
curl -s "https://example.com/article" | head -200
with DDGS() as ddgs:
results = list(ddgs.text("fastapi deployment guide", max_results=3))
for r in results:
print(r["title"], "->", r["href"])
# Then use web_extract tool on the best URL
```
## Limitations
- **Rate limiting**: DuckDuckGo may throttle after many rapid requests. Add `sleep 1` between searches if needed.
- **No content extraction**: ddgs only returns titles, URLs, and snippets not full page content. Use `web_extract` or curl for that.
- **Rate limiting**: DuckDuckGo may throttle after many rapid requests. Add a short delay between searches if needed.
- **No content extraction**: ddgs returns snippets, not full page content. Use `web_extract` or curl for that.
- **Results quality**: Generally good but less configurable than Firecrawl's search.
- **Availability**: DuckDuckGo may block requests from some cloud IPs. If searches return empty, try different keywords or add a short delay.
- **Availability**: DuckDuckGo may block requests from some cloud IPs. If searches return empty, try different keywords or wait a few seconds.
- **Field variability**: Return fields may vary between results or ddgs versions. Use `.get()` for optional fields to avoid KeyError.
## Pitfalls
- **Don't confuse `-k` and `-m`**: `-k` is for keywords (the query), `-m` is for max results count.
- **`max_results` is keyword-only**: `ddgs.text("query", 5)` raises an error. Use `ddgs.text("query", max_results=5)`.
- **Don't confuse `-k` and `-m`** (CLI): `-k` is for keywords, `-m` is for max results count.
- **Package name**: The package is `ddgs` (was previously `duckduckgo-search`). Install with `pip install ddgs`.
- **Empty results**: If ddgs returns nothing, it may be rate-limited. Wait a few seconds and retry.
## Validated With
Smoke-tested with `ddgs==9.11.2` on Python 3.13. All four methods (text, news, images, videos) confirmed working with keyword `max_results`.
@@ -0,0 +1,198 @@
"""Tests for configurable background process notification modes.
The gateway process watcher pushes status updates to users' chats when
background terminal commands run. ``display.background_process_notifications``
controls verbosity: off | result | error | all (default).
Contributed by @PeterFile (PR #593), reimplemented on current main.
"""
import asyncio
from types import SimpleNamespace
from unittest.mock import AsyncMock, patch
import pytest
from gateway.config import GatewayConfig, Platform
from gateway.run import GatewayRunner
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
class _FakeRegistry:
"""Return pre-canned sessions, then None once exhausted."""
def __init__(self, sessions):
self._sessions = list(sessions)
def get(self, session_id):
if self._sessions:
return self._sessions.pop(0)
return None
def _build_runner(monkeypatch, tmp_path, mode: str) -> GatewayRunner:
"""Create a GatewayRunner with a fake config for the given mode."""
(tmp_path / "config.yaml").write_text(
f"display:\n background_process_notifications: {mode}\n",
encoding="utf-8",
)
import gateway.run as gateway_run
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
runner = GatewayRunner(GatewayConfig())
adapter = SimpleNamespace(send=AsyncMock())
runner.adapters[Platform.TELEGRAM] = adapter
return runner
def _watcher_dict(session_id="proc_test"):
return {
"session_id": session_id,
"check_interval": 0,
"platform": "telegram",
"chat_id": "123",
}
# ---------------------------------------------------------------------------
# _load_background_notifications_mode unit tests
# ---------------------------------------------------------------------------
class TestLoadBackgroundNotificationsMode:
def test_defaults_to_all(self, monkeypatch, tmp_path):
import gateway.run as gw
monkeypatch.setattr(gw, "_hermes_home", tmp_path)
monkeypatch.delenv("HERMES_BACKGROUND_NOTIFICATIONS", raising=False)
assert GatewayRunner._load_background_notifications_mode() == "all"
def test_reads_config_yaml(self, monkeypatch, tmp_path):
(tmp_path / "config.yaml").write_text(
"display:\n background_process_notifications: error\n"
)
import gateway.run as gw
monkeypatch.setattr(gw, "_hermes_home", tmp_path)
monkeypatch.delenv("HERMES_BACKGROUND_NOTIFICATIONS", raising=False)
assert GatewayRunner._load_background_notifications_mode() == "error"
def test_env_var_overrides_config(self, monkeypatch, tmp_path):
(tmp_path / "config.yaml").write_text(
"display:\n background_process_notifications: error\n"
)
import gateway.run as gw
monkeypatch.setattr(gw, "_hermes_home", tmp_path)
monkeypatch.setenv("HERMES_BACKGROUND_NOTIFICATIONS", "off")
assert GatewayRunner._load_background_notifications_mode() == "off"
def test_false_value_maps_to_off(self, monkeypatch, tmp_path):
(tmp_path / "config.yaml").write_text(
"display:\n background_process_notifications: false\n"
)
import gateway.run as gw
monkeypatch.setattr(gw, "_hermes_home", tmp_path)
monkeypatch.delenv("HERMES_BACKGROUND_NOTIFICATIONS", raising=False)
assert GatewayRunner._load_background_notifications_mode() == "off"
def test_invalid_value_defaults_to_all(self, monkeypatch, tmp_path):
(tmp_path / "config.yaml").write_text(
"display:\n background_process_notifications: banana\n"
)
import gateway.run as gw
monkeypatch.setattr(gw, "_hermes_home", tmp_path)
monkeypatch.delenv("HERMES_BACKGROUND_NOTIFICATIONS", raising=False)
assert GatewayRunner._load_background_notifications_mode() == "all"
# ---------------------------------------------------------------------------
# _run_process_watcher integration tests
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
@pytest.mark.parametrize(
("mode", "sessions", "expected_calls", "expected_fragment"),
[
# all mode: running output → sends update
(
"all",
[
SimpleNamespace(output_buffer="building...\n", exited=False, exit_code=None),
None, # process disappears → watcher exits
],
1,
"is still running",
),
# result mode: running output → no update
(
"result",
[
SimpleNamespace(output_buffer="building...\n", exited=False, exit_code=None),
None,
],
0,
None,
),
# off mode: exited process → no notification
(
"off",
[SimpleNamespace(output_buffer="done\n", exited=True, exit_code=0)],
0,
None,
),
# result mode: exited → notifies
(
"result",
[SimpleNamespace(output_buffer="done\n", exited=True, exit_code=0)],
1,
"finished with exit code 0",
),
# error mode: exit 0 → no notification
(
"error",
[SimpleNamespace(output_buffer="done\n", exited=True, exit_code=0)],
0,
None,
),
# error mode: exit 1 → notifies
(
"error",
[SimpleNamespace(output_buffer="traceback\n", exited=True, exit_code=1)],
1,
"finished with exit code 1",
),
# all mode: exited → notifies
(
"all",
[SimpleNamespace(output_buffer="ok\n", exited=True, exit_code=0)],
1,
"finished with exit code 0",
),
],
)
async def test_run_process_watcher_respects_notification_mode(
monkeypatch, tmp_path, mode, sessions, expected_calls, expected_fragment
):
import tools.process_registry as pr_module
monkeypatch.setattr(pr_module, "process_registry", _FakeRegistry(sessions))
# Patch asyncio.sleep to avoid real delays
async def _instant_sleep(*_a, **_kw):
pass
monkeypatch.setattr(asyncio, "sleep", _instant_sleep)
runner = _build_runner(monkeypatch, tmp_path, mode)
adapter = runner.adapters[Platform.TELEGRAM]
await runner._run_process_watcher(_watcher_dict())
assert adapter.send.await_count == expected_calls, (
f"mode={mode}: expected {expected_calls} sends, got {adapter.send.await_count}"
)
if expected_fragment is not None:
sent_message = adapter.send.await_args.args[1]
assert expected_fragment in sent_message
+24
View File
@@ -160,3 +160,27 @@ class TestMirrorToSession:
result = mirror_to_session("telegram", "123", "msg")
assert result is False
class TestAppendToSqlite:
def test_connection_is_closed_after_use(self, tmp_path):
"""Verify _append_to_sqlite closes the SessionDB connection."""
from gateway.mirror import _append_to_sqlite
mock_db = MagicMock()
with patch("hermes_state.SessionDB", return_value=mock_db):
_append_to_sqlite("sess_1", {"role": "assistant", "content": "hello"})
mock_db.append_message.assert_called_once()
mock_db.close.assert_called_once()
def test_connection_closed_even_on_error(self, tmp_path):
"""Verify connection is closed even when append_message raises."""
from gateway.mirror import _append_to_sqlite
mock_db = MagicMock()
mock_db.append_message.side_effect = Exception("db error")
with patch("hermes_state.SessionDB", return_value=mock_db):
_append_to_sqlite("sess_1", {"role": "assistant", "content": "hello"})
mock_db.close.assert_called_once()
+33 -1
View File
@@ -34,7 +34,7 @@ def _ensure_telegram_mock():
_ensure_telegram_mock()
from gateway.platforms.telegram import TelegramAdapter, _escape_mdv2 # noqa: E402
from gateway.platforms.telegram import TelegramAdapter, _escape_mdv2, _strip_mdv2 # noqa: E402
# ---------------------------------------------------------------------------
@@ -360,3 +360,35 @@ class TestFormatMessageComplex:
assert "Header" in result
assert "block" in result
assert "url.com" in result
# =========================================================================
# _strip_mdv2 — plaintext fallback
# =========================================================================
class TestStripMdv2:
def test_removes_escape_backslashes(self):
assert _strip_mdv2(r"hello\.world\!") == "hello.world!"
def test_removes_bold_markers(self):
assert _strip_mdv2("*bold text*") == "bold text"
def test_removes_italic_markers(self):
assert _strip_mdv2("_italic text_") == "italic text"
def test_removes_both_bold_and_italic(self):
result = _strip_mdv2("*bold* and _italic_")
assert result == "bold and italic"
def test_preserves_snake_case(self):
assert _strip_mdv2("my_variable_name") == "my_variable_name"
def test_preserves_multi_underscore_identifier(self):
assert _strip_mdv2("some_func_call here") == "some_func_call here"
def test_plain_text_unchanged(self):
assert _strip_mdv2("plain text") == "plain text"
def test_empty_string(self):
assert _strip_mdv2("") == ""
@@ -0,0 +1,113 @@
"""Tests for _coalesce_session_name_args — multi-word session name merging."""
import pytest
from hermes_cli.main import _coalesce_session_name_args
class TestCoalesceSessionNameArgs:
"""Ensure unquoted multi-word session names are merged into one token."""
# ── -c / --continue ──────────────────────────────────────────────────
def test_continue_multiword_unquoted(self):
"""hermes -c Pokemon Agent Dev → -c 'Pokemon Agent Dev'"""
assert _coalesce_session_name_args(
["-c", "Pokemon", "Agent", "Dev"]
) == ["-c", "Pokemon Agent Dev"]
def test_continue_long_form_multiword(self):
"""hermes --continue Pokemon Agent Dev"""
assert _coalesce_session_name_args(
["--continue", "Pokemon", "Agent", "Dev"]
) == ["--continue", "Pokemon Agent Dev"]
def test_continue_single_word(self):
"""hermes -c MyProject (no merging needed)"""
assert _coalesce_session_name_args(["-c", "MyProject"]) == [
"-c",
"MyProject",
]
def test_continue_already_quoted(self):
"""hermes -c 'Pokemon Agent Dev' (shell already merged)"""
assert _coalesce_session_name_args(
["-c", "Pokemon Agent Dev"]
) == ["-c", "Pokemon Agent Dev"]
def test_continue_bare_flag(self):
"""hermes -c (no name — means 'continue latest')"""
assert _coalesce_session_name_args(["-c"]) == ["-c"]
def test_continue_followed_by_flag(self):
"""hermes -c -w (no name consumed, -w stays separate)"""
assert _coalesce_session_name_args(["-c", "-w"]) == ["-c", "-w"]
def test_continue_multiword_then_flag(self):
"""hermes -c my project -w"""
assert _coalesce_session_name_args(
["-c", "my", "project", "-w"]
) == ["-c", "my project", "-w"]
def test_continue_multiword_then_subcommand(self):
"""hermes -c my project chat -q hello"""
assert _coalesce_session_name_args(
["-c", "my", "project", "chat", "-q", "hello"]
) == ["-c", "my project", "chat", "-q", "hello"]
# ── -r / --resume ────────────────────────────────────────────────────
def test_resume_multiword(self):
"""hermes -r My Session Name"""
assert _coalesce_session_name_args(
["-r", "My", "Session", "Name"]
) == ["-r", "My Session Name"]
def test_resume_long_form_multiword(self):
"""hermes --resume My Session Name"""
assert _coalesce_session_name_args(
["--resume", "My", "Session", "Name"]
) == ["--resume", "My Session Name"]
def test_resume_multiword_then_flag(self):
"""hermes -r My Session -w"""
assert _coalesce_session_name_args(
["-r", "My", "Session", "-w"]
) == ["-r", "My Session", "-w"]
# ── combined flags ───────────────────────────────────────────────────
def test_worktree_and_continue_multiword(self):
"""hermes -w -c Pokemon Agent Dev (the original failing case)"""
assert _coalesce_session_name_args(
["-w", "-c", "Pokemon", "Agent", "Dev"]
) == ["-w", "-c", "Pokemon Agent Dev"]
def test_continue_multiword_and_worktree(self):
"""hermes -c Pokemon Agent Dev -w (order reversed)"""
assert _coalesce_session_name_args(
["-c", "Pokemon", "Agent", "Dev", "-w"]
) == ["-c", "Pokemon Agent Dev", "-w"]
# ── passthrough (no session flags) ───────────────────────────────────
def test_no_session_flags_passthrough(self):
"""hermes -w chat -q hello (nothing to merge)"""
result = _coalesce_session_name_args(["-w", "chat", "-q", "hello"])
assert result == ["-w", "chat", "-q", "hello"]
def test_empty_argv(self):
assert _coalesce_session_name_args([]) == []
# ── subcommand boundary ──────────────────────────────────────────────
def test_stops_at_sessions_subcommand(self):
"""hermes -c my project sessions list → stops before 'sessions'"""
assert _coalesce_session_name_args(
["-c", "my", "project", "sessions", "list"]
) == ["-c", "my project", "sessions", "list"]
def test_stops_at_setup_subcommand(self):
"""hermes -c my setup → 'setup' is a subcommand, not part of name"""
assert _coalesce_session_name_args(
["-c", "my", "setup"]
) == ["-c", "my", "setup"]
+1 -1
View File
@@ -12,7 +12,7 @@ EXPECTED_COMMANDS = {
"/personality", "/clear", "/history", "/new", "/reset", "/retry",
"/undo", "/save", "/config", "/cron", "/skills", "/platforms",
"/verbose", "/compress", "/title", "/usage", "/insights", "/paste",
"/reload-mcp", "/quit",
"/reload-mcp", "/rollback", "/skin", "/quit",
}
+89 -4
View File
@@ -2,7 +2,11 @@
import os
from pathlib import Path
from unittest.mock import patch
from unittest.mock import patch, MagicMock
import yaml
import yaml
from hermes_cli.config import (
DEFAULT_CONFIG,
@@ -41,22 +45,44 @@ class TestLoadConfigDefaults:
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
config = load_config()
assert config["model"] == DEFAULT_CONFIG["model"]
assert config["max_turns"] == DEFAULT_CONFIG["max_turns"]
assert config["agent"]["max_turns"] == DEFAULT_CONFIG["agent"]["max_turns"]
assert "max_turns" not in config
assert "terminal" in config
assert config["terminal"]["backend"] == "local"
def test_legacy_root_level_max_turns_migrates_to_agent_config(self, tmp_path):
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
config_path = tmp_path / "config.yaml"
config_path.write_text("max_turns: 42\n")
config = load_config()
assert config["agent"]["max_turns"] == 42
assert "max_turns" not in config
class TestSaveAndLoadRoundtrip:
def test_roundtrip(self, tmp_path):
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
config = load_config()
config["model"] = "test/custom-model"
config["max_turns"] = 42
config["agent"]["max_turns"] = 42
save_config(config)
reloaded = load_config()
assert reloaded["model"] == "test/custom-model"
assert reloaded["max_turns"] == 42
assert reloaded["agent"]["max_turns"] == 42
saved = yaml.safe_load((tmp_path / "config.yaml").read_text())
assert saved["agent"]["max_turns"] == 42
assert "max_turns" not in saved
def test_save_config_normalizes_legacy_root_level_max_turns(self, tmp_path):
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
save_config({"model": "test/custom-model", "max_turns": 37})
saved = yaml.safe_load((tmp_path / "config.yaml").read_text())
assert saved["agent"]["max_turns"] == 37
assert "max_turns" not in saved
def test_nested_values_preserved(self, tmp_path):
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
@@ -66,3 +92,62 @@ class TestSaveAndLoadRoundtrip:
reloaded = load_config()
assert reloaded["terminal"]["timeout"] == 999
class TestSaveConfigAtomicity:
"""Verify save_config uses atomic writes (tempfile + os.replace)."""
def test_no_partial_write_on_crash(self, tmp_path):
"""If save_config crashes mid-write, the previous file stays intact."""
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
# Write an initial config
config = load_config()
config["model"] = "original-model"
save_config(config)
config_path = tmp_path / "config.yaml"
assert config_path.exists()
# Simulate a crash during yaml.dump by making atomic_yaml_write's
# yaml.dump raise after the temp file is created but before replace.
with patch("utils.yaml.dump", side_effect=OSError("disk full")):
try:
config["model"] = "should-not-persist"
save_config(config)
except OSError:
pass
# Original file must still be intact
reloaded = load_config()
assert reloaded["model"] == "original-model"
def test_no_leftover_temp_files(self, tmp_path):
"""Failed writes must clean up their temp files."""
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
config = load_config()
save_config(config)
with patch("utils.yaml.dump", side_effect=OSError("disk full")):
try:
save_config(config)
except OSError:
pass
# No .tmp files should remain
tmp_files = list(tmp_path.glob(".*config*.tmp"))
assert tmp_files == []
def test_atomic_write_creates_valid_yaml(self, tmp_path):
"""The written file must be valid YAML matching the input."""
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
config = load_config()
config["model"] = "test/atomic-model"
config["agent"]["max_turns"] = 77
save_config(config)
# Read raw YAML to verify it's valid and correct
config_path = tmp_path / "config.yaml"
with open(config_path) as f:
raw = yaml.safe_load(f)
assert raw["model"] == "test/atomic-model"
assert raw["agent"]["max_turns"] == 77
+232
View File
@@ -0,0 +1,232 @@
"""Tests for hermes_cli.skin_engine — the data-driven skin/theme system."""
import json
import os
import pytest
from pathlib import Path
from unittest.mock import patch
@pytest.fixture(autouse=True)
def reset_skin_state():
"""Reset skin engine state between tests."""
from hermes_cli import skin_engine
skin_engine._active_skin = None
skin_engine._active_skin_name = "default"
yield
skin_engine._active_skin = None
skin_engine._active_skin_name = "default"
class TestSkinConfig:
def test_default_skin_has_required_fields(self):
from hermes_cli.skin_engine import load_skin
skin = load_skin("default")
assert skin.name == "default"
assert skin.tool_prefix == ""
assert "banner_title" in skin.colors
assert "banner_border" in skin.colors
assert "agent_name" in skin.branding
def test_get_color_with_fallback(self):
from hermes_cli.skin_engine import load_skin
skin = load_skin("default")
assert skin.get_color("banner_title") == "#FFD700"
assert skin.get_color("nonexistent", "#000") == "#000"
def test_get_branding_with_fallback(self):
from hermes_cli.skin_engine import load_skin
skin = load_skin("default")
assert skin.get_branding("agent_name") == "Hermes Agent"
assert skin.get_branding("nonexistent", "fallback") == "fallback"
def test_get_spinner_list_empty_for_default(self):
from hermes_cli.skin_engine import load_skin
skin = load_skin("default")
# Default skin has no custom spinner config
assert skin.get_spinner_list("waiting_faces") == []
assert skin.get_spinner_list("thinking_verbs") == []
def test_get_spinner_wings_empty_for_default(self):
from hermes_cli.skin_engine import load_skin
skin = load_skin("default")
assert skin.get_spinner_wings() == []
class TestBuiltinSkins:
def test_ares_skin_loads(self):
from hermes_cli.skin_engine import load_skin
skin = load_skin("ares")
assert skin.name == "ares"
assert skin.tool_prefix == ""
assert skin.get_color("banner_border") == "#9F1C1C"
assert skin.get_branding("agent_name") == "Ares Agent"
def test_ares_has_spinner_customization(self):
from hermes_cli.skin_engine import load_skin
skin = load_skin("ares")
assert len(skin.get_spinner_list("waiting_faces")) > 0
assert len(skin.get_spinner_list("thinking_faces")) > 0
assert len(skin.get_spinner_list("thinking_verbs")) > 0
wings = skin.get_spinner_wings()
assert len(wings) > 0
assert isinstance(wings[0], tuple)
assert len(wings[0]) == 2
def test_mono_skin_loads(self):
from hermes_cli.skin_engine import load_skin
skin = load_skin("mono")
assert skin.name == "mono"
assert skin.get_color("banner_title") == "#e6edf3"
def test_slate_skin_loads(self):
from hermes_cli.skin_engine import load_skin
skin = load_skin("slate")
assert skin.name == "slate"
assert skin.get_color("banner_title") == "#7eb8f6"
def test_unknown_skin_falls_back_to_default(self):
from hermes_cli.skin_engine import load_skin
skin = load_skin("nonexistent_skin_xyz")
assert skin.name == "default"
def test_all_builtin_skins_have_complete_colors(self):
from hermes_cli.skin_engine import _BUILTIN_SKINS, _build_skin_config
required_keys = ["banner_border", "banner_title", "banner_accent",
"banner_dim", "banner_text", "ui_accent"]
for name, data in _BUILTIN_SKINS.items():
skin = _build_skin_config(data)
for key in required_keys:
assert key in skin.colors, f"Skin '{name}' missing color '{key}'"
class TestSkinManagement:
def test_set_active_skin(self):
from hermes_cli.skin_engine import set_active_skin, get_active_skin, get_active_skin_name
skin = set_active_skin("ares")
assert skin.name == "ares"
assert get_active_skin_name() == "ares"
assert get_active_skin().name == "ares"
def test_get_active_skin_defaults(self):
from hermes_cli.skin_engine import get_active_skin
skin = get_active_skin()
assert skin.name == "default"
def test_list_skins_includes_builtins(self):
from hermes_cli.skin_engine import list_skins
skins = list_skins()
names = [s["name"] for s in skins]
assert "default" in names
assert "ares" in names
assert "mono" in names
assert "slate" in names
for s in skins:
assert "source" in s
assert s["source"] == "builtin"
def test_init_skin_from_config(self):
from hermes_cli.skin_engine import init_skin_from_config, get_active_skin_name
init_skin_from_config({"display": {"skin": "ares"}})
assert get_active_skin_name() == "ares"
def test_init_skin_from_empty_config(self):
from hermes_cli.skin_engine import init_skin_from_config, get_active_skin_name
init_skin_from_config({})
assert get_active_skin_name() == "default"
class TestUserSkins:
def test_load_user_skin_from_yaml(self, tmp_path, monkeypatch):
from hermes_cli.skin_engine import load_skin, _skins_dir
# Create a user skin YAML
skins_dir = tmp_path / "skins"
skins_dir.mkdir()
skin_file = skins_dir / "custom.yaml"
skin_data = {
"name": "custom",
"description": "A custom test skin",
"colors": {"banner_title": "#FF0000"},
"branding": {"agent_name": "Custom Agent"},
"tool_prefix": "",
}
import yaml
skin_file.write_text(yaml.dump(skin_data))
# Patch skins dir
monkeypatch.setattr("hermes_cli.skin_engine._skins_dir", lambda: skins_dir)
skin = load_skin("custom")
assert skin.name == "custom"
assert skin.get_color("banner_title") == "#FF0000"
assert skin.get_branding("agent_name") == "Custom Agent"
assert skin.tool_prefix == ""
# Should inherit defaults for unspecified colors
assert skin.get_color("banner_border") == "#CD7F32" # from default
def test_list_skins_includes_user_skins(self, tmp_path, monkeypatch):
from hermes_cli.skin_engine import list_skins
skins_dir = tmp_path / "skins"
skins_dir.mkdir()
import yaml
(skins_dir / "pirate.yaml").write_text(yaml.dump({
"name": "pirate",
"description": "Arr matey",
}))
monkeypatch.setattr("hermes_cli.skin_engine._skins_dir", lambda: skins_dir)
skins = list_skins()
names = [s["name"] for s in skins]
assert "pirate" in names
pirate = [s for s in skins if s["name"] == "pirate"][0]
assert pirate["source"] == "user"
class TestDisplayIntegration:
def test_get_skin_tool_prefix_default(self):
from agent.display import get_skin_tool_prefix
assert get_skin_tool_prefix() == ""
def test_get_skin_tool_prefix_custom(self):
from hermes_cli.skin_engine import set_active_skin
from agent.display import get_skin_tool_prefix
set_active_skin("ares")
assert get_skin_tool_prefix() == ""
def test_get_skin_faces_default(self):
from agent.display import get_skin_faces, KawaiiSpinner
faces = get_skin_faces("waiting_faces", KawaiiSpinner.KAWAII_WAITING)
# Default skin has no custom faces, so should return the default list
assert faces == KawaiiSpinner.KAWAII_WAITING
def test_get_skin_faces_ares(self):
from hermes_cli.skin_engine import set_active_skin
from agent.display import get_skin_faces, KawaiiSpinner
set_active_skin("ares")
faces = get_skin_faces("waiting_faces", KawaiiSpinner.KAWAII_WAITING)
assert "(⚔)" in faces
def test_get_skin_verbs_default(self):
from agent.display import get_skin_verbs, KawaiiSpinner
verbs = get_skin_verbs()
assert verbs == KawaiiSpinner.THINKING_VERBS
def test_get_skin_verbs_ares(self):
from hermes_cli.skin_engine import set_active_skin
from agent.display import get_skin_verbs
set_active_skin("ares")
verbs = get_skin_verbs()
assert "forging" in verbs
def test_tool_message_uses_skin_prefix(self):
from hermes_cli.skin_engine import set_active_skin
from agent.display import get_cute_tool_message
set_active_skin("ares")
msg = get_cute_tool_message("terminal", {"command": "ls"}, 0.5)
assert msg.startswith("")
assert "" not in msg
def test_tool_message_default_prefix(self):
from agent.display import get_cute_tool_message
msg = get_cute_tool_message("terminal", {"command": "ls"}, 0.5)
assert msg.startswith("")
+675
View File
@@ -0,0 +1,675 @@
from __future__ import annotations
import importlib.util
import json
import sys
from pathlib import Path
SCRIPT_PATH = (
Path(__file__).resolve().parents[2]
/ "optional-skills"
/ "migration"
/ "openclaw-migration"
/ "scripts"
/ "openclaw_to_hermes.py"
)
def load_module():
spec = importlib.util.spec_from_file_location("openclaw_to_hermes", SCRIPT_PATH)
module = importlib.util.module_from_spec(spec)
assert spec.loader is not None
sys.modules[spec.name] = module
spec.loader.exec_module(module)
return module
def load_skills_guard():
spec = importlib.util.spec_from_file_location(
"skills_guard_local",
Path(__file__).resolve().parents[2] / "tools" / "skills_guard.py",
)
module = importlib.util.module_from_spec(spec)
assert spec.loader is not None
sys.modules[spec.name] = module
spec.loader.exec_module(module)
return module
def test_extract_markdown_entries_promotes_heading_context():
mod = load_module()
text = """# MEMORY.md - Long-Term Memory
## Tyler Williams
- Founder of VANTA Research
- Timezone: America/Los_Angeles
### Active Projects
- Hermes Agent
"""
entries = mod.extract_markdown_entries(text)
assert "Tyler Williams: Founder of VANTA Research" in entries
assert "Tyler Williams: Timezone: America/Los_Angeles" in entries
assert "Tyler Williams > Active Projects: Hermes Agent" in entries
def test_merge_entries_respects_limit_and_reports_overflow():
mod = load_module()
existing = ["alpha"]
incoming = ["beta", "gamma is too long"]
merged, stats, overflowed = mod.merge_entries(existing, incoming, limit=12)
assert merged == ["alpha", "beta"]
assert stats["added"] == 1
assert stats["overflowed"] == 1
assert overflowed == ["gamma is too long"]
def test_resolve_selected_options_supports_include_and_exclude():
mod = load_module()
selected = mod.resolve_selected_options(["memory,skills", "user-profile"], ["skills"])
assert selected == {"memory", "user-profile"}
def test_resolve_selected_options_supports_presets():
mod = load_module()
user_data = mod.resolve_selected_options(preset="user-data")
full = mod.resolve_selected_options(preset="full")
assert "secret-settings" not in user_data
assert "secret-settings" in full
assert user_data < full
def test_resolve_selected_options_rejects_unknown_values():
mod = load_module()
try:
mod.resolve_selected_options(["memory,unknown-option"], None)
except ValueError as exc:
assert "unknown-option" in str(exc)
else:
raise AssertionError("Expected ValueError for unknown migration option")
def test_resolve_selected_options_rejects_unknown_preset():
mod = load_module()
try:
mod.resolve_selected_options(preset="everything")
except ValueError as exc:
assert "everything" in str(exc)
else:
raise AssertionError("Expected ValueError for unknown migration preset")
def test_migrator_copies_skill_and_merges_allowlist(tmp_path: Path):
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
(source / "workspace" / "skills" / "demo-skill").mkdir(parents=True)
(source / "workspace" / "skills" / "demo-skill" / "SKILL.md").write_text(
"---\nname: demo-skill\ndescription: demo\n---\n\nbody\n",
encoding="utf-8",
)
(source / "exec-approvals.json").write_text(
json.dumps(
{
"agents": {
"*": {
"allowlist": [
{"pattern": "/usr/bin/*"},
{"pattern": "/home/test/**"},
]
}
}
}
),
encoding="utf-8",
)
(target / "config.yaml").write_text("command_allowlist:\n - /usr/bin/*\n", encoding="utf-8")
migrator = mod.Migrator(
source_root=source,
target_root=target,
execute=True,
workspace_target=None,
overwrite=False,
migrate_secrets=False,
output_dir=target / "migration-report",
)
report = migrator.migrate()
imported_skill = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "demo-skill" / "SKILL.md"
assert imported_skill.exists()
assert "/home/test/**" in (target / "config.yaml").read_text(encoding="utf-8")
assert report["summary"]["migrated"] >= 2
def test_migrator_optionally_imports_supported_secrets_and_messaging_settings(tmp_path: Path):
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
(source / "credentials").mkdir(parents=True)
(source / "openclaw.json").write_text(
json.dumps(
{
"agents": {"defaults": {"workspace": "/tmp/openclaw-workspace"}},
"channels": {"telegram": {"botToken": "123:abc"}},
}
),
encoding="utf-8",
)
(source / "credentials" / "telegram-default-allowFrom.json").write_text(
json.dumps({"allowFrom": ["111", "222"]}),
encoding="utf-8",
)
target.mkdir()
migrator = mod.Migrator(
source_root=source,
target_root=target,
execute=True,
workspace_target=None,
overwrite=False,
migrate_secrets=True,
output_dir=target / "migration-report",
)
migrator.migrate()
env_text = (target / ".env").read_text(encoding="utf-8")
assert "MESSAGING_CWD=/tmp/openclaw-workspace" in env_text
assert "TELEGRAM_ALLOWED_USERS=111,222" in env_text
assert "TELEGRAM_BOT_TOKEN=123:abc" in env_text
def test_migrator_can_execute_only_selected_categories(tmp_path: Path):
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
(source / "workspace" / "skills" / "demo-skill").mkdir(parents=True)
(source / "workspace" / "skills" / "demo-skill" / "SKILL.md").write_text(
"---\nname: demo-skill\ndescription: demo\n---\n\nbody\n",
encoding="utf-8",
)
(source / "workspace" / "MEMORY.md").write_text(
"# Memory\n\n- keep me\n",
encoding="utf-8",
)
(target / "config.yaml").write_text("command_allowlist: []\n", encoding="utf-8")
migrator = mod.Migrator(
source_root=source,
target_root=target,
execute=True,
workspace_target=None,
overwrite=False,
migrate_secrets=False,
output_dir=target / "migration-report",
selected_options={"skills"},
)
report = migrator.migrate()
imported_skill = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "demo-skill" / "SKILL.md"
assert imported_skill.exists()
assert not (target / "memories" / "MEMORY.md").exists()
assert report["selection"]["selected"] == ["skills"]
skipped_items = [item for item in report["items"] if item["status"] == "skipped"]
assert any(item["kind"] == "memory" and item["reason"] == "Not selected for this run" for item in skipped_items)
def test_migrator_records_preset_in_report(tmp_path: Path):
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
(target / "config.yaml").write_text("command_allowlist: []\n", encoding="utf-8")
migrator = mod.Migrator(
source_root=source,
target_root=target,
execute=False,
workspace_target=None,
overwrite=False,
migrate_secrets=False,
output_dir=None,
selected_options=mod.MIGRATION_PRESETS["user-data"],
preset_name="user-data",
)
report = migrator.build_report()
assert report["preset"] == "user-data"
assert report["selection"]["preset"] == "user-data"
assert report["skill_conflict_mode"] == "skip"
assert report["selection"]["skill_conflict_mode"] == "skip"
def test_migrator_exports_full_overflow_entries(tmp_path: Path):
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
(target / "config.yaml").write_text("memory:\n memory_char_limit: 10\n user_char_limit: 10\n", encoding="utf-8")
(source / "workspace").mkdir(parents=True)
(source / "workspace" / "MEMORY.md").write_text(
"# Memory\n\n- alpha\n- beta\n- gamma\n",
encoding="utf-8",
)
migrator = mod.Migrator(
source_root=source,
target_root=target,
execute=True,
workspace_target=None,
overwrite=False,
migrate_secrets=False,
output_dir=target / "migration-report",
selected_options={"memory"},
)
report = migrator.migrate()
memory_item = next(item for item in report["items"] if item["kind"] == "memory")
overflow_file = Path(memory_item["details"]["overflow_file"])
assert overflow_file.exists()
text = overflow_file.read_text(encoding="utf-8")
assert "alpha" in text or "beta" in text or "gamma" in text
def test_migrator_can_rename_conflicting_imported_skill(tmp_path: Path):
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
source_skill = source / "workspace" / "skills" / "demo-skill"
source_skill.mkdir(parents=True)
(source_skill / "SKILL.md").write_text(
"---\nname: demo-skill\ndescription: demo\n---\n\nbody\n",
encoding="utf-8",
)
existing_skill = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "demo-skill"
existing_skill.mkdir(parents=True)
(existing_skill / "SKILL.md").write_text(
"---\nname: demo-skill\ndescription: existing\n---\n\nexisting\n",
encoding="utf-8",
)
migrator = mod.Migrator(
source_root=source,
target_root=target,
execute=True,
workspace_target=None,
overwrite=False,
migrate_secrets=False,
output_dir=target / "migration-report",
skill_conflict_mode="rename",
)
report = migrator.migrate()
renamed_skill = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "demo-skill-imported" / "SKILL.md"
assert renamed_skill.exists()
assert existing_skill.joinpath("SKILL.md").read_text(encoding="utf-8").endswith("existing\n")
imported_items = [item for item in report["items"] if item["kind"] == "skill" and item["status"] == "migrated"]
assert any(item["details"].get("renamed_from", "").endswith("/demo-skill") for item in imported_items)
def test_migrator_can_overwrite_conflicting_imported_skill_with_backup(tmp_path: Path):
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
source_skill = source / "workspace" / "skills" / "demo-skill"
source_skill.mkdir(parents=True)
(source_skill / "SKILL.md").write_text(
"---\nname: demo-skill\ndescription: imported\n---\n\nfresh\n",
encoding="utf-8",
)
existing_skill = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "demo-skill"
existing_skill.mkdir(parents=True)
(existing_skill / "SKILL.md").write_text(
"---\nname: demo-skill\ndescription: existing\n---\n\nexisting\n",
encoding="utf-8",
)
migrator = mod.Migrator(
source_root=source,
target_root=target,
execute=True,
workspace_target=None,
overwrite=False,
migrate_secrets=False,
output_dir=target / "migration-report",
skill_conflict_mode="overwrite",
)
report = migrator.migrate()
assert existing_skill.joinpath("SKILL.md").read_text(encoding="utf-8").endswith("fresh\n")
backup_items = [item for item in report["items"] if item["kind"] == "skill" and item["status"] == "migrated"]
assert any(item["details"].get("backup") for item in backup_items)
def test_discord_settings_migrated(tmp_path: Path):
"""Discord bot token and allowlist migrate to .env."""
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
source.mkdir()
(source / "openclaw.json").write_text(
json.dumps({
"channels": {
"discord": {
"token": "discord-bot-token-123",
"allowFrom": ["111222333", "444555666"],
}
}
}),
encoding="utf-8",
)
migrator = mod.Migrator(
source_root=source, target_root=target, execute=True,
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
selected_options={"discord-settings"},
)
report = migrator.migrate()
env_text = (target / ".env").read_text(encoding="utf-8")
assert "DISCORD_BOT_TOKEN=discord-bot-token-123" in env_text
assert "DISCORD_ALLOWED_USERS=111222333,444555666" in env_text
def test_slack_settings_migrated(tmp_path: Path):
"""Slack bot/app tokens and allowlist migrate to .env."""
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
source.mkdir()
(source / "openclaw.json").write_text(
json.dumps({
"channels": {
"slack": {
"botToken": "xoxb-slack-bot",
"appToken": "xapp-slack-app",
"allowFrom": ["U111", "U222"],
}
}
}),
encoding="utf-8",
)
migrator = mod.Migrator(
source_root=source, target_root=target, execute=True,
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
selected_options={"slack-settings"},
)
report = migrator.migrate()
env_text = (target / ".env").read_text(encoding="utf-8")
assert "SLACK_BOT_TOKEN=xoxb-slack-bot" in env_text
assert "SLACK_APP_TOKEN=xapp-slack-app" in env_text
assert "SLACK_ALLOWED_USERS=U111,U222" in env_text
def test_signal_settings_migrated(tmp_path: Path):
"""Signal account, HTTP URL, and allowlist migrate to .env."""
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
source.mkdir()
(source / "openclaw.json").write_text(
json.dumps({
"channels": {
"signal": {
"account": "+15551234567",
"httpUrl": "http://localhost:8080",
"allowFrom": ["+15559876543"],
}
}
}),
encoding="utf-8",
)
migrator = mod.Migrator(
source_root=source, target_root=target, execute=True,
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
selected_options={"signal-settings"},
)
report = migrator.migrate()
env_text = (target / ".env").read_text(encoding="utf-8")
assert "SIGNAL_ACCOUNT=+15551234567" in env_text
assert "SIGNAL_HTTP_URL=http://localhost:8080" in env_text
assert "SIGNAL_ALLOWED_USERS=+15559876543" in env_text
def test_model_config_migrated(tmp_path: Path):
"""Default model setting migrates to config.yaml."""
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
source.mkdir()
(source / "openclaw.json").write_text(
json.dumps({
"agents": {"defaults": {"model": "anthropic/claude-sonnet-4"}}
}),
encoding="utf-8",
)
# config.yaml must exist for YAML merge to work
(target / "config.yaml").write_text("model: openrouter/auto\n", encoding="utf-8")
migrator = mod.Migrator(
source_root=source, target_root=target, execute=True,
workspace_target=None, overwrite=True, migrate_secrets=False, output_dir=None,
selected_options={"model-config"},
)
report = migrator.migrate()
config_text = (target / "config.yaml").read_text(encoding="utf-8")
assert "anthropic/claude-sonnet-4" in config_text
def test_model_config_object_format(tmp_path: Path):
"""Model config handles {primary: ...} object format."""
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
source.mkdir()
(source / "openclaw.json").write_text(
json.dumps({
"agents": {"defaults": {"model": {"primary": "openai/gpt-4o"}}}
}),
encoding="utf-8",
)
(target / "config.yaml").write_text("model: old-model\n", encoding="utf-8")
migrator = mod.Migrator(
source_root=source, target_root=target, execute=True,
workspace_target=None, overwrite=True, migrate_secrets=False, output_dir=None,
selected_options={"model-config"},
)
report = migrator.migrate()
config_text = (target / "config.yaml").read_text(encoding="utf-8")
assert "openai/gpt-4o" in config_text
def test_tts_config_migrated(tmp_path: Path):
"""TTS provider and voice settings migrate to config.yaml."""
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
source.mkdir()
(source / "openclaw.json").write_text(
json.dumps({
"messages": {
"tts": {
"provider": "elevenlabs",
"elevenlabs": {
"voiceId": "custom-voice-id",
"modelId": "eleven_turbo_v2",
},
}
}
}),
encoding="utf-8",
)
(target / "config.yaml").write_text("tts:\n provider: edge\n", encoding="utf-8")
migrator = mod.Migrator(
source_root=source, target_root=target, execute=True,
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
selected_options={"tts-config"},
)
report = migrator.migrate()
config_text = (target / "config.yaml").read_text(encoding="utf-8")
assert "elevenlabs" in config_text
assert "custom-voice-id" in config_text
def test_shared_skills_migrated(tmp_path: Path):
"""Shared skills from ~/.openclaw/skills/ are migrated."""
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
# Create a shared skill (not in workspace/skills/)
(source / "skills" / "my-shared-skill").mkdir(parents=True)
(source / "skills" / "my-shared-skill" / "SKILL.md").write_text(
"---\nname: my-shared-skill\ndescription: shared\n---\n\nbody\n",
encoding="utf-8",
)
migrator = mod.Migrator(
source_root=source, target_root=target, execute=True,
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
selected_options={"shared-skills"},
)
report = migrator.migrate()
imported = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "my-shared-skill" / "SKILL.md"
assert imported.exists()
def test_daily_memory_merged(tmp_path: Path):
"""Daily memory notes from workspace/memory/*.md are merged into MEMORY.md."""
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
mem_dir = source / "workspace" / "memory"
mem_dir.mkdir(parents=True)
(mem_dir / "2026-03-01.md").write_text(
"# March 1 Notes\n\n- User prefers dark mode\n- Timezone: PST\n",
encoding="utf-8",
)
(mem_dir / "2026-03-02.md").write_text(
"# March 2 Notes\n\n- Working on migration project\n",
encoding="utf-8",
)
migrator = mod.Migrator(
source_root=source, target_root=target, execute=True,
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
selected_options={"daily-memory"},
)
report = migrator.migrate()
mem_path = target / "memories" / "MEMORY.md"
assert mem_path.exists()
content = mem_path.read_text(encoding="utf-8")
assert "dark mode" in content
assert "migration project" in content
def test_provider_keys_require_migrate_secrets_flag(tmp_path: Path):
"""Provider keys migration is double-gated: needs option + --migrate-secrets."""
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
target.mkdir()
source.mkdir()
(source / "openclaw.json").write_text(
json.dumps({
"models": {
"providers": {
"openrouter": {
"apiKey": "sk-or-test-key",
"baseUrl": "https://openrouter.ai/api/v1",
}
}
}
}),
encoding="utf-8",
)
# Without --migrate-secrets: should skip
migrator = mod.Migrator(
source_root=source, target_root=target, execute=True,
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
selected_options={"provider-keys"},
)
report = migrator.migrate()
env_path = target / ".env"
if env_path.exists():
assert "sk-or-test-key" not in env_path.read_text(encoding="utf-8")
# With --migrate-secrets: should import
migrator2 = mod.Migrator(
source_root=source, target_root=target, execute=True,
workspace_target=None, overwrite=False, migrate_secrets=True, output_dir=None,
selected_options={"provider-keys"},
)
report2 = migrator2.migrate()
env_text = (target / ".env").read_text(encoding="utf-8")
assert "OPENROUTER_API_KEY=sk-or-test-key" in env_text
def test_workspace_agents_records_skip_when_missing(tmp_path: Path):
"""Bug fix: workspace-agents records 'skipped' when source is missing."""
mod = load_module()
source = tmp_path / ".openclaw"
target = tmp_path / ".hermes"
source.mkdir()
target.mkdir()
migrator = mod.Migrator(
source_root=source, target_root=target, execute=True,
workspace_target=tmp_path / "workspace", overwrite=False, migrate_secrets=False, output_dir=None,
selected_options={"workspace-agents"},
)
report = migrator.migrate()
wa_items = [i for i in report["items"] if i["kind"] == "workspace-agents"]
assert len(wa_items) == 1
assert wa_items[0]["status"] == "skipped"
def test_skill_installs_cleanly_under_skills_guard():
skills_guard = load_skills_guard()
result = skills_guard.scan_skill(
SCRIPT_PATH.parents[1],
source="official/migration/openclaw-migration",
)
# The migration script legitimately references AGENTS.md (migrating
# workspace instructions), which triggers a false-positive
# agent_config_mod finding. Accept "caution" or "safe" — just not
# "dangerous" from a *real* threat.
assert result.verdict in ("safe", "caution", "dangerous"), f"Unexpected verdict: {result.verdict}"
# All findings should be the known false-positive for AGENTS.md
for f in result.findings:
assert f.pattern_id == "agent_config_mod", f"Unexpected finding: {f}"
+49
View File
@@ -234,6 +234,55 @@ class TestHTTP413Compression:
mock_compress.assert_called_once()
assert result["completed"] is True
def test_context_length_retry_rebuilds_request_after_compression(self, agent):
"""Retry must send the compressed transcript, not the stale oversized payload."""
err_400 = Exception(
"Error code: 400 - {'error': {'message': "
"\"This endpoint's maximum context length is 128000 tokens. "
"Please reduce the length of the messages.\"}}"
)
err_400.status_code = 400
ok_resp = _mock_response(content="Recovered after real compression", finish_reason="stop")
request_payloads = []
def _side_effect(**kwargs):
request_payloads.append(kwargs)
if len(request_payloads) == 1:
raise err_400
return ok_resp
agent.client.chat.completions.create.side_effect = _side_effect
prefill = [
{"role": "user", "content": "previous question"},
{"role": "assistant", "content": "previous answer"},
]
with (
patch.object(agent, "_compress_context") as mock_compress,
patch.object(agent, "_persist_session"),
patch.object(agent, "_save_trajectory"),
patch.object(agent, "_cleanup_task_resources"),
):
mock_compress.return_value = (
[{"role": "user", "content": "compressed summary"}],
"compressed prompt",
)
result = agent.run_conversation("hello", conversation_history=prefill)
assert result["completed"] is True
assert len(request_payloads) == 2
assert len(request_payloads[1]["messages"]) < len(request_payloads[0]["messages"])
assert request_payloads[1]["messages"][0] == {
"role": "system",
"content": "compressed prompt",
}
assert request_payloads[1]["messages"][1] == {
"role": "user",
"content": "compressed summary",
}
def test_413_cannot_compress_further(self, agent):
"""When compression can't reduce messages, return partial result."""
err_413 = _make_413_error()
+33 -8
View File
@@ -3,15 +3,15 @@ that only manifest at runtime (not in mocked unit tests)."""
import os
import sys
from unittest.mock import patch
from unittest.mock import MagicMock, patch
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
def _make_cli(env_overrides=None, **kwargs):
def _make_cli(env_overrides=None, config_overrides=None, **kwargs):
"""Create a HermesCLI instance with minimal mocking."""
import cli as _cli_mod
from cli import HermesCLI
import importlib
_clean_config = {
"model": {
"default": "anthropic/claude-opus-4.6",
@@ -22,13 +22,34 @@ def _make_cli(env_overrides=None, **kwargs):
"agent": {},
"terminal": {"env_type": "local"},
}
if config_overrides:
_clean_config.update(config_overrides)
clean_env = {"LLM_MODEL": "", "HERMES_MAX_ITERATIONS": ""}
if env_overrides:
clean_env.update(env_overrides)
with patch("cli.get_tool_definitions", return_value=[]), \
patch.dict("os.environ", clean_env, clear=False), \
patch.dict(_cli_mod.__dict__, {"CLI_CONFIG": _clean_config}):
return HermesCLI(**kwargs)
prompt_toolkit_stubs = {
"prompt_toolkit": MagicMock(),
"prompt_toolkit.history": MagicMock(),
"prompt_toolkit.styles": MagicMock(),
"prompt_toolkit.patch_stdout": MagicMock(),
"prompt_toolkit.application": MagicMock(),
"prompt_toolkit.layout": MagicMock(),
"prompt_toolkit.layout.processors": MagicMock(),
"prompt_toolkit.filters": MagicMock(),
"prompt_toolkit.layout.dimension": MagicMock(),
"prompt_toolkit.layout.menus": MagicMock(),
"prompt_toolkit.widgets": MagicMock(),
"prompt_toolkit.key_binding": MagicMock(),
"prompt_toolkit.completion": MagicMock(),
"prompt_toolkit.formatted_text": MagicMock(),
}
with patch.dict(sys.modules, prompt_toolkit_stubs), \
patch.dict("os.environ", clean_env, clear=False):
import cli as _cli_mod
_cli_mod = importlib.reload(_cli_mod)
with patch.object(_cli_mod, "get_tool_definitions", return_value=[]), \
patch.dict(_cli_mod.__dict__, {"CLI_CONFIG": _clean_config}):
return _cli_mod.HermesCLI(**kwargs)
class TestMaxTurnsResolution:
@@ -53,6 +74,10 @@ class TestMaxTurnsResolution:
cli_obj = _make_cli(env_overrides={"HERMES_MAX_ITERATIONS": "42"})
assert cli_obj.max_turns == 42
def test_legacy_root_max_turns_is_used_when_agent_key_exists_without_value(self):
cli_obj = _make_cli(config_overrides={"agent": {}, "max_turns": 77})
assert cli_obj.max_turns == 77
def test_max_turns_never_none_for_agent(self):
"""The value passed to AIAgent must never be None (causes TypeError in run_conversation)."""
cli = _make_cli()
+85
View File
@@ -0,0 +1,85 @@
"""Tests for agent/display.py — build_tool_preview()."""
import pytest
from agent.display import build_tool_preview
class TestBuildToolPreview:
"""Tests for build_tool_preview defensive handling and normal operation."""
def test_none_args_returns_none(self):
"""PR #453: None args should not crash, should return None."""
assert build_tool_preview("terminal", None) is None
def test_empty_dict_returns_none(self):
"""Empty dict has no keys to preview."""
assert build_tool_preview("terminal", {}) is None
def test_known_tool_with_primary_arg(self):
"""Known tool with its primary arg should return a preview string."""
result = build_tool_preview("terminal", {"command": "ls -la"})
assert result is not None
assert "ls -la" in result
def test_web_search_preview(self):
result = build_tool_preview("web_search", {"query": "hello world"})
assert result is not None
assert "hello world" in result
def test_read_file_preview(self):
result = build_tool_preview("read_file", {"path": "/tmp/test.py", "offset": 1})
assert result is not None
assert "/tmp/test.py" in result
def test_unknown_tool_with_fallback_key(self):
"""Unknown tool but with a recognized fallback key should still preview."""
result = build_tool_preview("custom_tool", {"query": "test query"})
assert result is not None
assert "test query" in result
def test_unknown_tool_no_matching_key(self):
"""Unknown tool with no recognized keys should return None."""
result = build_tool_preview("custom_tool", {"foo": "bar"})
assert result is None
def test_long_value_truncated(self):
"""Preview should truncate long values."""
long_cmd = "a" * 100
result = build_tool_preview("terminal", {"command": long_cmd}, max_len=40)
assert result is not None
assert len(result) <= 43 # max_len + "..."
def test_process_tool_with_none_args(self):
"""Process tool special case should also handle None args."""
assert build_tool_preview("process", None) is None
def test_process_tool_normal(self):
result = build_tool_preview("process", {"action": "poll", "session_id": "abc123"})
assert result is not None
assert "poll" in result
def test_todo_tool_read(self):
result = build_tool_preview("todo", {"merge": False})
assert result is not None
assert "reading" in result
def test_todo_tool_with_todos(self):
result = build_tool_preview("todo", {"todos": [{"id": "1", "content": "test", "status": "pending"}]})
assert result is not None
assert "1 task" in result
def test_memory_tool_add(self):
result = build_tool_preview("memory", {"action": "add", "target": "user", "content": "test note"})
assert result is not None
assert "user" in result
def test_session_search_preview(self):
result = build_tool_preview("session_search", {"query": "find something"})
assert result is not None
assert "find something" in result
def test_false_like_args_zero(self):
"""Non-dict falsy values should return None, not crash."""
assert build_tool_preview("terminal", 0) is None
assert build_tool_preview("terminal", "") is None
assert build_tool_preview("terminal", []) is None
+86 -1
View File
@@ -94,13 +94,50 @@ class TestMessageStorage:
session = db.get_session("s1")
assert session["message_count"] == 2
def test_tool_message_increments_tool_count(self, db):
def test_tool_response_does_not_increment_tool_count(self, db):
"""Tool responses (role=tool) should not increment tool_call_count.
Only assistant messages with tool_calls should count.
"""
db.create_session(session_id="s1", source="cli")
db.append_message("s1", role="tool", content="result", tool_name="web_search")
session = db.get_session("s1")
assert session["tool_call_count"] == 0
def test_assistant_tool_calls_increment_by_count(self, db):
"""An assistant message with N tool_calls should increment by N."""
db.create_session(session_id="s1", source="cli")
tool_calls = [
{"id": "call_1", "function": {"name": "web_search", "arguments": "{}"}},
]
db.append_message("s1", role="assistant", content="", tool_calls=tool_calls)
session = db.get_session("s1")
assert session["tool_call_count"] == 1
def test_tool_call_count_matches_actual_calls(self, db):
"""tool_call_count should equal the number of tool calls made, not messages."""
db.create_session(session_id="s1", source="cli")
# Assistant makes 2 parallel tool calls in one message
tool_calls = [
{"id": "call_1", "function": {"name": "ha_call_service", "arguments": "{}"}},
{"id": "call_2", "function": {"name": "ha_call_service", "arguments": "{}"}},
]
db.append_message("s1", role="assistant", content="", tool_calls=tool_calls)
# Two tool responses come back
db.append_message("s1", role="tool", content="ok", tool_name="ha_call_service")
db.append_message("s1", role="tool", content="ok", tool_name="ha_call_service")
session = db.get_session("s1")
# Should be 2 (the actual number of tool calls), not 3
assert session["tool_call_count"] == 2, (
f"Expected 2 tool calls but got {session['tool_call_count']}. "
"tool responses are double-counted and multi-call messages are under-counted"
)
def test_tool_calls_serialization(self, db):
db.create_session(session_id="s1", source="cli")
tool_calls = [{"id": "call_1", "function": {"name": "web_search", "arguments": "{}"}}]
@@ -179,6 +216,54 @@ class TestFTS5Search:
assert isinstance(results[0]["context"], list)
assert len(results[0]["context"]) > 0
def test_search_special_chars_do_not_crash(self, db):
"""FTS5 special characters in queries must not raise OperationalError."""
db.create_session(session_id="s1", source="cli")
db.append_message("s1", role="user", content="How do I use C++ templates?")
# Each of these previously caused sqlite3.OperationalError
dangerous_queries = [
'C++', # + is FTS5 column filter
'"unterminated', # unbalanced double-quote
'(problem', # unbalanced parenthesis
'hello AND', # dangling boolean operator
'***', # repeated wildcard
'{test}', # curly braces (column reference)
'OR hello', # leading boolean operator
'a AND OR b', # adjacent operators
]
for query in dangerous_queries:
# Must not raise — should return list (possibly empty)
results = db.search_messages(query)
assert isinstance(results, list), f"Query {query!r} did not return a list"
def test_search_sanitized_query_still_finds_content(self, db):
"""Sanitization must not break normal keyword search."""
db.create_session(session_id="s1", source="cli")
db.append_message("s1", role="user", content="Learning C++ templates today")
# "C++" sanitized to "C" should still match "C++"
results = db.search_messages("C++")
# The word "C" appears in the content, so FTS5 should find it
assert isinstance(results, list)
def test_sanitize_fts5_query_strips_dangerous_chars(self):
"""Unit test for _sanitize_fts5_query static method."""
from hermes_state import SessionDB
s = SessionDB._sanitize_fts5_query
assert s('hello world') == 'hello world'
assert '+' not in s('C++')
assert '"' not in s('"unterminated')
assert '(' not in s('(problem')
assert '{' not in s('{test}')
# Dangling operators removed
assert s('hello AND') == 'hello'
assert s('OR world') == 'world'
# Leading bare * removed
assert s('***') == ''
# Valid prefix kept
assert s('deploy*') == 'deploy*'
# =========================================================================
# Session search and listing
+37 -2
View File
@@ -601,7 +601,10 @@ class TestExecuteToolCalls:
messages = []
with patch("run_agent.handle_function_call", return_value="search result") as mock_hfc:
agent._execute_tool_calls(mock_msg, messages, "task-1")
mock_hfc.assert_called_once_with("web_search", {"q": "test"}, "task-1")
# enabled_tools passes the agent's own valid_tool_names
args, kwargs = mock_hfc.call_args
assert args[:3] == ("web_search", {"q": "test"}, "task-1")
assert set(kwargs.get("enabled_tools", [])) == agent.valid_tool_names
assert len(messages) == 1
assert messages[0]["role"] == "tool"
assert "search result" in messages[0]["content"]
@@ -627,7 +630,9 @@ class TestExecuteToolCalls:
with patch("run_agent.handle_function_call", return_value="ok") as mock_hfc:
agent._execute_tool_calls(mock_msg, messages, "task-1")
# Invalid JSON args should fall back to empty dict
mock_hfc.assert_called_once_with("web_search", {}, "task-1")
args, kwargs = mock_hfc.call_args
assert args[:3] == ("web_search", {}, "task-1")
assert set(kwargs.get("enabled_tools", [])) == agent.valid_tool_names
assert len(messages) == 1
assert messages[0]["role"] == "tool"
assert messages[0]["tool_call_id"] == "c1"
@@ -829,6 +834,36 @@ class TestRunConversation:
assert result["final_response"] == "All done"
assert result["completed"] is True
@pytest.mark.parametrize(
("first_content", "second_content", "expected_final"),
[
("Part 1 ", "Part 2", "Part 1 Part 2"),
("<think>internal reasoning</think>", "Recovered final answer", "Recovered final answer"),
],
)
def test_length_finish_reason_requests_continuation(
self, agent, first_content, second_content, expected_final
):
self._setup_agent(agent)
first = _mock_response(content=first_content, finish_reason="length")
second = _mock_response(content=second_content, finish_reason="stop")
agent.client.chat.completions.create.side_effect = [first, second]
with (
patch.object(agent, "_persist_session"),
patch.object(agent, "_save_trajectory"),
patch.object(agent, "_cleanup_task_resources"),
):
result = agent.run_conversation("hello")
assert result["completed"] is True
assert result["api_calls"] == 2
assert result["final_response"] == expected_final
second_call_messages = agent.client.chat.completions.create.call_args_list[1].kwargs["messages"]
assert second_call_messages[-1]["role"] == "user"
assert "truncated by the output length limit" in second_call_messages[-1]["content"]
class TestRetryExhaustion:
"""Regression: retry_count > max_retries was dead code (off-by-one).
+23
View File
@@ -158,6 +158,29 @@ def test_custom_endpoint_auto_provider_prefers_openai_key(monkeypatch):
assert resolved["api_key"] == "sk-vllm-key"
def test_resolve_runtime_provider_nous_api(monkeypatch):
"""Nous Portal API key provider resolves via the api_key path."""
monkeypatch.setattr(rp, "resolve_provider", lambda *a, **k: "nous-api")
monkeypatch.setattr(
rp,
"resolve_api_key_provider_credentials",
lambda pid: {
"provider": "nous-api",
"api_key": "nous-test-key",
"base_url": "https://inference-api.nousresearch.com/v1",
"source": "NOUS_API_KEY",
},
)
resolved = rp.resolve_runtime_provider(requested="nous-api")
assert resolved["provider"] == "nous-api"
assert resolved["api_mode"] == "chat_completions"
assert resolved["base_url"] == "https://inference-api.nousresearch.com/v1"
assert resolved["api_key"] == "nous-test-key"
assert resolved["requested_provider"] == "nous-api"
def test_resolve_requested_provider_precedence(monkeypatch):
monkeypatch.setenv("HERMES_INFERENCE_PROVIDER", "nous")
monkeypatch.setattr(rp, "_get_model_config", lambda: {"provider": "openai-codex"})
+1 -1
View File
@@ -136,7 +136,7 @@ class TestToolsetConsistency:
def test_hermes_platforms_share_core_tools(self):
"""All hermes-* platform toolsets should have the same tools."""
platforms = ["hermes-cli", "hermes-telegram", "hermes-discord", "hermes-whatsapp", "hermes-slack"]
platforms = ["hermes-cli", "hermes-telegram", "hermes-discord", "hermes-whatsapp", "hermes-slack", "hermes-signal", "hermes-homeassistant"]
tool_sets = [set(TOOLSETS[p]["tools"]) for p in platforms]
# All platform toolsets should be identical
for ts in tool_sets[1:]:
+385
View File
@@ -0,0 +1,385 @@
"""Tests for tools/checkpoint_manager.py — CheckpointManager."""
import os
import json
import shutil
import pytest
from pathlib import Path
from unittest.mock import patch
from tools.checkpoint_manager import (
CheckpointManager,
_shadow_repo_path,
_init_shadow_repo,
_run_git,
_git_env,
_dir_file_count,
format_checkpoint_list,
DEFAULT_EXCLUDES,
CHECKPOINT_BASE,
)
# =========================================================================
# Fixtures
# =========================================================================
@pytest.fixture()
def work_dir(tmp_path):
"""Temporary working directory."""
d = tmp_path / "project"
d.mkdir()
(d / "main.py").write_text("print('hello')\\n")
(d / "README.md").write_text("# Project\\n")
return d
@pytest.fixture()
def checkpoint_base(tmp_path):
"""Isolated checkpoint base — never writes to ~/.hermes/."""
return tmp_path / "checkpoints"
@pytest.fixture()
def mgr(work_dir, checkpoint_base, monkeypatch):
"""CheckpointManager with redirected checkpoint base."""
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
return CheckpointManager(enabled=True, max_snapshots=50)
@pytest.fixture()
def disabled_mgr(checkpoint_base, monkeypatch):
"""Disabled CheckpointManager."""
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
return CheckpointManager(enabled=False)
# =========================================================================
# Shadow repo path
# =========================================================================
class TestShadowRepoPath:
def test_deterministic(self, work_dir, checkpoint_base, monkeypatch):
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
p1 = _shadow_repo_path(str(work_dir))
p2 = _shadow_repo_path(str(work_dir))
assert p1 == p2
def test_different_dirs_different_paths(self, tmp_path, checkpoint_base, monkeypatch):
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
p1 = _shadow_repo_path(str(tmp_path / "a"))
p2 = _shadow_repo_path(str(tmp_path / "b"))
assert p1 != p2
def test_under_checkpoint_base(self, work_dir, checkpoint_base, monkeypatch):
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
p = _shadow_repo_path(str(work_dir))
assert str(p).startswith(str(checkpoint_base))
# =========================================================================
# Shadow repo init
# =========================================================================
class TestShadowRepoInit:
def test_creates_git_repo(self, work_dir, checkpoint_base, monkeypatch):
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
shadow = _shadow_repo_path(str(work_dir))
err = _init_shadow_repo(shadow, str(work_dir))
assert err is None
assert (shadow / "HEAD").exists()
def test_no_git_in_project_dir(self, work_dir, checkpoint_base, monkeypatch):
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
shadow = _shadow_repo_path(str(work_dir))
_init_shadow_repo(shadow, str(work_dir))
assert not (work_dir / ".git").exists()
def test_has_exclude_file(self, work_dir, checkpoint_base, monkeypatch):
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
shadow = _shadow_repo_path(str(work_dir))
_init_shadow_repo(shadow, str(work_dir))
exclude = shadow / "info" / "exclude"
assert exclude.exists()
content = exclude.read_text()
assert "node_modules/" in content
assert ".env" in content
def test_has_workdir_file(self, work_dir, checkpoint_base, monkeypatch):
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
shadow = _shadow_repo_path(str(work_dir))
_init_shadow_repo(shadow, str(work_dir))
workdir_file = shadow / "HERMES_WORKDIR"
assert workdir_file.exists()
assert str(work_dir.resolve()) in workdir_file.read_text()
def test_idempotent(self, work_dir, checkpoint_base, monkeypatch):
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
shadow = _shadow_repo_path(str(work_dir))
err1 = _init_shadow_repo(shadow, str(work_dir))
err2 = _init_shadow_repo(shadow, str(work_dir))
assert err1 is None
assert err2 is None
# =========================================================================
# CheckpointManager — disabled
# =========================================================================
class TestDisabledManager:
def test_ensure_checkpoint_returns_false(self, disabled_mgr, work_dir):
assert disabled_mgr.ensure_checkpoint(str(work_dir)) is False
def test_new_turn_works(self, disabled_mgr):
disabled_mgr.new_turn() # should not raise
# =========================================================================
# CheckpointManager — taking checkpoints
# =========================================================================
class TestTakeCheckpoint:
def test_first_checkpoint(self, mgr, work_dir):
result = mgr.ensure_checkpoint(str(work_dir), "initial")
assert result is True
def test_dedup_same_turn(self, mgr, work_dir):
r1 = mgr.ensure_checkpoint(str(work_dir), "first")
r2 = mgr.ensure_checkpoint(str(work_dir), "second")
assert r1 is True
assert r2 is False # dedup'd
def test_new_turn_resets_dedup(self, mgr, work_dir):
r1 = mgr.ensure_checkpoint(str(work_dir), "turn 1")
assert r1 is True
mgr.new_turn()
# Modify a file so there's something to commit
(work_dir / "main.py").write_text("print('modified')\\n")
r2 = mgr.ensure_checkpoint(str(work_dir), "turn 2")
assert r2 is True
def test_no_changes_skips_commit(self, mgr, work_dir):
# First checkpoint
mgr.ensure_checkpoint(str(work_dir), "initial")
mgr.new_turn()
# No file changes — should return False (nothing to commit)
r = mgr.ensure_checkpoint(str(work_dir), "no changes")
assert r is False
def test_skip_root_dir(self, mgr):
r = mgr.ensure_checkpoint("/", "root")
assert r is False
def test_skip_home_dir(self, mgr):
r = mgr.ensure_checkpoint(str(Path.home()), "home")
assert r is False
# =========================================================================
# CheckpointManager — listing checkpoints
# =========================================================================
class TestListCheckpoints:
def test_empty_when_no_checkpoints(self, mgr, work_dir):
result = mgr.list_checkpoints(str(work_dir))
assert result == []
def test_list_after_take(self, mgr, work_dir):
mgr.ensure_checkpoint(str(work_dir), "test checkpoint")
result = mgr.list_checkpoints(str(work_dir))
assert len(result) == 1
assert result[0]["reason"] == "test checkpoint"
assert "hash" in result[0]
assert "short_hash" in result[0]
assert "timestamp" in result[0]
def test_multiple_checkpoints_ordered(self, mgr, work_dir):
mgr.ensure_checkpoint(str(work_dir), "first")
mgr.new_turn()
(work_dir / "main.py").write_text("v2\\n")
mgr.ensure_checkpoint(str(work_dir), "second")
mgr.new_turn()
(work_dir / "main.py").write_text("v3\\n")
mgr.ensure_checkpoint(str(work_dir), "third")
result = mgr.list_checkpoints(str(work_dir))
assert len(result) == 3
# Most recent first
assert result[0]["reason"] == "third"
assert result[2]["reason"] == "first"
# =========================================================================
# CheckpointManager — restoring
# =========================================================================
class TestRestore:
def test_restore_to_previous(self, mgr, work_dir):
# Write original content
(work_dir / "main.py").write_text("original\\n")
mgr.ensure_checkpoint(str(work_dir), "original state")
mgr.new_turn()
# Modify the file
(work_dir / "main.py").write_text("modified\\n")
# Get the checkpoint hash
checkpoints = mgr.list_checkpoints(str(work_dir))
assert len(checkpoints) == 1
# Restore
result = mgr.restore(str(work_dir), checkpoints[0]["hash"])
assert result["success"] is True
# File should be back to original
assert (work_dir / "main.py").read_text() == "original\\n"
def test_restore_invalid_hash(self, mgr, work_dir):
mgr.ensure_checkpoint(str(work_dir), "initial")
result = mgr.restore(str(work_dir), "deadbeef1234")
assert result["success"] is False
def test_restore_no_checkpoints(self, mgr, work_dir):
result = mgr.restore(str(work_dir), "abc123")
assert result["success"] is False
def test_restore_creates_pre_rollback_snapshot(self, mgr, work_dir):
(work_dir / "main.py").write_text("v1\\n")
mgr.ensure_checkpoint(str(work_dir), "v1")
mgr.new_turn()
(work_dir / "main.py").write_text("v2\\n")
checkpoints = mgr.list_checkpoints(str(work_dir))
mgr.restore(str(work_dir), checkpoints[0]["hash"])
# Should now have 2 checkpoints: original + pre-rollback
all_cps = mgr.list_checkpoints(str(work_dir))
assert len(all_cps) >= 2
assert "pre-rollback" in all_cps[0]["reason"]
# =========================================================================
# CheckpointManager — working dir resolution
# =========================================================================
class TestWorkingDirResolution:
def test_resolves_git_project_root(self, tmp_path):
mgr = CheckpointManager(enabled=True)
project = tmp_path / "myproject"
project.mkdir()
(project / ".git").mkdir()
subdir = project / "src"
subdir.mkdir()
filepath = subdir / "main.py"
filepath.write_text("x\\n")
result = mgr.get_working_dir_for_path(str(filepath))
assert result == str(project)
def test_resolves_pyproject_root(self, tmp_path):
mgr = CheckpointManager(enabled=True)
project = tmp_path / "pyproj"
project.mkdir()
(project / "pyproject.toml").write_text("[project]\\n")
subdir = project / "src"
subdir.mkdir()
result = mgr.get_working_dir_for_path(str(subdir / "file.py"))
assert result == str(project)
def test_falls_back_to_parent(self, tmp_path):
mgr = CheckpointManager(enabled=True)
filepath = tmp_path / "random" / "file.py"
filepath.parent.mkdir(parents=True)
filepath.write_text("x\\n")
result = mgr.get_working_dir_for_path(str(filepath))
assert result == str(filepath.parent)
# =========================================================================
# Git env isolation
# =========================================================================
class TestGitEnvIsolation:
def test_sets_git_dir(self, tmp_path):
shadow = tmp_path / "shadow"
env = _git_env(shadow, str(tmp_path / "work"))
assert env["GIT_DIR"] == str(shadow)
def test_sets_work_tree(self, tmp_path):
shadow = tmp_path / "shadow"
work = tmp_path / "work"
env = _git_env(shadow, str(work))
assert env["GIT_WORK_TREE"] == str(work.resolve())
def test_clears_index_file(self, tmp_path, monkeypatch):
monkeypatch.setenv("GIT_INDEX_FILE", "/some/index")
shadow = tmp_path / "shadow"
env = _git_env(shadow, str(tmp_path))
assert "GIT_INDEX_FILE" not in env
# =========================================================================
# format_checkpoint_list
# =========================================================================
class TestFormatCheckpointList:
def test_empty_list(self):
result = format_checkpoint_list([], "/some/dir")
assert "No checkpoints" in result
def test_formats_entries(self):
cps = [
{"hash": "abc123", "short_hash": "abc1", "timestamp": "2026-03-09T21:15:00-07:00", "reason": "before write_file"},
{"hash": "def456", "short_hash": "def4", "timestamp": "2026-03-09T21:10:00-07:00", "reason": "before patch"},
]
result = format_checkpoint_list(cps, "/home/user/project")
assert "abc1" in result
assert "def4" in result
assert "before write_file" in result
assert "/rollback" in result
# =========================================================================
# File count guard
# =========================================================================
class TestDirFileCount:
def test_counts_files(self, work_dir):
count = _dir_file_count(str(work_dir))
assert count >= 2 # main.py + README.md
def test_nonexistent_dir(self, tmp_path):
count = _dir_file_count(str(tmp_path / "nonexistent"))
assert count == 0
# =========================================================================
# Error resilience
# =========================================================================
class TestErrorResilience:
def test_no_git_installed(self, work_dir, checkpoint_base, monkeypatch):
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
mgr = CheckpointManager(enabled=True)
# Mock git not found
monkeypatch.setattr("shutil.which", lambda x: None)
mgr._git_available = None # reset lazy probe
result = mgr.ensure_checkpoint(str(work_dir), "test")
assert result is False
def test_checkpoint_failure_does_not_raise(self, mgr, work_dir, monkeypatch):
"""Checkpoint failures should never raise — they're silently logged."""
def broken_run_git(*args, **kwargs):
raise OSError("git exploded")
monkeypatch.setattr("tools.checkpoint_manager._run_git", broken_run_git)
# Should not raise
result = mgr.ensure_checkpoint(str(work_dir), "test")
assert result is False
+45
View File
@@ -558,6 +558,51 @@ class TestConvertToPng:
assert result is True
assert dest.exists() and dest.stat().st_size > 0
def test_imagemagick_failure_preserves_original(self, tmp_path):
"""When ImageMagick convert fails, the original file must not be lost."""
dest = tmp_path / "img.png"
original_data = FAKE_BMP
dest.write_bytes(original_data)
def fake_run_fail(cmd, **kw):
# Simulate convert failing without producing output
return MagicMock(returncode=1)
with patch.dict(sys.modules, {"PIL": None, "PIL.Image": None}):
with patch("hermes_cli.clipboard.subprocess.run", side_effect=fake_run_fail):
_convert_to_png(dest)
# Original file must still exist with original content
assert dest.exists(), "Original file was lost after failed conversion"
assert dest.read_bytes() == original_data
def test_imagemagick_not_installed_preserves_original(self, tmp_path):
"""When ImageMagick is not installed, the original file must not be lost."""
dest = tmp_path / "img.png"
original_data = FAKE_BMP
dest.write_bytes(original_data)
with patch.dict(sys.modules, {"PIL": None, "PIL.Image": None}):
with patch("hermes_cli.clipboard.subprocess.run", side_effect=FileNotFoundError):
_convert_to_png(dest)
assert dest.exists(), "Original file was lost when ImageMagick not installed"
assert dest.read_bytes() == original_data
def test_imagemagick_timeout_preserves_original(self, tmp_path):
"""When ImageMagick times out, the original file must not be lost."""
import subprocess
dest = tmp_path / "img.png"
original_data = FAKE_BMP
dest.write_bytes(original_data)
with patch.dict(sys.modules, {"PIL": None, "PIL.Image": None}):
with patch("hermes_cli.clipboard.subprocess.run", side_effect=subprocess.TimeoutExpired("convert", 5)):
_convert_to_png(dest)
assert dest.exists(), "Original file was lost after timeout"
assert dest.read_bytes() == original_data
# ── has_clipboard_image dispatch ─────────────────────────────────────────
+351 -1
View File
@@ -12,17 +12,21 @@ Run with: python -m pytest tests/test_code_execution.py -v
"""
import json
import os
import sys
import time
import threading
import unittest
from unittest.mock import patch
from unittest.mock import patch, MagicMock
from tools.code_execution_tool import (
SANDBOX_ALLOWED_TOOLS,
execute_code,
generate_hermes_tools_module,
check_sandbox_requirements,
build_execute_code_schema,
EXECUTE_CODE_SCHEMA,
_TOOL_DOC_LINES,
)
@@ -393,5 +397,351 @@ class TestStubSchemaDrift(unittest.TestCase):
self.assertIn("mode", src)
# ---------------------------------------------------------------------------
# build_execute_code_schema
# ---------------------------------------------------------------------------
class TestBuildExecuteCodeSchema(unittest.TestCase):
"""Tests for build_execute_code_schema — the dynamic schema generator."""
def test_default_includes_all_tools(self):
schema = build_execute_code_schema()
desc = schema["description"]
for name, _ in _TOOL_DOC_LINES:
self.assertIn(name, desc, f"Default schema should mention '{name}'")
def test_schema_structure(self):
schema = build_execute_code_schema()
self.assertEqual(schema["name"], "execute_code")
self.assertIn("parameters", schema)
self.assertIn("code", schema["parameters"]["properties"])
self.assertEqual(schema["parameters"]["required"], ["code"])
def test_subset_only_lists_enabled_tools(self):
enabled = {"terminal", "read_file"}
schema = build_execute_code_schema(enabled)
desc = schema["description"]
self.assertIn("terminal(", desc)
self.assertIn("read_file(", desc)
self.assertNotIn("web_search(", desc)
self.assertNotIn("web_extract(", desc)
self.assertNotIn("write_file(", desc)
def test_single_tool(self):
schema = build_execute_code_schema({"terminal"})
desc = schema["description"]
self.assertIn("terminal(", desc)
self.assertNotIn("web_search(", desc)
def test_import_examples_prefer_web_search_and_terminal(self):
enabled = {"web_search", "terminal", "read_file"}
schema = build_execute_code_schema(enabled)
code_desc = schema["parameters"]["properties"]["code"]["description"]
self.assertIn("web_search", code_desc)
self.assertIn("terminal", code_desc)
def test_import_examples_fallback_when_no_preferred(self):
"""When neither web_search nor terminal are enabled, falls back to
sorted first two tools."""
enabled = {"read_file", "write_file", "patch"}
schema = build_execute_code_schema(enabled)
code_desc = schema["parameters"]["properties"]["code"]["description"]
# Should use sorted first 2: patch, read_file
self.assertIn("patch", code_desc)
self.assertIn("read_file", code_desc)
def test_empty_set_produces_valid_description(self):
"""build_execute_code_schema(set()) must not produce 'import , ...'
in the code property description."""
schema = build_execute_code_schema(set())
code_desc = schema["parameters"]["properties"]["code"]["description"]
self.assertNotIn("import , ...", code_desc,
"Empty enabled set produces broken import syntax in description")
def test_real_scenario_all_sandbox_tools_disabled(self):
"""Reproduce the exact code path from model_tools.py:231-234.
Scenario: user runs `hermes tools code_execution` (only code_execution
toolset enabled). tools_to_include = {"execute_code"}.
model_tools.py does:
sandbox_enabled = SANDBOX_ALLOWED_TOOLS & tools_to_include
dynamic_schema = build_execute_code_schema(sandbox_enabled)
SANDBOX_ALLOWED_TOOLS = {web_search, web_extract, read_file, write_file,
search_files, patch, terminal}
tools_to_include = {"execute_code"}
intersection = empty set
"""
# Simulate model_tools.py:233
tools_to_include = {"execute_code"}
sandbox_enabled = SANDBOX_ALLOWED_TOOLS & tools_to_include
self.assertEqual(sandbox_enabled, set(),
"Intersection should be empty when only execute_code is enabled")
schema = build_execute_code_schema(sandbox_enabled)
code_desc = schema["parameters"]["properties"]["code"]["description"]
self.assertNotIn("import , ...", code_desc,
"Bug: broken import syntax sent to the model")
def test_real_scenario_only_vision_enabled(self):
"""Another real path: user runs `hermes tools code_execution,vision`.
tools_to_include = {"execute_code", "vision_analyze"}
SANDBOX_ALLOWED_TOOLS has neither, so intersection is empty.
"""
tools_to_include = {"execute_code", "vision_analyze"}
sandbox_enabled = SANDBOX_ALLOWED_TOOLS & tools_to_include
self.assertEqual(sandbox_enabled, set())
schema = build_execute_code_schema(sandbox_enabled)
code_desc = schema["parameters"]["properties"]["code"]["description"]
self.assertNotIn("import , ...", code_desc)
def test_description_mentions_limits(self):
schema = build_execute_code_schema()
desc = schema["description"]
self.assertIn("5-minute timeout", desc)
self.assertIn("50KB", desc)
self.assertIn("50 tool calls", desc)
def test_description_mentions_helpers(self):
schema = build_execute_code_schema()
desc = schema["description"]
self.assertIn("json_parse", desc)
self.assertIn("shell_quote", desc)
self.assertIn("retry", desc)
def test_none_defaults_to_all_tools(self):
schema_none = build_execute_code_schema(None)
schema_all = build_execute_code_schema(SANDBOX_ALLOWED_TOOLS)
self.assertEqual(schema_none["description"], schema_all["description"])
# ---------------------------------------------------------------------------
# Environment variable filtering (security critical)
# ---------------------------------------------------------------------------
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
class TestEnvVarFiltering(unittest.TestCase):
"""Verify that execute_code filters environment variables correctly.
The child process should NOT receive API keys, tokens, or secrets.
It should receive safe vars like PATH, HOME, LANG, etc.
"""
def _get_child_env(self, extra_env=None):
"""Run a script that dumps its environment and return the env dict."""
code = (
"import os, json\n"
"print(json.dumps(dict(os.environ)))\n"
)
env_backup = os.environ.copy()
try:
if extra_env:
os.environ.update(extra_env)
with patch("model_tools.handle_function_call", return_value='{}'), \
patch("tools.code_execution_tool._load_config",
return_value={"timeout": 10, "max_tool_calls": 50}):
raw = execute_code(code, task_id="test-env",
enabled_tools=list(SANDBOX_ALLOWED_TOOLS))
finally:
os.environ.clear()
os.environ.update(env_backup)
result = json.loads(raw)
self.assertEqual(result["status"], "success", result.get("error", ""))
return json.loads(result["output"].strip())
def test_api_keys_excluded(self):
child_env = self._get_child_env({
"OPENAI_API_KEY": "sk-secret123",
"ANTHROPIC_API_KEY": "sk-ant-secret",
"FIRECRAWL_API_KEY": "fc-secret",
})
self.assertNotIn("OPENAI_API_KEY", child_env)
self.assertNotIn("ANTHROPIC_API_KEY", child_env)
self.assertNotIn("FIRECRAWL_API_KEY", child_env)
def test_tokens_excluded(self):
child_env = self._get_child_env({
"GITHUB_TOKEN": "ghp_secret",
"MODAL_TOKEN_ID": "tok-123",
"MODAL_TOKEN_SECRET": "tok-sec",
})
self.assertNotIn("GITHUB_TOKEN", child_env)
self.assertNotIn("MODAL_TOKEN_ID", child_env)
self.assertNotIn("MODAL_TOKEN_SECRET", child_env)
def test_password_vars_excluded(self):
child_env = self._get_child_env({
"DB_PASSWORD": "hunter2",
"MY_PASSWD": "secret",
"AUTH_CREDENTIAL": "cred",
})
self.assertNotIn("DB_PASSWORD", child_env)
self.assertNotIn("MY_PASSWD", child_env)
self.assertNotIn("AUTH_CREDENTIAL", child_env)
def test_path_included(self):
child_env = self._get_child_env()
self.assertIn("PATH", child_env)
def test_home_included(self):
child_env = self._get_child_env()
self.assertIn("HOME", child_env)
def test_hermes_rpc_socket_injected(self):
child_env = self._get_child_env()
self.assertIn("HERMES_RPC_SOCKET", child_env)
def test_pythondontwritebytecode_set(self):
child_env = self._get_child_env()
self.assertEqual(child_env.get("PYTHONDONTWRITEBYTECODE"), "1")
def test_timezone_injected_when_set(self):
env_backup = os.environ.copy()
try:
os.environ["HERMES_TIMEZONE"] = "America/New_York"
child_env = self._get_child_env()
self.assertEqual(child_env.get("TZ"), "America/New_York")
finally:
os.environ.clear()
os.environ.update(env_backup)
def test_timezone_not_set_when_empty(self):
env_backup = os.environ.copy()
try:
os.environ.pop("HERMES_TIMEZONE", None)
child_env = self._get_child_env()
if "TZ" in child_env:
self.assertNotEqual(child_env["TZ"], "")
finally:
os.environ.clear()
os.environ.update(env_backup)
# ---------------------------------------------------------------------------
# execute_code edge cases
# ---------------------------------------------------------------------------
class TestExecuteCodeEdgeCases(unittest.TestCase):
def test_windows_returns_error(self):
"""On Windows (or when SANDBOX_AVAILABLE is False), returns error JSON."""
with patch("tools.code_execution_tool.SANDBOX_AVAILABLE", False):
result = json.loads(execute_code("print('hi')", task_id="test"))
self.assertIn("error", result)
self.assertIn("Windows", result["error"])
def test_whitespace_only_code(self):
result = json.loads(execute_code(" \n\t ", task_id="test"))
self.assertIn("error", result)
self.assertIn("No code", result["error"])
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
def test_none_enabled_tools_uses_all(self):
"""When enabled_tools is None, all sandbox tools should be available."""
code = (
"from hermes_tools import terminal, web_search, read_file\n"
"print('all imports ok')\n"
)
with patch("model_tools.handle_function_call",
return_value=json.dumps({"ok": True})):
result = json.loads(execute_code(code, task_id="test-none",
enabled_tools=None))
self.assertEqual(result["status"], "success")
self.assertIn("all imports ok", result["output"])
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
def test_empty_enabled_tools_uses_all(self):
"""When enabled_tools is [] (empty), all sandbox tools should be available."""
code = (
"from hermes_tools import terminal, web_search\n"
"print('imports ok')\n"
)
with patch("model_tools.handle_function_call",
return_value=json.dumps({"ok": True})):
result = json.loads(execute_code(code, task_id="test-empty",
enabled_tools=[]))
self.assertEqual(result["status"], "success")
self.assertIn("imports ok", result["output"])
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
def test_nonoverlapping_tools_fallback(self):
"""When enabled_tools has no overlap with SANDBOX_ALLOWED_TOOLS,
should fall back to all allowed tools."""
code = (
"from hermes_tools import terminal\n"
"print('fallback ok')\n"
)
with patch("model_tools.handle_function_call",
return_value=json.dumps({"ok": True})):
result = json.loads(execute_code(
code, task_id="test-nonoverlap",
enabled_tools=["vision_analyze", "browser_snapshot"],
))
self.assertEqual(result["status"], "success")
self.assertIn("fallback ok", result["output"])
# ---------------------------------------------------------------------------
# _load_config
# ---------------------------------------------------------------------------
class TestLoadConfig(unittest.TestCase):
def test_returns_empty_dict_when_cli_config_unavailable(self):
from tools.code_execution_tool import _load_config
with patch.dict("sys.modules", {"cli": None}):
result = _load_config()
self.assertIsInstance(result, dict)
def test_returns_code_execution_section(self):
from tools.code_execution_tool import _load_config
mock_cli = MagicMock()
mock_cli.CLI_CONFIG = {"code_execution": {"timeout": 120, "max_tool_calls": 10}}
with patch.dict("sys.modules", {"cli": mock_cli}):
result = _load_config()
self.assertIsInstance(result, dict)
# ---------------------------------------------------------------------------
# Interrupt event
# ---------------------------------------------------------------------------
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
class TestInterruptHandling(unittest.TestCase):
def test_interrupt_event_stops_execution(self):
"""When _interrupt_event is set, execute_code should stop the script."""
code = "import time; time.sleep(60); print('should not reach')"
def set_interrupt_after_delay():
import time as _t
_t.sleep(1)
from tools.terminal_tool import _interrupt_event
_interrupt_event.set()
t = threading.Thread(target=set_interrupt_after_delay, daemon=True)
t.start()
try:
with patch("model_tools.handle_function_call",
return_value=json.dumps({"ok": True})), \
patch("tools.code_execution_tool._load_config",
return_value={"timeout": 30, "max_tool_calls": 50}):
result = json.loads(execute_code(
code, task_id="test-interrupt",
enabled_tools=list(SANDBOX_ALLOWED_TOOLS),
))
self.assertEqual(result["status"], "interrupted")
self.assertIn("interrupted", result["output"])
finally:
from tools.terminal_tool import _interrupt_event
_interrupt_event.clear()
t.join(timeout=3)
if __name__ == "__main__":
unittest.main()
+1 -1
View File
@@ -295,6 +295,6 @@ def check_dangerous_command(command: str, env_type: str,
elif choice == "always":
approve_session(session_key, pattern_key)
approve_permanent(pattern_key)
save_permanent_allowlist(load_permanent_allowlist() | {pattern_key})
save_permanent_allowlist(_permanent_approved)
return {"approved": True, "message": None}
+9 -9
View File
@@ -1615,10 +1615,10 @@ def _cleanup_old_screenshots(screenshots_dir, max_age_hours=24):
try:
if f.stat().st_mtime < cutoff:
f.unlink()
except Exception:
pass
except Exception:
pass # Non-critical — don't fail the screenshot operation
except Exception as e:
logger.debug("Failed to clean old screenshot %s: %s", f, e)
except Exception as e:
logger.debug("Screenshot cleanup error (non-critical): %s", e)
def _cleanup_old_recordings(max_age_hours=72):
@@ -1634,10 +1634,10 @@ def _cleanup_old_recordings(max_age_hours=72):
try:
if f.stat().st_mtime < cutoff:
f.unlink()
except Exception:
pass
except Exception:
pass
except Exception as e:
logger.debug("Failed to clean old recording %s: %s", f, e)
except Exception as e:
logger.debug("Recording cleanup error (non-critical): %s", e)
# ============================================================================
@@ -1749,7 +1749,7 @@ def cleanup_browser(task_id: Optional[str] = None) -> None:
os.kill(daemon_pid, signal.SIGTERM)
logger.debug("Killed daemon pid %s for %s", daemon_pid, session_name)
except (ProcessLookupError, ValueError, PermissionError, OSError):
pass
logger.debug("Could not kill daemon pid for %s (already dead or inaccessible)", session_name)
shutil.rmtree(socket_dir, ignore_errors=True)
logger.debug("Removed task %s from active sessions", task_id)
+441
View File
@@ -0,0 +1,441 @@
"""
Checkpoint Manager Transparent filesystem snapshots via shadow git repos.
Creates automatic snapshots of working directories before file-mutating
operations (write_file, patch), triggered once per conversation turn.
Provides rollback to any previous checkpoint.
This is NOT a tool the LLM never sees it. It's transparent infrastructure
controlled by the ``checkpoints`` config flag or ``--checkpoints`` CLI flag.
Architecture:
~/.hermes/checkpoints/{sha256(abs_dir)[:16]}/ shadow git repo
HEAD, refs/, objects/ standard git internals
HERMES_WORKDIR original dir path
info/exclude default excludes
The shadow repo uses GIT_DIR + GIT_WORK_TREE so no git state leaks
into the user's project directory.
"""
import hashlib
import logging
import os
import shutil
import subprocess
import time
from pathlib import Path
from typing import Dict, List, Optional, Set
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
CHECKPOINT_BASE = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")) / "checkpoints"
DEFAULT_EXCLUDES = [
"node_modules/",
"dist/",
"build/",
".env",
".env.*",
".env.local",
".env.*.local",
"__pycache__/",
"*.pyc",
"*.pyo",
".DS_Store",
"*.log",
".cache/",
".next/",
".nuxt/",
"coverage/",
".pytest_cache/",
".venv/",
"venv/",
".git/",
]
# Git subprocess timeout (seconds).
_GIT_TIMEOUT: int = max(10, min(60, int(os.getenv("HERMES_CHECKPOINT_TIMEOUT", "30"))))
# Max files to snapshot — skip huge directories to avoid slowdowns.
_MAX_FILES = 50_000
# ---------------------------------------------------------------------------
# Shadow repo helpers
# ---------------------------------------------------------------------------
def _shadow_repo_path(working_dir: str) -> Path:
"""Deterministic shadow repo path: sha256(abs_path)[:16]."""
abs_path = str(Path(working_dir).resolve())
dir_hash = hashlib.sha256(abs_path.encode()).hexdigest()[:16]
return CHECKPOINT_BASE / dir_hash
def _git_env(shadow_repo: Path, working_dir: str) -> dict:
"""Build env dict that redirects git to the shadow repo."""
env = os.environ.copy()
env["GIT_DIR"] = str(shadow_repo)
env["GIT_WORK_TREE"] = str(Path(working_dir).resolve())
env.pop("GIT_INDEX_FILE", None)
env.pop("GIT_NAMESPACE", None)
env.pop("GIT_ALTERNATE_OBJECT_DIRECTORIES", None)
return env
def _run_git(
args: List[str],
shadow_repo: Path,
working_dir: str,
timeout: int = _GIT_TIMEOUT,
) -> tuple:
"""Run a git command against the shadow repo. Returns (ok, stdout, stderr)."""
env = _git_env(shadow_repo, working_dir)
try:
result = subprocess.run(
["git"] + args,
capture_output=True,
text=True,
timeout=timeout,
env=env,
cwd=str(Path(working_dir).resolve()),
)
return result.returncode == 0, result.stdout.strip(), result.stderr.strip()
except subprocess.TimeoutExpired:
return False, "", f"git timed out after {timeout}s: git {' '.join(args)}"
except FileNotFoundError:
return False, "", "git not found"
except Exception as exc:
return False, "", str(exc)
def _init_shadow_repo(shadow_repo: Path, working_dir: str) -> Optional[str]:
"""Initialise shadow repo if needed. Returns error string or None."""
if (shadow_repo / "HEAD").exists():
return None
shadow_repo.mkdir(parents=True, exist_ok=True)
ok, _, err = _run_git(["init"], shadow_repo, working_dir)
if not ok:
return f"Shadow repo init failed: {err}"
_run_git(["config", "user.email", "hermes@local"], shadow_repo, working_dir)
_run_git(["config", "user.name", "Hermes Checkpoint"], shadow_repo, working_dir)
info_dir = shadow_repo / "info"
info_dir.mkdir(exist_ok=True)
(info_dir / "exclude").write_text(
"\n".join(DEFAULT_EXCLUDES) + "\n", encoding="utf-8"
)
(shadow_repo / "HERMES_WORKDIR").write_text(
str(Path(working_dir).resolve()) + "\n", encoding="utf-8"
)
logger.debug("Initialised checkpoint repo at %s for %s", shadow_repo, working_dir)
return None
def _dir_file_count(path: str) -> int:
"""Quick file count estimate (stops early if over _MAX_FILES)."""
count = 0
try:
for _ in Path(path).rglob("*"):
count += 1
if count > _MAX_FILES:
return count
except (PermissionError, OSError):
pass
return count
# ---------------------------------------------------------------------------
# CheckpointManager
# ---------------------------------------------------------------------------
class CheckpointManager:
"""Manages automatic filesystem checkpoints.
Designed to be owned by AIAgent. Call ``new_turn()`` at the start of
each conversation turn and ``ensure_checkpoint(dir, reason)`` before
any file-mutating tool call. The manager deduplicates so at most one
snapshot is taken per directory per turn.
Parameters
----------
enabled : bool
Master switch (from config / CLI flag).
max_snapshots : int
Keep at most this many checkpoints per directory.
"""
def __init__(self, enabled: bool = False, max_snapshots: int = 50):
self.enabled = enabled
self.max_snapshots = max_snapshots
self._checkpointed_dirs: Set[str] = set()
self._git_available: Optional[bool] = None # lazy probe
# ------------------------------------------------------------------
# Turn lifecycle
# ------------------------------------------------------------------
def new_turn(self) -> None:
"""Reset per-turn dedup. Call at the start of each agent iteration."""
self._checkpointed_dirs.clear()
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def ensure_checkpoint(self, working_dir: str, reason: str = "auto") -> bool:
"""Take a checkpoint if enabled and not already done this turn.
Returns True if a checkpoint was taken, False otherwise.
Never raises all errors are silently logged.
"""
if not self.enabled:
return False
# Lazy git probe
if self._git_available is None:
self._git_available = shutil.which("git") is not None
if not self._git_available:
logger.debug("Checkpoints disabled: git not found")
if not self._git_available:
return False
abs_dir = str(Path(working_dir).resolve())
# Skip root, home, and other overly broad directories
if abs_dir in ("/", str(Path.home())):
logger.debug("Checkpoint skipped: directory too broad (%s)", abs_dir)
return False
# Already checkpointed this turn?
if abs_dir in self._checkpointed_dirs:
return False
self._checkpointed_dirs.add(abs_dir)
try:
return self._take(abs_dir, reason)
except Exception as e:
logger.debug("Checkpoint failed (non-fatal): %s", e)
return False
def list_checkpoints(self, working_dir: str) -> List[Dict]:
"""List available checkpoints for a directory.
Returns a list of dicts with keys: hash, short_hash, timestamp, reason.
Most recent first.
"""
abs_dir = str(Path(working_dir).resolve())
shadow = _shadow_repo_path(abs_dir)
if not (shadow / "HEAD").exists():
return []
ok, stdout, _ = _run_git(
["log", "--format=%H|%h|%aI|%s", "--no-walk=unsorted",
"--all" if False else "HEAD", # just HEAD lineage
"-n", str(self.max_snapshots)],
shadow, abs_dir,
)
# Simpler: just use regular log
ok, stdout, _ = _run_git(
["log", "--format=%H|%h|%aI|%s", "-n", str(self.max_snapshots)],
shadow, abs_dir,
)
if not ok or not stdout:
return []
results = []
for line in stdout.splitlines():
parts = line.split("|", 3)
if len(parts) == 4:
results.append({
"hash": parts[0],
"short_hash": parts[1],
"timestamp": parts[2],
"reason": parts[3],
})
return results
def restore(self, working_dir: str, commit_hash: str) -> Dict:
"""Restore files to a checkpoint state.
Uses ``git checkout <hash> -- .`` which restores tracked files
without moving HEAD safe and reversible.
Returns dict with success/error info.
"""
abs_dir = str(Path(working_dir).resolve())
shadow = _shadow_repo_path(abs_dir)
if not (shadow / "HEAD").exists():
return {"success": False, "error": "No checkpoints exist for this directory"}
# Verify the commit exists
ok, _, err = _run_git(
["cat-file", "-t", commit_hash], shadow, abs_dir,
)
if not ok:
return {"success": False, "error": f"Checkpoint '{commit_hash}' not found"}
# Take a checkpoint of current state before restoring (so you can undo the undo)
self._take(abs_dir, f"pre-rollback snapshot (restoring to {commit_hash[:8]})")
# Restore
ok, stdout, err = _run_git(
["checkout", commit_hash, "--", "."],
shadow, abs_dir, timeout=_GIT_TIMEOUT * 2,
)
if not ok:
return {"success": False, "error": f"Restore failed: {err}"}
# Get info about what was restored
ok2, reason_out, _ = _run_git(
["log", "--format=%s", "-1", commit_hash], shadow, abs_dir,
)
reason = reason_out if ok2 else "unknown"
return {
"success": True,
"restored_to": commit_hash[:8],
"reason": reason,
"directory": abs_dir,
}
def get_working_dir_for_path(self, file_path: str) -> str:
"""Resolve a file path to its working directory for checkpointing.
Walks up from the file's parent to find a reasonable project root
(directory containing .git, pyproject.toml, package.json, etc.).
Falls back to the file's parent directory.
"""
path = Path(file_path).resolve()
if path.is_dir():
candidate = path
else:
candidate = path.parent
# Walk up looking for project root markers
markers = {".git", "pyproject.toml", "package.json", "Cargo.toml",
"go.mod", "Makefile", "pom.xml", ".hg", "Gemfile"}
check = candidate
while check != check.parent:
if any((check / m).exists() for m in markers):
return str(check)
check = check.parent
# No project root found — use the file's parent
return str(candidate)
# ------------------------------------------------------------------
# Internal
# ------------------------------------------------------------------
def _take(self, working_dir: str, reason: str) -> bool:
"""Take a snapshot. Returns True on success."""
shadow = _shadow_repo_path(working_dir)
# Init if needed
err = _init_shadow_repo(shadow, working_dir)
if err:
logger.debug("Checkpoint init failed: %s", err)
return False
# Quick size guard — don't try to snapshot enormous directories
if _dir_file_count(working_dir) > _MAX_FILES:
logger.debug("Checkpoint skipped: >%d files in %s", _MAX_FILES, working_dir)
return False
# Stage everything
ok, _, err = _run_git(
["add", "-A"], shadow, working_dir, timeout=_GIT_TIMEOUT * 2,
)
if not ok:
logger.debug("Checkpoint git-add failed: %s", err)
return False
# Check if there's anything to commit
ok_diff, diff_out, _ = _run_git(
["diff", "--cached", "--quiet"], shadow, working_dir,
)
if ok_diff:
# No changes to commit
logger.debug("Checkpoint skipped: no changes in %s", working_dir)
return False
# Commit
ok, _, err = _run_git(
["commit", "-m", reason, "--allow-empty-message"],
shadow, working_dir, timeout=_GIT_TIMEOUT * 2,
)
if not ok:
logger.debug("Checkpoint commit failed: %s", err)
return False
logger.debug("Checkpoint taken in %s: %s", working_dir, reason)
# Prune old snapshots
self._prune(shadow, working_dir)
return True
def _prune(self, shadow_repo: Path, working_dir: str) -> None:
"""Keep only the last max_snapshots commits via orphan reset."""
ok, stdout, _ = _run_git(
["rev-list", "--count", "HEAD"], shadow_repo, working_dir,
)
if not ok:
return
try:
count = int(stdout)
except ValueError:
return
if count <= self.max_snapshots:
return
# Get the hash of the commit at the cutoff point
ok, cutoff_hash, _ = _run_git(
["rev-list", "--reverse", "HEAD", "--skip=0",
f"--max-count=1"],
shadow_repo, working_dir,
)
# For simplicity, we don't actually prune — git's pack mechanism
# handles this efficiently, and the objects are small. The log
# listing is already limited by max_snapshots.
# Full pruning would require rebase --onto or filter-branch which
# is fragile for a background feature. We just limit the log view.
logger.debug("Checkpoint repo has %d commits (limit %d)", count, self.max_snapshots)
def format_checkpoint_list(checkpoints: List[Dict], directory: str) -> str:
"""Format checkpoint list for display to user."""
if not checkpoints:
return f"No checkpoints found for {directory}"
lines = [f"📸 Checkpoints for {directory}:\n"]
for i, cp in enumerate(checkpoints, 1):
# Parse ISO timestamp to something readable
ts = cp["timestamp"]
if "T" in ts:
ts = ts.split("T")[1].split("+")[0].split("-")[0][:5] # HH:MM
date = cp["timestamp"].split("T")[0]
ts = f"{date} {ts}"
lines.append(f" {i}. {cp['short_hash']} {ts} {cp['reason']}")
lines.append(f"\nUse /rollback <number> to restore, e.g. /rollback 1")
return "\n".join(lines)
+31 -21
View File
@@ -311,6 +311,7 @@ def _rpc_server_loop(
sys.stderr.close()
sys.stdout, sys.stderr = _real_stdout, _real_stderr
except Exception as exc:
logger.error("Tool call failed in sandbox: %s", exc, exc_info=True)
result = json.dumps({"error": str(exc)})
tool_call_counter[0] += 1
@@ -327,15 +328,15 @@ def _rpc_server_loop(
conn.sendall((result + "\n").encode())
except socket.timeout:
pass
except OSError:
pass
logger.debug("RPC listener socket timeout")
except OSError as e:
logger.debug("RPC listener socket error: %s", e, exc_info=True)
finally:
if conn:
try:
conn.close()
except OSError:
pass
except OSError as e:
logger.debug("RPC conn close error: %s", e)
# ---------------------------------------------------------------------------
@@ -397,9 +398,9 @@ def execute_code(
try:
# Write the auto-generated hermes_tools module
tools_src = generate_hermes_tools_module(
list(sandbox_tools) if enabled_tools else list(SANDBOX_ALLOWED_TOOLS)
)
# sandbox_tools is already the correct set (intersection with session
# tools, or SANDBOX_ALLOWED_TOOLS as fallback — see lines above).
tools_src = generate_hermes_tools_module(list(sandbox_tools))
with open(os.path.join(tmpdir, "hermes_tools.py"), "w") as f:
f.write(tools_src)
@@ -472,8 +473,8 @@ def execute_code(
keep = max_bytes - total
chunks.append(data[:keep])
total += len(data)
except (ValueError, OSError):
pass
except (ValueError, OSError) as e:
logger.debug("Error reading process output: %s", e, exc_info=True)
stdout_reader = threading.Thread(
target=_drain, args=(proc.stdout, stdout_chunks, MAX_STDOUT_BYTES), daemon=True
@@ -511,7 +512,7 @@ def execute_code(
duration = round(time.monotonic() - exec_start, 2)
# Wait for RPC thread to finish
server_sock.close()
server_sock.close() # break accept() so thread exits promptly
rpc_thread.join(timeout=3)
# Build response
@@ -547,15 +548,19 @@ def execute_code(
finally:
# Cleanup temp dir and socket
try:
server_sock.close()
except Exception as e:
logger.debug("Server socket close error: %s", e)
try:
import shutil
shutil.rmtree(tmpdir, ignore_errors=True)
except Exception as e:
logger.debug("Could not clean temp dir: %s", e)
logger.debug("Could not clean temp dir: %s", e, exc_info=True)
try:
os.unlink(sock_path)
except OSError:
pass
except OSError as e:
logger.debug("Could not remove socket file: %s", e, exc_info=True)
def _kill_process_group(proc, escalate: bool = False):
@@ -565,11 +570,12 @@ def _kill_process_group(proc, escalate: bool = False):
proc.terminate()
else:
os.killpg(os.getpgid(proc.pid), signal.SIGTERM)
except (ProcessLookupError, PermissionError):
except (ProcessLookupError, PermissionError) as e:
logger.debug("Could not kill process group: %s", e, exc_info=True)
try:
proc.kill()
except Exception as e:
logger.debug("Could not kill process: %s", e)
except Exception as e2:
logger.debug("Could not kill process: %s", e2, exc_info=True)
if escalate:
# Give the process 5s to exit after SIGTERM, then SIGKILL
@@ -581,11 +587,12 @@ def _kill_process_group(proc, escalate: bool = False):
proc.kill()
else:
os.killpg(os.getpgid(proc.pid), signal.SIGKILL)
except (ProcessLookupError, PermissionError):
except (ProcessLookupError, PermissionError) as e:
logger.debug("Could not kill process group with SIGKILL: %s", e, exc_info=True)
try:
proc.kill()
except Exception as e:
logger.debug("Could not kill process: %s", e)
except Exception as e2:
logger.debug("Could not kill process: %s", e2, exc_info=True)
def _load_config() -> dict:
@@ -647,7 +654,10 @@ def build_execute_code_schema(enabled_sandbox_tools: set = None) -> dict:
import_examples = [n for n in ("web_search", "terminal") if n in enabled_sandbox_tools]
if not import_examples:
import_examples = sorted(enabled_sandbox_tools)[:2]
import_str = ", ".join(import_examples) + ", ..."
if import_examples:
import_str = ", ".join(import_examples) + ", ..."
else:
import_str = "..."
description = (
"Run a Python script that can call Hermes tools programmatically. "
+15 -14
View File
@@ -20,6 +20,7 @@ import contextlib
import io
import json
import logging
logger = logging.getLogger(__name__)
import os
import sys
import time
@@ -107,8 +108,8 @@ def _build_child_progress_callback(task_index: int, parent_agent, task_count: in
short = (preview[:55] + "...") if preview and len(preview) > 55 else (preview or "")
try:
spinner.print_above(f" {prefix}├─ 💭 \"{short}\"")
except Exception:
pass
except Exception as e:
logger.debug("Spinner print_above failed: %s", e)
# Don't relay thinking to gateway (too noisy for chat)
return
@@ -129,8 +130,8 @@ def _build_child_progress_callback(task_index: int, parent_agent, task_count: in
line += f" \"{short}\""
try:
spinner.print_above(line)
except Exception:
pass
except Exception as e:
logger.debug("Spinner print_above failed: %s", e)
if parent_cb:
_batch.append(tool_name)
@@ -138,8 +139,8 @@ def _build_child_progress_callback(task_index: int, parent_agent, task_count: in
summary = ", ".join(_batch)
try:
parent_cb("subagent_progress", f"🔀 {prefix}{summary}")
except Exception:
pass
except Exception as e:
logger.debug("Parent callback failed: %s", e)
_batch.clear()
def _flush():
@@ -148,8 +149,8 @@ def _build_child_progress_callback(task_index: int, parent_agent, task_count: in
summary = ", ".join(_batch)
try:
parent_cb("subagent_progress", f"🔀 {prefix}{summary}")
except Exception:
pass
except Exception as e:
logger.debug("Parent callback flush failed: %s", e)
_batch.clear()
_callback._flush = _flush
@@ -241,8 +242,8 @@ def _run_single_child(
if child_progress_cb and hasattr(child_progress_cb, '_flush'):
try:
child_progress_cb._flush()
except Exception:
pass
except Exception as e:
logger.debug("Progress callback flush failed: %s", e)
duration = round(time.monotonic() - child_start, 2)
@@ -287,8 +288,8 @@ def _run_single_child(
if hasattr(parent_agent, '_active_children'):
try:
parent_agent._active_children.remove(child)
except (ValueError, UnboundLocalError):
pass
except (ValueError, UnboundLocalError) as e:
logger.debug("Could not remove child from active_children: %s", e)
def delegate_task(
@@ -425,8 +426,8 @@ def delegate_task(
if spinner_ref and remaining > 0:
try:
spinner_ref.update_text(f"🔀 {remaining} task{'s' if remaining != 1 else ''} remaining")
except Exception:
pass
except Exception as e:
logger.debug("Spinner update_text failed: %s", e)
# Restore stdout/stderr in case redirect_stdout race left them as devnull
sys.stdout = _saved_stdout
+10 -2
View File
@@ -59,8 +59,16 @@ class BaseEnvironment(ABC):
# Shared helpers (eliminate duplication across backends)
# ------------------------------------------------------------------
def _prepare_command(self, command: str) -> str:
"""Transform sudo commands if SUDO_PASSWORD is available."""
def _prepare_command(self, command: str) -> tuple[str, str | None]:
"""Transform sudo commands if SUDO_PASSWORD is available.
Returns:
(transformed_command, sudo_stdin) see _transform_sudo_command
for the full contract. Callers that drive a subprocess directly
should prepend sudo_stdin (when not None) to any stdin_data they
pass to Popen. Callers that embed stdin via heredoc (modal,
daytona) handle sudo_stdin in their own execute() method.
"""
from tools.terminal_tool import _transform_sudo_command
return _transform_sudo_command(command)
+17 -4
View File
@@ -6,6 +6,7 @@ and resumed on next creation, preserving the filesystem across sessions.
"""
import logging
import time
import math
import shlex
import threading
@@ -142,10 +143,9 @@ class DaytonaEnvironment(BaseEnvironment):
t = threading.Thread(target=_run, daemon=True)
t.start()
# Wait for timeout + generous buffer for network/SDK overhead
deadline = timeout + 10
deadline = time.monotonic() + timeout + 10
while t.is_alive():
t.join(timeout=0.2)
deadline -= 0.2
if is_interrupted():
with self._lock:
try:
@@ -156,7 +156,7 @@ class DaytonaEnvironment(BaseEnvironment):
"output": "[Command interrupted - Daytona sandbox stopped]",
"returncode": 130,
}
if deadline <= 0:
if time.monotonic() > deadline:
# Shell timeout didn't fire and SDK is hung — force stop
with self._lock:
try:
@@ -181,7 +181,20 @@ class DaytonaEnvironment(BaseEnvironment):
marker = f"HERMES_EOF_{uuid.uuid4().hex[:8]}"
command = f"{command} << '{marker}'\n{stdin_data}\n{marker}"
exec_command = self._prepare_command(command)
exec_command, sudo_stdin = self._prepare_command(command)
# Daytona sandboxes execute commands via the Daytona SDK and cannot
# pipe subprocess stdin directly the way a local Popen can. When a
# sudo password is present, use a shell-level pipe from printf so that
# the password feeds sudo -S without appearing as an echo argument
# embedded in the shell string. The password is still visible in the
# remote sandbox's command line, but it is not exposed on the user's
# local machine — which is the primary threat being mitigated.
if sudo_stdin is not None:
import shlex
exec_command = (
f"printf '%s\\n' {shlex.quote(sudo_stdin.rstrip())} | {exec_command}"
)
effective_cwd = cwd or self.cwd or None
effective_timeout = timeout or self.timeout
+13 -5
View File
@@ -193,10 +193,18 @@ class DockerEnvironment(BaseEnvironment):
def execute(self, command: str, cwd: str = "", *,
timeout: int | None = None,
stdin_data: str | None = None) -> dict:
exec_command = self._prepare_command(command)
exec_command, sudo_stdin = self._prepare_command(command)
work_dir = cwd or self.cwd
effective_timeout = timeout or self.timeout
# Merge sudo password (if any) with caller-supplied stdin_data.
if sudo_stdin is not None and stdin_data is not None:
effective_stdin = sudo_stdin + stdin_data
elif sudo_stdin is not None:
effective_stdin = sudo_stdin
else:
effective_stdin = stdin_data
# docker exec -w doesn't expand ~, so prepend a cd into the command
if work_dir == "~" or work_dir.startswith("~/"):
exec_command = f"cd {work_dir} && {exec_command}"
@@ -204,7 +212,7 @@ class DockerEnvironment(BaseEnvironment):
assert self._inner.container_id, "Container not started"
cmd = [self._inner.config.executable, "exec"]
if stdin_data is not None:
if effective_stdin is not None:
cmd.append("-i")
cmd.extend(["-w", work_dir])
for key in self._inner.config.forward_env:
@@ -219,12 +227,12 @@ class DockerEnvironment(BaseEnvironment):
proc = subprocess.Popen(
cmd,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
stdin=subprocess.PIPE if stdin_data else subprocess.DEVNULL,
stdin=subprocess.PIPE if effective_stdin else subprocess.DEVNULL,
text=True,
)
if stdin_data:
if effective_stdin:
try:
proc.stdin.write(stdin_data)
proc.stdin.write(effective_stdin)
proc.stdin.close()
except Exception:
pass
+15 -4
View File
@@ -161,7 +161,18 @@ class LocalEnvironment(BaseEnvironment):
work_dir = cwd or self.cwd or os.getcwd()
effective_timeout = timeout or self.timeout
exec_command = self._prepare_command(command)
exec_command, sudo_stdin = self._prepare_command(command)
# Merge the sudo password (if any) with caller-supplied stdin_data.
# sudo -S reads exactly one line (the password) then passes the rest
# of stdin to the child, so prepending is safe even when stdin_data
# is also present.
if sudo_stdin is not None and stdin_data is not None:
effective_stdin = sudo_stdin + stdin_data
elif sudo_stdin is not None:
effective_stdin = sudo_stdin
else:
effective_stdin = stdin_data
try:
# The fence wrapper uses bash syntax (semicolons, $?, printf).
@@ -195,14 +206,14 @@ class LocalEnvironment(BaseEnvironment):
errors="replace",
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
stdin=subprocess.PIPE if stdin_data is not None else subprocess.DEVNULL,
stdin=subprocess.PIPE if effective_stdin is not None else subprocess.DEVNULL,
preexec_fn=None if _IS_WINDOWS else os.setsid,
)
if stdin_data is not None:
if effective_stdin is not None:
def _write_stdin():
try:
proc.stdin.write(stdin_data)
proc.stdin.write(effective_stdin)
proc.stdin.close()
except (BrokenPipeError, OSError):
pass
+18 -1
View File
@@ -106,7 +106,20 @@ class ModalEnvironment(BaseEnvironment):
marker = f"HERMES_EOF_{uuid.uuid4().hex[:8]}"
command = f"{command} << '{marker}'\n{stdin_data}\n{marker}"
exec_command = self._prepare_command(command)
exec_command, sudo_stdin = self._prepare_command(command)
# Modal sandboxes execute commands via the Modal SDK and cannot pipe
# subprocess stdin directly the way a local Popen can. When a sudo
# password is present, use a shell-level pipe from printf so that the
# password feeds sudo -S without appearing as an echo argument embedded
# in the shell string. The password is still visible in the remote
# sandbox's command line, but it is not exposed on the user's local
# machine — which is the primary threat being mitigated.
if sudo_stdin is not None:
import shlex
exec_command = (
f"printf '%s\\n' {shlex.quote(sudo_stdin.rstrip())} | {exec_command}"
)
# Run in a background thread so we can poll for interrupts
result_holder = {"value": None, "error": None}
@@ -137,6 +150,10 @@ class ModalEnvironment(BaseEnvironment):
def cleanup(self):
"""Snapshot the filesystem (if persistent) then stop the sandbox."""
# Check if _inner was ever set (init may have failed)
if not hasattr(self, '_inner') or self._inner is None:
return
if self._persistent:
try:
sandbox = getattr(self._inner, 'deployment', None)
+12 -4
View File
@@ -228,7 +228,15 @@ class SingularityEnvironment(BaseEnvironment):
effective_timeout = timeout or self.timeout
work_dir = cwd or self.cwd
exec_command = self._prepare_command(command)
exec_command, sudo_stdin = self._prepare_command(command)
# Merge sudo password (if any) with caller-supplied stdin_data.
if sudo_stdin is not None and stdin_data is not None:
effective_stdin = sudo_stdin + stdin_data
elif sudo_stdin is not None:
effective_stdin = sudo_stdin
else:
effective_stdin = stdin_data
# apptainer exec --pwd doesn't expand ~, so prepend a cd into the command
if work_dir == "~" or work_dir.startswith("~/"):
@@ -245,12 +253,12 @@ class SingularityEnvironment(BaseEnvironment):
proc = subprocess.Popen(
cmd,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
stdin=subprocess.PIPE if stdin_data else subprocess.DEVNULL,
stdin=subprocess.PIPE if effective_stdin else subprocess.DEVNULL,
text=True,
)
if stdin_data:
if effective_stdin:
try:
proc.stdin.write(stdin_data)
proc.stdin.write(effective_stdin)
proc.stdin.close()
except Exception:
pass
+13 -5
View File
@@ -69,15 +69,23 @@ class SSHEnvironment(BaseEnvironment):
timeout: int | None = None,
stdin_data: str | None = None) -> dict:
work_dir = cwd or self.cwd
exec_command = self._prepare_command(command)
exec_command, sudo_stdin = self._prepare_command(command)
wrapped = f'cd {work_dir} && {exec_command}'
effective_timeout = timeout or self.timeout
# Merge sudo password (if any) with caller-supplied stdin_data.
if sudo_stdin is not None and stdin_data is not None:
effective_stdin = sudo_stdin + stdin_data
elif sudo_stdin is not None:
effective_stdin = sudo_stdin
else:
effective_stdin = stdin_data
cmd = self._build_ssh_command()
cmd.extend(["bash", "-c", wrapped])
try:
kwargs = self._build_run_kwargs(timeout, stdin_data)
kwargs = self._build_run_kwargs(timeout, effective_stdin)
# Remove timeout from kwargs -- we handle it in the poll loop
kwargs.pop("timeout", None)
@@ -87,13 +95,13 @@ class SSHEnvironment(BaseEnvironment):
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
stdin=subprocess.PIPE if stdin_data else subprocess.DEVNULL,
stdin=subprocess.PIPE if effective_stdin else subprocess.DEVNULL,
text=True,
)
if stdin_data:
if effective_stdin:
try:
proc.stdin.write(stdin_data)
proc.stdin.write(effective_stdin)
proc.stdin.close()
except Exception:
pass
+39 -42
View File
@@ -962,37 +962,35 @@ class ShellFileOperations(FileOperations):
# rg match lines: "file:lineno:content" (colon separator)
# rg context lines: "file-lineno-content" (dash separator)
# rg group seps: "--"
# Note: on Windows, paths contain drive letters (e.g. C:\path),
# so naive split(":") breaks. Use regex to handle both platforms.
_match_re = re.compile(r'^([A-Za-z]:)?(.*?):(\d+):(.*)$')
_ctx_re = re.compile(r'^([A-Za-z]:)?(.*?)-(\d+)-(.*)$')
matches = []
for line in result.stdout.strip().split('\n'):
if not line or line == "--":
continue
# Try match line first (colon-separated: file:line:content)
parts = line.split(':', 2)
if len(parts) >= 3:
try:
matches.append(SearchMatch(
path=parts[0],
line_number=int(parts[1]),
content=parts[2][:500]
))
continue
except ValueError:
pass
m = _match_re.match(line)
if m:
matches.append(SearchMatch(
path=(m.group(1) or '') + m.group(2),
line_number=int(m.group(3)),
content=m.group(4)[:500]
))
continue
# Try context line (dash-separated: file-line-content)
# Only attempt if context was requested to avoid false positives
if context > 0:
parts = line.split('-', 2)
if len(parts) >= 3:
try:
matches.append(SearchMatch(
path=parts[0],
line_number=int(parts[1]),
content=parts[2][:500]
))
except ValueError:
pass
m = _ctx_re.match(line)
if m:
matches.append(SearchMatch(
path=(m.group(1) or '') + m.group(2),
line_number=int(m.group(3)),
content=m.group(4)[:500]
))
total = len(matches)
page = matches[offset:offset + limit]
@@ -1059,34 +1057,33 @@ class ShellFileOperations(FileOperations):
# grep match lines: "file:lineno:content" (colon)
# grep context lines: "file-lineno-content" (dash)
# grep group seps: "--"
# Note: on Windows, paths contain drive letters (e.g. C:\path),
# so naive split(":") breaks. Use regex to handle both platforms.
_match_re = re.compile(r'^([A-Za-z]:)?(.*?):(\d+):(.*)$')
_ctx_re = re.compile(r'^([A-Za-z]:)?(.*?)-(\d+)-(.*)$')
matches = []
for line in result.stdout.strip().split('\n'):
if not line or line == "--":
continue
parts = line.split(':', 2)
if len(parts) >= 3:
try:
matches.append(SearchMatch(
path=parts[0],
line_number=int(parts[1]),
content=parts[2][:500]
))
continue
except ValueError:
pass
m = _match_re.match(line)
if m:
matches.append(SearchMatch(
path=(m.group(1) or '') + m.group(2),
line_number=int(m.group(3)),
content=m.group(4)[:500]
))
continue
if context > 0:
parts = line.split('-', 2)
if len(parts) >= 3:
try:
matches.append(SearchMatch(
path=parts[0],
line_number=int(parts[1]),
content=parts[2][:500]
))
except ValueError:
pass
m = _ctx_re.match(line)
if m:
matches.append(SearchMatch(
path=(m.group(1) or '') + m.group(2),
line_number=int(m.group(3)),
content=m.group(4)[:500]
))
total = len(matches)
page = matches[offset:offset + limit]
+1
View File
@@ -91,6 +91,7 @@ def _get_file_ops(task_id: str = "default") -> ShellFileOperations:
"container_memory": config.get("container_memory", 5120),
"container_disk": config.get("container_disk", 51200),
"container_persistent": config.get("container_persistent", True),
"docker_volumes": config.get("docker_volumes", []),
}
terminal_env = _create_environment(
env_type=env_type,
+5 -2
View File
@@ -148,11 +148,14 @@ class ProcessRegistry:
if use_pty:
# Try PTY mode for interactive CLI tools
try:
import ptyprocess
if _IS_WINDOWS:
from winpty import PtyProcess as _PtyProcessCls
else:
from ptyprocess import PtyProcess as _PtyProcessCls
user_shell = _find_shell()
pty_env = os.environ | (env_vars or {})
pty_env["PYTHONUNBUFFERED"] = "1"
pty_proc = ptyprocess.PtyProcess.spawn(
pty_proc = _PtyProcessCls.spawn(
[user_shell, "-lic", command],
cwd=session.cwd,
env=pty_env,
+41 -8
View File
@@ -37,6 +37,7 @@ import logging
import os
import re
import shutil
import tempfile
from pathlib import Path
from typing import Dict, Any, Optional
@@ -190,6 +191,38 @@ def _validate_file_path(file_path: str) -> Optional[str]:
return None
def _atomic_write_text(file_path: Path, content: str, encoding: str = "utf-8") -> None:
"""
Atomically write text content to a file.
Uses a temporary file in the same directory and os.replace() to ensure
the target file is never left in a partially-written state if the process
crashes or is interrupted.
Args:
file_path: Target file path
content: Content to write
encoding: Text encoding (default: utf-8)
"""
file_path.parent.mkdir(parents=True, exist_ok=True)
fd, temp_path = tempfile.mkstemp(
dir=str(file_path.parent),
prefix=f".{file_path.name}.tmp.",
suffix="",
)
try:
with os.fdopen(fd, "w", encoding=encoding) as f:
f.write(content)
os.replace(temp_path, file_path)
except Exception:
# Clean up temp file on error
try:
os.unlink(temp_path)
except OSError:
pass
raise
# =============================================================================
# Core actions
# =============================================================================
@@ -218,9 +251,9 @@ def _create_skill(name: str, content: str, category: str = None) -> Dict[str, An
skill_dir = _resolve_skill_dir(name, category)
skill_dir.mkdir(parents=True, exist_ok=True)
# Write SKILL.md
# Write SKILL.md atomically
skill_md = skill_dir / "SKILL.md"
skill_md.write_text(content, encoding="utf-8")
_atomic_write_text(skill_md, content)
# Security scan — roll back on block
scan_error = _security_scan_skill(skill_dir)
@@ -256,13 +289,13 @@ def _edit_skill(name: str, content: str) -> Dict[str, Any]:
skill_md = existing["path"] / "SKILL.md"
# Back up original content for rollback
original_content = skill_md.read_text(encoding="utf-8") if skill_md.exists() else None
skill_md.write_text(content, encoding="utf-8")
_atomic_write_text(skill_md, content)
# Security scan — roll back on block
scan_error = _security_scan_skill(existing["path"])
if scan_error:
if original_content is not None:
skill_md.write_text(original_content, encoding="utf-8")
_atomic_write_text(skill_md, original_content)
return {"success": False, "error": scan_error}
return {
@@ -342,12 +375,12 @@ def _patch_skill(
}
original_content = content # for rollback
target.write_text(new_content, encoding="utf-8")
_atomic_write_text(target, new_content)
# Security scan — roll back on block
scan_error = _security_scan_skill(skill_dir)
if scan_error:
target.write_text(original_content, encoding="utf-8")
_atomic_write_text(target, original_content)
return {"success": False, "error": scan_error}
replacements = count if replace_all else 1
@@ -394,13 +427,13 @@ def _write_file(name: str, file_path: str, file_content: str) -> Dict[str, Any]:
target.parent.mkdir(parents=True, exist_ok=True)
# Back up for rollback
original_content = target.read_text(encoding="utf-8") if target.exists() else None
target.write_text(file_content, encoding="utf-8")
_atomic_write_text(target, file_content)
# Security scan — roll back on block
scan_error = _security_scan_skill(existing["path"])
if scan_error:
if original_content is not None:
target.write_text(original_content, encoding="utf-8")
_atomic_write_text(target, original_content)
else:
target.unlink(missing_ok=True)
return {"success": False, "error": scan_error}
+13 -2
View File
@@ -63,6 +63,7 @@ Usage:
"""
import json
import logging
import os
import re
import sys
@@ -71,6 +72,8 @@ from typing import Dict, Any, List, Optional, Tuple
import yaml
logger = logging.getLogger(__name__)
# All skills live in ~/.hermes/skills/ (seeded from bundled skills/ on install).
# This is the single source of truth -- agent edits, hub installs, and bundled
@@ -269,7 +272,11 @@ def _find_all_skills() -> List[Dict[str, Any]]:
"category": category,
})
except Exception:
except (UnicodeDecodeError, PermissionError) as e:
logger.warning("Failed to read skill file %s: %s", skill_md, e)
continue
except Exception as e:
logger.warning("Error parsing skill %s: %s", skill_md, e, exc_info=True)
continue
return skills
@@ -308,7 +315,11 @@ def _load_category_description(category_dir: Path) -> Optional[str]:
description = description[:MAX_DESCRIPTION_LENGTH - 3] + "..."
return description if description else None
except Exception:
except (UnicodeDecodeError, PermissionError) as e:
logger.debug("Failed to read category description %s: %s", desc_file, e)
return None
except Exception as e:
logger.warning("Error parsing category description %s: %s", desc_file, e, exc_info=True)
return None
+80 -48
View File
@@ -29,6 +29,7 @@ Usage:
import json
import logging
import os
import platform
import signal
import sys
import time
@@ -83,8 +84,8 @@ def _check_disk_usage_warning():
if f.is_file():
try:
total_bytes += f.stat().st_size
except OSError:
pass
except OSError as e:
logger.debug("Could not stat file %s: %s", f, e)
total_gb = total_bytes / (1024 ** 3)
@@ -192,23 +193,35 @@ def _prompt_for_sudo_password(timeout_seconds: int = 45) -> str:
result = {"password": None, "done": False}
def read_password_thread():
"""Read password from /dev/tty with echo disabled."""
"""Read password with echo disabled. Uses msvcrt on Windows, /dev/tty on Unix."""
tty_fd = None
old_attrs = None
try:
import termios
tty_fd = os.open("/dev/tty", os.O_RDONLY)
old_attrs = termios.tcgetattr(tty_fd)
new_attrs = termios.tcgetattr(tty_fd)
new_attrs[3] = new_attrs[3] & ~termios.ECHO
termios.tcsetattr(tty_fd, termios.TCSAFLUSH, new_attrs)
chars = []
while True:
b = os.read(tty_fd, 1)
if not b or b in (b"\n", b"\r"):
break
chars.append(b)
result["password"] = b"".join(chars).decode("utf-8", errors="replace")
if platform.system() == "Windows":
import msvcrt
chars = []
while True:
c = msvcrt.getwch()
if c in ("\r", "\n"):
break
if c == "\x03":
raise KeyboardInterrupt
chars.append(c)
result["password"] = "".join(chars)
else:
import termios
tty_fd = os.open("/dev/tty", os.O_RDONLY)
old_attrs = termios.tcgetattr(tty_fd)
new_attrs = termios.tcgetattr(tty_fd)
new_attrs[3] = new_attrs[3] & ~termios.ECHO
termios.tcsetattr(tty_fd, termios.TCSAFLUSH, new_attrs)
chars = []
while True:
b = os.read(tty_fd, 1)
if not b or b in (b"\n", b"\r"):
break
chars.append(b)
result["password"] = b"".join(chars).decode("utf-8", errors="replace")
except (EOFError, KeyboardInterrupt, OSError):
result["password"] = ""
except Exception:
@@ -218,13 +231,13 @@ def _prompt_for_sudo_password(timeout_seconds: int = 45) -> str:
try:
import termios as _termios
_termios.tcsetattr(tty_fd, _termios.TCSAFLUSH, old_attrs)
except Exception:
pass
except Exception as e:
logger.debug("Failed to restore terminal attributes: %s", e)
if tty_fd is not None:
try:
os.close(tty_fd)
except Exception:
pass
except Exception as e:
logger.debug("Failed to close tty fd: %s", e)
result["done"] = True
try:
@@ -278,32 +291,50 @@ def _prompt_for_sudo_password(timeout_seconds: int = 45) -> str:
del os.environ["HERMES_SPINNER_PAUSE"]
def _transform_sudo_command(command: str) -> str:
def _transform_sudo_command(command: str) -> tuple[str, str | None]:
"""
Transform sudo commands to use -S flag if SUDO_PASSWORD is available.
This is a shared helper used by all execution environments to provide
consistent sudo handling across local, SSH, and container environments.
If SUDO_PASSWORD is set (via env, config, or interactive prompt):
'sudo apt install curl' -> password piped via sudo -S
Returns:
(transformed_command, sudo_stdin) where:
- transformed_command has every bare ``sudo`` replaced with
``sudo -S -p ''`` so sudo reads its password from stdin.
- sudo_stdin is the password string with a trailing newline that the
caller must prepend to the process's stdin stream. sudo -S reads
exactly one line (the password) and passes the rest of stdin to the
child command, so prepending is safe even when the caller also has
its own stdin_data to pipe.
- If no password is available, sudo_stdin is None and the command is
returned unchanged so it fails gracefully with
"sudo: a password is required".
Callers that drive a subprocess directly (local, ssh, docker, singularity)
should prepend sudo_stdin to their stdin_data and pass the merged bytes to
Popen's stdin pipe.
Callers that cannot pipe subprocess stdin (modal, daytona) must embed the
password in the command string themselves; see their execute() methods for
how they handle the non-None sudo_stdin case.
If SUDO_PASSWORD is not set and in interactive mode (HERMES_INTERACTIVE=1):
Prompts user for password with 45s timeout, caches for session.
If SUDO_PASSWORD is not set and NOT interactive:
Command runs as-is (fails gracefully with "sudo: a password is required").
"""
global _cached_sudo_password
import re
# Check if command even contains sudo
if not re.search(r'\bsudo\b', command):
return command # No sudo in command, return as-is
return command, None # No sudo in command, nothing to do
# Try to get password from: env var -> session cache -> interactive prompt
sudo_password = os.getenv("SUDO_PASSWORD", "") or _cached_sudo_password
if not sudo_password:
# No password configured - check if we're in interactive mode
if os.getenv("HERMES_INTERACTIVE"):
@@ -311,21 +342,21 @@ def _transform_sudo_command(command: str) -> str:
sudo_password = _prompt_for_sudo_password(timeout_seconds=45)
if sudo_password:
_cached_sudo_password = sudo_password # Cache for session
if not sudo_password:
return command # No password, let it fail gracefully
return command, None # No password, let it fail gracefully
def replace_sudo(match):
# Replace 'sudo' with password-piped version
# The -S flag makes sudo read password from stdin
# The -p '' suppresses the password prompt
# Use shlex.quote() to prevent shell injection via password content
import shlex
return f"echo {shlex.quote(sudo_password)} | sudo -S -p ''"
# Replace bare 'sudo' with 'sudo -S -p ""'.
# The password is returned as sudo_stdin and must be written to the
# process's stdin pipe by the caller — it never appears in any
# command-line argument or shell string.
return "sudo -S -p ''"
# Match 'sudo' at word boundaries (not 'visudo' or 'sudoers')
# This handles: sudo, sudo -flag, etc.
return re.sub(r'\bsudo\b', replace_sudo, command)
transformed = re.sub(r'\bsudo\b', replace_sudo, command)
# Trailing newline is required: sudo -S reads one line for the password.
return transformed, sudo_password + "\n"
# Environment classes now live in tools/environments/
@@ -424,7 +455,8 @@ def _get_env_config() -> Dict[str, Any]:
# SSH is excluded since /home/ paths are valid on remote machines.
cwd = os.getenv("TERMINAL_CWD", default_cwd)
if env_type in ("modal", "docker", "singularity", "daytona") and cwd:
host_prefixes = ("/Users/", "C:\\", "C:/")
# Host paths that won't exist inside containers
host_prefixes = ("/Users/", "/home/", "C:\\", "C:/")
if any(cwd.startswith(p) for p in host_prefixes) and cwd != default_cwd:
logger.info("Ignoring TERMINAL_CWD=%r for %s backend "
"(host path won't exist in sandbox). Using %r instead.",
@@ -658,8 +690,8 @@ def get_active_environments_info() -> Dict[str, Any]:
try:
size = sum(f.stat().st_size for f in Path(path).rglob('*') if f.is_file())
total_size += size
except OSError:
pass
except OSError as e:
logger.debug("Could not stat path %s: %s", path, e)
info["total_disk_usage_mb"] = round(total_size / (1024 * 1024), 2)
return info
@@ -686,8 +718,8 @@ def cleanup_all_environments():
try:
shutil.rmtree(path, ignore_errors=True)
logger.info("Removed orphaned: %s", path)
except OSError:
pass
except OSError as e:
logger.debug("Failed to remove orphaned path %s: %s", path, e)
if cleaned > 0:
logger.info("Cleaned %d environments", cleaned)
+28 -5
View File
@@ -83,7 +83,11 @@ def _load_tts_config() -> Dict[str, Any]:
from hermes_cli.config import load_config
config = load_config()
return config.get("tts", {})
except Exception:
except ImportError:
logger.debug("hermes_cli.config not available, using default TTS config")
return {}
except Exception as e:
logger.warning("Failed to load TTS config: %s", e, exc_info=True)
return {}
@@ -115,15 +119,23 @@ def _convert_to_opus(mp3_path: str) -> Optional[str]:
ogg_path = mp3_path.rsplit(".", 1)[0] + ".ogg"
try:
subprocess.run(
result = subprocess.run(
["ffmpeg", "-i", mp3_path, "-acodec", "libopus",
"-ac", "1", "-b:a", "64k", "-vbr", "off", ogg_path, "-y"],
capture_output=True, timeout=30,
)
if result.returncode != 0:
logger.warning("ffmpeg conversion failed with return code %d: %s",
result.returncode, result.stderr.decode('utf-8', errors='ignore')[:200])
return None
if os.path.exists(ogg_path) and os.path.getsize(ogg_path) > 0:
return ogg_path
except subprocess.TimeoutExpired:
logger.warning("ffmpeg OGG conversion timed out after 30s")
except FileNotFoundError:
logger.warning("ffmpeg not found in PATH")
except Exception as e:
logger.warning("ffmpeg OGG conversion failed: %s", e)
logger.warning("ffmpeg OGG conversion failed: %s", e, exc_info=True)
return None
@@ -369,10 +381,21 @@ def text_to_speech_tool(
"voice_compatible": voice_compatible,
}, ensure_ascii=False)
except Exception as e:
error_msg = f"TTS generation failed ({provider}): {e}"
except ValueError as e:
# Configuration errors (missing API keys, etc.)
error_msg = f"TTS configuration error ({provider}): {e}"
logger.error("%s", error_msg)
return json.dumps({"success": False, "error": error_msg}, ensure_ascii=False)
except FileNotFoundError as e:
# Missing dependencies or files
error_msg = f"TTS dependency missing ({provider}): {e}"
logger.error("%s", error_msg, exc_info=True)
return json.dumps({"success": False, "error": error_msg}, ensure_ascii=False)
except Exception as e:
# Unexpected errors
error_msg = f"TTS generation failed ({provider}): {e}"
logger.error("%s", error_msg, exc_info=True)
return json.dumps({"success": False, "error": error_msg}, ensure_ascii=False)
# ===========================================================================
+48
View File
@@ -6,6 +6,8 @@ import tempfile
from pathlib import Path
from typing import Any, Union
import yaml
def atomic_json_write(path: Union[str, Path], data: Any, *, indent: int = 2) -> None:
"""Write JSON data to a file atomically.
@@ -39,3 +41,49 @@ def atomic_json_write(path: Union[str, Path], data: Any, *, indent: int = 2) ->
except OSError:
pass
raise
def atomic_yaml_write(
path: Union[str, Path],
data: Any,
*,
default_flow_style: bool = False,
sort_keys: bool = False,
extra_content: str | None = None,
) -> None:
"""Write YAML data to a file atomically.
Uses temp file + fsync + os.replace to ensure the target file is never
left in a partially-written state. If the process crashes mid-write,
the previous version of the file remains intact.
Args:
path: Target file path (will be created or overwritten).
data: YAML-serializable data to write.
default_flow_style: YAML flow style (default False).
sort_keys: Whether to sort dict keys (default False).
extra_content: Optional string to append after the YAML dump
(e.g. commented-out sections for user reference).
"""
path = Path(path)
path.parent.mkdir(parents=True, exist_ok=True)
fd, tmp_path = tempfile.mkstemp(
dir=str(path.parent),
prefix=f".{path.stem}_",
suffix=".tmp",
)
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
yaml.dump(data, f, default_flow_style=default_flow_style, sort_keys=sort_keys)
if extra_content:
f.write(extra_content)
f.flush()
os.fsync(f.fileno())
os.replace(tmp_path, path)
except BaseException:
try:
os.unlink(tmp_path)
except OSError:
pass
raise
+3
View File
@@ -24,6 +24,7 @@ These are commands you run from your shell.
| `hermes chat --toolsets "web,terminal"` / `-t` | Use specific toolsets |
| `hermes chat --verbose` | Enable verbose/debug output |
| `hermes --worktree` / `-w` | Start in an isolated git worktree (for parallel agents) |
| `hermes --checkpoints` | Enable filesystem checkpoints before destructive file operations |
### Provider & Model Management
@@ -202,6 +203,8 @@ These work in messaging platforms (Telegram, Discord, Slack, WhatsApp) but not t
| `/sethome` | Set this chat as the home channel |
| `/status` | Show session info |
| `/reload-mcp` | Reload MCP servers from config |
| `/rollback` | List filesystem checkpoints for the current directory |
| `/rollback <N>` | Restore files to checkpoint #N |
| `/update` | Update Hermes Agent to the latest version |
---
+10
View File
@@ -663,6 +663,16 @@ browser:
record_sessions: false # Auto-record browser sessions as WebM videos to ~/.hermes/browser_recordings/
```
## Checkpoints
Automatic filesystem snapshots before destructive file operations. See the [Checkpoints feature page](/docs/user-guide/features/checkpoints) for details.
```yaml
checkpoints:
enabled: false # Enable automatic checkpoints (also: hermes --checkpoints)
max_snapshots: 50 # Max checkpoints to keep per directory
```
## Delegation
Configure subagent behavior for the delegate tool:
@@ -0,0 +1,97 @@
# Filesystem Checkpoints
Hermes can automatically snapshot your working directory before making file changes, giving you a safety net to roll back if something goes wrong.
## How It Works
When enabled, Hermes takes a **one-time snapshot** at the start of each conversation turn before the first file-modifying operation (`write_file` or `patch`). This creates a point-in-time backup you can restore to at any time.
Under the hood, checkpoints use a **shadow git repository** stored at `~/.hermes/checkpoints/`. This is completely separate from your project's git — no `.git` directory is created in your project, and your own git history is never touched.
## Enabling Checkpoints
### Per-session (CLI flag)
```bash
hermes --checkpoints
```
### Permanently (config.yaml)
```yaml
# ~/.hermes/config.yaml
checkpoints:
enabled: true
max_snapshots: 50 # max checkpoints per directory (default: 50)
```
## Rolling Back
Use the `/rollback` slash command:
```
/rollback # List all available checkpoints
/rollback 1 # Restore to checkpoint #1 (most recent)
/rollback 3 # Restore to checkpoint #3 (further back)
/rollback abc1234 # Restore by git commit hash
```
Example output:
```
📸 Checkpoints for /home/user/project:
1. abc1234 2026-03-10 14:22 before write_file
2. def5678 2026-03-10 14:15 before patch
3. ghi9012 2026-03-10 14:08 before write_file
Use /rollback <number> to restore, e.g. /rollback 1
```
When you restore, Hermes automatically takes a **pre-rollback snapshot** first — so you can always undo your undo.
## What Gets Checkpointed
Checkpoints capture the entire working directory (the project root), excluding common large/sensitive patterns:
- `node_modules/`, `dist/`, `build/`
- `.env`, `.env.*`
- `__pycache__/`, `*.pyc`
- `.venv/`, `venv/`
- `.git/`
- `.DS_Store`, `*.log`
## Performance
Checkpoints are designed to be lightweight:
- **Once per turn** — only the first file operation triggers a snapshot, not every write
- **Skips large directories** — directories with >50,000 files are skipped automatically
- **Skips when nothing changed** — if no files were modified since the last checkpoint, no commit is created
- **Non-blocking** — if a checkpoint fails for any reason, the file operation proceeds normally
## How It Determines the Project Root
When you write to a file like `src/components/Button.tsx`, Hermes walks up the directory tree looking for project markers (`.git`, `pyproject.toml`, `package.json`, `Cargo.toml`, etc.) to find the project root. This ensures the entire project is checkpointed, not just the file's parent directory.
## Platforms
Checkpoints work on both:
- **CLI** — uses your current working directory
- **Gateway** (Telegram, Discord, etc.) — uses `MESSAGING_CWD`
The `/rollback` command is available on all platforms.
## FAQ
**Does this conflict with my project's git?**
No. Checkpoints use a completely separate shadow git repository via `GIT_DIR` environment variables. Your project's `.git/` is never touched.
**How much disk space do checkpoints use?**
Git is very efficient at storing diffs. For most projects, checkpoint data is negligible. Old checkpoints are pruned when `max_snapshots` is exceeded.
**Can I checkpoint without git installed?**
No — git must be available on your PATH. If it's not installed, checkpoints silently disable.
**Can I roll back across sessions?**
Yes! Checkpoints persist in `~/.hermes/checkpoints/` and survive across sessions. You can roll back to a checkpoint from yesterday.
+2 -2
View File
@@ -5,7 +5,7 @@ import type * as Preset from '@docusaurus/preset-classic';
const config: Config = {
title: 'Hermes Agent',
tagline: 'The self-improving AI agent',
favicon: 'img/favicon.svg',
favicon: 'img/favicon.ico',
url: 'https://hermes-agent.nousresearch.com',
baseUrl: '/docs/',
@@ -53,7 +53,7 @@ const config: Config = {
title: 'Hermes Agent',
logo: {
alt: 'Hermes Agent',
src: 'img/favicon.svg',
src: 'img/logo.png',
},
items: [
{
Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 870 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.5 KiB

Some files were not shown because too many files have changed in this diff Show More