Compare commits
111 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| d6ab35c1a3 | |||
| cea78c5e27 | |||
| d04b9f4dc5 | |||
| 8eefbef91c | |||
| e590caf8d8 | |||
| 46b95ee694 | |||
| 0fdeffe6c4 | |||
| cc4ead999a | |||
| 60cba55d82 | |||
| 1caee06b22 | |||
| a6eaf0f41f | |||
| fadad820dd | |||
| e8b19b5826 | |||
| 9ea2209a43 | |||
| 87af622df4 | |||
| 2c21c4b897 | |||
| 771969f747 | |||
| e9742e202f | |||
| a2ea85924a | |||
| 8318a519e6 | |||
| 8ef3c815e7 | |||
| de07aa7c40 | |||
| 928bb16da1 | |||
| 441f498d6f | |||
| a630ca15de | |||
| 52e3580cd4 | |||
| 694a3ebdd5 | |||
| 2a062e2f45 | |||
| 49ec1c9e8f | |||
| 4bd579f915 | |||
| e4adb67ed8 | |||
| ff09cad879 | |||
| 580e6ba2ff | |||
| d6d5a43d3a | |||
| d723208b1b | |||
| b0a5fe8974 | |||
| 899dfdcfb9 | |||
| 8f0b07ed29 | |||
| f16f2912cf | |||
| af748539f8 | |||
| 695c017411 | |||
| 5e6c7bc205 | |||
| e8cec55fad | |||
| 67fc6bc4e9 | |||
| cbca0225f6 | |||
| 36ac91c902 | |||
| a2902fbad5 | |||
| d03de749a1 | |||
| c3dec1dcda | |||
| 4945240fc3 | |||
| f6bc620d39 | |||
| b4b46d1b67 | |||
| c1775de56f | |||
| de6750ed23 | |||
| c0ffd6b704 | |||
| 8b9de366f2 | |||
| 60d3f79c72 | |||
| 6f3a673aba | |||
| ab6a6338c4 | |||
| 1ec8c1fcaa | |||
| 739eb6702e | |||
| 1aa7badb3c | |||
| ee4008431a | |||
| 88f8bcde38 | |||
| 2285615010 | |||
| 805ce8177b | |||
| bdce33e239 | |||
| 9be8d88ccc | |||
| 6ab3ebf195 | |||
| 0a628c1aef | |||
| 36328a996f | |||
| 4bc32dc0f1 | |||
| 4de5e017f1 | |||
| 3e352f8a0d | |||
| 28ae5db9b0 | |||
| d5811c887a | |||
| 975fd86dc4 | |||
| 0ff7fe3ee2 | |||
| b9d55d5719 | |||
| ab7dc22984 | |||
| bf8350ac18 | |||
| a5c6348d41 | |||
| 320f881e0b | |||
| d8df91dfa8 | |||
| 7b1f40dd00 | |||
| 86eed141af | |||
| c6df39955c | |||
| 19459b7623 | |||
| b0b19fdeb1 | |||
| 8c26a057a3 | |||
| ae4644f495 | |||
| 70cffa4d3b | |||
| ee7d8c56c7 | |||
| 40bc7216e1 | |||
| 5cdcb9e26f | |||
| ce7e7fef30 | |||
| 86caa8539c | |||
| 53b4b7651a | |||
| a857321463 | |||
| 33cfe1515d | |||
| 3b43f7267a | |||
| 1755a9e38a | |||
| 566aeaeefa | |||
| 7a0544ab57 | |||
| 453e0677d6 | |||
| 32dbd31b9a | |||
| 81986022b7 | |||
| dcba291d45 | |||
| 48e65631f6 | |||
| 14a11d24b4 | |||
| 71c0cd00e5 |
@@ -31,7 +31,8 @@ hermes-agent/
|
||||
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
|
||||
│ ├── commands.py # Slash command definitions + SlashCommandCompleter
|
||||
│ ├── callbacks.py # Terminal callbacks (clarify, sudo, approval)
|
||||
│ └── setup.py # Interactive setup wizard
|
||||
│ ├── setup.py # Interactive setup wizard
|
||||
│ └── skin_engine.py # Skin/theme engine — CLI visual customization
|
||||
├── tools/ # Tool implementations (one file per tool)
|
||||
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
|
||||
│ ├── approval.py # Dangerous command detection
|
||||
@@ -121,6 +122,7 @@ Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`. Re
|
||||
- **Rich** for banner/panels, **prompt_toolkit** for input with autocomplete
|
||||
- **KawaiiSpinner** (`agent/display.py`) — animated faces during API calls, `┊` activity feed for tool results
|
||||
- `load_cli_config()` in cli.py merges hardcoded defaults + user config YAML
|
||||
- **Skin engine** (`hermes_cli/skin_engine.py`) — data-driven CLI theming; initialized from `display.skin` config key at startup; skins customize banner colors, spinner faces/verbs/wings, tool prefix, response box, branding text
|
||||
- `process_command()` is a method on `HermesCLI` (not in commands.py)
|
||||
- Skill slash commands: `agent/skill_commands.py` scans `~/.hermes/skills/`, injects as **user message** (not system prompt) to preserve prompt caching
|
||||
|
||||
@@ -195,6 +197,94 @@ The registry handles schema collection, dispatch, availability checking, and err
|
||||
|
||||
---
|
||||
|
||||
## Skin/Theme System
|
||||
|
||||
The skin engine (`hermes_cli/skin_engine.py`) provides data-driven CLI visual customization. Skins are **pure data** — no code changes needed to add a new skin.
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
hermes_cli/skin_engine.py # SkinConfig dataclass, built-in skins, YAML loader
|
||||
~/.hermes/skins/*.yaml # User-installed custom skins (drop-in)
|
||||
```
|
||||
|
||||
- `init_skin_from_config()` — called at CLI startup, reads `display.skin` from config
|
||||
- `get_active_skin()` — returns cached `SkinConfig` for the current skin
|
||||
- `set_active_skin(name)` — switches skin at runtime (used by `/skin` command)
|
||||
- `load_skin(name)` — loads from user skins first, then built-ins, then falls back to default
|
||||
- Missing skin values inherit from the `default` skin automatically
|
||||
|
||||
### What skins customize
|
||||
|
||||
| Element | Skin Key | Used By |
|
||||
|---------|----------|---------|
|
||||
| Banner panel border | `colors.banner_border` | `banner.py` |
|
||||
| Banner panel title | `colors.banner_title` | `banner.py` |
|
||||
| Banner section headers | `colors.banner_accent` | `banner.py` |
|
||||
| Banner dim text | `colors.banner_dim` | `banner.py` |
|
||||
| Banner body text | `colors.banner_text` | `banner.py` |
|
||||
| Response box border | `colors.response_border` | `cli.py` |
|
||||
| Spinner faces (waiting) | `spinner.waiting_faces` | `display.py` |
|
||||
| Spinner faces (thinking) | `spinner.thinking_faces` | `display.py` |
|
||||
| Spinner verbs | `spinner.thinking_verbs` | `display.py` |
|
||||
| Spinner wings (optional) | `spinner.wings` | `display.py` |
|
||||
| Tool output prefix | `tool_prefix` | `display.py` |
|
||||
| Agent name | `branding.agent_name` | `banner.py`, `cli.py` |
|
||||
| Welcome message | `branding.welcome` | `cli.py` |
|
||||
| Response box label | `branding.response_label` | `cli.py` |
|
||||
| Prompt symbol | `branding.prompt_symbol` | `cli.py` |
|
||||
|
||||
### Built-in skins
|
||||
|
||||
- `default` — Classic Hermes gold/kawaii (the current look)
|
||||
- `ares` — Crimson/bronze war-god theme with custom spinner wings
|
||||
- `mono` — Clean grayscale monochrome
|
||||
- `slate` — Cool blue developer-focused theme
|
||||
|
||||
### Adding a built-in skin
|
||||
|
||||
Add to `_BUILTIN_SKINS` dict in `hermes_cli/skin_engine.py`:
|
||||
|
||||
```python
|
||||
"mytheme": {
|
||||
"name": "mytheme",
|
||||
"description": "Short description",
|
||||
"colors": { ... },
|
||||
"spinner": { ... },
|
||||
"branding": { ... },
|
||||
"tool_prefix": "┊",
|
||||
},
|
||||
```
|
||||
|
||||
### User skins (YAML)
|
||||
|
||||
Users create `~/.hermes/skins/<name>.yaml`:
|
||||
|
||||
```yaml
|
||||
name: cyberpunk
|
||||
description: Neon-soaked terminal theme
|
||||
|
||||
colors:
|
||||
banner_border: "#FF00FF"
|
||||
banner_title: "#00FFFF"
|
||||
banner_accent: "#FF1493"
|
||||
|
||||
spinner:
|
||||
thinking_verbs: ["jacking in", "decrypting", "uploading"]
|
||||
wings:
|
||||
- ["⟨⚡", "⚡⟩"]
|
||||
|
||||
branding:
|
||||
agent_name: "Cyber Agent"
|
||||
response_label: " ⚡ Cyber "
|
||||
|
||||
tool_prefix: "▏"
|
||||
```
|
||||
|
||||
Activate with `/skin cyberpunk` or `display.skin: cyberpunk` in config.yaml.
|
||||
|
||||
---
|
||||
|
||||
## Important Policies
|
||||
|
||||
### Prompt Caching Must Not Break
|
||||
@@ -210,6 +300,17 @@ Cache-breaking forces dramatically higher costs. The ONLY time we alter context
|
||||
- **CLI**: Uses current directory (`.` → `os.getcwd()`)
|
||||
- **Messaging**: Uses `MESSAGING_CWD` env var (default: home directory)
|
||||
|
||||
### Background Process Notifications (Gateway)
|
||||
|
||||
When `terminal(background=true, check_interval=...)` is used, the gateway runs a watcher that
|
||||
pushes status updates to the user's chat. Control verbosity with `display.background_process_notifications`
|
||||
in config.yaml (or `HERMES_BACKGROUND_NOTIFICATIONS` env var):
|
||||
|
||||
- `all` — running-output updates + final message (default)
|
||||
- `result` — only the final completion message
|
||||
- `error` — only the final message when exit code != 0
|
||||
- `off` — no watcher messages at all
|
||||
|
||||
---
|
||||
|
||||
## Known Pitfalls
|
||||
|
||||
@@ -139,7 +139,8 @@ hermes-agent/
|
||||
│ ├── commands.py # Slash command definitions + autocomplete
|
||||
│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval)
|
||||
│ ├── doctor.py # Diagnostics
|
||||
│ └── skills_hub.py # Skills Hub CLI + /skills slash command
|
||||
│ ├── skills_hub.py # Skills Hub CLI + /skills slash command
|
||||
│ └── skin_engine.py # Skin/theme engine — data-driven CLI visual customization
|
||||
│
|
||||
├── tools/ # Tool implementations (self-registering)
|
||||
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
|
||||
@@ -375,6 +376,56 @@ If the field is omitted or empty, the skill loads on all platforms (backward com
|
||||
|
||||
---
|
||||
|
||||
## Adding a Skin / Theme
|
||||
|
||||
Hermes uses a data-driven skin system — no code changes needed to add a new skin.
|
||||
|
||||
**Option A: User skin (YAML file)**
|
||||
|
||||
Create `~/.hermes/skins/<name>.yaml`:
|
||||
|
||||
```yaml
|
||||
name: mytheme
|
||||
description: Short description of the theme
|
||||
|
||||
colors:
|
||||
banner_border: "#HEX" # Panel border color
|
||||
banner_title: "#HEX" # Panel title color
|
||||
banner_accent: "#HEX" # Section header color
|
||||
banner_dim: "#HEX" # Muted/dim text color
|
||||
banner_text: "#HEX" # Body text color
|
||||
response_border: "#HEX" # Response box border
|
||||
|
||||
spinner:
|
||||
waiting_faces: ["(⚔)", "(⛨)"]
|
||||
thinking_faces: ["(⚔)", "(⌁)"]
|
||||
thinking_verbs: ["forging", "plotting"]
|
||||
wings: # Optional left/right decorations
|
||||
- ["⟪⚔", "⚔⟫"]
|
||||
|
||||
branding:
|
||||
agent_name: "My Agent"
|
||||
welcome: "Welcome message"
|
||||
response_label: " ⚔ Agent "
|
||||
prompt_symbol: "⚔ ❯ "
|
||||
|
||||
tool_prefix: "╎" # Tool output line prefix
|
||||
```
|
||||
|
||||
All fields are optional — missing values inherit from the default skin.
|
||||
|
||||
**Option B: Built-in skin**
|
||||
|
||||
Add to `_BUILTIN_SKINS` dict in `hermes_cli/skin_engine.py`. Use the same schema as above but as a Python dict. Built-in skins ship with the package and are always available.
|
||||
|
||||
**Activating:**
|
||||
- CLI: `/skin mytheme` or set `display.skin: mytheme` in config.yaml
|
||||
- Config: `display: { skin: mytheme }`
|
||||
|
||||
See `hermes_cli/skin_engine.py` for the full schema and existing skins as examples.
|
||||
|
||||
---
|
||||
|
||||
## Cross-Platform Compatibility
|
||||
|
||||
Hermes runs on Linux, macOS, and Windows. When writing code that touches the OS:
|
||||
|
||||
@@ -5,8 +5,8 @@ Used by AIAgent._execute_tool_calls for CLI feedback.
|
||||
"""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import random
|
||||
import sys
|
||||
import threading
|
||||
import time
|
||||
@@ -15,6 +15,49 @@ import time
|
||||
_RED = "\033[31m"
|
||||
_RESET = "\033[0m"
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Skin-aware helpers (lazy import to avoid circular deps)
|
||||
# =========================================================================
|
||||
|
||||
def _get_skin():
|
||||
"""Get the active skin config, or None if not available."""
|
||||
try:
|
||||
from hermes_cli.skin_engine import get_active_skin
|
||||
return get_active_skin()
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def get_skin_faces(key: str, default: list) -> list:
|
||||
"""Get spinner face list from active skin, falling back to default."""
|
||||
skin = _get_skin()
|
||||
if skin:
|
||||
faces = skin.get_spinner_list(key)
|
||||
if faces:
|
||||
return faces
|
||||
return default
|
||||
|
||||
|
||||
def get_skin_verbs() -> list:
|
||||
"""Get thinking verbs from active skin."""
|
||||
skin = _get_skin()
|
||||
if skin:
|
||||
verbs = skin.get_spinner_list("thinking_verbs")
|
||||
if verbs:
|
||||
return verbs
|
||||
return KawaiiSpinner.THINKING_VERBS
|
||||
|
||||
|
||||
def get_skin_tool_prefix() -> str:
|
||||
"""Get tool output prefix character from active skin."""
|
||||
skin = _get_skin()
|
||||
if skin:
|
||||
return skin.tool_prefix
|
||||
return "┊"
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Tool preview (one-line summary of a tool call's primary argument)
|
||||
@@ -22,6 +65,8 @@ _RESET = "\033[0m"
|
||||
|
||||
def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str:
|
||||
"""Build a short preview of a tool call's primary argument for display."""
|
||||
if not args:
|
||||
return None
|
||||
primary_args = {
|
||||
"terminal": "command", "web_search": "query", "web_extract": "urls",
|
||||
"read_file": "path", "write_file": "path", "patch": "path",
|
||||
@@ -163,6 +208,7 @@ class KawaiiSpinner:
|
||||
self.frame_idx = 0
|
||||
self.start_time = None
|
||||
self.last_line_len = 0
|
||||
self._last_flush_time = 0.0 # Rate-limit flushes for patch_stdout compat
|
||||
# Capture stdout NOW, before any redirect_stdout(devnull) from
|
||||
# child agents can replace sys.stdout with a black hole.
|
||||
self._out = sys.stdout
|
||||
@@ -177,15 +223,34 @@ class KawaiiSpinner:
|
||||
pass
|
||||
|
||||
def _animate(self):
|
||||
# Cache skin wings at start (avoid per-frame imports)
|
||||
skin = _get_skin()
|
||||
wings = skin.get_spinner_wings() if skin else []
|
||||
|
||||
while self.running:
|
||||
if os.getenv("HERMES_SPINNER_PAUSE"):
|
||||
time.sleep(0.1)
|
||||
continue
|
||||
frame = self.spinner_frames[self.frame_idx % len(self.spinner_frames)]
|
||||
elapsed = time.time() - self.start_time
|
||||
line = f" {frame} {self.message} ({elapsed:.1f}s)"
|
||||
if wings:
|
||||
left, right = wings[self.frame_idx % len(wings)]
|
||||
line = f" {left} {frame} {self.message} {right} ({elapsed:.1f}s)"
|
||||
else:
|
||||
line = f" {frame} {self.message} ({elapsed:.1f}s)"
|
||||
pad = max(self.last_line_len - len(line), 0)
|
||||
self._write(f"\r{line}{' ' * pad}", end='', flush=True)
|
||||
# Rate-limit flush() calls to avoid spinner spam under
|
||||
# prompt_toolkit's patch_stdout. Each flush() pushes a queue
|
||||
# item that may trigger a separate run_in_terminal() call; if
|
||||
# items are processed one-at-a-time the \r overwrite is lost
|
||||
# and every frame appears on its own line. By flushing at
|
||||
# most every 0.4s we guarantee multiple \r-frames are batched
|
||||
# into a single write, so the terminal collapses them correctly.
|
||||
now = time.time()
|
||||
should_flush = (now - self._last_flush_time) >= 0.4
|
||||
self._write(f"\r{line}{' ' * pad}", end='', flush=should_flush)
|
||||
if should_flush:
|
||||
self._last_flush_time = now
|
||||
self.last_line_len = len(line)
|
||||
self.frame_idx += 1
|
||||
time.sleep(0.12)
|
||||
@@ -300,7 +365,7 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
|
||||
if exit_code is not None and exit_code != 0:
|
||||
return True, f" [exit {exit_code}]"
|
||||
except (json.JSONDecodeError, TypeError, AttributeError):
|
||||
pass
|
||||
logger.debug("Could not parse terminal result as JSON for exit code check")
|
||||
return False, ""
|
||||
|
||||
# Memory-specific: distinguish "full" from real errors
|
||||
@@ -310,7 +375,7 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
|
||||
if data.get("success") is False and "exceed the limit" in data.get("error", ""):
|
||||
return True, " [full]"
|
||||
except (json.JSONDecodeError, TypeError, AttributeError):
|
||||
pass
|
||||
logger.debug("Could not parse memory result as JSON for capacity check")
|
||||
|
||||
# Generic heuristic for non-terminal tools
|
||||
lower = result[:500].lower()
|
||||
@@ -332,6 +397,7 @@ def get_cute_tool_message(
|
||||
"""
|
||||
dur = f"{duration:.1f}s"
|
||||
is_failure, failure_suffix = _detect_tool_failure(tool_name, result)
|
||||
skin_prefix = get_skin_tool_prefix()
|
||||
|
||||
def _trunc(s, n=40):
|
||||
s = str(s)
|
||||
@@ -342,7 +408,9 @@ def get_cute_tool_message(
|
||||
return ("..." + p[-(n-3):]) if len(p) > n else p
|
||||
|
||||
def _wrap(line: str) -> str:
|
||||
"""Append failure suffix when the tool failed."""
|
||||
"""Apply skin tool prefix and failure suffix."""
|
||||
if skin_prefix != "┊":
|
||||
line = line.replace("┊", skin_prefix, 1)
|
||||
if not is_failure:
|
||||
return line
|
||||
return f"{line}{failure_suffix}"
|
||||
|
||||
@@ -159,8 +159,8 @@ def _read_skill_description(skill_file: Path, max_chars: int = 60) -> str:
|
||||
if len(desc) > max_chars:
|
||||
desc = desc[:max_chars - 3] + "..."
|
||||
return desc
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as e:
|
||||
logger.debug("Failed to read skill description from %s: %s", skill_file, e)
|
||||
return ""
|
||||
|
||||
|
||||
|
||||
@@ -10,7 +10,6 @@ the first 6 and last 4 characters for debuggability.
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
from typing import Optional
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@@ -606,7 +606,7 @@ class BatchRunner:
|
||||
# Create batches
|
||||
self.batches = self._create_batches()
|
||||
|
||||
print(f"📊 Batch Runner Initialized")
|
||||
print("📊 Batch Runner Initialized")
|
||||
print(f" Dataset: {self.dataset_file} ({len(self.dataset)} prompts)")
|
||||
print(f" Batch size: {self.batch_size}")
|
||||
print(f" Total batches: {len(self.batches)}")
|
||||
@@ -826,7 +826,7 @@ class BatchRunner:
|
||||
print("=" * 70)
|
||||
print(f" Original dataset size: {len(self.dataset):,} prompts")
|
||||
print(f" Already completed: {len(skipped_indices):,} prompts")
|
||||
print(f" ─────────────────────────────────────────")
|
||||
print(" ─────────────────────────────────────────")
|
||||
print(f" 🎯 RESUMING WITH: {len(filtered_entries):,} prompts")
|
||||
print(f" New batches created: {len(batches_to_process)}")
|
||||
print("=" * 70 + "\n")
|
||||
@@ -888,7 +888,7 @@ class BatchRunner:
|
||||
]
|
||||
|
||||
print(f"✅ Created {len(tasks)} batch tasks")
|
||||
print(f"🚀 Starting parallel batch processing...\n")
|
||||
print("🚀 Starting parallel batch processing...\n")
|
||||
|
||||
# Use rich Progress for better visual tracking with persistent bottom bar
|
||||
# redirect_stdout/stderr lets rich manage all output so progress bar stays clean
|
||||
@@ -1057,7 +1057,7 @@ class BatchRunner:
|
||||
print(f"✅ Total trajectories in merged file: {total_entries - filtered_entries}")
|
||||
print(f"✅ Total batch files merged: {batch_files_found}")
|
||||
print(f"⏱️ Total duration: {round(time.time() - start_time, 2)}s")
|
||||
print(f"\n📈 Tool Usage Statistics:")
|
||||
print("\n📈 Tool Usage Statistics:")
|
||||
print("-" * 70)
|
||||
|
||||
if total_tool_stats:
|
||||
@@ -1084,7 +1084,7 @@ class BatchRunner:
|
||||
# Print reasoning coverage stats
|
||||
total_discarded = sum(r.get("discarded_no_reasoning", 0) for r in results)
|
||||
|
||||
print(f"\n🧠 Reasoning Coverage:")
|
||||
print("\n🧠 Reasoning Coverage:")
|
||||
print("-" * 70)
|
||||
total_turns = total_reasoning_stats["total_assistant_turns"]
|
||||
with_reasoning = total_reasoning_stats["turns_with_reasoning"]
|
||||
@@ -1101,8 +1101,8 @@ class BatchRunner:
|
||||
print(f" 🚫 Samples discarded (zero reasoning): {total_discarded:,}")
|
||||
|
||||
print(f"\n💾 Results saved to: {self.output_dir}")
|
||||
print(f" - Trajectories: trajectories.jsonl (combined)")
|
||||
print(f" - Individual batches: batch_*.jsonl (for debugging)")
|
||||
print(" - Trajectories: trajectories.jsonl (combined)")
|
||||
print(" - Individual batches: batch_*.jsonl (for debugging)")
|
||||
print(f" - Statistics: {self.stats_file.name}")
|
||||
print(f" - Checkpoint: {self.checkpoint_file.name}")
|
||||
|
||||
@@ -1238,7 +1238,7 @@ def main(
|
||||
with open(prefill_messages_file, 'r', encoding='utf-8') as f:
|
||||
prefill_messages = json.load(f)
|
||||
if not isinstance(prefill_messages, list):
|
||||
print(f"❌ Error: prefill_messages_file must contain a JSON array of messages")
|
||||
print("❌ Error: prefill_messages_file must contain a JSON array of messages")
|
||||
return
|
||||
print(f"💬 Loaded {len(prefill_messages)} prefill messages from {prefill_messages_file}")
|
||||
except Exception as e:
|
||||
|
||||
@@ -11,6 +11,7 @@ model:
|
||||
|
||||
# Inference provider selection:
|
||||
# "auto" - Use Nous Portal if logged in, otherwise OpenRouter/env vars (default)
|
||||
# "nous-api" - Use Nous Portal via API key (requires: NOUS_API_KEY)
|
||||
# "openrouter" - Always use OpenRouter API key from OPENROUTER_API_KEY
|
||||
# "nous" - Always use Nous Portal (requires: hermes login)
|
||||
# "zai" - Use z.ai / ZhipuAI GLM models (requires: GLM_API_KEY)
|
||||
@@ -402,11 +403,13 @@ agent:
|
||||
# discord: [web, vision, skills, todo]
|
||||
#
|
||||
# If not set, defaults are:
|
||||
# cli: hermes-cli (everything + cronjob management)
|
||||
# telegram: hermes-telegram (terminal, file, web, vision, image, tts, browser, skills, todo, cronjob, messaging)
|
||||
# discord: hermes-discord (same as telegram)
|
||||
# whatsapp: hermes-whatsapp (same as telegram)
|
||||
# slack: hermes-slack (same as telegram)
|
||||
# cli: hermes-cli (everything + cronjob management)
|
||||
# telegram: hermes-telegram (terminal, file, web, vision, image, tts, browser, skills, todo, cronjob, messaging)
|
||||
# discord: hermes-discord (same as telegram)
|
||||
# whatsapp: hermes-whatsapp (same as telegram)
|
||||
# slack: hermes-slack (same as telegram)
|
||||
# signal: hermes-signal (same as telegram)
|
||||
# homeassistant: hermes-homeassistant (same as telegram)
|
||||
#
|
||||
platform_toolsets:
|
||||
cli: [hermes-cli]
|
||||
@@ -414,6 +417,8 @@ platform_toolsets:
|
||||
discord: [hermes-discord]
|
||||
whatsapp: [hermes-whatsapp]
|
||||
slack: [hermes-slack]
|
||||
signal: [hermes-signal]
|
||||
homeassistant: [hermes-homeassistant]
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# Available toolsets (use these names in platform_toolsets or the toolsets list)
|
||||
@@ -651,7 +656,57 @@ display:
|
||||
# Toggle at runtime with /verbose in the CLI
|
||||
tool_progress: all
|
||||
|
||||
# Background process notifications (gateway/messaging only).
|
||||
# Controls how chatty the process watcher is when you use
|
||||
# terminal(background=true, check_interval=...) from Telegram/Discord/etc.
|
||||
# off: No watcher messages at all
|
||||
# result: Only the final completion message
|
||||
# error: Only the final message when exit code != 0
|
||||
# all: Running output updates + final message (default)
|
||||
background_process_notifications: all
|
||||
|
||||
# Play terminal bell when agent finishes a response.
|
||||
# Useful for long-running tasks — your terminal will ding when the agent is done.
|
||||
# Works over SSH. Most terminals can be configured to flash the taskbar or play a sound.
|
||||
bell_on_complete: false
|
||||
|
||||
# ───────────────────────────────────────────────────────────────────────────
|
||||
# Skin / Theme
|
||||
# ───────────────────────────────────────────────────────────────────────────
|
||||
# Customize CLI visual appearance — banner colors, spinner faces, tool prefix,
|
||||
# response box label, and branding text. Change at runtime with /skin <name>.
|
||||
#
|
||||
# Built-in skins:
|
||||
# default — Classic Hermes gold/kawaii
|
||||
# ares — Crimson/bronze war-god theme with spinner wings
|
||||
# mono — Clean grayscale monochrome
|
||||
# slate — Cool blue developer-focused
|
||||
#
|
||||
# Custom skins: drop a YAML file in ~/.hermes/skins/<name>.yaml
|
||||
# Schema (all fields optional, missing values inherit from default):
|
||||
#
|
||||
# name: my-theme
|
||||
# description: Short description
|
||||
# colors:
|
||||
# banner_border: "#HEX" # Panel border
|
||||
# banner_title: "#HEX" # Panel title
|
||||
# banner_accent: "#HEX" # Section headers (Available Tools, etc.)
|
||||
# banner_dim: "#HEX" # Dim/muted text
|
||||
# banner_text: "#HEX" # Body text (tool names, skill names)
|
||||
# ui_accent: "#HEX" # UI accent color
|
||||
# response_border: "#HEX" # Response box border color
|
||||
# spinner:
|
||||
# waiting_faces: ["(⚔)", "(⛨)"] # Faces shown while waiting
|
||||
# thinking_faces: ["(⚔)", "(⌁)"] # Faces shown while thinking
|
||||
# thinking_verbs: ["forging", "plotting"] # Verbs for spinner messages
|
||||
# wings: # Optional left/right spinner decorations
|
||||
# - ["⟪⚔", "⚔⟫"]
|
||||
# - ["⟪▲", "▲⟫"]
|
||||
# branding:
|
||||
# agent_name: "My Agent" # Banner title and branding
|
||||
# welcome: "Welcome message" # Shown at CLI startup
|
||||
# response_label: " ⚔ Agent " # Response box header label
|
||||
# prompt_symbol: "⚔ ❯ " # Prompt symbol
|
||||
# tool_prefix: "╎" # Tool output line prefix (default: ┊)
|
||||
#
|
||||
skin: default
|
||||
|
||||
@@ -19,6 +19,7 @@ import sys
|
||||
import json
|
||||
import atexit
|
||||
import uuid
|
||||
import textwrap
|
||||
from pathlib import Path
|
||||
from datetime import datetime
|
||||
from typing import List, Dict, Any, Optional
|
||||
@@ -45,6 +46,11 @@ from prompt_toolkit.widgets import TextArea
|
||||
from prompt_toolkit.key_binding import KeyBindings
|
||||
from prompt_toolkit import print_formatted_text as _pt_print
|
||||
from prompt_toolkit.formatted_text import ANSI as _PT_ANSI
|
||||
try:
|
||||
from prompt_toolkit.cursor_shapes import CursorShape
|
||||
_STEADY_CURSOR = CursorShape.BLOCK # Non-blinking block cursor
|
||||
except (ImportError, AttributeError):
|
||||
_STEADY_CURSOR = None
|
||||
import threading
|
||||
import queue
|
||||
|
||||
@@ -196,6 +202,7 @@ def load_cli_config() -> Dict[str, Any]:
|
||||
"display": {
|
||||
"compact": False,
|
||||
"resume_display": "full",
|
||||
"skin": "default",
|
||||
},
|
||||
"clarify": {
|
||||
"timeout": 120, # Seconds to wait for a clarify answer before auto-proceeding
|
||||
@@ -250,8 +257,13 @@ def load_cli_config() -> Dict[str, Any]:
|
||||
if key not in defaults and key != "model":
|
||||
defaults[key] = file_config[key]
|
||||
|
||||
# Handle root-level max_turns (backwards compat) - copy to agent.max_turns
|
||||
if "max_turns" in file_config and "agent" not in file_config:
|
||||
# Handle legacy root-level max_turns (backwards compat) - copy to
|
||||
# agent.max_turns whenever the nested key is missing.
|
||||
agent_file_config = file_config.get("agent")
|
||||
if "max_turns" in file_config and not (
|
||||
isinstance(agent_file_config, dict)
|
||||
and agent_file_config.get("max_turns") is not None
|
||||
):
|
||||
defaults["agent"]["max_turns"] = file_config["max_turns"]
|
||||
except Exception as e:
|
||||
logger.warning("Failed to load cli-config.yaml: %s", e)
|
||||
@@ -377,6 +389,13 @@ def load_cli_config() -> Dict[str, Any]:
|
||||
# Load configuration at module startup
|
||||
CLI_CONFIG = load_cli_config()
|
||||
|
||||
# Initialize the skin engine from config
|
||||
try:
|
||||
from hermes_cli.skin_engine import init_skin_from_config
|
||||
init_skin_from_config(CLI_CONFIG)
|
||||
except Exception:
|
||||
pass # Skin engine is optional — default skin used if unavailable
|
||||
|
||||
from rich.console import Console
|
||||
from rich.panel import Panel
|
||||
from rich.table import Table
|
||||
@@ -695,6 +714,8 @@ class ChatConsole:
|
||||
def print(self, *args, **kwargs):
|
||||
self._buffer.seek(0)
|
||||
self._buffer.truncate()
|
||||
# Read terminal width at render time so panels adapt to current size
|
||||
self._inner.width = shutil.get_terminal_size((80, 24)).columns
|
||||
self._inner.print(*args, **kwargs)
|
||||
output = self._buffer.getvalue()
|
||||
for line in output.rstrip("\n").split("\n"):
|
||||
@@ -828,25 +849,43 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
|
||||
layout_table.add_column("right", justify="left")
|
||||
|
||||
# Build left content: caduceus + model info
|
||||
left_lines = ["", HERMES_CADUCEUS, ""]
|
||||
# Resolve skin colors for the banner
|
||||
try:
|
||||
from hermes_cli.skin_engine import get_active_skin
|
||||
_bskin = get_active_skin()
|
||||
_accent = _bskin.get_color("banner_accent", "#FFBF00")
|
||||
_dim = _bskin.get_color("banner_dim", "#B8860B")
|
||||
_text = _bskin.get_color("banner_text", "#FFF8DC")
|
||||
_session_c = _bskin.get_color("session_border", "#8B8682")
|
||||
_title_c = _bskin.get_color("banner_title", "#FFD700")
|
||||
_border_c = _bskin.get_color("banner_border", "#CD7F32")
|
||||
_agent_name = _bskin.get_branding("agent_name", "Hermes Agent")
|
||||
except Exception:
|
||||
_bskin = None
|
||||
_accent, _dim, _text = "#FFBF00", "#B8860B", "#FFF8DC"
|
||||
_session_c, _title_c, _border_c = "#8B8682", "#FFD700", "#CD7F32"
|
||||
_agent_name = "Hermes Agent"
|
||||
|
||||
_hero = _bskin.banner_hero if hasattr(_bskin, 'banner_hero') and _bskin.banner_hero else HERMES_CADUCEUS
|
||||
left_lines = ["", _hero, ""]
|
||||
|
||||
# Shorten model name for display
|
||||
model_short = model.split("/")[-1] if "/" in model else model
|
||||
if len(model_short) > 28:
|
||||
model_short = model_short[:25] + "..."
|
||||
|
||||
ctx_str = f" [dim #B8860B]·[/] [dim #B8860B]{_format_context_length(context_length)} context[/]" if context_length else ""
|
||||
left_lines.append(f"[#FFBF00]{model_short}[/]{ctx_str} [dim #B8860B]·[/] [dim #B8860B]Nous Research[/]")
|
||||
left_lines.append(f"[dim #B8860B]{cwd}[/]")
|
||||
ctx_str = f" [dim {_dim}]·[/] [dim {_dim}]{_format_context_length(context_length)} context[/]" if context_length else ""
|
||||
left_lines.append(f"[{_accent}]{model_short}[/]{ctx_str} [dim {_dim}]·[/] [dim {_dim}]Nous Research[/]")
|
||||
left_lines.append(f"[dim {_dim}]{cwd}[/]")
|
||||
|
||||
# Add session ID if provided
|
||||
if session_id:
|
||||
left_lines.append(f"[dim #8B8682]Session: {session_id}[/]")
|
||||
left_lines.append(f"[dim {_session_c}]Session: {session_id}[/]")
|
||||
left_content = "\n".join(left_lines)
|
||||
|
||||
# Build right content: tools list grouped by toolset
|
||||
right_lines = []
|
||||
right_lines.append("[bold #FFBF00]Available Tools[/]")
|
||||
right_lines.append(f"[bold {_accent}]Available Tools[/]")
|
||||
|
||||
# Group tools by toolset (include all possible tools, both enabled and disabled)
|
||||
toolsets_dict = {}
|
||||
@@ -883,7 +922,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
|
||||
if name in disabled_tools:
|
||||
colored_names.append(f"[red]{name}[/]")
|
||||
else:
|
||||
colored_names.append(f"[#FFF8DC]{name}[/]")
|
||||
colored_names.append(f"[{_text}]{name}[/]")
|
||||
|
||||
tools_str = ", ".join(colored_names)
|
||||
# Truncate if too long (accounting for markup)
|
||||
@@ -905,18 +944,18 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
|
||||
elif name in disabled_tools:
|
||||
colored_names.append(f"[red]{name}[/]")
|
||||
else:
|
||||
colored_names.append(f"[#FFF8DC]{name}[/]")
|
||||
colored_names.append(f"[{_text}]{name}[/]")
|
||||
tools_str = ", ".join(colored_names)
|
||||
|
||||
right_lines.append(f"[dim #B8860B]{toolset}:[/] {tools_str}")
|
||||
right_lines.append(f"[dim {_dim}]{toolset}:[/] {tools_str}")
|
||||
|
||||
if remaining_toolsets > 0:
|
||||
right_lines.append(f"[dim #B8860B](and {remaining_toolsets} more toolsets...)[/]")
|
||||
right_lines.append(f"[dim {_dim}](and {remaining_toolsets} more toolsets...)[/]")
|
||||
|
||||
right_lines.append("")
|
||||
|
||||
# Add skills section
|
||||
right_lines.append("[bold #FFBF00]Available Skills[/]")
|
||||
right_lines.append(f"[bold {_accent}]Available Skills[/]")
|
||||
skills_by_category = _get_available_skills()
|
||||
total_skills = sum(len(s) for s in skills_by_category.values())
|
||||
|
||||
@@ -932,12 +971,12 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
|
||||
# Truncate if still too long
|
||||
if len(skills_str) > 50:
|
||||
skills_str = skills_str[:47] + "..."
|
||||
right_lines.append(f"[dim #B8860B]{category}:[/] [#FFF8DC]{skills_str}[/]")
|
||||
right_lines.append(f"[dim {_dim}]{category}:[/] [{_text}]{skills_str}[/]")
|
||||
else:
|
||||
right_lines.append("[dim #B8860B]No skills installed[/]")
|
||||
right_lines.append(f"[dim {_dim}]No skills installed[/]")
|
||||
|
||||
right_lines.append("")
|
||||
right_lines.append(f"[dim #B8860B]{len(tools)} tools · {total_skills} skills · /help for commands[/]")
|
||||
right_lines.append(f"[dim {_dim}]{len(tools)} tools · {total_skills} skills · /help for commands[/]")
|
||||
|
||||
right_content = "\n".join(right_lines)
|
||||
|
||||
@@ -947,16 +986,17 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
|
||||
# Wrap in a panel with the title
|
||||
outer_panel = Panel(
|
||||
layout_table,
|
||||
title=f"[bold #FFD700]Hermes Agent {VERSION}[/]",
|
||||
border_style="#CD7F32",
|
||||
title=f"[bold {_title_c}]{_agent_name} {VERSION}[/]",
|
||||
border_style=_border_c,
|
||||
padding=(0, 2),
|
||||
)
|
||||
|
||||
# Print the big HERMES-AGENT logo — skip if terminal is too narrow
|
||||
# Print the big logo — use skin's custom logo if available
|
||||
console.print()
|
||||
term_width = shutil.get_terminal_size().columns
|
||||
if term_width >= 95:
|
||||
console.print(HERMES_AGENT_LOGO)
|
||||
_logo = _bskin.banner_logo if hasattr(_bskin, 'banner_logo') and _bskin.banner_logo else HERMES_AGENT_LOGO
|
||||
console.print(_logo)
|
||||
console.print()
|
||||
|
||||
# Print the panel with caduceus and info
|
||||
@@ -1045,6 +1085,7 @@ class HermesCLI:
|
||||
verbose: bool = False,
|
||||
compact: bool = False,
|
||||
resume: str = None,
|
||||
checkpoints: bool = False,
|
||||
):
|
||||
"""
|
||||
Initialize the Hermes CLI.
|
||||
@@ -1126,6 +1167,13 @@ class HermesCLI:
|
||||
if invalid:
|
||||
self.console.print(f"[bold red]Warning: Unknown toolsets: {', '.join(invalid)}[/]")
|
||||
|
||||
# Filesystem checkpoints: CLI flag > config
|
||||
cp_cfg = CLI_CONFIG.get("checkpoints", {})
|
||||
if isinstance(cp_cfg, bool):
|
||||
cp_cfg = {"enabled": cp_cfg}
|
||||
self.checkpoints_enabled = checkpoints or cp_cfg.get("enabled", False)
|
||||
self.checkpoint_max_snapshots = cp_cfg.get("max_snapshots", 50)
|
||||
|
||||
# Ephemeral system prompt: env var takes precedence, then config
|
||||
self.system_prompt = (
|
||||
os.getenv("HERMES_EPHEMERAL_SYSTEM_PROMPT", "")
|
||||
@@ -1187,6 +1235,7 @@ class HermesCLI:
|
||||
# History file for persistent input recall across sessions
|
||||
self._history_file = Path.home() / ".hermes_history"
|
||||
self._last_invalidate: float = 0.0 # throttle UI repaints
|
||||
self._spinner_text: str = "" # thinking spinner text for TUI
|
||||
|
||||
def _invalidate(self, min_interval: float = 0.25) -> None:
|
||||
"""Throttled UI repaint — prevents terminal blinking on slow/SSH connections."""
|
||||
@@ -1250,6 +1299,11 @@ class HermesCLI:
|
||||
|
||||
return changed
|
||||
|
||||
def _on_thinking(self, text: str) -> None:
|
||||
"""Called by agent when thinking starts/stops. Updates TUI spinner."""
|
||||
self._spinner_text = text or ""
|
||||
self._invalidate()
|
||||
|
||||
def _ensure_runtime_credentials(self) -> bool:
|
||||
"""
|
||||
Ensure runtime credentials are resolved before agent use.
|
||||
@@ -1388,6 +1442,9 @@ class HermesCLI:
|
||||
clarify_callback=self._clarify_callback,
|
||||
honcho_session_key=self.session_id,
|
||||
fallback_model=self._fallback_model,
|
||||
thinking_callback=self._on_thinking,
|
||||
checkpoints_enabled=self.checkpoints_enabled,
|
||||
checkpoint_max_snapshots=self.checkpoint_max_snapshots,
|
||||
)
|
||||
# Apply any pending title now that the session exists in the DB
|
||||
if self._pending_title and self._session_db:
|
||||
@@ -1657,6 +1714,55 @@ class HermesCLI:
|
||||
self._image_counter -= 1
|
||||
return False
|
||||
|
||||
def _handle_rollback_command(self, command: str):
|
||||
"""Handle /rollback — list or restore filesystem checkpoints."""
|
||||
from tools.checkpoint_manager import CheckpointManager, format_checkpoint_list
|
||||
|
||||
if not hasattr(self, 'agent') or not self.agent:
|
||||
print(" No active agent session.")
|
||||
return
|
||||
|
||||
mgr = self.agent._checkpoint_mgr
|
||||
if not mgr.enabled:
|
||||
print(" Checkpoints are not enabled.")
|
||||
print(" Enable with: hermes --checkpoints")
|
||||
print(" Or in config.yaml: checkpoints: { enabled: true }")
|
||||
return
|
||||
|
||||
cwd = os.getenv("TERMINAL_CWD", os.getcwd())
|
||||
parts = command.split(maxsplit=1)
|
||||
arg = parts[1].strip() if len(parts) > 1 else ""
|
||||
|
||||
if not arg:
|
||||
# List checkpoints
|
||||
checkpoints = mgr.list_checkpoints(cwd)
|
||||
print(format_checkpoint_list(checkpoints, cwd))
|
||||
else:
|
||||
# Restore by number or hash
|
||||
checkpoints = mgr.list_checkpoints(cwd)
|
||||
if not checkpoints:
|
||||
print(f" No checkpoints found for {cwd}")
|
||||
return
|
||||
|
||||
target_hash = None
|
||||
try:
|
||||
idx = int(arg) - 1 # 1-indexed for user
|
||||
if 0 <= idx < len(checkpoints):
|
||||
target_hash = checkpoints[idx]["hash"]
|
||||
else:
|
||||
print(f" Invalid checkpoint number. Use 1-{len(checkpoints)}.")
|
||||
return
|
||||
except ValueError:
|
||||
# Try as a git hash
|
||||
target_hash = arg
|
||||
|
||||
result = mgr.restore(cwd, target_hash)
|
||||
if result["success"]:
|
||||
print(f" ✅ Restored to checkpoint {result['restored_to']}: {result['reason']}")
|
||||
print(f" A pre-rollback snapshot was saved automatically.")
|
||||
else:
|
||||
print(f" ❌ {result['error']}")
|
||||
|
||||
def _handle_paste_command(self):
|
||||
"""Handle /paste — explicitly check clipboard for an image.
|
||||
|
||||
@@ -2666,6 +2772,10 @@ class HermesCLI:
|
||||
self._handle_paste_command()
|
||||
elif cmd_lower == "/reload-mcp":
|
||||
self._reload_mcp()
|
||||
elif cmd_lower.startswith("/rollback"):
|
||||
self._handle_rollback_command(cmd_original)
|
||||
elif cmd_lower.startswith("/skin"):
|
||||
self._handle_skin_command(cmd_original)
|
||||
else:
|
||||
# Check for skill slash commands (/gif-search, /axolotl, etc.)
|
||||
base_cmd = cmd_lower.split()[0]
|
||||
@@ -2685,6 +2795,43 @@ class HermesCLI:
|
||||
|
||||
return True
|
||||
|
||||
def _handle_skin_command(self, cmd: str):
|
||||
"""Handle /skin [name] — show or change the display skin."""
|
||||
try:
|
||||
from hermes_cli.skin_engine import list_skins, set_active_skin, get_active_skin_name
|
||||
except ImportError:
|
||||
print("Skin engine not available.")
|
||||
return
|
||||
|
||||
parts = cmd.strip().split(maxsplit=1)
|
||||
if len(parts) < 2 or not parts[1].strip():
|
||||
# Show current skin and list available
|
||||
current = get_active_skin_name()
|
||||
skins = list_skins()
|
||||
print(f"\n Current skin: {current}")
|
||||
print(f" Available skins:")
|
||||
for s in skins:
|
||||
marker = " ●" if s["name"] == current else " "
|
||||
source = f" ({s['source']})" if s["source"] == "user" else ""
|
||||
print(f" {marker} {s['name']}{source} — {s['description']}")
|
||||
print(f"\n Usage: /skin <name>")
|
||||
print(f" Custom skins: drop a YAML file in ~/.hermes/skins/\n")
|
||||
return
|
||||
|
||||
new_skin = parts[1].strip().lower()
|
||||
available = {s["name"] for s in list_skins()}
|
||||
if new_skin not in available:
|
||||
print(f" Unknown skin: {new_skin}")
|
||||
print(f" Available: {', '.join(sorted(available))}")
|
||||
return
|
||||
|
||||
set_active_skin(new_skin)
|
||||
if save_config_value("display.skin", new_skin):
|
||||
print(f" Skin set to: {new_skin} (saved)")
|
||||
else:
|
||||
print(f" Skin set to: {new_skin}")
|
||||
print(" Note: banner colors will update on next session start.")
|
||||
|
||||
def _toggle_verbose(self):
|
||||
"""Cycle tool progress mode: off → new → all → verbose → off."""
|
||||
cycle = ["off", "new", "all", "verbose"]
|
||||
@@ -2933,8 +3080,16 @@ class HermesCLI:
|
||||
# Trigger prompt_toolkit repaint from this (non-main) thread
|
||||
self._invalidate()
|
||||
|
||||
# Poll in 1-second ticks so the countdown refreshes in the UI.
|
||||
# Each tick triggers an invalidate() to repaint the hint line.
|
||||
# Poll for the user's response. The countdown in the hint line
|
||||
# updates on each invalidate — but frequent repaints cause visible
|
||||
# flicker in some terminals (Kitty, ghostty). We only refresh the
|
||||
# countdown every 5 s; selection changes (↑/↓) trigger instant
|
||||
# Poll for the user's response. The countdown in the hint line
|
||||
# updates on each invalidate — but frequent repaints cause visible
|
||||
# flicker in some terminals (Kitty, ghostty). We only refresh the
|
||||
# countdown every 5 s; selection changes (↑/↓) trigger instant
|
||||
# repaints via the key bindings.
|
||||
_last_countdown_refresh = _time.monotonic()
|
||||
while True:
|
||||
try:
|
||||
result = response_queue.get(timeout=1)
|
||||
@@ -2944,8 +3099,14 @@ class HermesCLI:
|
||||
remaining = self._clarify_deadline - _time.monotonic()
|
||||
if remaining <= 0:
|
||||
break
|
||||
# Repaint so the countdown updates
|
||||
self._invalidate()
|
||||
# Only repaint every 5 s for the countdown — avoids flicker
|
||||
now = _time.monotonic()
|
||||
if now - _last_countdown_refresh >= 5.0:
|
||||
_last_countdown_refresh = now
|
||||
self._invalidate()
|
||||
if now - _last_countdown_refresh >= 5.0:
|
||||
_last_countdown_refresh = now
|
||||
self._invalidate()
|
||||
|
||||
# Timed out — tear down the UI and let the agent decide
|
||||
self._clarify_state = None
|
||||
@@ -3025,6 +3186,9 @@ class HermesCLI:
|
||||
|
||||
self._invalidate()
|
||||
|
||||
# Same throttled countdown as _clarify_callback — repaint only
|
||||
# every 5 s to avoid flicker in Kitty / ghostty / etc.
|
||||
_last_countdown_refresh = _time.monotonic()
|
||||
while True:
|
||||
try:
|
||||
result = response_queue.get(timeout=1)
|
||||
@@ -3036,11 +3200,16 @@ class HermesCLI:
|
||||
remaining = self._approval_deadline - _time.monotonic()
|
||||
if remaining <= 0:
|
||||
break
|
||||
self._invalidate()
|
||||
now = _time.monotonic()
|
||||
if now - _last_countdown_refresh >= 5.0:
|
||||
_last_countdown_refresh = now
|
||||
self._invalidate()
|
||||
|
||||
self._approval_state = None
|
||||
self._approval_deadline = 0
|
||||
self._invalidate()
|
||||
return "deny"
|
||||
|
||||
def chat(self, message, images: list = None) -> Optional[str]:
|
||||
"""
|
||||
Send a message to the agent and get a response.
|
||||
@@ -3079,8 +3248,7 @@ class HermesCLI:
|
||||
# Add user message to history
|
||||
self.conversation_history.append({"role": "user", "content": message})
|
||||
|
||||
w = shutil.get_terminal_size().columns
|
||||
_cprint(f"{_GOLD}{'─' * w}{_RST}")
|
||||
_cprint(f"{_GOLD}{'─' * 40}{_RST}")
|
||||
print(flush=True)
|
||||
|
||||
try:
|
||||
@@ -3155,15 +3323,25 @@ class HermesCLI:
|
||||
response = response + "\n\n---\n_[Interrupted - processing new message]_"
|
||||
|
||||
if response:
|
||||
w = shutil.get_terminal_size().columns
|
||||
label = " ⚕ Hermes "
|
||||
fill = w - 2 - len(label) # 2 for ╭ and ╮
|
||||
top = f"{_GOLD}╭─{label}{'─' * max(fill - 1, 0)}╮{_RST}"
|
||||
bot = f"{_GOLD}╰{'─' * (w - 2)}╯{_RST}"
|
||||
# Use a Rich Panel for the response box — adapts to terminal
|
||||
# width at render time instead of hard-coding border length.
|
||||
try:
|
||||
from hermes_cli.skin_engine import get_active_skin
|
||||
_skin = get_active_skin()
|
||||
label = _skin.get_branding("response_label", "⚕ Hermes")
|
||||
_resp_color = _skin.get_color("response_border", "#CD7F32")
|
||||
except Exception:
|
||||
label = "⚕ Hermes"
|
||||
_resp_color = "#CD7F32"
|
||||
|
||||
# Render box + response as a single _cprint call so
|
||||
# nothing can interleave between the box borders.
|
||||
_cprint(f"\n{top}\n{response}\n\n{bot}")
|
||||
_chat_console = ChatConsole()
|
||||
_chat_console.print(Panel(
|
||||
response,
|
||||
title=f"[bold]{label}[/bold]",
|
||||
title_align="left",
|
||||
border_style=_resp_color,
|
||||
padding=(1, 2),
|
||||
))
|
||||
|
||||
# Play terminal bell when agent finishes (if enabled).
|
||||
# Works over SSH — the bell propagates to the user's terminal.
|
||||
@@ -3228,7 +3406,15 @@ class HermesCLI:
|
||||
if self._preload_resumed_session():
|
||||
self._display_resumed_history()
|
||||
|
||||
self.console.print("[#FFF8DC]Welcome to Hermes Agent! Type your message or /help for commands.[/]")
|
||||
try:
|
||||
from hermes_cli.skin_engine import get_active_skin
|
||||
_welcome_skin = get_active_skin()
|
||||
_welcome_text = _welcome_skin.get_branding("welcome", "Welcome to Hermes Agent! Type your message or /help for commands.")
|
||||
_welcome_color = _welcome_skin.get_color("banner_text", "#FFF8DC")
|
||||
except Exception:
|
||||
_welcome_text = "Welcome to Hermes Agent! Type your message or /help for commands."
|
||||
_welcome_color = "#FFF8DC"
|
||||
self.console.print(f"[{_welcome_color}]{_welcome_text}[/]")
|
||||
self.console.print()
|
||||
|
||||
# State for async operation
|
||||
@@ -3616,6 +3802,8 @@ class HermesCLI:
|
||||
return "type password (hidden), Enter to skip"
|
||||
if cli_ref._approval_state:
|
||||
return ""
|
||||
if cli_ref._clarify_freetext:
|
||||
return "type your answer here and press Enter"
|
||||
if cli_ref._clarify_state:
|
||||
return ""
|
||||
if cli_ref._agent_running:
|
||||
@@ -3666,6 +3854,20 @@ class HermesCLI:
|
||||
# right up against the top rule of the input area
|
||||
return 1 if cli_ref._agent_running else 0
|
||||
|
||||
def get_spinner_text():
|
||||
txt = cli_ref._spinner_text
|
||||
if not txt:
|
||||
return []
|
||||
return [('class:hint', f' {txt}')]
|
||||
|
||||
def get_spinner_height():
|
||||
return 1 if cli_ref._spinner_text else 0
|
||||
|
||||
spinner_widget = Window(
|
||||
content=FormattedTextControl(get_spinner_text),
|
||||
height=get_spinner_height,
|
||||
)
|
||||
|
||||
spacer = Window(
|
||||
content=FormattedTextControl(get_hint_text),
|
||||
height=get_hint_height,
|
||||
@@ -3673,6 +3875,32 @@ class HermesCLI:
|
||||
|
||||
# --- Clarify tool: dynamic display widget for questions + choices ---
|
||||
|
||||
def _panel_box_width(title: str, content_lines: list[str], min_width: int = 46, max_width: int = 76) -> int:
|
||||
"""Choose a stable panel width wide enough for the title and content."""
|
||||
term_cols = shutil.get_terminal_size((100, 20)).columns
|
||||
longest = max([len(title)] + [len(line) for line in content_lines] + [min_width - 4])
|
||||
inner = min(max(longest + 4, min_width - 2), max_width - 2, max(24, term_cols - 6))
|
||||
return inner + 2 # account for the single leading/trailing spaces inside borders
|
||||
|
||||
def _wrap_panel_text(text: str, width: int, subsequent_indent: str = "") -> list[str]:
|
||||
wrapped = textwrap.wrap(
|
||||
text,
|
||||
width=max(8, width),
|
||||
break_long_words=False,
|
||||
break_on_hyphens=False,
|
||||
subsequent_indent=subsequent_indent,
|
||||
)
|
||||
return wrapped or [""]
|
||||
|
||||
def _append_panel_line(lines, border_style: str, content_style: str, text: str, box_width: int) -> None:
|
||||
inner_width = max(0, box_width - 2)
|
||||
lines.append((border_style, "│ "))
|
||||
lines.append((content_style, text.ljust(inner_width)))
|
||||
lines.append((border_style, " │\n"))
|
||||
|
||||
def _append_blank_panel_line(lines, border_style: str, box_width: int) -> None:
|
||||
lines.append((border_style, "│" + (" " * box_width) + "│\n"))
|
||||
|
||||
def _get_clarify_display():
|
||||
"""Build styled text for the clarify question/choices panel."""
|
||||
state = cli_ref._clarify_state
|
||||
@@ -3682,43 +3910,62 @@ class HermesCLI:
|
||||
question = state["question"]
|
||||
choices = state.get("choices") or []
|
||||
selected = state.get("selected", 0)
|
||||
preview_lines = _wrap_panel_text(question, 60)
|
||||
for i, choice in enumerate(choices):
|
||||
prefix = "❯ " if i == selected and not cli_ref._clarify_freetext else " "
|
||||
preview_lines.extend(_wrap_panel_text(f"{prefix}{choice}", 60, subsequent_indent=" "))
|
||||
other_label = (
|
||||
"❯ Other (type below)" if cli_ref._clarify_freetext
|
||||
else "❯ Other (type your answer)" if selected == len(choices)
|
||||
else " Other (type your answer)"
|
||||
)
|
||||
preview_lines.extend(_wrap_panel_text(other_label, 60, subsequent_indent=" "))
|
||||
box_width = _panel_box_width("Hermes needs your input", preview_lines)
|
||||
inner_text_width = max(8, box_width - 2)
|
||||
|
||||
lines = []
|
||||
# Box top border
|
||||
lines.append(('class:clarify-border', '╭─ '))
|
||||
lines.append(('class:clarify-title', 'Hermes needs your input'))
|
||||
lines.append(('class:clarify-border', ' ─────────────────────────────╮\n'))
|
||||
lines.append(('class:clarify-border', '│\n'))
|
||||
lines.append(('class:clarify-border', ' ' + ('─' * max(0, box_width - len("Hermes needs your input") - 3)) + '╮\n'))
|
||||
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
|
||||
|
||||
# Question text
|
||||
lines.append(('class:clarify-border', '│ '))
|
||||
lines.append(('class:clarify-question', question))
|
||||
lines.append(('', '\n'))
|
||||
lines.append(('class:clarify-border', '│\n'))
|
||||
for wrapped in _wrap_panel_text(question, inner_text_width):
|
||||
_append_panel_line(lines, 'class:clarify-border', 'class:clarify-question', wrapped, box_width)
|
||||
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
|
||||
|
||||
if cli_ref._clarify_freetext and not choices:
|
||||
guidance = "Type your answer in the prompt below, then press Enter."
|
||||
for wrapped in _wrap_panel_text(guidance, inner_text_width):
|
||||
_append_panel_line(lines, 'class:clarify-border', 'class:clarify-choice', wrapped, box_width)
|
||||
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
|
||||
|
||||
if choices:
|
||||
# Multiple-choice mode: show selectable options
|
||||
for i, choice in enumerate(choices):
|
||||
lines.append(('class:clarify-border', '│ '))
|
||||
if i == selected and not cli_ref._clarify_freetext:
|
||||
lines.append(('class:clarify-selected', f'❯ {choice}'))
|
||||
else:
|
||||
lines.append(('class:clarify-choice', f' {choice}'))
|
||||
lines.append(('', '\n'))
|
||||
style = 'class:clarify-selected' if i == selected and not cli_ref._clarify_freetext else 'class:clarify-choice'
|
||||
prefix = '❯ ' if i == selected and not cli_ref._clarify_freetext else ' '
|
||||
wrapped_lines = _wrap_panel_text(f"{prefix}{choice}", inner_text_width, subsequent_indent=" ")
|
||||
for wrapped in wrapped_lines:
|
||||
_append_panel_line(lines, 'class:clarify-border', style, wrapped, box_width)
|
||||
|
||||
# "Other" option (5th line, only shown when choices exist)
|
||||
other_idx = len(choices)
|
||||
lines.append(('class:clarify-border', '│ '))
|
||||
if selected == other_idx and not cli_ref._clarify_freetext:
|
||||
lines.append(('class:clarify-selected', '❯ Other (type your answer)'))
|
||||
other_style = 'class:clarify-selected'
|
||||
other_label = '❯ Other (type your answer)'
|
||||
elif cli_ref._clarify_freetext:
|
||||
lines.append(('class:clarify-active-other', '❯ Other (type below)'))
|
||||
other_style = 'class:clarify-active-other'
|
||||
other_label = '❯ Other (type below)'
|
||||
else:
|
||||
lines.append(('class:clarify-choice', ' Other (type your answer)'))
|
||||
lines.append(('', '\n'))
|
||||
other_style = 'class:clarify-choice'
|
||||
other_label = ' Other (type your answer)'
|
||||
for wrapped in _wrap_panel_text(other_label, inner_text_width, subsequent_indent=" "):
|
||||
_append_panel_line(lines, 'class:clarify-border', other_style, wrapped, box_width)
|
||||
|
||||
lines.append(('class:clarify-border', '│\n'))
|
||||
lines.append(('class:clarify-border', '╰──────────────────────────────────────────────────╯\n'))
|
||||
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
|
||||
lines.append(('class:clarify-border', '╰' + ('─' * box_width) + '╯\n'))
|
||||
return lines
|
||||
|
||||
clarify_widget = ConditionalContainer(
|
||||
@@ -3735,16 +3982,18 @@ class HermesCLI:
|
||||
state = cli_ref._sudo_state
|
||||
if not state:
|
||||
return []
|
||||
title = '🔐 Sudo Password Required'
|
||||
body = 'Enter password below (hidden), or press Enter to skip'
|
||||
box_width = _panel_box_width(title, [body])
|
||||
inner = max(0, box_width - 2)
|
||||
lines = []
|
||||
lines.append(('class:sudo-border', '╭─ '))
|
||||
lines.append(('class:sudo-title', '🔐 Sudo Password Required'))
|
||||
lines.append(('class:sudo-border', ' ──────────────────────────╮\n'))
|
||||
lines.append(('class:sudo-border', '│\n'))
|
||||
lines.append(('class:sudo-border', '│ '))
|
||||
lines.append(('class:sudo-text', 'Enter password below (hidden), or press Enter to skip'))
|
||||
lines.append(('', '\n'))
|
||||
lines.append(('class:sudo-border', '│\n'))
|
||||
lines.append(('class:sudo-border', '╰──────────────────────────────────────────────────╯\n'))
|
||||
lines.append(('class:sudo-title', title))
|
||||
lines.append(('class:sudo-border', ' ' + ('─' * max(0, box_width - len(title) - 3)) + '╮\n'))
|
||||
_append_blank_panel_line(lines, 'class:sudo-border', box_width)
|
||||
_append_panel_line(lines, 'class:sudo-border', 'class:sudo-text', body, box_width)
|
||||
_append_blank_panel_line(lines, 'class:sudo-border', box_width)
|
||||
lines.append(('class:sudo-border', '╰' + ('─' * box_width) + '╯\n'))
|
||||
return lines
|
||||
|
||||
sudo_widget = ConditionalContainer(
|
||||
@@ -3773,29 +4022,32 @@ class HermesCLI:
|
||||
"always": "Add to permanent allowlist",
|
||||
"deny": "Deny",
|
||||
}
|
||||
preview_lines = _wrap_panel_text(description, 60)
|
||||
preview_lines.extend(_wrap_panel_text(cmd_display, 60))
|
||||
for i, choice in enumerate(choices):
|
||||
prefix = '❯ ' if i == selected else ' '
|
||||
preview_lines.extend(_wrap_panel_text(f"{prefix}{choice_labels.get(choice, choice)}", 60, subsequent_indent=" "))
|
||||
box_width = _panel_box_width("⚠️ Dangerous Command", preview_lines)
|
||||
inner_text_width = max(8, box_width - 2)
|
||||
|
||||
lines = []
|
||||
lines.append(('class:approval-border', '╭─ '))
|
||||
lines.append(('class:approval-title', '⚠️ Dangerous Command'))
|
||||
lines.append(('class:approval-border', ' ───────────────────────────────╮\n'))
|
||||
lines.append(('class:approval-border', '│\n'))
|
||||
lines.append(('class:approval-border', '│ '))
|
||||
lines.append(('class:approval-desc', description))
|
||||
lines.append(('', '\n'))
|
||||
lines.append(('class:approval-border', '│ '))
|
||||
lines.append(('class:approval-cmd', cmd_display))
|
||||
lines.append(('', '\n'))
|
||||
lines.append(('class:approval-border', '│\n'))
|
||||
lines.append(('class:approval-border', ' ' + ('─' * max(0, box_width - len("⚠️ Dangerous Command") - 3)) + '╮\n'))
|
||||
_append_blank_panel_line(lines, 'class:approval-border', box_width)
|
||||
for wrapped in _wrap_panel_text(description, inner_text_width):
|
||||
_append_panel_line(lines, 'class:approval-border', 'class:approval-desc', wrapped, box_width)
|
||||
for wrapped in _wrap_panel_text(cmd_display, inner_text_width):
|
||||
_append_panel_line(lines, 'class:approval-border', 'class:approval-cmd', wrapped, box_width)
|
||||
_append_blank_panel_line(lines, 'class:approval-border', box_width)
|
||||
for i, choice in enumerate(choices):
|
||||
lines.append(('class:approval-border', '│ '))
|
||||
label = choice_labels.get(choice, choice)
|
||||
if i == selected:
|
||||
lines.append(('class:approval-selected', f'❯ {label}'))
|
||||
else:
|
||||
lines.append(('class:approval-choice', f' {label}'))
|
||||
lines.append(('', '\n'))
|
||||
lines.append(('class:approval-border', '│\n'))
|
||||
lines.append(('class:approval-border', '╰──────────────────────────────────────────────────────╯\n'))
|
||||
style = 'class:approval-selected' if i == selected else 'class:approval-choice'
|
||||
prefix = '❯ ' if i == selected else ' '
|
||||
for wrapped in _wrap_panel_text(f"{prefix}{label}", inner_text_width, subsequent_indent=" "):
|
||||
_append_panel_line(lines, 'class:approval-border', style, wrapped, box_width)
|
||||
_append_blank_panel_line(lines, 'class:approval-border', box_width)
|
||||
lines.append(('class:approval-border', '╰' + ('─' * box_width) + '╯\n'))
|
||||
return lines
|
||||
|
||||
approval_widget = ConditionalContainer(
|
||||
@@ -3848,6 +4100,7 @@ class HermesCLI:
|
||||
sudo_widget,
|
||||
approval_widget,
|
||||
clarify_widget,
|
||||
spinner_widget,
|
||||
spacer,
|
||||
input_rule_top,
|
||||
image_bar,
|
||||
@@ -3902,6 +4155,7 @@ class HermesCLI:
|
||||
style=style,
|
||||
full_screen=False,
|
||||
mouse_support=False,
|
||||
**({'cursor': _STEADY_CURSOR} if _STEADY_CURSOR is not None else {}),
|
||||
)
|
||||
self._app = app # Store reference for clarify_callback
|
||||
|
||||
@@ -3970,6 +4224,7 @@ class HermesCLI:
|
||||
self.chat(user_input, images=submit_images or None)
|
||||
finally:
|
||||
self._agent_running = False
|
||||
self._spinner_text = ""
|
||||
app.invalidate() # Refresh status line
|
||||
|
||||
except Exception as e:
|
||||
@@ -4030,6 +4285,7 @@ def main(
|
||||
resume: str = None,
|
||||
worktree: bool = False,
|
||||
w: bool = False,
|
||||
checkpoints: bool = False,
|
||||
):
|
||||
"""
|
||||
Hermes Agent CLI - Interactive AI Assistant
|
||||
@@ -4134,6 +4390,7 @@ def main(
|
||||
verbose=verbose,
|
||||
compact=compact,
|
||||
resume=resume,
|
||||
checkpoints=checkpoints,
|
||||
)
|
||||
|
||||
# Inject worktree context into agent's system prompt
|
||||
|
||||
@@ -26,7 +26,7 @@ except ImportError:
|
||||
# Configuration
|
||||
# =============================================================================
|
||||
|
||||
HERMES_DIR = Path.home() / ".hermes"
|
||||
HERMES_DIR = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
|
||||
CRON_DIR = HERMES_DIR / "cron"
|
||||
JOBS_FILE = CRON_DIR / "jobs.json"
|
||||
OUTPUT_DIR = CRON_DIR / "output"
|
||||
|
||||
@@ -0,0 +1,89 @@
|
||||
# ============================================================================
|
||||
# Hermes Agent — Example Skin Template
|
||||
# ============================================================================
|
||||
#
|
||||
# Copy this file to ~/.hermes/skins/<name>.yaml to create a custom skin.
|
||||
# All fields are optional — missing values inherit from the default skin.
|
||||
# Activate with: /skin <name> or display.skin: <name> in config.yaml
|
||||
#
|
||||
# See hermes_cli/skin_engine.py for the full schema reference.
|
||||
# ============================================================================
|
||||
|
||||
# Required: unique skin name (used in /skin command and config)
|
||||
name: example
|
||||
description: An example custom skin — copy and modify this template
|
||||
|
||||
# ── Colors ──────────────────────────────────────────────────────────────────
|
||||
# Hex color values for Rich markup. These control the CLI's visual palette.
|
||||
colors:
|
||||
# Banner panel (the startup welcome box)
|
||||
banner_border: "#CD7F32" # Panel border
|
||||
banner_title: "#FFD700" # Panel title text
|
||||
banner_accent: "#FFBF00" # Section headers (Available Tools, Skills, etc.)
|
||||
banner_dim: "#B8860B" # Dim/muted text (separators, model info)
|
||||
banner_text: "#FFF8DC" # Body text (tool names, skill names)
|
||||
|
||||
# UI elements
|
||||
ui_accent: "#FFBF00" # General accent color
|
||||
ui_label: "#4dd0e1" # Labels
|
||||
ui_ok: "#4caf50" # Success indicators
|
||||
ui_error: "#ef5350" # Error indicators
|
||||
ui_warn: "#ffa726" # Warning indicators
|
||||
|
||||
# Input area
|
||||
prompt: "#FFF8DC" # Prompt text color
|
||||
input_rule: "#CD7F32" # Horizontal rule around input
|
||||
|
||||
# Response box
|
||||
response_border: "#FFD700" # Response box border (ANSI color)
|
||||
|
||||
# Session display
|
||||
session_label: "#DAA520" # Session label
|
||||
session_border: "#8B8682" # Session ID dim color
|
||||
|
||||
# ── Spinner ─────────────────────────────────────────────────────────────────
|
||||
# Customize the animated spinner shown during API calls and tool execution.
|
||||
spinner:
|
||||
# Faces shown while waiting for the API response
|
||||
waiting_faces:
|
||||
- "(。◕‿◕。)"
|
||||
- "(◕‿◕✿)"
|
||||
- "٩(◕‿◕。)۶"
|
||||
|
||||
# Faces shown during extended thinking/reasoning
|
||||
thinking_faces:
|
||||
- "(。•́︿•̀。)"
|
||||
- "(◔_◔)"
|
||||
- "(¬‿¬)"
|
||||
|
||||
# Verbs used in spinner messages (e.g., "pondering your request...")
|
||||
thinking_verbs:
|
||||
- "pondering"
|
||||
- "contemplating"
|
||||
- "musing"
|
||||
- "ruminating"
|
||||
|
||||
# Optional: left/right decorations around the spinner
|
||||
# Each entry is a [left, right] pair. Omit entirely for no wings.
|
||||
# wings:
|
||||
# - ["⟪⚔", "⚔⟫"]
|
||||
# - ["⟪▲", "▲⟫"]
|
||||
|
||||
# ── Branding ────────────────────────────────────────────────────────────────
|
||||
# Text strings used throughout the CLI interface.
|
||||
branding:
|
||||
agent_name: "Hermes Agent" # Banner title, about display
|
||||
welcome: "Welcome! Type your message or /help for commands."
|
||||
goodbye: "Goodbye! ⚕" # Exit message
|
||||
response_label: " ⚕ Hermes " # Response box header label
|
||||
prompt_symbol: "❯ " # Input prompt symbol
|
||||
help_header: "(^_^)? Available Commands" # /help header text
|
||||
|
||||
# ── Tool Output ─────────────────────────────────────────────────────────────
|
||||
# Character used as the prefix for tool output lines.
|
||||
# Default is "┊" (thin dotted vertical line). Some alternatives:
|
||||
# "╎" (light triple dash vertical)
|
||||
# "▏" (left one-eighth block)
|
||||
# "│" (box drawing light vertical)
|
||||
# "┃" (box drawing heavy vertical)
|
||||
tool_prefix: "┊"
|
||||
@@ -29,6 +29,10 @@ env:
|
||||
wandb_name: "terminal-bench-2"
|
||||
ensure_scores_are_not_same: false
|
||||
data_dir_to_save_evals: "environments/benchmarks/evals/terminal-bench-2"
|
||||
# CRITICAL: Limit concurrent Modal sandbox creations to avoid deadlocks.
|
||||
# Modal's blocking calls (App.lookup, etc.) deadlock when too many sandboxes
|
||||
# are created simultaneously inside thread pool workers via asyncio.run().
|
||||
max_concurrent_tasks: 8
|
||||
|
||||
openai:
|
||||
base_url: "https://openrouter.ai/api/v1"
|
||||
|
||||
@@ -118,6 +118,15 @@ class TerminalBench2EvalConfig(HermesAgentEnvConfig):
|
||||
"Tasks exceeding this are scored as FAIL. Default 30 minutes.",
|
||||
)
|
||||
|
||||
# --- Concurrency control ---
|
||||
max_concurrent_tasks: int = Field(
|
||||
default=8,
|
||||
description="Maximum number of tasks to run concurrently. "
|
||||
"Limits concurrent Modal sandbox creations to avoid async/threading deadlocks. "
|
||||
"Modal has internal limits and creating too many sandboxes simultaneously "
|
||||
"causes blocking calls to deadlock inside the thread pool.",
|
||||
)
|
||||
|
||||
|
||||
# Tasks that cannot run properly on Modal and are excluded from scoring.
|
||||
MODAL_INCOMPATIBLE_TASKS = {
|
||||
@@ -430,7 +439,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
|
||||
}
|
||||
|
||||
# --- 2. Register per-task Modal image override ---
|
||||
register_task_env_overrides(task_id, {"modal_image": modal_image})
|
||||
register_task_env_overrides(task_id, {"modal_image": modal_image, "cwd": "/app"})
|
||||
logger.info(
|
||||
"Task %s: registered image override for task_id %s",
|
||||
task_name, task_id[:8],
|
||||
@@ -733,12 +742,23 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
|
||||
print(f" Tool thread pool: {self.config.tool_pool_size}")
|
||||
print(f" Terminal timeout: {self.config.terminal_timeout}s/cmd")
|
||||
print(f" Terminal lifetime: {self.config.terminal_lifetime}s (auto: task_timeout + 120)")
|
||||
print(f" Max concurrent tasks: {self.config.max_concurrent_tasks}")
|
||||
print(f"{'='*60}\n")
|
||||
|
||||
# Semaphore to limit concurrent Modal sandbox creations.
|
||||
# Without this, all 86 tasks fire simultaneously, each creating a Modal
|
||||
# sandbox via asyncio.run() inside a thread pool worker. Modal's blocking
|
||||
# calls (App.lookup, etc.) deadlock when too many are created at once.
|
||||
semaphore = asyncio.Semaphore(self.config.max_concurrent_tasks)
|
||||
|
||||
async def _eval_with_semaphore(item):
|
||||
async with semaphore:
|
||||
return await self._eval_with_timeout(item)
|
||||
|
||||
# Fire all tasks with wall-clock timeout, track live accuracy on the bar
|
||||
total_tasks = len(self.all_eval_items)
|
||||
eval_tasks = [
|
||||
asyncio.ensure_future(self._eval_with_timeout(item))
|
||||
asyncio.ensure_future(_eval_with_semaphore(item))
|
||||
for item in self.all_eval_items
|
||||
]
|
||||
|
||||
|
||||
@@ -356,10 +356,19 @@ class WebResearchEnv(HermesAgentBaseEnv):
|
||||
efficiency_weight * efficiency — penalizes wasteful tool usage
|
||||
+ diversity_bonus — source diversity (≥2 distinct domains)
|
||||
"""
|
||||
final_response: str = result.final_response or ""
|
||||
tools_used: list[str] = [
|
||||
tc.tool_name for tc in (result.tool_calls or [])
|
||||
] if hasattr(result, "tool_calls") and result.tool_calls else []
|
||||
# Extract final response from messages (last assistant message with content)
|
||||
final_response = ""
|
||||
tools_used: list[str] = []
|
||||
for msg in reversed(result.messages):
|
||||
if msg.get("role") == "assistant" and msg.get("content") and not final_response:
|
||||
final_response = msg["content"]
|
||||
# Collect tool names from tool call messages
|
||||
if msg.get("role") == "assistant" and msg.get("tool_calls"):
|
||||
for tc in msg["tool_calls"]:
|
||||
fn = tc.get("function", {}) if isinstance(tc, dict) else {}
|
||||
name = fn.get("name", "")
|
||||
if name:
|
||||
tools_used.append(name)
|
||||
tool_call_count: int = result.turns_used or len(tools_used)
|
||||
|
||||
cfg = self.config
|
||||
@@ -416,8 +425,16 @@ class WebResearchEnv(HermesAgentBaseEnv):
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
async def evaluate(self, *args, **kwargs) -> None:
|
||||
"""Run evaluation on the held-out split using the agent loop."""
|
||||
"""Run evaluation on the held-out split using the full agent loop with tools.
|
||||
|
||||
Each eval item runs through the same agent loop as training —
|
||||
the model can use web_search, web_extract, etc. to research answers.
|
||||
This measures actual agentic research capability, not just knowledge.
|
||||
"""
|
||||
import time
|
||||
import uuid
|
||||
from environments.agent_loop import HermesAgentLoop
|
||||
from environments.tool_context import ToolContext
|
||||
|
||||
items = self._eval_items
|
||||
if not items:
|
||||
@@ -427,43 +444,88 @@ class WebResearchEnv(HermesAgentBaseEnv):
|
||||
eval_size = min(self.config.eval_size, len(items))
|
||||
eval_items = items[:eval_size]
|
||||
|
||||
logger.info(f"Running eval on {len(eval_items)} questions...")
|
||||
logger.info(f"Running eval on {len(eval_items)} questions (with agent loop + tools)...")
|
||||
start_time = time.time()
|
||||
samples = []
|
||||
|
||||
for item in eval_items:
|
||||
# Resolve tools once for all eval items
|
||||
tools, valid_names = self._resolve_tools_for_group()
|
||||
|
||||
for i, item in enumerate(eval_items):
|
||||
task_id = str(uuid.uuid4())
|
||||
logger.info(f"Eval [{i+1}/{len(eval_items)}]: {item['question'][:80]}...")
|
||||
|
||||
try:
|
||||
# Use the base env's agent loop for eval (same as training)
|
||||
prompt = self.format_prompt(item)
|
||||
completion = await self.server.chat_completion(
|
||||
messages=[
|
||||
{"role": "system", "content": self.config.system_prompt or ""},
|
||||
{"role": "user", "content": prompt},
|
||||
],
|
||||
n=1,
|
||||
# Build messages
|
||||
messages: List[Dict[str, Any]] = []
|
||||
if self.config.system_prompt:
|
||||
messages.append({"role": "system", "content": self.config.system_prompt})
|
||||
messages.append({"role": "user", "content": self.format_prompt(item)})
|
||||
|
||||
# Run the full agent loop with tools
|
||||
agent = HermesAgentLoop(
|
||||
server=self.server,
|
||||
tool_schemas=tools,
|
||||
valid_tool_names=valid_names,
|
||||
max_turns=self.config.max_agent_turns,
|
||||
task_id=task_id,
|
||||
temperature=0.0, # Deterministic for eval
|
||||
max_tokens=self.config.max_token_length,
|
||||
temperature=0.0,
|
||||
split="eval",
|
||||
extra_body=self.config.extra_body,
|
||||
)
|
||||
result = await agent.run(messages)
|
||||
|
||||
response_content = (
|
||||
completion.choices[0].message.content if completion.choices else ""
|
||||
)
|
||||
# Extract final response and tool usage from messages
|
||||
final_response = ""
|
||||
tool_call_count = 0
|
||||
for msg in reversed(result.messages):
|
||||
if msg.get("role") == "assistant" and msg.get("content") and not final_response:
|
||||
final_response = msg["content"]
|
||||
if msg.get("role") == "assistant" and msg.get("tool_calls"):
|
||||
tool_call_count += len(msg["tool_calls"])
|
||||
|
||||
# Score the response
|
||||
correctness = await self._llm_judge(
|
||||
question=item["question"],
|
||||
expected=item["answer"],
|
||||
model_answer=response_content,
|
||||
# Compute reward (includes LLM judge for correctness)
|
||||
# Temporarily save buffer lengths so we can extract the
|
||||
# correctness score without calling judge twice, and avoid
|
||||
# polluting training metric buffers with eval data.
|
||||
buf_len = len(self._correctness_buffer)
|
||||
ctx = ToolContext(task_id)
|
||||
try:
|
||||
reward = await self.compute_reward(item, result, ctx)
|
||||
finally:
|
||||
ctx.cleanup()
|
||||
|
||||
# Extract correctness from the buffer (compute_reward appended it)
|
||||
# then remove eval entries from training buffers
|
||||
correctness = (
|
||||
self._correctness_buffer[buf_len]
|
||||
if len(self._correctness_buffer) > buf_len
|
||||
else 0.0
|
||||
)
|
||||
# Roll back buffers to avoid polluting training metrics
|
||||
for buf in (
|
||||
self._reward_buffer, self._correctness_buffer,
|
||||
self._tool_usage_buffer, self._efficiency_buffer,
|
||||
self._diversity_buffer,
|
||||
):
|
||||
if len(buf) > buf_len:
|
||||
buf.pop()
|
||||
|
||||
samples.append({
|
||||
"prompt": item["question"],
|
||||
"response": response_content,
|
||||
"response": final_response[:500],
|
||||
"expected": item["answer"],
|
||||
"correctness": correctness,
|
||||
"reward": reward,
|
||||
"tool_calls": tool_call_count,
|
||||
"turns": result.turns_used,
|
||||
})
|
||||
|
||||
logger.info(
|
||||
f" → correctness={correctness:.2f}, reward={reward:.3f}, "
|
||||
f"tools={tool_call_count}, turns={result.turns_used}"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Eval error on item: {e}")
|
||||
samples.append({
|
||||
@@ -471,20 +533,33 @@ class WebResearchEnv(HermesAgentBaseEnv):
|
||||
"response": f"ERROR: {e}",
|
||||
"expected": item["answer"],
|
||||
"correctness": 0.0,
|
||||
"reward": 0.0,
|
||||
"tool_calls": 0,
|
||||
"turns": 0,
|
||||
})
|
||||
|
||||
end_time = time.time()
|
||||
|
||||
# Compute metrics
|
||||
# Compute aggregate metrics
|
||||
correctness_scores = [s["correctness"] for s in samples]
|
||||
rewards = [s["reward"] for s in samples]
|
||||
tool_counts = [s["tool_calls"] for s in samples]
|
||||
n = len(samples)
|
||||
|
||||
eval_metrics = {
|
||||
"eval/mean_correctness": (
|
||||
sum(correctness_scores) / len(correctness_scores)
|
||||
if correctness_scores else 0.0
|
||||
),
|
||||
"eval/n_items": len(samples),
|
||||
"eval/mean_correctness": sum(correctness_scores) / n if n else 0.0,
|
||||
"eval/mean_reward": sum(rewards) / n if n else 0.0,
|
||||
"eval/mean_tool_calls": sum(tool_counts) / n if n else 0.0,
|
||||
"eval/tool_usage_rate": sum(1 for t in tool_counts if t > 0) / n if n else 0.0,
|
||||
"eval/n_items": n,
|
||||
}
|
||||
|
||||
logger.info(
|
||||
f"Eval complete — correctness={eval_metrics['eval/mean_correctness']:.3f}, "
|
||||
f"reward={eval_metrics['eval/mean_reward']:.3f}, "
|
||||
f"tool_usage={eval_metrics['eval/tool_usage_rate']:.0%}"
|
||||
)
|
||||
|
||||
await self.evaluate_log(
|
||||
metrics=eval_metrics,
|
||||
samples=samples,
|
||||
|
||||
@@ -270,7 +270,7 @@ def load_gateway_config() -> GatewayConfig:
|
||||
gateway_config_path = Path.home() / ".hermes" / "gateway.json"
|
||||
if gateway_config_path.exists():
|
||||
try:
|
||||
with open(gateway_config_path, "r") as f:
|
||||
with open(gateway_config_path, "r", encoding="utf-8") as f:
|
||||
data = json.load(f)
|
||||
config = GatewayConfig.from_dict(data)
|
||||
except Exception as e:
|
||||
@@ -283,7 +283,7 @@ def load_gateway_config() -> GatewayConfig:
|
||||
import yaml
|
||||
config_yaml_path = Path.home() / ".hermes" / "config.yaml"
|
||||
if config_yaml_path.exists():
|
||||
with open(config_yaml_path) as f:
|
||||
with open(config_yaml_path, encoding="utf-8") as f:
|
||||
yaml_cfg = yaml.safe_load(f) or {}
|
||||
sr = yaml_cfg.get("session_reset")
|
||||
if sr and isinstance(sr, dict):
|
||||
@@ -441,5 +441,5 @@ def save_gateway_config(config: GatewayConfig) -> None:
|
||||
gateway_config_path = Path.home() / ".hermes" / "gateway.json"
|
||||
gateway_config_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
with open(gateway_config_path, "w") as f:
|
||||
with open(gateway_config_path, "w", encoding="utf-8") as f:
|
||||
json.dump(config.to_dict(), f, indent=2)
|
||||
|
||||
@@ -111,6 +111,7 @@ def _append_to_jsonl(session_id: str, message: dict) -> None:
|
||||
|
||||
def _append_to_sqlite(session_id: str, message: dict) -> None:
|
||||
"""Append a message to the SQLite session database."""
|
||||
db = None
|
||||
try:
|
||||
from hermes_state import SessionDB
|
||||
db = SessionDB()
|
||||
@@ -121,3 +122,6 @@ def _append_to_sqlite(session_id: str, message: dict) -> None:
|
||||
)
|
||||
except Exception as e:
|
||||
logger.debug("Mirror SQLite write failed: %s", e)
|
||||
finally:
|
||||
if db is not None:
|
||||
db.close()
|
||||
|
||||
@@ -413,11 +413,12 @@ class BasePlatformAdapter(ABC):
|
||||
"""
|
||||
return SendResult(success=False, error="Not supported")
|
||||
|
||||
async def send_typing(self, chat_id: str) -> None:
|
||||
async def send_typing(self, chat_id: str, metadata=None) -> None:
|
||||
"""
|
||||
Send a typing indicator.
|
||||
|
||||
Override in subclasses if the platform supports it.
|
||||
metadata: optional dict with platform-specific context (e.g. thread_id for Slack).
|
||||
"""
|
||||
pass
|
||||
|
||||
@@ -620,7 +621,7 @@ class BasePlatformAdapter(ABC):
|
||||
|
||||
return media, cleaned
|
||||
|
||||
async def _keep_typing(self, chat_id: str, interval: float = 2.0) -> None:
|
||||
async def _keep_typing(self, chat_id: str, interval: float = 2.0, metadata=None) -> None:
|
||||
"""
|
||||
Continuously send typing indicator until cancelled.
|
||||
|
||||
@@ -629,7 +630,7 @@ class BasePlatformAdapter(ABC):
|
||||
"""
|
||||
try:
|
||||
while True:
|
||||
await self.send_typing(chat_id)
|
||||
await self.send_typing(chat_id, metadata=metadata)
|
||||
await asyncio.sleep(interval)
|
||||
except asyncio.CancelledError:
|
||||
pass # Normal cancellation when handler completes
|
||||
@@ -687,7 +688,8 @@ class BasePlatformAdapter(ABC):
|
||||
self._active_sessions[session_key] = interrupt_event
|
||||
|
||||
# Start continuous typing indicator (refreshes every 2 seconds)
|
||||
typing_task = asyncio.create_task(self._keep_typing(event.source.chat_id))
|
||||
_thread_metadata = {"thread_id": event.source.thread_id} if event.source.thread_id else None
|
||||
typing_task = asyncio.create_task(self._keep_typing(event.source.chat_id, metadata=_thread_metadata))
|
||||
|
||||
try:
|
||||
# Call the handler (this can take a while with tool calls)
|
||||
@@ -711,7 +713,8 @@ class BasePlatformAdapter(ABC):
|
||||
result = await self.send(
|
||||
chat_id=event.source.chat_id,
|
||||
content=text_content,
|
||||
reply_to=event.message_id
|
||||
reply_to=event.message_id,
|
||||
metadata=_thread_metadata,
|
||||
)
|
||||
|
||||
# Log send failures (don't raise - user already saw tool progress)
|
||||
@@ -721,7 +724,8 @@ class BasePlatformAdapter(ABC):
|
||||
fallback_result = await self.send(
|
||||
chat_id=event.source.chat_id,
|
||||
content=f"(Response formatting failed, plain text:)\n\n{text_content[:3500]}",
|
||||
reply_to=event.message_id
|
||||
reply_to=event.message_id,
|
||||
metadata=_thread_metadata,
|
||||
)
|
||||
if not fallback_result.success:
|
||||
print(f"[{self.name}] Fallback send also failed: {fallback_result.error}")
|
||||
@@ -743,12 +747,14 @@ class BasePlatformAdapter(ABC):
|
||||
chat_id=event.source.chat_id,
|
||||
animation_url=image_url,
|
||||
caption=alt_text if alt_text else None,
|
||||
metadata=_thread_metadata,
|
||||
)
|
||||
else:
|
||||
img_result = await self.send_image(
|
||||
chat_id=event.source.chat_id,
|
||||
image_url=image_url,
|
||||
caption=alt_text if alt_text else None,
|
||||
metadata=_thread_metadata,
|
||||
)
|
||||
if not img_result.success:
|
||||
logger.error("[%s] Failed to send image: %s", self.name, img_result.error)
|
||||
@@ -769,21 +775,25 @@ class BasePlatformAdapter(ABC):
|
||||
media_result = await self.send_voice(
|
||||
chat_id=event.source.chat_id,
|
||||
audio_path=media_path,
|
||||
metadata=_thread_metadata,
|
||||
)
|
||||
elif ext in _VIDEO_EXTS:
|
||||
media_result = await self.send_video(
|
||||
chat_id=event.source.chat_id,
|
||||
video_path=media_path,
|
||||
metadata=_thread_metadata,
|
||||
)
|
||||
elif ext in _IMAGE_EXTS:
|
||||
media_result = await self.send_image_file(
|
||||
chat_id=event.source.chat_id,
|
||||
image_path=media_path,
|
||||
metadata=_thread_metadata,
|
||||
)
|
||||
else:
|
||||
media_result = await self.send_document(
|
||||
chat_id=event.source.chat_id,
|
||||
file_path=media_path,
|
||||
metadata=_thread_metadata,
|
||||
)
|
||||
|
||||
if not media_result.success:
|
||||
|
||||
@@ -359,7 +359,7 @@ class DiscordAdapter(BasePlatformAdapter):
|
||||
print(f"[{self.name}] Failed to send image attachment, falling back to URL: {e}")
|
||||
return await super().send_image(chat_id, image_url, caption, reply_to)
|
||||
|
||||
async def send_typing(self, chat_id: str) -> None:
|
||||
async def send_typing(self, chat_id: str, metadata=None) -> None:
|
||||
"""Send typing indicator."""
|
||||
if self._client:
|
||||
try:
|
||||
|
||||
@@ -419,7 +419,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
|
||||
except Exception as e:
|
||||
return SendResult(success=False, error=str(e))
|
||||
|
||||
async def send_typing(self, chat_id: str) -> None:
|
||||
async def send_typing(self, chat_id: str, metadata=None) -> None:
|
||||
"""No typing indicator for Home Assistant."""
|
||||
pass
|
||||
|
||||
|
||||
@@ -104,6 +104,20 @@ def _is_audio_ext(ext: str) -> bool:
|
||||
return ext.lower() in (".mp3", ".wav", ".ogg", ".m4a", ".aac")
|
||||
|
||||
|
||||
_EXT_TO_MIME = {
|
||||
".jpg": "image/jpeg", ".jpeg": "image/jpeg", ".png": "image/png",
|
||||
".gif": "image/gif", ".webp": "image/webp",
|
||||
".ogg": "audio/ogg", ".mp3": "audio/mpeg", ".wav": "audio/wav",
|
||||
".m4a": "audio/mp4", ".aac": "audio/aac",
|
||||
".mp4": "video/mp4", ".pdf": "application/pdf", ".zip": "application/zip",
|
||||
}
|
||||
|
||||
|
||||
def _ext_to_mime(ext: str) -> str:
|
||||
"""Map file extension to MIME type."""
|
||||
return _EXT_TO_MIME.get(ext.lower(), "application/octet-stream")
|
||||
|
||||
|
||||
def _render_mentions(text: str, mentions: list) -> str:
|
||||
"""Replace Signal mention placeholders (\\uFFFC) with readable @identifiers.
|
||||
|
||||
@@ -404,9 +418,8 @@ class SignalAdapter(BasePlatformAdapter):
|
||||
|
||||
# Process attachments
|
||||
attachments_data = data_message.get("attachments", [])
|
||||
image_paths = []
|
||||
audio_path = None
|
||||
document_paths = []
|
||||
media_urls = []
|
||||
media_types = []
|
||||
|
||||
if attachments_data and not getattr(self, "ignore_attachments", False):
|
||||
for att in attachments_data:
|
||||
@@ -420,12 +433,10 @@ class SignalAdapter(BasePlatformAdapter):
|
||||
try:
|
||||
cached_path, ext = await self._fetch_attachment(att_id)
|
||||
if cached_path:
|
||||
if _is_image_ext(ext):
|
||||
image_paths.append(cached_path)
|
||||
elif _is_audio_ext(ext):
|
||||
audio_path = cached_path
|
||||
else:
|
||||
document_paths.append(cached_path)
|
||||
# Use contentType from Signal if available, else map from extension
|
||||
content_type = att.get("contentType") or _ext_to_mime(ext)
|
||||
media_urls.append(cached_path)
|
||||
media_types.append(content_type)
|
||||
except Exception:
|
||||
logger.exception("Signal: failed to fetch attachment %s", att_id)
|
||||
|
||||
@@ -440,12 +451,13 @@ class SignalAdapter(BasePlatformAdapter):
|
||||
chat_id_alt=group_id if is_group else None,
|
||||
)
|
||||
|
||||
# Determine message type
|
||||
# Determine message type from media
|
||||
msg_type = MessageType.TEXT
|
||||
if audio_path:
|
||||
msg_type = MessageType.VOICE
|
||||
elif image_paths:
|
||||
msg_type = MessageType.IMAGE
|
||||
if media_types:
|
||||
if any(mt.startswith("audio/") for mt in media_types):
|
||||
msg_type = MessageType.VOICE
|
||||
elif any(mt.startswith("image/") for mt in media_types):
|
||||
msg_type = MessageType.IMAGE
|
||||
|
||||
# Parse timestamp from envelope data (milliseconds since epoch)
|
||||
ts_ms = envelope_data.get("timestamp", 0)
|
||||
@@ -462,9 +474,8 @@ class SignalAdapter(BasePlatformAdapter):
|
||||
source=source,
|
||||
text=text or "",
|
||||
message_type=msg_type,
|
||||
image_paths=image_paths,
|
||||
audio_path=audio_path,
|
||||
document_paths=document_paths,
|
||||
media_urls=media_urls,
|
||||
media_types=media_types,
|
||||
timestamp=timestamp,
|
||||
)
|
||||
|
||||
@@ -546,16 +557,16 @@ class SignalAdapter(BasePlatformAdapter):
|
||||
async def send(
|
||||
self,
|
||||
chat_id: str,
|
||||
text: str,
|
||||
reply_to_message_id: Optional[str] = None,
|
||||
**kwargs,
|
||||
content: str,
|
||||
reply_to: Optional[str] = None,
|
||||
metadata: Optional[Dict[str, Any]] = None,
|
||||
) -> SendResult:
|
||||
"""Send a text message."""
|
||||
await self._stop_typing_indicator(chat_id)
|
||||
|
||||
params: Dict[str, Any] = {
|
||||
"account": self.account,
|
||||
"message": text,
|
||||
"message": content,
|
||||
}
|
||||
|
||||
if chat_id.startswith("group:"):
|
||||
@@ -569,7 +580,7 @@ class SignalAdapter(BasePlatformAdapter):
|
||||
return SendResult(success=True)
|
||||
return SendResult(success=False, error="RPC send failed")
|
||||
|
||||
async def send_typing(self, chat_id: str) -> None:
|
||||
async def send_typing(self, chat_id: str, metadata=None) -> None:
|
||||
"""Send a typing indicator."""
|
||||
params: Dict[str, Any] = {
|
||||
"account": self.account,
|
||||
|
||||
@@ -185,7 +185,7 @@ class SlackAdapter(BasePlatformAdapter):
|
||||
except Exception as e:
|
||||
return SendResult(success=False, error=str(e))
|
||||
|
||||
async def send_typing(self, chat_id: str) -> None:
|
||||
async def send_typing(self, chat_id: str, metadata=None) -> None:
|
||||
"""Slack doesn't have a direct typing indicator API for bots."""
|
||||
pass
|
||||
|
||||
|
||||
@@ -86,6 +86,9 @@ def _strip_mdv2(text: str) -> str:
|
||||
cleaned = re.sub(r'\\([_*\[\]()~`>#\+\-=|{}.!\\])', r'\1', text)
|
||||
# Remove MarkdownV2 bold markers that format_message converted from **bold**
|
||||
cleaned = re.sub(r'\*([^*]+)\*', r'\1', cleaned)
|
||||
# Remove MarkdownV2 italic markers that format_message converted from *italic*
|
||||
# Use word boundary (\b) to avoid breaking snake_case like my_variable_name
|
||||
cleaned = re.sub(r'(?<!\w)_([^_]+)_(?!\w)', r'\1', cleaned)
|
||||
return cleaned
|
||||
|
||||
|
||||
@@ -286,6 +289,7 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
audio_path: str,
|
||||
caption: Optional[str] = None,
|
||||
reply_to: Optional[str] = None,
|
||||
metadata: Optional[Dict[str, Any]] = None,
|
||||
) -> SendResult:
|
||||
"""Send audio as a native Telegram voice message or audio file."""
|
||||
if not self._bot:
|
||||
@@ -299,19 +303,23 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
with open(audio_path, "rb") as audio_file:
|
||||
# .ogg files -> send as voice (round playable bubble)
|
||||
if audio_path.endswith(".ogg") or audio_path.endswith(".opus"):
|
||||
_voice_thread = metadata.get("thread_id") if metadata else None
|
||||
msg = await self._bot.send_voice(
|
||||
chat_id=int(chat_id),
|
||||
voice=audio_file,
|
||||
caption=caption[:1024] if caption else None,
|
||||
reply_to_message_id=int(reply_to) if reply_to else None,
|
||||
message_thread_id=int(_voice_thread) if _voice_thread else None,
|
||||
)
|
||||
else:
|
||||
# .mp3 and others -> send as audio file
|
||||
_audio_thread = metadata.get("thread_id") if metadata else None
|
||||
msg = await self._bot.send_audio(
|
||||
chat_id=int(chat_id),
|
||||
audio=audio_file,
|
||||
caption=caption[:1024] if caption else None,
|
||||
reply_to_message_id=int(reply_to) if reply_to else None,
|
||||
message_thread_id=int(_audio_thread) if _audio_thread else None,
|
||||
)
|
||||
return SendResult(success=True, message_id=str(msg.message_id))
|
||||
except Exception as e:
|
||||
@@ -352,6 +360,7 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
image_url: str,
|
||||
caption: Optional[str] = None,
|
||||
reply_to: Optional[str] = None,
|
||||
metadata: Optional[Dict[str, Any]] = None,
|
||||
) -> SendResult:
|
||||
"""Send an image natively as a Telegram photo.
|
||||
|
||||
@@ -363,11 +372,13 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
|
||||
try:
|
||||
# Telegram can send photos directly from URLs (up to ~5MB)
|
||||
_photo_thread = metadata.get("thread_id") if metadata else None
|
||||
msg = await self._bot.send_photo(
|
||||
chat_id=int(chat_id),
|
||||
photo=image_url,
|
||||
caption=caption[:1024] if caption else None, # Telegram caption limit
|
||||
reply_to_message_id=int(reply_to) if reply_to else None,
|
||||
message_thread_id=int(_photo_thread) if _photo_thread else None,
|
||||
)
|
||||
return SendResult(success=True, message_id=str(msg.message_id))
|
||||
except Exception as e:
|
||||
@@ -398,17 +409,20 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
animation_url: str,
|
||||
caption: Optional[str] = None,
|
||||
reply_to: Optional[str] = None,
|
||||
metadata: Optional[Dict[str, Any]] = None,
|
||||
) -> SendResult:
|
||||
"""Send an animated GIF natively as a Telegram animation (auto-plays inline)."""
|
||||
if not self._bot:
|
||||
return SendResult(success=False, error="Not connected")
|
||||
|
||||
try:
|
||||
_anim_thread = metadata.get("thread_id") if metadata else None
|
||||
msg = await self._bot.send_animation(
|
||||
chat_id=int(chat_id),
|
||||
animation=animation_url,
|
||||
caption=caption[:1024] if caption else None,
|
||||
reply_to_message_id=int(reply_to) if reply_to else None,
|
||||
message_thread_id=int(_anim_thread) if _anim_thread else None,
|
||||
)
|
||||
return SendResult(success=True, message_id=str(msg.message_id))
|
||||
except Exception as e:
|
||||
@@ -416,13 +430,15 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
# Fallback: try as a regular photo
|
||||
return await self.send_image(chat_id, animation_url, caption, reply_to)
|
||||
|
||||
async def send_typing(self, chat_id: str) -> None:
|
||||
async def send_typing(self, chat_id: str, metadata: Optional[Dict[str, Any]] = None) -> None:
|
||||
"""Send typing indicator."""
|
||||
if self._bot:
|
||||
try:
|
||||
_typing_thread = metadata.get("thread_id") if metadata else None
|
||||
await self._bot.send_chat_action(
|
||||
chat_id=int(chat_id),
|
||||
action="typing"
|
||||
action="typing",
|
||||
message_thread_id=int(_typing_thread) if _typing_thread else None,
|
||||
)
|
||||
except Exception:
|
||||
pass # Ignore typing indicator failures
|
||||
|
||||
@@ -493,7 +493,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
|
||||
file_name or os.path.basename(file_path),
|
||||
)
|
||||
|
||||
async def send_typing(self, chat_id: str) -> None:
|
||||
async def send_typing(self, chat_id: str, metadata=None) -> None:
|
||||
"""Send typing indicator via bridge."""
|
||||
if not self._running:
|
||||
return
|
||||
|
||||
@@ -48,7 +48,7 @@ _config_path = _hermes_home / 'config.yaml'
|
||||
if _config_path.exists():
|
||||
try:
|
||||
import yaml as _yaml
|
||||
with open(_config_path) as _f:
|
||||
with open(_config_path, encoding="utf-8") as _f:
|
||||
_cfg = _yaml.safe_load(_f) or {}
|
||||
# Top-level simple values (fallback only — don't override .env)
|
||||
for _key, _val in _cfg.items():
|
||||
@@ -316,7 +316,7 @@ class GatewayRunner:
|
||||
import yaml as _y
|
||||
cfg_path = _hermes_home / "config.yaml"
|
||||
if cfg_path.exists():
|
||||
with open(cfg_path) as _f:
|
||||
with open(cfg_path, encoding="utf-8") as _f:
|
||||
cfg = _y.safe_load(_f) or {}
|
||||
file_path = cfg.get("prefill_messages_file", "")
|
||||
except Exception:
|
||||
@@ -354,7 +354,7 @@ class GatewayRunner:
|
||||
import yaml as _y
|
||||
cfg_path = _hermes_home / "config.yaml"
|
||||
if cfg_path.exists():
|
||||
with open(cfg_path) as _f:
|
||||
with open(cfg_path, encoding="utf-8") as _f:
|
||||
cfg = _y.safe_load(_f) or {}
|
||||
return (cfg.get("agent", {}).get("system_prompt", "") or "").strip()
|
||||
except Exception:
|
||||
@@ -375,7 +375,7 @@ class GatewayRunner:
|
||||
import yaml as _y
|
||||
cfg_path = _hermes_home / "config.yaml"
|
||||
if cfg_path.exists():
|
||||
with open(cfg_path) as _f:
|
||||
with open(cfg_path, encoding="utf-8") as _f:
|
||||
cfg = _y.safe_load(_f) or {}
|
||||
effort = str(cfg.get("agent", {}).get("reasoning_effort", "") or "").strip()
|
||||
except Exception:
|
||||
@@ -391,6 +391,41 @@ class GatewayRunner:
|
||||
logger.warning("Unknown reasoning_effort '%s', using default (medium)", effort)
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def _load_background_notifications_mode() -> str:
|
||||
"""Load background process notification mode from config or env var.
|
||||
|
||||
Modes:
|
||||
- ``all`` — push running-output updates *and* the final message (default)
|
||||
- ``result`` — only the final completion message (regardless of exit code)
|
||||
- ``error`` — only the final message when exit code is non-zero
|
||||
- ``off`` — no watcher messages at all
|
||||
"""
|
||||
mode = os.getenv("HERMES_BACKGROUND_NOTIFICATIONS", "")
|
||||
if not mode:
|
||||
try:
|
||||
import yaml as _y
|
||||
cfg_path = _hermes_home / "config.yaml"
|
||||
if cfg_path.exists():
|
||||
with open(cfg_path, encoding="utf-8") as _f:
|
||||
cfg = _y.safe_load(_f) or {}
|
||||
raw = cfg.get("display", {}).get("background_process_notifications")
|
||||
if raw is False:
|
||||
mode = "off"
|
||||
elif raw not in (None, ""):
|
||||
mode = str(raw)
|
||||
except Exception:
|
||||
pass
|
||||
mode = (mode or "all").strip().lower()
|
||||
valid = {"all", "result", "error", "off"}
|
||||
if mode not in valid:
|
||||
logger.warning(
|
||||
"Unknown background_process_notifications '%s', defaulting to 'all'",
|
||||
mode,
|
||||
)
|
||||
return "all"
|
||||
return mode
|
||||
|
||||
@staticmethod
|
||||
def _load_provider_routing() -> dict:
|
||||
"""Load OpenRouter provider routing preferences from config.yaml."""
|
||||
@@ -398,7 +433,7 @@ class GatewayRunner:
|
||||
import yaml as _y
|
||||
cfg_path = _hermes_home / "config.yaml"
|
||||
if cfg_path.exists():
|
||||
with open(cfg_path) as _f:
|
||||
with open(cfg_path, encoding="utf-8") as _f:
|
||||
cfg = _y.safe_load(_f) or {}
|
||||
return cfg.get("provider_routing", {}) or {}
|
||||
except Exception:
|
||||
@@ -416,7 +451,7 @@ class GatewayRunner:
|
||||
import yaml as _y
|
||||
cfg_path = _hermes_home / "config.yaml"
|
||||
if cfg_path.exists():
|
||||
with open(cfg_path) as _f:
|
||||
with open(cfg_path, encoding="utf-8") as _f:
|
||||
cfg = _y.safe_load(_f) or {}
|
||||
fb = cfg.get("fallback_model", {}) or {}
|
||||
if fb.get("provider") and fb.get("model"):
|
||||
@@ -771,7 +806,7 @@ class GatewayRunner:
|
||||
_known_commands = {"new", "reset", "help", "status", "stop", "model",
|
||||
"personality", "retry", "undo", "sethome", "set-home",
|
||||
"compress", "usage", "insights", "reload-mcp", "reload_mcp",
|
||||
"update", "title", "resume", "provider"}
|
||||
"update", "title", "resume", "provider", "rollback"}
|
||||
if command and command in _known_commands:
|
||||
await self.hooks.emit(f"command:{command}", {
|
||||
"platform": source.platform.value if source.platform else "",
|
||||
@@ -830,6 +865,9 @@ class GatewayRunner:
|
||||
|
||||
if command == "resume":
|
||||
return await self._handle_resume_command(event)
|
||||
|
||||
if command == "rollback":
|
||||
return await self._handle_rollback_command(event)
|
||||
|
||||
# Skill slash commands: /skill-name loads the skill and sends to agent
|
||||
if command:
|
||||
@@ -931,7 +969,7 @@ class GatewayRunner:
|
||||
_hyg_cfg_path = _hermes_home / "config.yaml"
|
||||
if _hyg_cfg_path.exists():
|
||||
import yaml as _hyg_yaml
|
||||
with open(_hyg_cfg_path) as _hyg_f:
|
||||
with open(_hyg_cfg_path, encoding="utf-8") as _hyg_f:
|
||||
_hyg_data = _hyg_yaml.safe_load(_hyg_f) or {}
|
||||
|
||||
# Resolve model name (same logic as run_sync)
|
||||
@@ -1400,6 +1438,7 @@ class GatewayRunner:
|
||||
"`/resume [name]` — Resume a previously-named session",
|
||||
"`/usage` — Show token usage for this session",
|
||||
"`/insights [days]` — Show usage insights and analytics",
|
||||
"`/rollback [number]` — List or restore filesystem checkpoints",
|
||||
"`/reload-mcp` — Reload MCP servers from config",
|
||||
"`/update` — Update Hermes Agent to the latest version",
|
||||
"`/help` — Show this message",
|
||||
@@ -1434,7 +1473,7 @@ class GatewayRunner:
|
||||
current_provider = "openrouter"
|
||||
try:
|
||||
if config_path.exists():
|
||||
with open(config_path) as f:
|
||||
with open(config_path, encoding="utf-8") as f:
|
||||
cfg = yaml.safe_load(f) or {}
|
||||
model_cfg = cfg.get("model", {})
|
||||
if isinstance(model_cfg, str):
|
||||
@@ -1525,14 +1564,14 @@ class GatewayRunner:
|
||||
try:
|
||||
user_config = {}
|
||||
if config_path.exists():
|
||||
with open(config_path) as f:
|
||||
with open(config_path, encoding="utf-8") as f:
|
||||
user_config = yaml.safe_load(f) or {}
|
||||
if "model" not in user_config or not isinstance(user_config["model"], dict):
|
||||
user_config["model"] = {}
|
||||
user_config["model"]["default"] = new_model
|
||||
if provider_changed:
|
||||
user_config["model"]["provider"] = target_provider
|
||||
with open(config_path, 'w') as f:
|
||||
with open(config_path, 'w', encoding="utf-8") as f:
|
||||
yaml.dump(user_config, f, default_flow_style=False, sort_keys=False)
|
||||
except Exception as e:
|
||||
return f"⚠️ Failed to save model change: {e}"
|
||||
@@ -1569,7 +1608,7 @@ class GatewayRunner:
|
||||
config_path = _hermes_home / 'config.yaml'
|
||||
try:
|
||||
if config_path.exists():
|
||||
with open(config_path) as f:
|
||||
with open(config_path, encoding="utf-8") as f:
|
||||
cfg = yaml.safe_load(f) or {}
|
||||
model_cfg = cfg.get("model", {})
|
||||
if isinstance(model_cfg, dict):
|
||||
@@ -1618,7 +1657,7 @@ class GatewayRunner:
|
||||
|
||||
try:
|
||||
if config_path.exists():
|
||||
with open(config_path, 'r') as f:
|
||||
with open(config_path, 'r', encoding="utf-8") as f:
|
||||
config = yaml.safe_load(f) or {}
|
||||
personalities = config.get("agent", {}).get("personalities", {})
|
||||
else:
|
||||
@@ -1647,7 +1686,7 @@ class GatewayRunner:
|
||||
if "agent" not in config or not isinstance(config.get("agent"), dict):
|
||||
config["agent"] = {}
|
||||
config["agent"]["system_prompt"] = new_prompt
|
||||
with open(config_path, 'w') as f:
|
||||
with open(config_path, 'w', encoding="utf-8") as f:
|
||||
yaml.dump(config, f, default_flow_style=False, sort_keys=False)
|
||||
except Exception as e:
|
||||
return f"⚠️ Failed to save personality change: {e}"
|
||||
@@ -1731,10 +1770,10 @@ class GatewayRunner:
|
||||
config_path = _hermes_home / 'config.yaml'
|
||||
user_config = {}
|
||||
if config_path.exists():
|
||||
with open(config_path) as f:
|
||||
with open(config_path, encoding="utf-8") as f:
|
||||
user_config = yaml.safe_load(f) or {}
|
||||
user_config[env_key] = chat_id
|
||||
with open(config_path, 'w') as f:
|
||||
with open(config_path, 'w', encoding="utf-8") as f:
|
||||
yaml.dump(user_config, f, default_flow_style=False)
|
||||
# Also set in the current environment so it takes effect immediately
|
||||
os.environ[env_key] = str(chat_id)
|
||||
@@ -1746,6 +1785,65 @@ class GatewayRunner:
|
||||
f"Cron jobs and cross-platform messages will be delivered here."
|
||||
)
|
||||
|
||||
async def _handle_rollback_command(self, event: MessageEvent) -> str:
|
||||
"""Handle /rollback command — list or restore filesystem checkpoints."""
|
||||
from tools.checkpoint_manager import CheckpointManager, format_checkpoint_list
|
||||
|
||||
# Read checkpoint config from config.yaml
|
||||
cp_cfg = {}
|
||||
try:
|
||||
import yaml as _y
|
||||
_cfg_path = _hermes_home / "config.yaml"
|
||||
if _cfg_path.exists():
|
||||
with open(_cfg_path, encoding="utf-8") as _f:
|
||||
_data = _y.safe_load(_f) or {}
|
||||
cp_cfg = _data.get("checkpoints", {})
|
||||
if isinstance(cp_cfg, bool):
|
||||
cp_cfg = {"enabled": cp_cfg}
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if not cp_cfg.get("enabled", False):
|
||||
return (
|
||||
"Checkpoints are not enabled.\n"
|
||||
"Enable in config.yaml:\n```\ncheckpoints:\n enabled: true\n```"
|
||||
)
|
||||
|
||||
mgr = CheckpointManager(
|
||||
enabled=True,
|
||||
max_snapshots=cp_cfg.get("max_snapshots", 50),
|
||||
)
|
||||
|
||||
cwd = os.getenv("MESSAGING_CWD", str(Path.home()))
|
||||
arg = event.get_command_args().strip()
|
||||
|
||||
if not arg:
|
||||
checkpoints = mgr.list_checkpoints(cwd)
|
||||
return format_checkpoint_list(checkpoints, cwd)
|
||||
|
||||
# Restore by number or hash
|
||||
checkpoints = mgr.list_checkpoints(cwd)
|
||||
if not checkpoints:
|
||||
return f"No checkpoints found for {cwd}"
|
||||
|
||||
target_hash = None
|
||||
try:
|
||||
idx = int(arg) - 1
|
||||
if 0 <= idx < len(checkpoints):
|
||||
target_hash = checkpoints[idx]["hash"]
|
||||
else:
|
||||
return f"Invalid checkpoint number. Use 1-{len(checkpoints)}."
|
||||
except ValueError:
|
||||
target_hash = arg
|
||||
|
||||
result = mgr.restore(cwd, target_hash)
|
||||
if result["success"]:
|
||||
return (
|
||||
f"✅ Restored to checkpoint {result['restored_to']}: {result['reason']}\n"
|
||||
f"A pre-rollback snapshot was saved automatically."
|
||||
)
|
||||
return f"❌ {result['error']}"
|
||||
|
||||
async def _handle_compress_command(self, event: MessageEvent) -> str:
|
||||
"""Handle /compress command -- manually compress conversation context."""
|
||||
source = event.source
|
||||
@@ -2307,6 +2405,12 @@ class GatewayRunner:
|
||||
|
||||
Runs as an asyncio task. Stays silent when nothing changed.
|
||||
Auto-removes when the process exits or is killed.
|
||||
|
||||
Notification mode (from ``display.background_process_notifications``):
|
||||
- ``all`` — running-output updates + final message
|
||||
- ``result`` — final completion message only
|
||||
- ``error`` — final message only when exit code != 0
|
||||
- ``off`` — no messages at all
|
||||
"""
|
||||
from tools.process_registry import process_registry
|
||||
|
||||
@@ -2315,8 +2419,21 @@ class GatewayRunner:
|
||||
session_key = watcher.get("session_key", "")
|
||||
platform_name = watcher.get("platform", "")
|
||||
chat_id = watcher.get("chat_id", "")
|
||||
notify_mode = self._load_background_notifications_mode()
|
||||
|
||||
logger.debug("Process watcher started: %s (every %ss)", session_id, interval)
|
||||
logger.debug("Process watcher started: %s (every %ss, notify=%s)",
|
||||
session_id, interval, notify_mode)
|
||||
|
||||
if notify_mode == "off":
|
||||
# Still wait for the process to exit so we can log it, but don't
|
||||
# push any messages to the user.
|
||||
while True:
|
||||
await asyncio.sleep(interval)
|
||||
session = process_registry.get(session_id)
|
||||
if session is None or session.exited:
|
||||
break
|
||||
logger.debug("Process watcher ended (silent): %s", session_id)
|
||||
return
|
||||
|
||||
last_output_len = 0
|
||||
while True:
|
||||
@@ -2331,27 +2448,31 @@ class GatewayRunner:
|
||||
last_output_len = current_output_len
|
||||
|
||||
if session.exited:
|
||||
# Process finished -- deliver final update
|
||||
new_output = session.output_buffer[-1000:] if session.output_buffer else ""
|
||||
message_text = (
|
||||
f"[Background process {session_id} finished with exit code {session.exit_code}~ "
|
||||
f"Here's the final output:\n{new_output}]"
|
||||
# Decide whether to notify based on mode
|
||||
should_notify = (
|
||||
notify_mode in ("all", "result")
|
||||
or (notify_mode == "error" and session.exit_code not in (0, None))
|
||||
)
|
||||
# Try to deliver to the originating platform
|
||||
adapter = None
|
||||
for p, a in self.adapters.items():
|
||||
if p.value == platform_name:
|
||||
adapter = a
|
||||
break
|
||||
if adapter and chat_id:
|
||||
try:
|
||||
await adapter.send(chat_id, message_text)
|
||||
except Exception as e:
|
||||
logger.error("Watcher delivery error: %s", e)
|
||||
if should_notify:
|
||||
new_output = session.output_buffer[-1000:] if session.output_buffer else ""
|
||||
message_text = (
|
||||
f"[Background process {session_id} finished with exit code {session.exit_code}~ "
|
||||
f"Here's the final output:\n{new_output}]"
|
||||
)
|
||||
adapter = None
|
||||
for p, a in self.adapters.items():
|
||||
if p.value == platform_name:
|
||||
adapter = a
|
||||
break
|
||||
if adapter and chat_id:
|
||||
try:
|
||||
await adapter.send(chat_id, message_text)
|
||||
except Exception as e:
|
||||
logger.error("Watcher delivery error: %s", e)
|
||||
break
|
||||
|
||||
elif has_new_output:
|
||||
# New output available -- deliver status update
|
||||
elif has_new_output and notify_mode == "all":
|
||||
# New output available -- deliver status update (only in "all" mode)
|
||||
new_output = session.output_buffer[-500:] if session.output_buffer else ""
|
||||
message_text = (
|
||||
f"[Background process {session_id} is still running~ "
|
||||
@@ -2402,6 +2523,8 @@ class GatewayRunner:
|
||||
Platform.DISCORD: "hermes-discord",
|
||||
Platform.WHATSAPP: "hermes-whatsapp",
|
||||
Platform.SLACK: "hermes-slack",
|
||||
Platform.SIGNAL: "hermes-signal",
|
||||
Platform.HOMEASSISTANT: "hermes-homeassistant",
|
||||
}
|
||||
|
||||
# Try to load platform_toolsets from config
|
||||
@@ -2410,7 +2533,7 @@ class GatewayRunner:
|
||||
config_path = _hermes_home / 'config.yaml'
|
||||
if config_path.exists():
|
||||
import yaml
|
||||
with open(config_path, 'r') as f:
|
||||
with open(config_path, 'r', encoding="utf-8") as f:
|
||||
user_config = yaml.safe_load(f) or {}
|
||||
platform_toolsets_config = user_config.get("platform_toolsets", {})
|
||||
except Exception as e:
|
||||
@@ -2423,6 +2546,8 @@ class GatewayRunner:
|
||||
Platform.DISCORD: "discord",
|
||||
Platform.WHATSAPP: "whatsapp",
|
||||
Platform.SLACK: "slack",
|
||||
Platform.SIGNAL: "signal",
|
||||
Platform.HOMEASSISTANT: "homeassistant",
|
||||
}.get(source.platform, "telegram")
|
||||
|
||||
# Use config override if present (list of toolsets), otherwise hardcoded default
|
||||
@@ -2440,7 +2565,7 @@ class GatewayRunner:
|
||||
_tp_cfg_path = _hermes_home / "config.yaml"
|
||||
if _tp_cfg_path.exists():
|
||||
import yaml as _tp_yaml
|
||||
with open(_tp_cfg_path) as _tp_f:
|
||||
with open(_tp_cfg_path, encoding="utf-8") as _tp_f:
|
||||
_tp_data = _tp_yaml.safe_load(_tp_f) or {}
|
||||
_progress_cfg = _tp_data.get("display", {})
|
||||
except Exception:
|
||||
@@ -2531,6 +2656,8 @@ class GatewayRunner:
|
||||
|
||||
# Background task to send progress messages
|
||||
# Accumulates tool lines into a single message that gets edited
|
||||
_progress_metadata = {"thread_id": source.thread_id} if source.thread_id else None
|
||||
|
||||
async def send_progress_messages():
|
||||
if not progress_queue:
|
||||
return
|
||||
@@ -2560,15 +2687,15 @@ class GatewayRunner:
|
||||
# Platform doesn't support editing — stop trying,
|
||||
# send just this new line as a separate message
|
||||
can_edit = False
|
||||
await adapter.send(chat_id=source.chat_id, content=msg)
|
||||
await adapter.send(chat_id=source.chat_id, content=msg, metadata=_progress_metadata)
|
||||
else:
|
||||
if can_edit:
|
||||
# First tool: send all accumulated text as new message
|
||||
full_text = "\n".join(progress_lines)
|
||||
result = await adapter.send(chat_id=source.chat_id, content=full_text)
|
||||
result = await adapter.send(chat_id=source.chat_id, content=full_text, metadata=_progress_metadata)
|
||||
else:
|
||||
# Editing unsupported: send just this line
|
||||
result = await adapter.send(chat_id=source.chat_id, content=msg)
|
||||
result = await adapter.send(chat_id=source.chat_id, content=msg, metadata=_progress_metadata)
|
||||
if result.success and result.message_id:
|
||||
progress_msg_id = result.message_id
|
||||
|
||||
@@ -2658,7 +2785,7 @@ class GatewayRunner:
|
||||
import yaml as _y
|
||||
_cfg_path = _hermes_home / "config.yaml"
|
||||
if _cfg_path.exists():
|
||||
with open(_cfg_path) as _f:
|
||||
with open(_cfg_path, encoding="utf-8") as _f:
|
||||
_cfg = _y.safe_load(_f) or {}
|
||||
_model_cfg = _cfg.get("model", {})
|
||||
if isinstance(_model_cfg, str):
|
||||
@@ -3140,7 +3267,7 @@ def main():
|
||||
config = None
|
||||
if args.config:
|
||||
import json
|
||||
with open(args.config) as f:
|
||||
with open(args.config, encoding="utf-8") as f:
|
||||
data = json.load(f)
|
||||
config = GatewayConfig.from_dict(data)
|
||||
|
||||
|
||||
@@ -272,8 +272,8 @@ class SessionEntry:
|
||||
if data.get("platform"):
|
||||
try:
|
||||
platform = Platform(data["platform"])
|
||||
except ValueError:
|
||||
pass
|
||||
except ValueError as e:
|
||||
logger.debug("Unknown platform value %r: %s", data["platform"], e)
|
||||
|
||||
return cls(
|
||||
session_key=data["session_key"],
|
||||
@@ -353,12 +353,26 @@ class SessionStore:
|
||||
|
||||
def _save(self) -> None:
|
||||
"""Save sessions index to disk (kept for session key -> ID mapping)."""
|
||||
import tempfile
|
||||
self.sessions_dir.mkdir(parents=True, exist_ok=True)
|
||||
sessions_file = self.sessions_dir / "sessions.json"
|
||||
|
||||
|
||||
data = {key: entry.to_dict() for key, entry in self._entries.items()}
|
||||
with open(sessions_file, "w", encoding="utf-8") as f:
|
||||
json.dump(data, f, indent=2)
|
||||
fd, tmp_path = tempfile.mkstemp(
|
||||
dir=str(self.sessions_dir), suffix=".tmp", prefix=".sessions_"
|
||||
)
|
||||
try:
|
||||
with os.fdopen(fd, "w", encoding="utf-8") as f:
|
||||
json.dump(data, f, indent=2)
|
||||
f.flush()
|
||||
os.fsync(f.fileno())
|
||||
os.replace(tmp_path, sessions_file)
|
||||
except BaseException:
|
||||
try:
|
||||
os.unlink(tmp_path)
|
||||
except OSError as e:
|
||||
logger.debug("Could not remove temp file %s: %s", tmp_path, e)
|
||||
raise
|
||||
|
||||
def _generate_session_key(self, source: SessionSource) -> str:
|
||||
"""Generate a session key from a source."""
|
||||
|
||||
@@ -23,6 +23,7 @@ import stat
|
||||
import base64
|
||||
import hashlib
|
||||
import subprocess
|
||||
import threading
|
||||
import time
|
||||
import uuid
|
||||
import webbrowser
|
||||
@@ -44,6 +45,10 @@ try:
|
||||
import fcntl
|
||||
except Exception:
|
||||
fcntl = None
|
||||
try:
|
||||
import msvcrt
|
||||
except Exception:
|
||||
msvcrt = None
|
||||
|
||||
# =============================================================================
|
||||
# Constants
|
||||
@@ -103,6 +108,14 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
|
||||
auth_type="oauth_external",
|
||||
inference_base_url=DEFAULT_CODEX_BASE_URL,
|
||||
),
|
||||
"nous-api": ProviderConfig(
|
||||
id="nous-api",
|
||||
name="Nous Portal (API Key)",
|
||||
auth_type="api_key",
|
||||
inference_base_url="https://inference-api.nousresearch.com/v1",
|
||||
api_key_env_vars=("NOUS_API_KEY",),
|
||||
base_url_env_var="NOUS_BASE_URL",
|
||||
),
|
||||
"zai": ProviderConfig(
|
||||
id="zai",
|
||||
name="Z.AI / GLM",
|
||||
@@ -299,31 +312,64 @@ def _auth_lock_path() -> Path:
|
||||
return _auth_file_path().with_suffix(".lock")
|
||||
|
||||
|
||||
_auth_lock_holder = threading.local()
|
||||
|
||||
@contextmanager
|
||||
def _auth_store_lock(timeout_seconds: float = AUTH_LOCK_TIMEOUT_SECONDS):
|
||||
"""Cross-process advisory lock for auth.json reads+writes."""
|
||||
"""Cross-process advisory lock for auth.json reads+writes. Reentrant."""
|
||||
# Reentrant: if this thread already holds the lock, just yield.
|
||||
if getattr(_auth_lock_holder, "depth", 0) > 0:
|
||||
_auth_lock_holder.depth += 1
|
||||
try:
|
||||
yield
|
||||
finally:
|
||||
_auth_lock_holder.depth -= 1
|
||||
return
|
||||
|
||||
lock_path = _auth_lock_path()
|
||||
lock_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
with lock_path.open("a+") as lock_file:
|
||||
if fcntl is None:
|
||||
if fcntl is None and msvcrt is None:
|
||||
_auth_lock_holder.depth = 1
|
||||
try:
|
||||
yield
|
||||
return
|
||||
finally:
|
||||
_auth_lock_holder.depth = 0
|
||||
return
|
||||
|
||||
# On Windows, msvcrt.locking needs the file to have content and the
|
||||
# file pointer at position 0. Ensure the lock file has at least 1 byte.
|
||||
if msvcrt and (not lock_path.exists() or lock_path.stat().st_size == 0):
|
||||
lock_path.write_text(" ", encoding="utf-8")
|
||||
|
||||
with lock_path.open("r+" if msvcrt else "a+") as lock_file:
|
||||
deadline = time.time() + max(1.0, timeout_seconds)
|
||||
while True:
|
||||
try:
|
||||
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
|
||||
if fcntl:
|
||||
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
|
||||
else:
|
||||
lock_file.seek(0)
|
||||
msvcrt.locking(lock_file.fileno(), msvcrt.LK_NBLCK, 1)
|
||||
break
|
||||
except BlockingIOError:
|
||||
except (BlockingIOError, OSError, PermissionError):
|
||||
if time.time() >= deadline:
|
||||
raise TimeoutError("Timed out waiting for auth store lock")
|
||||
time.sleep(0.05)
|
||||
|
||||
_auth_lock_holder.depth = 1
|
||||
try:
|
||||
yield
|
||||
finally:
|
||||
fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN)
|
||||
_auth_lock_holder.depth = 0
|
||||
if fcntl:
|
||||
fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN)
|
||||
elif msvcrt:
|
||||
try:
|
||||
lock_file.seek(0)
|
||||
msvcrt.locking(lock_file.fileno(), msvcrt.LK_UNLCK, 1)
|
||||
except (OSError, IOError):
|
||||
pass
|
||||
|
||||
|
||||
def _load_auth_store(auth_file: Optional[Path] = None) -> Dict[str, Any]:
|
||||
@@ -475,6 +521,7 @@ def resolve_provider(
|
||||
|
||||
# Normalize provider aliases
|
||||
_PROVIDER_ALIASES = {
|
||||
"nous_api": "nous-api", "nousapi": "nous-api", "nous-portal-api": "nous-api",
|
||||
"glm": "zai", "z-ai": "zai", "z.ai": "zai", "zhipu": "zai",
|
||||
"kimi": "kimi-coding", "moonshot": "kimi-coding",
|
||||
"minimax-china": "minimax-cn", "minimax_cn": "minimax-cn",
|
||||
|
||||
@@ -36,6 +36,28 @@ def cprint(text: str):
|
||||
_pt_print(_PT_ANSI(text))
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Skin-aware color helpers
|
||||
# =========================================================================
|
||||
|
||||
def _skin_color(key: str, fallback: str) -> str:
|
||||
"""Get a color from the active skin, or return fallback."""
|
||||
try:
|
||||
from hermes_cli.skin_engine import get_active_skin
|
||||
return get_active_skin().get_color(key, fallback)
|
||||
except Exception:
|
||||
return fallback
|
||||
|
||||
|
||||
def _skin_branding(key: str, fallback: str) -> str:
|
||||
"""Get a branding string from the active skin, or return fallback."""
|
||||
try:
|
||||
from hermes_cli.skin_engine import get_active_skin
|
||||
return get_active_skin().get_branding(key, fallback)
|
||||
except Exception:
|
||||
return fallback
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# ASCII Art & Branding
|
||||
# =========================================================================
|
||||
@@ -217,18 +239,24 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
|
||||
layout_table.add_column("left", justify="center")
|
||||
layout_table.add_column("right", justify="left")
|
||||
|
||||
# Resolve skin colors once for the entire banner
|
||||
accent = _skin_color("banner_accent", "#FFBF00")
|
||||
dim = _skin_color("banner_dim", "#B8860B")
|
||||
text = _skin_color("banner_text", "#FFF8DC")
|
||||
session_color = _skin_color("session_border", "#8B8682")
|
||||
|
||||
left_lines = ["", HERMES_CADUCEUS, ""]
|
||||
model_short = model.split("/")[-1] if "/" in model else model
|
||||
if len(model_short) > 28:
|
||||
model_short = model_short[:25] + "..."
|
||||
ctx_str = f" [dim #B8860B]·[/] [dim #B8860B]{_format_context_length(context_length)} context[/]" if context_length else ""
|
||||
left_lines.append(f"[#FFBF00]{model_short}[/]{ctx_str} [dim #B8860B]·[/] [dim #B8860B]Nous Research[/]")
|
||||
left_lines.append(f"[dim #B8860B]{cwd}[/]")
|
||||
ctx_str = f" [dim {dim}]·[/] [dim {dim}]{_format_context_length(context_length)} context[/]" if context_length else ""
|
||||
left_lines.append(f"[{accent}]{model_short}[/]{ctx_str} [dim {dim}]·[/] [dim {dim}]Nous Research[/]")
|
||||
left_lines.append(f"[dim {dim}]{cwd}[/]")
|
||||
if session_id:
|
||||
left_lines.append(f"[dim #8B8682]Session: {session_id}[/]")
|
||||
left_lines.append(f"[dim {session_color}]Session: {session_id}[/]")
|
||||
left_content = "\n".join(left_lines)
|
||||
|
||||
right_lines = ["[bold #FFBF00]Available Tools[/]"]
|
||||
right_lines = [f"[bold {accent}]Available Tools[/]"]
|
||||
toolsets_dict: Dict[str, list] = {}
|
||||
|
||||
for tool in tools:
|
||||
@@ -256,7 +284,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
|
||||
if name in disabled_tools:
|
||||
colored_names.append(f"[red]{name}[/]")
|
||||
else:
|
||||
colored_names.append(f"[#FFF8DC]{name}[/]")
|
||||
colored_names.append(f"[{text}]{name}[/]")
|
||||
|
||||
tools_str = ", ".join(colored_names)
|
||||
if len(", ".join(sorted(tool_names))) > 45:
|
||||
@@ -275,7 +303,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
|
||||
elif name in disabled_tools:
|
||||
colored_names.append(f"[red]{name}[/]")
|
||||
else:
|
||||
colored_names.append(f"[#FFF8DC]{name}[/]")
|
||||
colored_names.append(f"[{text}]{name}[/]")
|
||||
tools_str = ", ".join(colored_names)
|
||||
|
||||
right_lines.append(f"[dim #B8860B]{toolset}:[/] {tools_str}")
|
||||
@@ -306,7 +334,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
|
||||
)
|
||||
|
||||
right_lines.append("")
|
||||
right_lines.append("[bold #FFBF00]Available Skills[/]")
|
||||
right_lines.append(f"[bold {accent}]Available Skills[/]")
|
||||
skills_by_category = get_available_skills()
|
||||
total_skills = sum(len(s) for s in skills_by_category.values())
|
||||
|
||||
@@ -320,9 +348,9 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
|
||||
skills_str = ", ".join(skill_names)
|
||||
if len(skills_str) > 50:
|
||||
skills_str = skills_str[:47] + "..."
|
||||
right_lines.append(f"[dim #B8860B]{category}:[/] [#FFF8DC]{skills_str}[/]")
|
||||
right_lines.append(f"[dim {dim}]{category}:[/] [{text}]{skills_str}[/]")
|
||||
else:
|
||||
right_lines.append("[dim #B8860B]No skills installed[/]")
|
||||
right_lines.append(f"[dim {dim}]No skills installed[/]")
|
||||
|
||||
right_lines.append("")
|
||||
mcp_connected = sum(1 for s in mcp_status if s["connected"]) if mcp_status else 0
|
||||
@@ -330,7 +358,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
|
||||
if mcp_connected:
|
||||
summary_parts.append(f"{mcp_connected} MCP servers")
|
||||
summary_parts.append("/help for commands")
|
||||
right_lines.append(f"[dim #B8860B]{' · '.join(summary_parts)}[/]")
|
||||
right_lines.append(f"[dim {dim}]{' · '.join(summary_parts)}[/]")
|
||||
|
||||
# Update check — show if behind origin/main
|
||||
try:
|
||||
@@ -347,10 +375,13 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
|
||||
right_content = "\n".join(right_lines)
|
||||
layout_table.add_row(left_content, right_content)
|
||||
|
||||
agent_name = _skin_branding("agent_name", "Hermes Agent")
|
||||
title_color = _skin_color("banner_title", "#FFD700")
|
||||
border_color = _skin_color("banner_border", "#CD7F32")
|
||||
outer_panel = Panel(
|
||||
layout_table,
|
||||
title=f"[bold #FFD700]Hermes Agent {VERSION}[/]",
|
||||
border_style="#CD7F32",
|
||||
title=f"[bold {title_color}]{agent_name} {VERSION}[/]",
|
||||
border_style=border_color,
|
||||
padding=(0, 2),
|
||||
)
|
||||
|
||||
|
||||
@@ -292,9 +292,12 @@ def _convert_to_png(path: Path) -> bool:
|
||||
["convert", str(tmp), "png:" + str(path)],
|
||||
capture_output=True, timeout=5,
|
||||
)
|
||||
tmp.unlink(missing_ok=True)
|
||||
if r.returncode == 0 and path.exists() and path.stat().st_size > 0:
|
||||
tmp.unlink(missing_ok=True)
|
||||
return True
|
||||
else:
|
||||
# Convert failed — restore the original file
|
||||
tmp.rename(path)
|
||||
except FileNotFoundError:
|
||||
logger.debug("ImageMagick not installed — cannot convert BMP to PNG")
|
||||
if tmp.exists() and not path.exists():
|
||||
|
||||
@@ -39,6 +39,8 @@ COMMANDS = {
|
||||
"/insights": "Show usage insights and analytics (last 30 days)",
|
||||
"/paste": "Check clipboard for an image and attach it",
|
||||
"/reload-mcp": "Reload MCP servers from config.yaml",
|
||||
"/rollback": "List or restore filesystem checkpoints (usage: /rollback [number])",
|
||||
"/skin": "Show or change the display skin/theme",
|
||||
"/quit": "Exit the CLI (also: /exit, /q)",
|
||||
}
|
||||
|
||||
|
||||
@@ -14,8 +14,9 @@ This module provides:
|
||||
|
||||
import os
|
||||
import platform
|
||||
import sys
|
||||
import stat
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, Optional, List, Tuple
|
||||
|
||||
@@ -62,7 +63,9 @@ def ensure_hermes_home():
|
||||
DEFAULT_CONFIG = {
|
||||
"model": "anthropic/claude-opus-4.6",
|
||||
"toolsets": ["hermes-cli"],
|
||||
"max_turns": 100,
|
||||
"agent": {
|
||||
"max_turns": 90,
|
||||
},
|
||||
|
||||
"terminal": {
|
||||
"backend": "local",
|
||||
@@ -88,6 +91,14 @@ DEFAULT_CONFIG = {
|
||||
"record_sessions": False, # Auto-record browser sessions as WebM videos
|
||||
},
|
||||
|
||||
# Filesystem checkpoints — automatic snapshots before destructive file ops.
|
||||
# When enabled, the agent takes a snapshot of the working directory once per
|
||||
# conversation turn (on first write_file/patch call). Use /rollback to restore.
|
||||
"checkpoints": {
|
||||
"enabled": False,
|
||||
"max_snapshots": 50, # Max checkpoints to keep per directory
|
||||
},
|
||||
|
||||
"compression": {
|
||||
"enabled": True,
|
||||
"threshold": 0.85,
|
||||
@@ -111,8 +122,9 @@ DEFAULT_CONFIG = {
|
||||
"display": {
|
||||
"compact": False,
|
||||
"personality": "kawaii",
|
||||
"resume_display": "full", # "full" (show previous messages) | "minimal" (one-liner only)
|
||||
"bell_on_complete": False, # Play terminal bell (\a) when agent finishes a response
|
||||
"resume_display": "full",
|
||||
"bell_on_complete": False,
|
||||
"skin": "default",
|
||||
},
|
||||
|
||||
# Text-to-speech configuration
|
||||
@@ -170,7 +182,7 @@ DEFAULT_CONFIG = {
|
||||
"command_allowlist": [],
|
||||
|
||||
# Config schema version - bump this when adding new required fields
|
||||
"_config_version": 5,
|
||||
"_config_version": 6,
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
@@ -195,6 +207,22 @@ REQUIRED_ENV_VARS = {}
|
||||
# Optional environment variables that enhance functionality
|
||||
OPTIONAL_ENV_VARS = {
|
||||
# ── Provider (handled in provider selection, not shown in checklists) ──
|
||||
"NOUS_API_KEY": {
|
||||
"description": "Nous Portal API key (direct API key access to Nous inference)",
|
||||
"prompt": "Nous Portal API key",
|
||||
"url": "https://portal.nousresearch.com",
|
||||
"password": True,
|
||||
"category": "provider",
|
||||
"advanced": True,
|
||||
},
|
||||
"NOUS_BASE_URL": {
|
||||
"description": "Nous Portal base URL override",
|
||||
"prompt": "Nous Portal base URL (leave empty for default)",
|
||||
"url": None,
|
||||
"password": False,
|
||||
"category": "provider",
|
||||
"advanced": True,
|
||||
},
|
||||
"OPENROUTER_API_KEY": {
|
||||
"description": "OpenRouter API key (for vision, web scraping helpers, and MoA)",
|
||||
"prompt": "OpenRouter API key",
|
||||
@@ -748,6 +776,23 @@ def _deep_merge(base: dict, override: dict) -> dict:
|
||||
return result
|
||||
|
||||
|
||||
def _normalize_max_turns_config(config: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Normalize legacy root-level max_turns into agent.max_turns."""
|
||||
config = dict(config)
|
||||
agent_config = dict(config.get("agent") or {})
|
||||
|
||||
if "max_turns" in config and "max_turns" not in agent_config:
|
||||
agent_config["max_turns"] = config["max_turns"]
|
||||
|
||||
if "max_turns" not in agent_config:
|
||||
agent_config["max_turns"] = DEFAULT_CONFIG["agent"]["max_turns"]
|
||||
|
||||
config["agent"] = agent_config
|
||||
config.pop("max_turns", None)
|
||||
return config
|
||||
|
||||
|
||||
|
||||
def load_config() -> Dict[str, Any]:
|
||||
"""Load configuration from ~/.hermes/config.yaml."""
|
||||
import copy
|
||||
@@ -757,14 +802,21 @@ def load_config() -> Dict[str, Any]:
|
||||
|
||||
if config_path.exists():
|
||||
try:
|
||||
with open(config_path) as f:
|
||||
with open(config_path, encoding="utf-8") as f:
|
||||
user_config = yaml.safe_load(f) or {}
|
||||
|
||||
|
||||
if "max_turns" in user_config:
|
||||
agent_user_config = dict(user_config.get("agent") or {})
|
||||
if agent_user_config.get("max_turns") is None:
|
||||
agent_user_config["max_turns"] = user_config["max_turns"]
|
||||
user_config["agent"] = agent_user_config
|
||||
user_config.pop("max_turns", None)
|
||||
|
||||
config = _deep_merge(config, user_config)
|
||||
except Exception as e:
|
||||
print(f"Warning: Failed to load config: {e}")
|
||||
|
||||
return config
|
||||
return _normalize_max_turns_config(config)
|
||||
|
||||
|
||||
_COMMENTED_SECTIONS = """
|
||||
@@ -799,23 +851,27 @@ _COMMENTED_SECTIONS = """
|
||||
|
||||
def save_config(config: Dict[str, Any]):
|
||||
"""Save configuration to ~/.hermes/config.yaml."""
|
||||
from utils import atomic_yaml_write
|
||||
|
||||
ensure_hermes_home()
|
||||
config_path = get_config_path()
|
||||
|
||||
with open(config_path, 'w') as f:
|
||||
yaml.dump(config, f, default_flow_style=False, sort_keys=False)
|
||||
# Append commented-out sections for features that are off by default
|
||||
# or only relevant when explicitly configured. Skip sections the
|
||||
# user has already uncommented and configured.
|
||||
sections = []
|
||||
sec = config.get("security", {})
|
||||
if not sec or sec.get("redact_secrets") is None:
|
||||
sections.append("security")
|
||||
fb = config.get("fallback_model", {})
|
||||
if not fb or not (fb.get("provider") and fb.get("model")):
|
||||
sections.append("fallback")
|
||||
if sections:
|
||||
f.write(_COMMENTED_SECTIONS)
|
||||
normalized = _normalize_max_turns_config(config)
|
||||
|
||||
# Build optional commented-out sections for features that are off by
|
||||
# default or only relevant when explicitly configured.
|
||||
sections = []
|
||||
sec = normalized.get("security", {})
|
||||
if not sec or sec.get("redact_secrets") is None:
|
||||
sections.append("security")
|
||||
fb = normalized.get("fallback_model", {})
|
||||
if not fb or not (fb.get("provider") and fb.get("model")):
|
||||
sections.append("fallback")
|
||||
|
||||
atomic_yaml_write(
|
||||
config_path,
|
||||
normalized,
|
||||
extra_content=_COMMENTED_SECTIONS if sections else None,
|
||||
)
|
||||
|
||||
|
||||
def load_env() -> Dict[str, str]:
|
||||
@@ -869,6 +925,13 @@ def save_env_value(key: str, value: str):
|
||||
with open(env_path, 'w', **write_kw) as f:
|
||||
f.writelines(lines)
|
||||
|
||||
# Restrict .env permissions to owner-only (contains API keys)
|
||||
if not _IS_WINDOWS:
|
||||
try:
|
||||
os.chmod(env_path, stat.S_IRUSR | stat.S_IWUSR)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
|
||||
def get_env_value(key: str) -> Optional[str]:
|
||||
"""Get a value from ~/.hermes/.env or environment."""
|
||||
@@ -932,7 +995,7 @@ def show_config():
|
||||
print()
|
||||
print(color("◆ Model", Colors.CYAN, Colors.BOLD))
|
||||
print(f" Model: {config.get('model', 'not set')}")
|
||||
print(f" Max turns: {config.get('max_turns', 100)}")
|
||||
print(f" Max turns: {config.get('agent', {}).get('max_turns', DEFAULT_CONFIG['agent']['max_turns'])}")
|
||||
print(f" Toolsets: {', '.join(config.get('toolsets', ['all']))}")
|
||||
|
||||
# Terminal
|
||||
@@ -1077,7 +1140,7 @@ def set_config_value(key: str, value: str):
|
||||
user_config = {}
|
||||
if config_path.exists():
|
||||
try:
|
||||
with open(config_path) as f:
|
||||
with open(config_path, encoding="utf-8") as f:
|
||||
user_config = yaml.safe_load(f) or {}
|
||||
except Exception:
|
||||
user_config = {}
|
||||
@@ -1105,7 +1168,7 @@ def set_config_value(key: str, value: str):
|
||||
|
||||
# Write only user config back (not the full merged defaults)
|
||||
ensure_hermes_home()
|
||||
with open(config_path, 'w') as f:
|
||||
with open(config_path, 'w', encoding="utf-8") as f:
|
||||
yaml.dump(user_config, f, default_flow_style=False, sort_keys=False)
|
||||
|
||||
# Keep .env in sync for keys that terminal_tool reads directly from env vars.
|
||||
|
||||
@@ -489,6 +489,7 @@ def cmd_chat(args):
|
||||
"query": args.query,
|
||||
"resume": getattr(args, "resume", None),
|
||||
"worktree": getattr(args, "worktree", False),
|
||||
"checkpoints": getattr(args, "checkpoints", False),
|
||||
}
|
||||
# Filter out None values
|
||||
kwargs = {k: v for k, v in kwargs.items() if v is not None}
|
||||
@@ -1777,6 +1778,44 @@ def cmd_update(args):
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
def _coalesce_session_name_args(argv: list) -> list:
|
||||
"""Join unquoted multi-word session names after -c/--continue and -r/--resume.
|
||||
|
||||
When a user types ``hermes -c Pokemon Agent Dev`` without quoting the
|
||||
session name, argparse sees three separate tokens. This function merges
|
||||
them into a single argument so argparse receives
|
||||
``['-c', 'Pokemon Agent Dev']`` instead.
|
||||
|
||||
Tokens are collected after the flag until we hit another flag (``-*``)
|
||||
or a known top-level subcommand.
|
||||
"""
|
||||
_SUBCOMMANDS = {
|
||||
"chat", "model", "gateway", "setup", "whatsapp", "login", "logout",
|
||||
"status", "cron", "doctor", "config", "pairing", "skills", "tools",
|
||||
"sessions", "insights", "version", "update", "uninstall",
|
||||
}
|
||||
_SESSION_FLAGS = {"-c", "--continue", "-r", "--resume"}
|
||||
|
||||
result = []
|
||||
i = 0
|
||||
while i < len(argv):
|
||||
token = argv[i]
|
||||
if token in _SESSION_FLAGS:
|
||||
result.append(token)
|
||||
i += 1
|
||||
# Collect subsequent non-flag, non-subcommand tokens as one name
|
||||
parts: list = []
|
||||
while i < len(argv) and not argv[i].startswith("-") and argv[i] not in _SUBCOMMANDS:
|
||||
parts.append(argv[i])
|
||||
i += 1
|
||||
if parts:
|
||||
result.append(" ".join(parts))
|
||||
else:
|
||||
result.append(token)
|
||||
i += 1
|
||||
return result
|
||||
|
||||
|
||||
def main():
|
||||
"""Main entry point for hermes CLI."""
|
||||
parser = argparse.ArgumentParser(
|
||||
@@ -1889,6 +1928,12 @@ For more help on a command:
|
||||
default=False,
|
||||
help="Run in an isolated git worktree (for parallel agents on the same repo)"
|
||||
)
|
||||
chat_parser.add_argument(
|
||||
"--checkpoints",
|
||||
action="store_true",
|
||||
default=False,
|
||||
help="Enable filesystem checkpoints before destructive file operations (use /rollback to restore)"
|
||||
)
|
||||
chat_parser.set_defaults(func=cmd_chat)
|
||||
|
||||
# =========================================================================
|
||||
@@ -2356,12 +2401,12 @@ For more help on a command:
|
||||
if not data:
|
||||
print(f"Session '{args.session_id}' not found.")
|
||||
return
|
||||
with open(args.output, "w") as f:
|
||||
with open(args.output, "w", encoding="utf-8") as f:
|
||||
f.write(_json.dumps(data, ensure_ascii=False) + "\n")
|
||||
print(f"Exported 1 session to {args.output}")
|
||||
else:
|
||||
sessions = db.export_all(source=args.source)
|
||||
with open(args.output, "w") as f:
|
||||
with open(args.output, "w", encoding="utf-8") as f:
|
||||
for s in sessions:
|
||||
f.write(_json.dumps(s, ensure_ascii=False) + "\n")
|
||||
print(f"Exported {len(sessions)} sessions to {args.output}")
|
||||
@@ -2515,7 +2560,11 @@ For more help on a command:
|
||||
# =========================================================================
|
||||
# Parse and execute
|
||||
# =========================================================================
|
||||
args = parser.parse_args()
|
||||
# Pre-process argv so unquoted multi-word session names after -c / -r
|
||||
# are merged into a single token before argparse sees them.
|
||||
# e.g. ``hermes -c Pokemon Agent Dev`` → ``hermes -c 'Pokemon Agent Dev'``
|
||||
_processed_argv = _coalesce_session_name_args(sys.argv[1:])
|
||||
args = parser.parse_args(_processed_argv)
|
||||
|
||||
# Handle --version flag
|
||||
if args.version:
|
||||
|
||||
@@ -516,7 +516,8 @@ def setup_model_provider(config: dict):
|
||||
keep_label = None # No provider configured — don't show "Keep current"
|
||||
|
||||
provider_choices = [
|
||||
"Login with Nous Portal (Nous Research subscription)",
|
||||
"Nous Portal API key (direct API key access)",
|
||||
"Login with Nous Portal (Nous Research subscription — OAuth)",
|
||||
"Login with OpenAI Codex",
|
||||
"OpenRouter API key (100+ models, pay-per-use)",
|
||||
"Custom OpenAI-compatible endpoint (self-hosted / VLLM / etc.)",
|
||||
@@ -529,7 +530,7 @@ def setup_model_provider(config: dict):
|
||||
provider_choices.append(keep_label)
|
||||
|
||||
# Default to "Keep current" if a provider exists, otherwise OpenRouter (most common)
|
||||
default_provider = len(provider_choices) - 1 if has_any_provider else 2
|
||||
default_provider = len(provider_choices) - 1 if has_any_provider else 3
|
||||
|
||||
if not has_any_provider:
|
||||
print_warning("An inference provider is required for Hermes to work.")
|
||||
@@ -541,7 +542,37 @@ def setup_model_provider(config: dict):
|
||||
selected_provider = None # "nous", "openai-codex", "openrouter", "custom", or None (keep)
|
||||
nous_models = [] # populated if Nous login succeeds
|
||||
|
||||
if provider_idx == 0: # Nous Portal
|
||||
if provider_idx == 0: # Nous Portal API Key (direct)
|
||||
selected_provider = "nous-api"
|
||||
print()
|
||||
print_header("Nous Portal API Key")
|
||||
print_info("Use a Nous Portal API key for direct access to Nous inference.")
|
||||
print_info("Get your API key at: https://portal.nousresearch.com")
|
||||
print()
|
||||
|
||||
existing_key = get_env_value("NOUS_API_KEY")
|
||||
if existing_key:
|
||||
print_info(f"Current: {existing_key[:8]}... (configured)")
|
||||
if prompt_yes_no("Update Nous API key?", False):
|
||||
api_key = prompt(" Nous API key", password=True)
|
||||
if api_key:
|
||||
save_env_value("NOUS_API_KEY", api_key)
|
||||
print_success("Nous API key updated")
|
||||
else:
|
||||
api_key = prompt(" Nous API key", password=True)
|
||||
if api_key:
|
||||
save_env_value("NOUS_API_KEY", api_key)
|
||||
print_success("Nous API key saved")
|
||||
else:
|
||||
print_warning("Skipped - agent won't work without an API key")
|
||||
|
||||
# Clear custom endpoint vars if switching
|
||||
if existing_custom:
|
||||
save_env_value("OPENAI_BASE_URL", "")
|
||||
save_env_value("OPENAI_API_KEY", "")
|
||||
_update_config_for_provider("nous-api", "https://inference-api.nousresearch.com/v1")
|
||||
|
||||
elif provider_idx == 1: # Nous Portal
|
||||
selected_provider = "nous"
|
||||
print()
|
||||
print_header("Nous Portal Login")
|
||||
@@ -581,7 +612,7 @@ def setup_model_provider(config: dict):
|
||||
print_info("You can try again later with: hermes model")
|
||||
selected_provider = None
|
||||
|
||||
elif provider_idx == 1: # OpenAI Codex
|
||||
elif provider_idx == 2: # OpenAI Codex
|
||||
selected_provider = "openai-codex"
|
||||
print()
|
||||
print_header("OpenAI Codex Login")
|
||||
@@ -605,7 +636,7 @@ def setup_model_provider(config: dict):
|
||||
print_info("You can try again later with: hermes model")
|
||||
selected_provider = None
|
||||
|
||||
elif provider_idx == 2: # OpenRouter
|
||||
elif provider_idx == 3: # OpenRouter
|
||||
selected_provider = "openrouter"
|
||||
print()
|
||||
print_header("OpenRouter API Key")
|
||||
@@ -655,7 +686,7 @@ def setup_model_provider(config: dict):
|
||||
except Exception as e:
|
||||
logger.debug("Could not save provider to config.yaml: %s", e)
|
||||
|
||||
elif provider_idx == 3: # Custom endpoint
|
||||
elif provider_idx == 4: # Custom endpoint
|
||||
selected_provider = "custom"
|
||||
print()
|
||||
print_header("Custom OpenAI-Compatible Endpoint")
|
||||
@@ -706,7 +737,7 @@ def setup_model_provider(config: dict):
|
||||
|
||||
print_success("Custom endpoint configured")
|
||||
|
||||
elif provider_idx == 4: # Z.AI / GLM
|
||||
elif provider_idx == 5: # Z.AI / GLM
|
||||
selected_provider = "zai"
|
||||
print()
|
||||
print_header("Z.AI / GLM API Key")
|
||||
@@ -760,7 +791,7 @@ def setup_model_provider(config: dict):
|
||||
save_env_value("OPENAI_API_KEY", "")
|
||||
_update_config_for_provider("zai", zai_base_url)
|
||||
|
||||
elif provider_idx == 5: # Kimi / Moonshot
|
||||
elif provider_idx == 6: # Kimi / Moonshot
|
||||
selected_provider = "kimi-coding"
|
||||
print()
|
||||
print_header("Kimi / Moonshot API Key")
|
||||
@@ -792,7 +823,7 @@ def setup_model_provider(config: dict):
|
||||
save_env_value("OPENAI_API_KEY", "")
|
||||
_update_config_for_provider("kimi-coding", pconfig.inference_base_url)
|
||||
|
||||
elif provider_idx == 6: # MiniMax
|
||||
elif provider_idx == 7: # MiniMax
|
||||
selected_provider = "minimax"
|
||||
print()
|
||||
print_header("MiniMax API Key")
|
||||
@@ -824,7 +855,7 @@ def setup_model_provider(config: dict):
|
||||
save_env_value("OPENAI_API_KEY", "")
|
||||
_update_config_for_provider("minimax", pconfig.inference_base_url)
|
||||
|
||||
elif provider_idx == 7: # MiniMax China
|
||||
elif provider_idx == 8: # MiniMax China
|
||||
selected_provider = "minimax-cn"
|
||||
print()
|
||||
print_header("MiniMax China API Key")
|
||||
@@ -856,12 +887,12 @@ def setup_model_provider(config: dict):
|
||||
save_env_value("OPENAI_API_KEY", "")
|
||||
_update_config_for_provider("minimax-cn", pconfig.inference_base_url)
|
||||
|
||||
# else: provider_idx == 8 (Keep current) — only shown when a provider already exists
|
||||
# else: provider_idx == 9 (Keep current) — only shown when a provider already exists
|
||||
|
||||
# ── OpenRouter API Key for tools (if not already set) ──
|
||||
# Tools (vision, web, MoA) use OpenRouter independently of the main provider.
|
||||
# Prompt for OpenRouter key if not set and a non-OpenRouter provider was chosen.
|
||||
if selected_provider in ("nous", "openai-codex", "custom", "zai", "kimi-coding", "minimax", "minimax-cn") and not get_env_value("OPENROUTER_API_KEY"):
|
||||
if selected_provider in ("nous", "nous-api", "openai-codex", "custom", "zai", "kimi-coding", "minimax", "minimax-cn") and not get_env_value("OPENROUTER_API_KEY"):
|
||||
print()
|
||||
print_header("OpenRouter API Key (for tools)")
|
||||
print_info("Tools like vision analysis, web search, and MoA use OpenRouter")
|
||||
@@ -914,6 +945,14 @@ def setup_model_provider(config: dict):
|
||||
if custom:
|
||||
config['model'] = custom
|
||||
save_env_value("LLM_MODEL", custom)
|
||||
elif selected_provider == "nous-api":
|
||||
# Nous API key provider — prompt for model manually
|
||||
print_info("Enter a model name available on Nous inference API.")
|
||||
print_info("Examples: anthropic/claude-opus-4.6, deepseek/deepseek-r1")
|
||||
custom = prompt(f" Model name (Enter to keep '{current_model}')")
|
||||
if custom:
|
||||
config['model'] = custom
|
||||
save_env_value("LLM_MODEL", custom)
|
||||
elif selected_provider == "openai-codex":
|
||||
from hermes_cli.codex_models import get_codex_model_ids
|
||||
codex_models = get_codex_model_ids()
|
||||
@@ -1309,7 +1348,7 @@ def setup_agent_settings(config: dict):
|
||||
# ── Max Iterations ──
|
||||
print_header("Agent Settings")
|
||||
|
||||
current_max = get_env_value('HERMES_MAX_ITERATIONS') or '90'
|
||||
current_max = get_env_value('HERMES_MAX_ITERATIONS') or str(config.get('agent', {}).get('max_turns', 90))
|
||||
print_info("Maximum tool-calling iterations per conversation.")
|
||||
print_info("Higher = more complex tasks, but costs more tokens.")
|
||||
print_info("Recommended: 30-60 for most tasks, 100+ for open exploration.")
|
||||
@@ -1319,7 +1358,8 @@ def setup_agent_settings(config: dict):
|
||||
max_iter = int(max_iter_str)
|
||||
if max_iter > 0:
|
||||
save_env_value("HERMES_MAX_ITERATIONS", str(max_iter))
|
||||
config['max_turns'] = max_iter
|
||||
config.setdefault('agent', {})['max_turns'] = max_iter
|
||||
config.pop('max_turns', None)
|
||||
print_success(f"Max iterations set to {max_iter}")
|
||||
except ValueError:
|
||||
print_warning("Invalid number, keeping current value")
|
||||
|
||||
@@ -0,0 +1,630 @@
|
||||
"""Hermes CLI skin/theme engine.
|
||||
|
||||
A data-driven skin system that lets users customize the CLI's visual appearance.
|
||||
Skins are defined as YAML files in ~/.hermes/skins/ or as built-in presets.
|
||||
No code changes are needed to add a new skin.
|
||||
|
||||
SKIN YAML SCHEMA
|
||||
================
|
||||
|
||||
All fields are optional. Missing values inherit from the ``default`` skin.
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# Required: skin identity
|
||||
name: mytheme # Unique skin name (lowercase, hyphens ok)
|
||||
description: Short description # Shown in /skin listing
|
||||
|
||||
# Colors: hex values for Rich markup (banner, UI, response box)
|
||||
colors:
|
||||
banner_border: "#CD7F32" # Panel border color
|
||||
banner_title: "#FFD700" # Panel title text color
|
||||
banner_accent: "#FFBF00" # Section headers (Available Tools, etc.)
|
||||
banner_dim: "#B8860B" # Dim/muted text (separators, labels)
|
||||
banner_text: "#FFF8DC" # Body text (tool names, skill names)
|
||||
ui_accent: "#FFBF00" # General UI accent
|
||||
ui_label: "#4dd0e1" # UI labels
|
||||
ui_ok: "#4caf50" # Success indicators
|
||||
ui_error: "#ef5350" # Error indicators
|
||||
ui_warn: "#ffa726" # Warning indicators
|
||||
prompt: "#FFF8DC" # Prompt text color
|
||||
input_rule: "#CD7F32" # Input area horizontal rule
|
||||
response_border: "#FFD700" # Response box border (ANSI)
|
||||
session_label: "#DAA520" # Session label color
|
||||
session_border: "#8B8682" # Session ID dim color
|
||||
|
||||
# Spinner: customize the animated spinner during API calls
|
||||
spinner:
|
||||
waiting_faces: # Faces shown while waiting for API
|
||||
- "(⚔)"
|
||||
- "(⛨)"
|
||||
thinking_faces: # Faces shown during reasoning
|
||||
- "(⌁)"
|
||||
- "(<>)"
|
||||
thinking_verbs: # Verbs for spinner messages
|
||||
- "forging"
|
||||
- "plotting"
|
||||
wings: # Optional left/right spinner decorations
|
||||
- ["⟪⚔", "⚔⟫"] # Each entry is [left, right] pair
|
||||
- ["⟪▲", "▲⟫"]
|
||||
|
||||
# Branding: text strings used throughout the CLI
|
||||
branding:
|
||||
agent_name: "Hermes Agent" # Banner title, status display
|
||||
welcome: "Welcome message" # Shown at CLI startup
|
||||
goodbye: "Goodbye! ⚕" # Shown on exit
|
||||
response_label: " ⚕ Hermes " # Response box header label
|
||||
prompt_symbol: "❯ " # Input prompt symbol
|
||||
help_header: "(^_^)? Commands" # /help header text
|
||||
|
||||
# Tool prefix: character for tool output lines (default: ┊)
|
||||
tool_prefix: "┊"
|
||||
|
||||
USAGE
|
||||
=====
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from hermes_cli.skin_engine import get_active_skin, list_skins, set_active_skin
|
||||
|
||||
skin = get_active_skin()
|
||||
print(skin.colors["banner_title"]) # "#FFD700"
|
||||
print(skin.get_branding("agent_name")) # "Hermes Agent"
|
||||
|
||||
set_active_skin("ares") # Switch to built-in ares skin
|
||||
set_active_skin("mytheme") # Switch to user skin from ~/.hermes/skins/
|
||||
|
||||
BUILT-IN SKINS
|
||||
==============
|
||||
|
||||
- ``default`` — Classic Hermes gold/kawaii (the current look)
|
||||
- ``ares`` — Crimson/bronze war-god theme with custom spinner wings
|
||||
- ``mono`` — Clean grayscale monochrome
|
||||
- ``slate`` — Cool blue developer-focused theme
|
||||
|
||||
USER SKINS
|
||||
==========
|
||||
|
||||
Drop a YAML file in ``~/.hermes/skins/<name>.yaml`` following the schema above.
|
||||
Activate with ``/skin <name>`` in the CLI or ``display.skin: <name>`` in config.yaml.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Skin data structure
|
||||
# =============================================================================
|
||||
|
||||
@dataclass
|
||||
class SkinConfig:
|
||||
"""Complete skin configuration."""
|
||||
name: str
|
||||
description: str = ""
|
||||
colors: Dict[str, str] = field(default_factory=dict)
|
||||
spinner: Dict[str, Any] = field(default_factory=dict)
|
||||
branding: Dict[str, str] = field(default_factory=dict)
|
||||
tool_prefix: str = "┊"
|
||||
banner_logo: str = "" # Rich-markup ASCII art logo (replaces HERMES_AGENT_LOGO)
|
||||
banner_hero: str = "" # Rich-markup hero art (replaces HERMES_CADUCEUS)
|
||||
|
||||
def get_color(self, key: str, fallback: str = "") -> str:
|
||||
"""Get a color value with fallback."""
|
||||
return self.colors.get(key, fallback)
|
||||
|
||||
def get_spinner_list(self, key: str) -> List[str]:
|
||||
"""Get a spinner list (faces, verbs, etc.)."""
|
||||
return self.spinner.get(key, [])
|
||||
|
||||
def get_spinner_wings(self) -> List[Tuple[str, str]]:
|
||||
"""Get spinner wing pairs, or empty list if none."""
|
||||
raw = self.spinner.get("wings", [])
|
||||
result = []
|
||||
for pair in raw:
|
||||
if isinstance(pair, (list, tuple)) and len(pair) == 2:
|
||||
result.append((str(pair[0]), str(pair[1])))
|
||||
return result
|
||||
|
||||
def get_branding(self, key: str, fallback: str = "") -> str:
|
||||
"""Get a branding value with fallback."""
|
||||
return self.branding.get(key, fallback)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Built-in skin definitions
|
||||
# =============================================================================
|
||||
|
||||
_BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
|
||||
"default": {
|
||||
"name": "default",
|
||||
"description": "Classic Hermes — gold and kawaii",
|
||||
"colors": {
|
||||
"banner_border": "#CD7F32",
|
||||
"banner_title": "#FFD700",
|
||||
"banner_accent": "#FFBF00",
|
||||
"banner_dim": "#B8860B",
|
||||
"banner_text": "#FFF8DC",
|
||||
"ui_accent": "#FFBF00",
|
||||
"ui_label": "#4dd0e1",
|
||||
"ui_ok": "#4caf50",
|
||||
"ui_error": "#ef5350",
|
||||
"ui_warn": "#ffa726",
|
||||
"prompt": "#FFF8DC",
|
||||
"input_rule": "#CD7F32",
|
||||
"response_border": "#FFD700",
|
||||
"session_label": "#DAA520",
|
||||
"session_border": "#8B8682",
|
||||
},
|
||||
"spinner": {
|
||||
# Empty = use hardcoded defaults in display.py
|
||||
},
|
||||
"branding": {
|
||||
"agent_name": "Hermes Agent",
|
||||
"welcome": "Welcome to Hermes Agent! Type your message or /help for commands.",
|
||||
"goodbye": "Goodbye! ⚕",
|
||||
"response_label": " ⚕ Hermes ",
|
||||
"prompt_symbol": "❯ ",
|
||||
"help_header": "(^_^)? Available Commands",
|
||||
},
|
||||
"tool_prefix": "┊",
|
||||
},
|
||||
"ares": {
|
||||
"name": "ares",
|
||||
"description": "War-god theme — crimson and bronze",
|
||||
"colors": {
|
||||
"banner_border": "#9F1C1C",
|
||||
"banner_title": "#C7A96B",
|
||||
"banner_accent": "#DD4A3A",
|
||||
"banner_dim": "#6B1717",
|
||||
"banner_text": "#F1E6CF",
|
||||
"ui_accent": "#DD4A3A",
|
||||
"ui_label": "#C7A96B",
|
||||
"ui_ok": "#4caf50",
|
||||
"ui_error": "#ef5350",
|
||||
"ui_warn": "#ffa726",
|
||||
"prompt": "#F1E6CF",
|
||||
"input_rule": "#9F1C1C",
|
||||
"response_border": "#C7A96B",
|
||||
"session_label": "#C7A96B",
|
||||
"session_border": "#6E584B",
|
||||
},
|
||||
"spinner": {
|
||||
"waiting_faces": ["(⚔)", "(⛨)", "(▲)", "(<>)", "(/)"],
|
||||
"thinking_faces": ["(⚔)", "(⛨)", "(▲)", "(⌁)", "(<>)"],
|
||||
"thinking_verbs": [
|
||||
"forging", "marching", "sizing the field", "holding the line",
|
||||
"hammering plans", "tempering steel", "plotting impact", "raising the shield",
|
||||
],
|
||||
"wings": [
|
||||
["⟪⚔", "⚔⟫"],
|
||||
["⟪▲", "▲⟫"],
|
||||
["⟪╸", "╺⟫"],
|
||||
["⟪⛨", "⛨⟫"],
|
||||
],
|
||||
},
|
||||
"branding": {
|
||||
"agent_name": "Ares Agent",
|
||||
"welcome": "Welcome to Ares Agent! Type your message or /help for commands.",
|
||||
"goodbye": "Farewell, warrior! ⚔",
|
||||
"response_label": " ⚔ Ares ",
|
||||
"prompt_symbol": "⚔ ❯ ",
|
||||
"help_header": "(⚔) Available Commands",
|
||||
},
|
||||
"tool_prefix": "╎",
|
||||
"banner_logo": """[bold #A3261F] █████╗ ██████╗ ███████╗███████╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
|
||||
[bold #B73122]██╔══██╗██╔══██╗██╔════╝██╔════╝ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
|
||||
[#C93C24]███████║██████╔╝█████╗ ███████╗█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
|
||||
[#D84A28]██╔══██║██╔══██╗██╔══╝ ╚════██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
|
||||
[#E15A2D]██║ ██║██║ ██║███████╗███████║ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
|
||||
[#EB6C32]╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝╚══════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
|
||||
"banner_hero": """[#9F1C1C]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣤⣤⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#9F1C1C]⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣴⣿⠟⠻⣿⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#C7A96B]⠀⠀⠀⠀⠀⠀⠀⣠⣾⡿⠋⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#C7A96B]⠀⠀⠀⠀⠀⢀⣾⡿⠋⠀⠀⢠⡄⠀⠀⠙⢿⣷⡀⠀⠀⠀⠀⠀[/]
|
||||
[#DD4A3A]⠀⠀⠀⠀⣰⣿⠟⠀⠀⠀⣰⣿⣿⣆⠀⠀⠀⠻⣿⣆⠀⠀⠀⠀[/]
|
||||
[#DD4A3A]⠀⠀⠀⢰⣿⠏⠀⠀⢀⣾⡿⠉⢿⣷⡀⠀⠀⠹⣿⡆⠀⠀⠀[/]
|
||||
[#9F1C1C]⠀⠀⠀⣿⡟⠀⠀⣠⣿⠟⠀⠀⠀⠻⣿⣄⠀⠀⢻⣿⠀⠀⠀[/]
|
||||
[#9F1C1C]⠀⠀⠀⣿⡇⠀⠀⠙⠋⠀⠀⚔⠀⠀⠙⠋⠀⠀⢸⣿⠀⠀⠀[/]
|
||||
[#6B1717]⠀⠀⠀⢿⣧⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣼⡿⠀⠀⠀[/]
|
||||
[#6B1717]⠀⠀⠀⠘⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⣠⣾⡿⠃⠀⠀⠀[/]
|
||||
[#C7A96B]⠀⠀⠀⠀⠈⠻⣿⣷⣦⣤⣀⣀⣤⣤⣶⣿⠿⠋⠀⠀⠀⠀[/]
|
||||
[#C7A96B]⠀⠀⠀⠀⠀⠀⠀⠉⠛⠿⠿⠿⠿⠛⠉⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#DD4A3A]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⚔⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[dim #6B1717]⠀⠀⠀⠀⠀⠀⠀⠀war god online⠀⠀⠀⠀⠀⠀⠀⠀[/]""",
|
||||
},
|
||||
"mono": {
|
||||
"name": "mono",
|
||||
"description": "Monochrome — clean grayscale",
|
||||
"colors": {
|
||||
"banner_border": "#555555",
|
||||
"banner_title": "#e6edf3",
|
||||
"banner_accent": "#aaaaaa",
|
||||
"banner_dim": "#444444",
|
||||
"banner_text": "#c9d1d9",
|
||||
"ui_accent": "#aaaaaa",
|
||||
"ui_label": "#888888",
|
||||
"ui_ok": "#888888",
|
||||
"ui_error": "#cccccc",
|
||||
"ui_warn": "#999999",
|
||||
"prompt": "#c9d1d9",
|
||||
"input_rule": "#444444",
|
||||
"response_border": "#aaaaaa",
|
||||
"session_label": "#888888",
|
||||
"session_border": "#555555",
|
||||
},
|
||||
"spinner": {},
|
||||
"branding": {
|
||||
"agent_name": "Hermes Agent",
|
||||
"welcome": "Welcome to Hermes Agent! Type your message or /help for commands.",
|
||||
"goodbye": "Goodbye! ⚕",
|
||||
"response_label": " ⚕ Hermes ",
|
||||
"prompt_symbol": "❯ ",
|
||||
"help_header": "[?] Available Commands",
|
||||
},
|
||||
"tool_prefix": "┊",
|
||||
},
|
||||
"slate": {
|
||||
"name": "slate",
|
||||
"description": "Cool blue — developer-focused",
|
||||
"colors": {
|
||||
"banner_border": "#4169e1",
|
||||
"banner_title": "#7eb8f6",
|
||||
"banner_accent": "#8EA8FF",
|
||||
"banner_dim": "#4b5563",
|
||||
"banner_text": "#c9d1d9",
|
||||
"ui_accent": "#7eb8f6",
|
||||
"ui_label": "#8EA8FF",
|
||||
"ui_ok": "#63D0A6",
|
||||
"ui_error": "#F7A072",
|
||||
"ui_warn": "#e6a855",
|
||||
"prompt": "#c9d1d9",
|
||||
"input_rule": "#4169e1",
|
||||
"response_border": "#7eb8f6",
|
||||
"session_label": "#7eb8f6",
|
||||
"session_border": "#4b5563",
|
||||
},
|
||||
"spinner": {},
|
||||
"branding": {
|
||||
"agent_name": "Hermes Agent",
|
||||
"welcome": "Welcome to Hermes Agent! Type your message or /help for commands.",
|
||||
"goodbye": "Goodbye! ⚕",
|
||||
"response_label": " ⚕ Hermes ",
|
||||
"prompt_symbol": "❯ ",
|
||||
"help_header": "(^_^)? Available Commands",
|
||||
},
|
||||
"tool_prefix": "┊",
|
||||
},
|
||||
"poseidon": {
|
||||
"name": "poseidon",
|
||||
"description": "Ocean-god theme — deep blue and seafoam",
|
||||
"colors": {
|
||||
"banner_border": "#2A6FB9",
|
||||
"banner_title": "#A9DFFF",
|
||||
"banner_accent": "#5DB8F5",
|
||||
"banner_dim": "#153C73",
|
||||
"banner_text": "#EAF7FF",
|
||||
"ui_accent": "#5DB8F5",
|
||||
"ui_label": "#A9DFFF",
|
||||
"ui_ok": "#4caf50",
|
||||
"ui_error": "#ef5350",
|
||||
"ui_warn": "#ffa726",
|
||||
"prompt": "#EAF7FF",
|
||||
"input_rule": "#2A6FB9",
|
||||
"response_border": "#5DB8F5",
|
||||
"session_label": "#A9DFFF",
|
||||
"session_border": "#496884",
|
||||
},
|
||||
"spinner": {
|
||||
"waiting_faces": ["(≈)", "(Ψ)", "(∿)", "(◌)", "(◠)"],
|
||||
"thinking_faces": ["(Ψ)", "(∿)", "(≈)", "(⌁)", "(◌)"],
|
||||
"thinking_verbs": [
|
||||
"charting currents", "sounding the depth", "reading foam lines",
|
||||
"steering the trident", "tracking undertow", "plotting sea lanes",
|
||||
"calling the swell", "measuring pressure",
|
||||
],
|
||||
"wings": [
|
||||
["⟪≈", "≈⟫"],
|
||||
["⟪Ψ", "Ψ⟫"],
|
||||
["⟪∿", "∿⟫"],
|
||||
["⟪◌", "◌⟫"],
|
||||
],
|
||||
},
|
||||
"branding": {
|
||||
"agent_name": "Poseidon Agent",
|
||||
"welcome": "Welcome to Poseidon Agent! Type your message or /help for commands.",
|
||||
"goodbye": "Fair winds! Ψ",
|
||||
"response_label": " Ψ Poseidon ",
|
||||
"prompt_symbol": "Ψ ❯ ",
|
||||
"help_header": "(Ψ) Available Commands",
|
||||
},
|
||||
"tool_prefix": "│",
|
||||
"banner_logo": """[bold #B8E8FF]██████╗ ██████╗ ███████╗██╗██████╗ ███████╗ ██████╗ ███╗ ██╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
|
||||
[bold #97D6FF]██╔══██╗██╔═══██╗██╔════╝██║██╔══██╗██╔════╝██╔═══██╗████╗ ██║ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
|
||||
[#75C1F6]██████╔╝██║ ██║███████╗██║██║ ██║█████╗ ██║ ██║██╔██╗ ██║█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
|
||||
[#4FA2E0]██╔═══╝ ██║ ██║╚════██║██║██║ ██║██╔══╝ ██║ ██║██║╚██╗██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
|
||||
[#2E7CC7]██║ ╚██████╔╝███████║██║██████╔╝███████╗╚██████╔╝██║ ╚████║ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
|
||||
[#1B4F95]╚═╝ ╚═════╝ ╚══════╝╚═╝╚═════╝ ╚══════╝ ╚═════╝ ╚═╝ ╚═══╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
|
||||
"banner_hero": """[#2A6FB9]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#5DB8F5]⠀⠀⠀⠀⠀⠀⠀⠀⠀⣠⣾⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#5DB8F5]⠀⠀⠀⠀⠀⠀⠀⢠⣿⠏⠀Ψ⠀⠹⣿⡄⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#A9DFFF]⠀⠀⠀⠀⠀⠀⠀⣿⡟⠀⠀⠀⠀⠀⢻⣿⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#A9DFFF]⠀⠀⠀≈≈≈≈≈⣿⡇⠀⠀⠀⠀⠀⢸⣿≈≈≈≈≈⠀⠀⠀[/]
|
||||
[#5DB8F5]⠀⠀⠀⠀⠀⠀⠀⣿⡇⠀⠀⠀⠀⠀⢸⣿⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#2A6FB9]⠀⠀⠀⠀⠀⠀⠀⢿⣧⠀⠀⠀⠀⠀⣼⡿⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#2A6FB9]⠀⠀⠀⠀⠀⠀⠀⠘⢿⣷⣄⣀⣠⣾⡿⠃⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#153C73]⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣿⣿⡿⠟⠁⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#153C73]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#5DB8F5]⠀⠀⠀⠀⠀≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈⠀⠀⠀⠀⠀[/]
|
||||
[#A9DFFF]⠀⠀⠀⠀⠀⠀≈≈≈≈≈≈≈≈≈≈≈≈≈⠀⠀⠀⠀⠀⠀[/]
|
||||
[dim #153C73]⠀⠀⠀⠀⠀⠀⠀deep waters hold⠀⠀⠀⠀⠀⠀⠀[/]""",
|
||||
},
|
||||
"sisyphus": {
|
||||
"name": "sisyphus",
|
||||
"description": "Sisyphean theme — austere grayscale with persistence",
|
||||
"colors": {
|
||||
"banner_border": "#B7B7B7",
|
||||
"banner_title": "#F5F5F5",
|
||||
"banner_accent": "#E7E7E7",
|
||||
"banner_dim": "#4A4A4A",
|
||||
"banner_text": "#D3D3D3",
|
||||
"ui_accent": "#E7E7E7",
|
||||
"ui_label": "#D3D3D3",
|
||||
"ui_ok": "#919191",
|
||||
"ui_error": "#E7E7E7",
|
||||
"ui_warn": "#B7B7B7",
|
||||
"prompt": "#F5F5F5",
|
||||
"input_rule": "#656565",
|
||||
"response_border": "#B7B7B7",
|
||||
"session_label": "#919191",
|
||||
"session_border": "#656565",
|
||||
},
|
||||
"spinner": {
|
||||
"waiting_faces": ["(◉)", "(◌)", "(◬)", "(⬤)", "(::)"],
|
||||
"thinking_faces": ["(◉)", "(◬)", "(◌)", "(○)", "(●)"],
|
||||
"thinking_verbs": [
|
||||
"finding traction", "measuring the grade", "resetting the boulder",
|
||||
"counting the ascent", "testing leverage", "setting the shoulder",
|
||||
"pushing uphill", "enduring the loop",
|
||||
],
|
||||
"wings": [
|
||||
["⟪◉", "◉⟫"],
|
||||
["⟪◬", "◬⟫"],
|
||||
["⟪◌", "◌⟫"],
|
||||
["⟪⬤", "⬤⟫"],
|
||||
],
|
||||
},
|
||||
"branding": {
|
||||
"agent_name": "Sisyphus Agent",
|
||||
"welcome": "Welcome to Sisyphus Agent! Type your message or /help for commands.",
|
||||
"goodbye": "The boulder waits. ◉",
|
||||
"response_label": " ◉ Sisyphus ",
|
||||
"prompt_symbol": "◉ ❯ ",
|
||||
"help_header": "(◉) Available Commands",
|
||||
},
|
||||
"tool_prefix": "│",
|
||||
"banner_logo": """[bold #F5F5F5]███████╗██╗███████╗██╗ ██╗██████╗ ██╗ ██╗██╗ ██╗███████╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
|
||||
[bold #E7E7E7]██╔════╝██║██╔════╝╚██╗ ██╔╝██╔══██╗██║ ██║██║ ██║██╔════╝ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
|
||||
[#D7D7D7]███████╗██║███████╗ ╚████╔╝ ██████╔╝███████║██║ ██║███████╗█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
|
||||
[#BFBFBF]╚════██║██║╚════██║ ╚██╔╝ ██╔═══╝ ██╔══██║██║ ██║╚════██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
|
||||
[#8F8F8F]███████║██║███████║ ██║ ██║ ██║ ██║╚██████╔╝███████║ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
|
||||
[#626262]╚══════╝╚═╝╚══════╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
|
||||
"banner_hero": """[#B7B7B7]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣀⣀⣀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#D3D3D3]⠀⠀⠀⠀⠀⠀⠀⣠⣾⣿⣿⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#E7E7E7]⠀⠀⠀⠀⠀⠀⣾⣿⣿⣿⣿⣿⣿⣿⣷⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#F5F5F5]⠀⠀⠀⠀⠀⢸⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⠀⠀⠀⠀⠀⠀[/]
|
||||
[#E7E7E7]⠀⠀⠀⠀⠀⠀⣿⣿⣿⣿⣿⣿⣿⣿⣿⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#D3D3D3]⠀⠀⠀⠀⠀⠀⠘⢿⣿⣿⣿⣿⣿⡿⠃⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#B7B7B7]⠀⠀⠀⠀⠀⠀⠀⠀⠙⠿⣿⠿⠋⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#919191]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#656565]⠀⠀⠀⠀⠀⠀⠀⠀⠀⣰⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#656565]⠀⠀⠀⠀⠀⠀⠀⠀⣰⣿⣿⣆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#4A4A4A]⠀⠀⠀⠀⠀⠀⠀⣰⣿⣿⣿⣿⣆⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#4A4A4A]⠀⠀⠀⠀⠀⣀⣴⣿⣿⣿⣿⣿⣿⣦⣀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#656565]⠀⠀⠀━━━━━━━━━━━━━━━━━━━━━━━⠀⠀⠀[/]
|
||||
[dim #4A4A4A]⠀⠀⠀⠀⠀⠀⠀⠀⠀the boulder⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]""",
|
||||
},
|
||||
"charizard": {
|
||||
"name": "charizard",
|
||||
"description": "Volcanic theme — burnt orange and ember",
|
||||
"colors": {
|
||||
"banner_border": "#C75B1D",
|
||||
"banner_title": "#FFD39A",
|
||||
"banner_accent": "#F29C38",
|
||||
"banner_dim": "#7A3511",
|
||||
"banner_text": "#FFF0D4",
|
||||
"ui_accent": "#F29C38",
|
||||
"ui_label": "#FFD39A",
|
||||
"ui_ok": "#4caf50",
|
||||
"ui_error": "#ef5350",
|
||||
"ui_warn": "#ffa726",
|
||||
"prompt": "#FFF0D4",
|
||||
"input_rule": "#C75B1D",
|
||||
"response_border": "#F29C38",
|
||||
"session_label": "#FFD39A",
|
||||
"session_border": "#6C4724",
|
||||
},
|
||||
"spinner": {
|
||||
"waiting_faces": ["(✦)", "(▲)", "(◇)", "(<>)", "(🔥)"],
|
||||
"thinking_faces": ["(✦)", "(▲)", "(◇)", "(⌁)", "(🔥)"],
|
||||
"thinking_verbs": [
|
||||
"banking into the draft", "measuring burn", "reading the updraft",
|
||||
"tracking ember fall", "setting wing angle", "holding the flame core",
|
||||
"plotting a hot landing", "coiling for lift",
|
||||
],
|
||||
"wings": [
|
||||
["⟪✦", "✦⟫"],
|
||||
["⟪▲", "▲⟫"],
|
||||
["⟪◌", "◌⟫"],
|
||||
["⟪◇", "◇⟫"],
|
||||
],
|
||||
},
|
||||
"branding": {
|
||||
"agent_name": "Charizard Agent",
|
||||
"welcome": "Welcome to Charizard Agent! Type your message or /help for commands.",
|
||||
"goodbye": "Flame out! ✦",
|
||||
"response_label": " ✦ Charizard ",
|
||||
"prompt_symbol": "✦ ❯ ",
|
||||
"help_header": "(✦) Available Commands",
|
||||
},
|
||||
"tool_prefix": "│",
|
||||
"banner_logo": """[bold #FFF0D4] ██████╗██╗ ██╗ █████╗ ██████╗ ██╗███████╗ █████╗ ██████╗ ██████╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
|
||||
[bold #FFD39A]██╔════╝██║ ██║██╔══██╗██╔══██╗██║╚══███╔╝██╔══██╗██╔══██╗██╔══██╗ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
|
||||
[#F29C38]██║ ███████║███████║██████╔╝██║ ███╔╝ ███████║██████╔╝██║ ██║█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
|
||||
[#E2832B]██║ ██╔══██║██╔══██║██╔══██╗██║ ███╔╝ ██╔══██║██╔══██╗██║ ██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
|
||||
[#C75B1D]╚██████╗██║ ██║██║ ██║██║ ██║██║███████╗██║ ██║██║ ██║██████╔╝ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
|
||||
[#7A3511] ╚═════╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
|
||||
"banner_hero": """[#FFD39A]⠀⠀⠀⠀⠀⠀⠀⠀⣀⣤⠶⠶⠶⣤⣀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#F29C38]⠀⠀⠀⠀⠀⠀⣴⠟⠁⠀⠀⠀⠀⠈⠻⣦⠀⠀⠀⠀⠀⠀[/]
|
||||
[#F29C38]⠀⠀⠀⠀⠀⣼⠏⠀⠀⠀✦⠀⠀⠀⠀⠹⣧⠀⠀⠀⠀⠀[/]
|
||||
[#E2832B]⠀⠀⠀⠀⢰⡟⠀⠀⣀⣤⣤⣤⣀⠀⠀⠀⢻⡆⠀⠀⠀⠀[/]
|
||||
[#E2832B]⠀⠀⣠⡾⠛⠁⣠⣾⠟⠉⠀⠉⠻⣷⣄⠀⠈⠛⢷⣄⠀⠀[/]
|
||||
[#C75B1D]⠀⣼⠟⠀⢀⣾⠟⠁⠀⠀⠀⠀⠀⠈⠻⣷⡀⠀⠻⣧⠀[/]
|
||||
[#C75B1D]⢸⡟⠀⠀⣿⡟⠀⠀⠀🔥⠀⠀⠀⠀⢻⣿⠀⠀⢻⡇[/]
|
||||
[#7A3511]⠀⠻⣦⡀⠘⢿⣧⡀⠀⠀⠀⠀⠀⢀⣼⡿⠃⢀⣴⠟⠀[/]
|
||||
[#7A3511]⠀⠀⠈⠻⣦⣀⠙⢿⣷⣤⣤⣤⣾⡿⠋⣀⣴⠟⠁⠀⠀[/]
|
||||
[#C75B1D]⠀⠀⠀⠀⠈⠙⠛⠶⠤⠭⠭⠤⠶⠛⠋⠁⠀⠀⠀⠀[/]
|
||||
[#F29C38]⠀⠀⠀⠀⠀⠀⠀⠀⣰⡿⢿⣆⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[#F29C38]⠀⠀⠀⠀⠀⠀⠀⣼⡟⠀⠀⢻⣧⠀⠀⠀⠀⠀⠀⠀⠀[/]
|
||||
[dim #7A3511]⠀⠀⠀⠀⠀⠀⠀tail flame lit⠀⠀⠀⠀⠀⠀⠀⠀[/]""",
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Skin loading and management
|
||||
# =============================================================================
|
||||
|
||||
_active_skin: Optional[SkinConfig] = None
|
||||
_active_skin_name: str = "default"
|
||||
|
||||
|
||||
def _skins_dir() -> Path:
|
||||
"""User skins directory."""
|
||||
home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
|
||||
return home / "skins"
|
||||
|
||||
|
||||
def _load_skin_from_yaml(path: Path) -> Optional[Dict[str, Any]]:
|
||||
"""Load a skin definition from a YAML file."""
|
||||
try:
|
||||
import yaml
|
||||
with open(path, "r", encoding="utf-8") as f:
|
||||
data = yaml.safe_load(f)
|
||||
if isinstance(data, dict) and "name" in data:
|
||||
return data
|
||||
except Exception as e:
|
||||
logger.debug("Failed to load skin from %s: %s", path, e)
|
||||
return None
|
||||
|
||||
|
||||
def _build_skin_config(data: Dict[str, Any]) -> SkinConfig:
|
||||
"""Build a SkinConfig from a raw dict (built-in or loaded from YAML)."""
|
||||
# Start with default values as base for missing keys
|
||||
default = _BUILTIN_SKINS["default"]
|
||||
colors = dict(default.get("colors", {}))
|
||||
colors.update(data.get("colors", {}))
|
||||
spinner = dict(default.get("spinner", {}))
|
||||
spinner.update(data.get("spinner", {}))
|
||||
branding = dict(default.get("branding", {}))
|
||||
branding.update(data.get("branding", {}))
|
||||
|
||||
return SkinConfig(
|
||||
name=data.get("name", "unknown"),
|
||||
description=data.get("description", ""),
|
||||
colors=colors,
|
||||
spinner=spinner,
|
||||
branding=branding,
|
||||
tool_prefix=data.get("tool_prefix", default.get("tool_prefix", "┊")),
|
||||
banner_logo=data.get("banner_logo", ""),
|
||||
banner_hero=data.get("banner_hero", ""),
|
||||
)
|
||||
|
||||
|
||||
def list_skins() -> List[Dict[str, str]]:
|
||||
"""List all available skins (built-in + user-installed).
|
||||
|
||||
Returns list of {"name": ..., "description": ..., "source": "builtin"|"user"}.
|
||||
"""
|
||||
result = []
|
||||
for name, data in _BUILTIN_SKINS.items():
|
||||
result.append({
|
||||
"name": name,
|
||||
"description": data.get("description", ""),
|
||||
"source": "builtin",
|
||||
})
|
||||
|
||||
skins_path = _skins_dir()
|
||||
if skins_path.is_dir():
|
||||
for f in sorted(skins_path.glob("*.yaml")):
|
||||
data = _load_skin_from_yaml(f)
|
||||
if data:
|
||||
skin_name = data.get("name", f.stem)
|
||||
# Skip if it shadows a built-in
|
||||
if any(s["name"] == skin_name for s in result):
|
||||
continue
|
||||
result.append({
|
||||
"name": skin_name,
|
||||
"description": data.get("description", ""),
|
||||
"source": "user",
|
||||
})
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def load_skin(name: str) -> SkinConfig:
|
||||
"""Load a skin by name. Checks user skins first, then built-in."""
|
||||
# Check user skins directory
|
||||
skins_path = _skins_dir()
|
||||
user_file = skins_path / f"{name}.yaml"
|
||||
if user_file.is_file():
|
||||
data = _load_skin_from_yaml(user_file)
|
||||
if data:
|
||||
return _build_skin_config(data)
|
||||
|
||||
# Check built-in skins
|
||||
if name in _BUILTIN_SKINS:
|
||||
return _build_skin_config(_BUILTIN_SKINS[name])
|
||||
|
||||
# Fallback to default
|
||||
logger.warning("Skin '%s' not found, using default", name)
|
||||
return _build_skin_config(_BUILTIN_SKINS["default"])
|
||||
|
||||
|
||||
def get_active_skin() -> SkinConfig:
|
||||
"""Get the currently active skin config (cached)."""
|
||||
global _active_skin
|
||||
if _active_skin is None:
|
||||
_active_skin = load_skin(_active_skin_name)
|
||||
return _active_skin
|
||||
|
||||
|
||||
def set_active_skin(name: str) -> SkinConfig:
|
||||
"""Switch the active skin. Returns the new SkinConfig."""
|
||||
global _active_skin, _active_skin_name
|
||||
_active_skin_name = name
|
||||
_active_skin = load_skin(name)
|
||||
return _active_skin
|
||||
|
||||
|
||||
def get_active_skin_name() -> str:
|
||||
"""Get the name of the currently active skin."""
|
||||
return _active_skin_name
|
||||
|
||||
|
||||
def init_skin_from_config(config: dict) -> None:
|
||||
"""Initialize the active skin from CLI config at startup.
|
||||
|
||||
Call this once during CLI init with the loaded config dict.
|
||||
"""
|
||||
display = config.get("display", {})
|
||||
skin_name = display.get("skin", "default")
|
||||
if isinstance(skin_name, str) and skin_name.strip():
|
||||
set_active_skin(skin_name.strip())
|
||||
else:
|
||||
set_active_skin("default")
|
||||
@@ -263,7 +263,7 @@ def show_status(args):
|
||||
if jobs_file.exists():
|
||||
import json
|
||||
try:
|
||||
with open(jobs_file) as f:
|
||||
with open(jobs_file, encoding="utf-8") as f:
|
||||
data = json.load(f)
|
||||
jobs = data.get("jobs", [])
|
||||
enabled_jobs = [j for j in jobs if j.get("enabled", True)]
|
||||
@@ -283,7 +283,7 @@ def show_status(args):
|
||||
if sessions_file.exists():
|
||||
import json
|
||||
try:
|
||||
with open(sessions_file) as f:
|
||||
with open(sessions_file, encoding="utf-8") as f:
|
||||
data = json.load(f)
|
||||
print(f" Active: {len(data)} session(s)")
|
||||
except Exception:
|
||||
|
||||
@@ -7,3 +7,6 @@ without risk of circular imports.
|
||||
OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
|
||||
OPENROUTER_MODELS_URL = f"{OPENROUTER_BASE_URL}/models"
|
||||
OPENROUTER_CHAT_URL = f"{OPENROUTER_BASE_URL}/chat/completions"
|
||||
|
||||
NOUS_API_BASE_URL = "https://inference-api.nousresearch.com/v1"
|
||||
NOUS_API_CHAT_URL = f"{NOUS_API_BASE_URL}/chat/completions"
|
||||
|
||||
@@ -16,6 +16,7 @@ Key design decisions:
|
||||
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sqlite3
|
||||
import time
|
||||
from pathlib import Path
|
||||
@@ -490,12 +491,16 @@ class SessionDB:
|
||||
msg_id = cursor.lastrowid
|
||||
|
||||
# Update counters
|
||||
is_tool_related = role == "tool" or tool_calls is not None
|
||||
if is_tool_related:
|
||||
# Count actual tool calls from the tool_calls list (not from tool responses).
|
||||
# A single assistant message can contain multiple parallel tool calls.
|
||||
num_tool_calls = 0
|
||||
if tool_calls is not None:
|
||||
num_tool_calls = len(tool_calls) if isinstance(tool_calls, list) else 1
|
||||
if num_tool_calls > 0:
|
||||
self._conn.execute(
|
||||
"""UPDATE sessions SET message_count = message_count + 1,
|
||||
tool_call_count = tool_call_count + 1 WHERE id = ?""",
|
||||
(session_id,),
|
||||
tool_call_count = tool_call_count + ? WHERE id = ?""",
|
||||
(num_tool_calls, session_id),
|
||||
)
|
||||
else:
|
||||
self._conn.execute(
|
||||
@@ -553,6 +558,32 @@ class SessionDB:
|
||||
# Search
|
||||
# =========================================================================
|
||||
|
||||
@staticmethod
|
||||
def _sanitize_fts5_query(query: str) -> str:
|
||||
"""Sanitize user input for safe use in FTS5 MATCH queries.
|
||||
|
||||
FTS5 has its own query syntax where characters like ``"``, ``(``, ``)``,
|
||||
``+``, ``*``, ``{``, ``}`` and bare boolean operators (``AND``, ``OR``,
|
||||
``NOT``) have special meaning. Passing raw user input directly to
|
||||
MATCH can cause ``sqlite3.OperationalError``.
|
||||
|
||||
Strategy: strip characters that are only meaningful as FTS5 operators
|
||||
and would otherwise cause syntax errors. This preserves normal keyword
|
||||
search while preventing crashes on inputs like ``C++``, ``"unterminated``,
|
||||
or ``hello AND``.
|
||||
"""
|
||||
# Remove FTS5-special characters that are not useful in keyword search
|
||||
sanitized = re.sub(r'[+{}()"^]', " ", query)
|
||||
# Collapse repeated * (e.g. "***") into a single one, and remove
|
||||
# leading * (prefix-only matching requires at least one char before *)
|
||||
sanitized = re.sub(r"\*+", "*", sanitized)
|
||||
sanitized = re.sub(r"(^|\s)\*", r"\1", sanitized)
|
||||
# Remove dangling boolean operators at start/end that would cause
|
||||
# syntax errors (e.g. "hello AND" or "OR world")
|
||||
sanitized = re.sub(r"(?i)^(AND|OR|NOT)\b\s*", "", sanitized.strip())
|
||||
sanitized = re.sub(r"(?i)\s+(AND|OR|NOT)\s*$", "", sanitized.strip())
|
||||
return sanitized.strip()
|
||||
|
||||
def search_messages(
|
||||
self,
|
||||
query: str,
|
||||
@@ -576,6 +607,10 @@ class SessionDB:
|
||||
if not query or not query.strip():
|
||||
return []
|
||||
|
||||
query = self._sanitize_fts5_query(query)
|
||||
if not query:
|
||||
return []
|
||||
|
||||
if source_filter is None:
|
||||
source_filter = ["cli", "telegram", "discord", "whatsapp", "slack"]
|
||||
|
||||
@@ -615,7 +650,11 @@ class SessionDB:
|
||||
LIMIT ? OFFSET ?
|
||||
"""
|
||||
|
||||
cursor = self._conn.execute(sql, params)
|
||||
try:
|
||||
cursor = self._conn.execute(sql, params)
|
||||
except sqlite3.OperationalError:
|
||||
# FTS5 query syntax error despite sanitization — return empty
|
||||
return []
|
||||
matches = [dict(row) for row in cursor.fetchall()]
|
||||
|
||||
# Add surrounding context (1 message before + after each match)
|
||||
|
||||
|
After Width: | Height: | Size: 28 KiB |
|
After Width: | Height: | Size: 870 B |
|
After Width: | Height: | Size: 2.5 KiB |
|
After Width: | Height: | Size: 7.9 KiB |
|
After Width: | Height: | Size: 29 KiB |
|
After Width: | Height: | Size: 134 KiB |
@@ -19,7 +19,10 @@
|
||||
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap" rel="stylesheet">
|
||||
|
||||
<link rel="stylesheet" href="style.css">
|
||||
<link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>⚕</text></svg>">
|
||||
<link rel="icon" type="image/x-icon" href="favicon.ico">
|
||||
<link rel="icon" type="image/png" sizes="32x32" href="favicon-32x32.png">
|
||||
<link rel="icon" type="image/png" sizes="16x16" href="favicon-16x16.png">
|
||||
<link rel="apple-touch-icon" sizes="180x180" href="apple-touch-icon.png">
|
||||
</head>
|
||||
<body>
|
||||
<!-- Ambient glow effects -->
|
||||
|
||||
@@ -266,6 +266,7 @@ def handle_function_call(
|
||||
function_args: Dict[str, Any],
|
||||
task_id: Optional[str] = None,
|
||||
user_task: Optional[str] = None,
|
||||
enabled_tools: Optional[List[str]] = None,
|
||||
) -> str:
|
||||
"""
|
||||
Main function call dispatcher that routes calls to the tool registry.
|
||||
@@ -275,6 +276,10 @@ def handle_function_call(
|
||||
function_args: Arguments for the function.
|
||||
task_id: Unique identifier for terminal/browser session isolation.
|
||||
user_task: The user's original task (for browser_snapshot context).
|
||||
enabled_tools: Tool names enabled for this session. When provided,
|
||||
execute_code uses this list to determine which sandbox
|
||||
tools to generate. Falls back to the process-global
|
||||
``_last_resolved_tool_names`` for backward compat.
|
||||
|
||||
Returns:
|
||||
Function result as a JSON string.
|
||||
@@ -284,10 +289,13 @@ def handle_function_call(
|
||||
return json.dumps({"error": f"{function_name} must be handled by the agent loop"})
|
||||
|
||||
if function_name == "execute_code":
|
||||
# Prefer the caller-provided list so subagents can't overwrite
|
||||
# the parent's tool set via the process-global.
|
||||
sandbox_enabled = enabled_tools if enabled_tools is not None else _last_resolved_tool_names
|
||||
return registry.dispatch(
|
||||
function_name, function_args,
|
||||
task_id=task_id,
|
||||
enabled_tools=_last_resolved_tool_names,
|
||||
enabled_tools=sandbox_enabled,
|
||||
)
|
||||
|
||||
return registry.dispatch(
|
||||
|
||||
@@ -0,0 +1,2 @@
|
||||
Optional migration workflows for importing user state and customizations from
|
||||
other agent systems into Hermes Agent.
|
||||
@@ -0,0 +1,281 @@
|
||||
---
|
||||
name: openclaw-migration
|
||||
description: Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports Hermes-compatible memories, SOUL.md, command allowlists, user skills, and selected workspace assets from ~/.openclaw, then reports exactly what could not be migrated and why.
|
||||
version: 1.0.0
|
||||
author: Hermes Agent (Nous Research)
|
||||
license: MIT
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [Migration, OpenClaw, Hermes, Memory, Persona, Import]
|
||||
related_skills: [hermes-agent]
|
||||
---
|
||||
|
||||
# OpenClaw -> Hermes Migration
|
||||
|
||||
Use this skill when a user wants to move their OpenClaw setup into Hermes Agent with minimal manual cleanup.
|
||||
|
||||
## What this skill does
|
||||
|
||||
It uses `scripts/openclaw_to_hermes.py` to:
|
||||
|
||||
- import `SOUL.md` into the Hermes home directory as `SOUL.md`
|
||||
- transform OpenClaw `MEMORY.md` and `USER.md` into Hermes memory entries
|
||||
- merge OpenClaw command approval patterns into Hermes `command_allowlist`
|
||||
- migrate Hermes-compatible messaging settings such as `TELEGRAM_ALLOWED_USERS` and `MESSAGING_CWD`
|
||||
- copy OpenClaw skills into `~/.hermes/skills/openclaw-imports/`
|
||||
- optionally copy the OpenClaw workspace instructions file into a chosen Hermes workspace
|
||||
- mirror compatible workspace assets such as `workspace/tts/` into `~/.hermes/tts/`
|
||||
- archive non-secret docs that do not have a direct Hermes destination
|
||||
- produce a structured report listing migrated items, conflicts, skipped items, and reasons
|
||||
|
||||
## Path resolution
|
||||
|
||||
The helper script lives in this skill directory at:
|
||||
|
||||
- `scripts/openclaw_to_hermes.py`
|
||||
|
||||
When this skill is installed from the Skills Hub, the normal location is:
|
||||
|
||||
- `~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py`
|
||||
|
||||
Do not guess a shorter path like `~/.hermes/skills/openclaw-migration/...`.
|
||||
|
||||
Before running the helper:
|
||||
|
||||
1. Prefer the installed path under `~/.hermes/skills/migration/openclaw-migration/`.
|
||||
2. If that path fails, inspect the installed skill directory and resolve the script relative to the installed `SKILL.md`.
|
||||
3. Only use `find` as a fallback if the installed location is missing or the skill was moved manually.
|
||||
4. When calling the terminal tool, do not pass `workdir: "~"`. Use an absolute directory such as the user's home directory, or omit `workdir` entirely.
|
||||
|
||||
With `--migrate-secrets`, it will also import a small allowlisted set of Hermes-compatible secrets, currently:
|
||||
|
||||
- `TELEGRAM_BOT_TOKEN`
|
||||
|
||||
## Default workflow
|
||||
|
||||
1. Inspect first with a dry run.
|
||||
2. Present a simple summary of what can be migrated, what cannot be migrated, and what would be archived.
|
||||
3. If the `clarify` tool is available, use it for user decisions instead of asking for a free-form prose reply.
|
||||
4. If the dry run finds imported skill directory conflicts, ask how those should be handled before executing.
|
||||
5. Ask the user to choose between the two supported migration modes before executing.
|
||||
6. Ask for a target workspace path only if the user wants the workspace instructions file brought over.
|
||||
7. Execute the migration with the matching preset and flags.
|
||||
8. Summarize the results, especially:
|
||||
- what was migrated
|
||||
- what was archived for manual review
|
||||
- what was skipped and why
|
||||
|
||||
## User interaction protocol
|
||||
|
||||
Hermes CLI supports the `clarify` tool for interactive prompts, but it is limited to:
|
||||
|
||||
- one choice at a time
|
||||
- up to 4 predefined choices
|
||||
- an automatic `Other` free-text option
|
||||
|
||||
It does **not** support true multi-select checkboxes in a single prompt.
|
||||
|
||||
For every `clarify` call:
|
||||
|
||||
- always include a non-empty `question`
|
||||
- include `choices` only for real selectable prompts
|
||||
- keep `choices` to 2-4 plain string options
|
||||
- never emit placeholder or truncated options such as `...`
|
||||
- never pad or stylize choices with extra whitespace
|
||||
- never include fake form fields in the question such as `enter directory here`, blank lines to fill in, or underscores like `_____`
|
||||
- for open-ended path questions, ask only the plain sentence; the user types in the normal CLI prompt below the panel
|
||||
|
||||
If a `clarify` call returns an error, inspect the error text, correct the payload, and retry once with a valid `question` and clean choices.
|
||||
|
||||
When `clarify` is available and the dry run reveals any required user decision, your **next action must be a `clarify` tool call**.
|
||||
Do not end the turn with a normal assistant message such as:
|
||||
|
||||
- "Let me present the choices"
|
||||
- "What would you like to do?"
|
||||
- "Here are the options"
|
||||
|
||||
If a user decision is required, collect it via `clarify` before producing more prose.
|
||||
If multiple unresolved decisions remain, do not insert an explanatory assistant message between them. After one `clarify` response is received, your next action should usually be the next required `clarify` call.
|
||||
|
||||
Treat `workspace-agents` as an unresolved decision whenever the dry run reports:
|
||||
|
||||
- `kind="workspace-agents"`
|
||||
- `status="skipped"`
|
||||
- reason containing `No workspace target was provided`
|
||||
|
||||
In that case, you must ask about workspace instructions before execution. Do not silently treat that as a decision to skip.
|
||||
|
||||
Because of that limitation, use this simplified decision flow:
|
||||
|
||||
1. For `SOUL.md` conflicts, use `clarify` with choices such as:
|
||||
- `keep existing`
|
||||
- `overwrite with backup`
|
||||
- `review first`
|
||||
2. If the dry run shows one or more `kind="skill"` items with `status="conflict"`, use `clarify` with choices such as:
|
||||
- `keep existing skills`
|
||||
- `overwrite conflicting skills with backup`
|
||||
- `import conflicting skills under renamed folders`
|
||||
3. For workspace instructions, use `clarify` with choices such as:
|
||||
- `skip workspace instructions`
|
||||
- `copy to a workspace path`
|
||||
- `decide later`
|
||||
4. If the user chooses to copy workspace instructions, ask a follow-up open-ended `clarify` question requesting an **absolute path**.
|
||||
5. If the user chooses `skip workspace instructions` or `decide later`, proceed without `--workspace-target`.
|
||||
5. For migration mode, use `clarify` with these 3 choices:
|
||||
- `user-data only`
|
||||
- `full compatible migration`
|
||||
- `cancel`
|
||||
6. `user-data only` means: migrate user data and compatible config, but do **not** import allowlisted secrets.
|
||||
7. `full compatible migration` means: migrate the same compatible user data plus the allowlisted secrets when present.
|
||||
8. If `clarify` is not available, ask the same question in normal text, but still constrain the answer to `user-data only`, `full compatible migration`, or `cancel`.
|
||||
|
||||
Execution gate:
|
||||
|
||||
- Do not execute while a `workspace-agents` skip caused by `No workspace target was provided` remains unresolved.
|
||||
- The only valid ways to resolve it are:
|
||||
- user explicitly chooses `skip workspace instructions`
|
||||
- user explicitly chooses `decide later`
|
||||
- user provides a workspace path after choosing `copy to a workspace path`
|
||||
- Absence of a workspace target in the dry run is not itself permission to execute.
|
||||
- Do not execute while any required `clarify` decision remains unresolved.
|
||||
|
||||
Use these exact `clarify` payload shapes as the default pattern:
|
||||
|
||||
- `{"question":"Your existing SOUL.md conflicts with the imported one. What should I do?","choices":["keep existing","overwrite with backup","review first"]}`
|
||||
- `{"question":"One or more imported OpenClaw skills already exist in Hermes. How should I handle those skill conflicts?","choices":["keep existing skills","overwrite conflicting skills with backup","import conflicting skills under renamed folders"]}`
|
||||
- `{"question":"Choose migration mode: migrate only user data, or run the full compatible migration including allowlisted secrets?","choices":["user-data only","full compatible migration","cancel"]}`
|
||||
- `{"question":"Do you want to copy the OpenClaw workspace instructions file into a Hermes workspace?","choices":["skip workspace instructions","copy to a workspace path","decide later"]}`
|
||||
- `{"question":"Please provide an absolute path where the workspace instructions should be copied."}`
|
||||
|
||||
## Decision-to-command mapping
|
||||
|
||||
Map user decisions to command flags exactly:
|
||||
|
||||
- If the user chooses `keep existing` for `SOUL.md`, do **not** add `--overwrite`.
|
||||
- If the user chooses `overwrite with backup`, add `--overwrite`.
|
||||
- If the user chooses `review first`, stop before execution and review the relevant files.
|
||||
- If the user chooses `keep existing skills`, add `--skill-conflict skip`.
|
||||
- If the user chooses `overwrite conflicting skills with backup`, add `--skill-conflict overwrite`.
|
||||
- If the user chooses `import conflicting skills under renamed folders`, add `--skill-conflict rename`.
|
||||
- If the user chooses `user-data only`, execute with `--preset user-data` and do **not** add `--migrate-secrets`.
|
||||
- If the user chooses `full compatible migration`, execute with `--preset full --migrate-secrets`.
|
||||
- Only add `--workspace-target` if the user explicitly provided an absolute workspace path.
|
||||
- If the user chooses `skip workspace instructions` or `decide later`, do not add `--workspace-target`.
|
||||
|
||||
Before executing, restate the exact command plan in plain language and make sure it matches the user's choices.
|
||||
|
||||
## Post-run reporting rules
|
||||
|
||||
After execution, treat the script's JSON output as the source of truth.
|
||||
|
||||
1. Base all counts on `report.summary`.
|
||||
2. Only list an item under "Successfully Migrated" if its `status` is exactly `migrated`.
|
||||
3. Do not claim a conflict was resolved unless the report shows that item as `migrated`.
|
||||
4. Do not say `SOUL.md` was overwritten unless the report item for `kind="soul"` has `status="migrated"`.
|
||||
5. If `report.summary.conflict > 0`, include a conflict section instead of silently implying success.
|
||||
6. If counts and listed items disagree, fix the list to match the report before responding.
|
||||
7. Include the `output_dir` path from the report when available so the user can inspect `report.json`, `summary.md`, backups, and archived files.
|
||||
8. For memory or user-profile overflow, do not say the entries were archived unless the report explicitly shows an archive path. If `details.overflow_file` exists, say the full overflow list was exported there.
|
||||
9. If a skill was imported under a renamed folder, report the final destination and mention `details.renamed_from`.
|
||||
10. If `report.skill_conflict_mode` is present, use it as the source of truth for the selected imported-skill conflict policy.
|
||||
11. If an item has `status="skipped"`, do not describe it as overwritten, backed up, migrated, or resolved.
|
||||
12. If `kind="soul"` has `status="skipped"` with reason `Target already matches source`, say it was left unchanged and do not mention a backup.
|
||||
13. If a renamed imported skill has an empty `details.backup`, do not imply the existing Hermes skill was renamed or backed up. Say only that the imported copy was placed in the new destination and reference `details.renamed_from` as the pre-existing folder that remained in place.
|
||||
|
||||
## Migration presets
|
||||
|
||||
Prefer these two presets in normal use:
|
||||
|
||||
- `user-data`
|
||||
- `full`
|
||||
|
||||
`user-data` includes:
|
||||
|
||||
- `soul`
|
||||
- `workspace-agents`
|
||||
- `memory`
|
||||
- `user-profile`
|
||||
- `messaging-settings`
|
||||
- `command-allowlist`
|
||||
- `skills`
|
||||
- `tts-assets`
|
||||
- `archive`
|
||||
|
||||
`full` includes everything in `user-data` plus:
|
||||
|
||||
- `secret-settings`
|
||||
|
||||
The helper script still supports category-level `--include` / `--exclude`, but treat that as an advanced fallback rather than the default UX.
|
||||
|
||||
## Commands
|
||||
|
||||
Dry run with full discovery:
|
||||
|
||||
```bash
|
||||
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py
|
||||
```
|
||||
|
||||
When using the terminal tool, prefer an absolute invocation pattern such as:
|
||||
|
||||
```json
|
||||
{"command":"python3 /home/USER/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py","workdir":"/home/USER"}
|
||||
```
|
||||
|
||||
Dry run with the user-data preset:
|
||||
|
||||
```bash
|
||||
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --preset user-data
|
||||
```
|
||||
|
||||
Execute a user-data migration:
|
||||
|
||||
```bash
|
||||
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --execute --preset user-data --skill-conflict skip
|
||||
```
|
||||
|
||||
Execute a full compatible migration:
|
||||
|
||||
```bash
|
||||
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --execute --preset full --migrate-secrets --skill-conflict skip
|
||||
```
|
||||
|
||||
Execute with workspace instructions included:
|
||||
|
||||
```bash
|
||||
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --execute --preset user-data --skill-conflict rename --workspace-target "/absolute/workspace/path"
|
||||
```
|
||||
|
||||
Do not use `$PWD` or the home directory as the workspace target by default. Ask for an explicit workspace path first.
|
||||
|
||||
## Important rules
|
||||
|
||||
1. Run a dry run before writing unless the user explicitly says to proceed immediately.
|
||||
2. Do not migrate secrets by default. Tokens, auth blobs, device credentials, and raw gateway config should stay out of Hermes unless the user explicitly asks for secret migration.
|
||||
3. Do not silently overwrite non-empty Hermes targets unless the user explicitly wants that. The helper script will preserve backups when overwriting is enabled.
|
||||
4. Always give the user the skipped-items report. That report is part of the migration, not an optional extra.
|
||||
5. Prefer the primary OpenClaw workspace (`~/.openclaw/workspace/`) over `workspace.default/`. Only use the default workspace as fallback when the primary files are missing.
|
||||
6. Even in secret-migration mode, only migrate secrets with a clean Hermes destination. Unsupported auth blobs must still be reported as skipped.
|
||||
7. If the dry run shows a large asset copy, a conflicting `SOUL.md`, or overflowed memory entries, call those out separately before execution.
|
||||
8. Default to `user-data only` if the user is unsure.
|
||||
9. Only include `workspace-agents` when the user has explicitly provided a destination workspace path.
|
||||
10. Treat category-level `--include` / `--exclude` as an advanced escape hatch, not the normal flow.
|
||||
11. Do not end the dry-run summary with a vague “What would you like to do?” if `clarify` is available. Use structured follow-up prompts instead.
|
||||
12. Do not use an open-ended `clarify` prompt when a real choice prompt would work. Prefer selectable choices first, then free text only for absolute paths or file review requests.
|
||||
13. After a dry run, never stop after summarizing if there is still an unresolved decision. Use `clarify` immediately for the highest-priority blocking decision.
|
||||
14. Priority order for follow-up questions:
|
||||
- `SOUL.md` conflict
|
||||
- imported skill conflicts
|
||||
- migration mode
|
||||
- workspace instructions destination
|
||||
15. Do not promise to present choices later in the same message. Present them by actually calling `clarify`.
|
||||
16. After the migration-mode answer, explicitly check whether `workspace-agents` is still unresolved. If it is, your next action must be the workspace-instructions `clarify` call.
|
||||
17. After any `clarify` answer, if another required decision remains, do not narrate what was just decided. Ask the next required question immediately.
|
||||
|
||||
## Expected result
|
||||
|
||||
After a successful run, the user should have:
|
||||
|
||||
- Hermes persona state imported
|
||||
- Hermes memory files populated with converted OpenClaw knowledge
|
||||
- OpenClaw skills available under `~/.hermes/skills/openclaw-imports/`
|
||||
- a migration report showing any conflicts, omissions, or unsupported data
|
||||
@@ -46,7 +46,10 @@ cron = ["croniter"]
|
||||
slack = ["slack-bolt>=1.18.0", "slack-sdk>=3.27.0"]
|
||||
cli = ["simple-term-menu"]
|
||||
tts-premium = ["elevenlabs"]
|
||||
pty = ["ptyprocess>=0.7.0"]
|
||||
pty = [
|
||||
"ptyprocess>=0.7.0; sys_platform != 'win32'",
|
||||
"pywinpty>=2.0.0; sys_platform == 'win32'",
|
||||
]
|
||||
honcho = ["honcho-ai>=2.0.1"]
|
||||
mcp = ["mcp>=1.2.0"]
|
||||
homeassistant = ["aiohttp>=3.9.0"]
|
||||
|
||||
@@ -172,6 +172,7 @@ class AIAgent:
|
||||
provider_data_collection: str = None,
|
||||
session_id: str = None,
|
||||
tool_progress_callback: callable = None,
|
||||
thinking_callback: callable = None,
|
||||
clarify_callback: callable = None,
|
||||
step_callback: callable = None,
|
||||
max_tokens: int = None,
|
||||
@@ -184,6 +185,8 @@ class AIAgent:
|
||||
honcho_session_key: str = None,
|
||||
iteration_budget: "IterationBudget" = None,
|
||||
fallback_model: Dict[str, Any] = None,
|
||||
checkpoints_enabled: bool = False,
|
||||
checkpoint_max_snapshots: int = 50,
|
||||
):
|
||||
"""
|
||||
Initialize the AI Agent.
|
||||
@@ -256,6 +259,7 @@ class AIAgent:
|
||||
self.api_mode = "chat_completions"
|
||||
|
||||
self.tool_progress_callback = tool_progress_callback
|
||||
self.thinking_callback = thinking_callback
|
||||
self.clarify_callback = clarify_callback
|
||||
self.step_callback = step_callback
|
||||
self._last_reported_tool = None # Track for "new tool" mode
|
||||
@@ -484,6 +488,13 @@ class AIAgent:
|
||||
# Cached system prompt -- built once per session, only rebuilt on compression
|
||||
self._cached_system_prompt: Optional[str] = None
|
||||
|
||||
# Filesystem checkpoint manager (transparent — not a tool)
|
||||
from tools.checkpoint_manager import CheckpointManager
|
||||
self._checkpoint_mgr = CheckpointManager(
|
||||
enabled=checkpoints_enabled,
|
||||
max_snapshots=checkpoint_max_snapshots,
|
||||
)
|
||||
|
||||
# SQLite session store (optional -- provided by CLI or gateway)
|
||||
self._session_db = session_db
|
||||
if self._session_db:
|
||||
@@ -1431,6 +1442,34 @@ class AIAgent:
|
||||
|
||||
return "\n\n".join(prompt_parts)
|
||||
|
||||
def _repair_tool_call(self, tool_name: str) -> str | None:
|
||||
"""Attempt to repair a mismatched tool name before aborting.
|
||||
|
||||
1. Try lowercase
|
||||
2. Try normalized (lowercase + hyphens/spaces -> underscores)
|
||||
3. Try fuzzy match (difflib, cutoff=0.7)
|
||||
|
||||
Returns the repaired name if found in valid_tool_names, else None.
|
||||
"""
|
||||
from difflib import get_close_matches
|
||||
|
||||
# 1. Lowercase
|
||||
lowered = tool_name.lower()
|
||||
if lowered in self.valid_tool_names:
|
||||
return lowered
|
||||
|
||||
# 2. Normalize
|
||||
normalized = lowered.replace("-", "_").replace(" ", "_")
|
||||
if normalized in self.valid_tool_names:
|
||||
return normalized
|
||||
|
||||
# 3. Fuzzy match
|
||||
matches = get_close_matches(lowered, self.valid_tool_names, n=1, cutoff=0.7)
|
||||
if matches:
|
||||
return matches[0]
|
||||
|
||||
return None
|
||||
|
||||
def _invalidate_system_prompt(self):
|
||||
"""
|
||||
Invalidate the cached system prompt, forcing a rebuild on the next turn.
|
||||
@@ -2689,6 +2728,8 @@ class AIAgent:
|
||||
except json.JSONDecodeError as e:
|
||||
logging.warning(f"Unexpected JSON error after validation: {e}")
|
||||
function_args = {}
|
||||
if not isinstance(function_args, dict):
|
||||
function_args = {}
|
||||
|
||||
if not self.quiet_mode:
|
||||
args_str = json.dumps(function_args, ensure_ascii=False)
|
||||
@@ -2702,6 +2743,18 @@ class AIAgent:
|
||||
except Exception as cb_err:
|
||||
logging.debug(f"Tool progress callback error: {cb_err}")
|
||||
|
||||
# Checkpoint: snapshot working dir before file-mutating tools
|
||||
if function_name in ("write_file", "patch") and self._checkpoint_mgr.enabled:
|
||||
try:
|
||||
file_path = function_args.get("path", "")
|
||||
if file_path:
|
||||
work_dir = self._checkpoint_mgr.get_working_dir_for_path(file_path)
|
||||
self._checkpoint_mgr.ensure_checkpoint(
|
||||
work_dir, f"before {function_name}"
|
||||
)
|
||||
except Exception:
|
||||
pass # never block tool execution
|
||||
|
||||
tool_start_time = time.time()
|
||||
|
||||
if function_name == "todo":
|
||||
@@ -2814,7 +2867,10 @@ class AIAgent:
|
||||
spinner.start()
|
||||
_spinner_result = None
|
||||
try:
|
||||
function_result = handle_function_call(function_name, function_args, effective_task_id)
|
||||
function_result = handle_function_call(
|
||||
function_name, function_args, effective_task_id,
|
||||
enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
|
||||
)
|
||||
_spinner_result = function_result
|
||||
except Exception as tool_error:
|
||||
function_result = f"Error executing tool '{function_name}': {tool_error}"
|
||||
@@ -2825,7 +2881,10 @@ class AIAgent:
|
||||
spinner.stop(cute_msg)
|
||||
else:
|
||||
try:
|
||||
function_result = handle_function_call(function_name, function_args, effective_task_id)
|
||||
function_result = handle_function_call(
|
||||
function_name, function_args, effective_task_id,
|
||||
enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
|
||||
)
|
||||
except Exception as tool_error:
|
||||
function_result = f"Error executing tool '{function_name}': {tool_error}"
|
||||
logger.error("handle_function_call raised for %s: %s", function_name, tool_error, exc_info=True)
|
||||
@@ -3042,6 +3101,8 @@ class AIAgent:
|
||||
self._invalid_tool_retries = 0
|
||||
self._invalid_json_retries = 0
|
||||
self._empty_content_retries = 0
|
||||
self._incomplete_scratchpad_retries = 0
|
||||
self._codex_incomplete_retries = 0
|
||||
self._last_content_with_tools = None
|
||||
self._turns_since_memory = 0
|
||||
self._iters_since_skill = 0
|
||||
@@ -3206,11 +3267,16 @@ class AIAgent:
|
||||
final_response = None
|
||||
interrupted = False
|
||||
codex_ack_continuations = 0
|
||||
length_continue_retries = 0
|
||||
truncated_response_prefix = ""
|
||||
|
||||
# Clear any stale interrupt state at start
|
||||
self.clear_interrupt()
|
||||
|
||||
while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
|
||||
# Reset per-turn checkpoint dedup so each iteration can take one snapshot
|
||||
self._checkpoint_mgr.new_turn()
|
||||
|
||||
# Check for interrupt request (e.g., user sent new message)
|
||||
if self._interrupt_requested:
|
||||
interrupted = True
|
||||
@@ -3254,7 +3320,7 @@ class AIAgent:
|
||||
api_messages = []
|
||||
for msg in messages:
|
||||
api_msg = msg.copy()
|
||||
|
||||
|
||||
# For ALL assistant messages, pass reasoning back to the API
|
||||
# This ensures multi-turn reasoning context is preserved
|
||||
if msg.get("role") == "assistant":
|
||||
@@ -3262,7 +3328,7 @@ class AIAgent:
|
||||
if reasoning_text:
|
||||
# Add reasoning_content for API compatibility (Moonshot AI, Novita, OpenRouter)
|
||||
api_msg["reasoning_content"] = reasoning_text
|
||||
|
||||
|
||||
# Remove 'reasoning' field - it's for trajectory storage only
|
||||
# We've copied it to 'reasoning_content' for the API above
|
||||
if "reasoning" in api_msg:
|
||||
@@ -3273,7 +3339,7 @@ class AIAgent:
|
||||
# Keep 'reasoning_details' - OpenRouter uses this for multi-turn reasoning context
|
||||
# The signature field helps maintain reasoning continuity
|
||||
api_messages.append(api_msg)
|
||||
|
||||
|
||||
# Build the final system message: cached prompt + ephemeral system prompt.
|
||||
# The ephemeral part is appended here (not baked into the cached prompt)
|
||||
# so it stays out of the session DB and logs.
|
||||
@@ -3286,21 +3352,21 @@ class AIAgent:
|
||||
effective_system = (effective_system + "\n\n" + self.ephemeral_system_prompt).strip()
|
||||
if effective_system:
|
||||
api_messages = [{"role": "system", "content": effective_system}] + api_messages
|
||||
|
||||
|
||||
# Inject ephemeral prefill messages right after the system prompt
|
||||
# but before conversation history. Same API-call-time-only pattern.
|
||||
if self.prefill_messages:
|
||||
sys_offset = 1 if effective_system else 0
|
||||
for idx, pfm in enumerate(self.prefill_messages):
|
||||
api_messages.insert(sys_offset + idx, pfm.copy())
|
||||
|
||||
|
||||
# Apply Anthropic prompt caching for Claude models via OpenRouter.
|
||||
# Auto-detected: if model name contains "claude" and base_url is OpenRouter,
|
||||
# inject cache_control breakpoints (system + last 3 messages) to reduce
|
||||
# input token costs by ~75% on multi-turn conversations.
|
||||
if self._use_prompt_caching:
|
||||
api_messages = apply_anthropic_cache_control(api_messages, cache_ttl=self._cache_ttl)
|
||||
|
||||
|
||||
# Safety net: strip orphaned tool results / add stubs for missing
|
||||
# results before sending to the API. The compressor handles this
|
||||
# during compression, but orphans can also sneak in from session
|
||||
@@ -3323,9 +3389,13 @@ class AIAgent:
|
||||
# Animated thinking spinner in quiet mode
|
||||
face = random.choice(KawaiiSpinner.KAWAII_THINKING)
|
||||
verb = random.choice(KawaiiSpinner.THINKING_VERBS)
|
||||
spinner_type = random.choice(['brain', 'sparkle', 'pulse', 'moon', 'star'])
|
||||
thinking_spinner = KawaiiSpinner(f"{face} {verb}...", spinner_type=spinner_type)
|
||||
thinking_spinner.start()
|
||||
if self.thinking_callback:
|
||||
# CLI TUI mode: use prompt_toolkit widget instead of raw spinner
|
||||
self.thinking_callback(f"{face} {verb}...")
|
||||
else:
|
||||
spinner_type = random.choice(['brain', 'sparkle', 'pulse', 'moon', 'star'])
|
||||
thinking_spinner = KawaiiSpinner(f"{face} {verb}...", spinner_type=spinner_type)
|
||||
thinking_spinner.start()
|
||||
|
||||
# Log request details if verbose
|
||||
if self.verbose_logging:
|
||||
@@ -3340,6 +3410,8 @@ class AIAgent:
|
||||
max_compression_attempts = 3
|
||||
codex_auth_retry_attempted = False
|
||||
nous_auth_retry_attempted = False
|
||||
restart_with_compressed_messages = False
|
||||
restart_with_length_continuation = False
|
||||
|
||||
finish_reason = "stop"
|
||||
response = None # Guard against UnboundLocalError if all retries fail
|
||||
@@ -3362,6 +3434,8 @@ class AIAgent:
|
||||
if thinking_spinner:
|
||||
thinking_spinner.stop("")
|
||||
thinking_spinner = None
|
||||
if self.thinking_callback:
|
||||
self.thinking_callback("")
|
||||
|
||||
if not self.quiet_mode:
|
||||
print(f"{self.log_prefix}⏱️ API call completed in {api_duration:.2f}s")
|
||||
@@ -3402,6 +3476,8 @@ class AIAgent:
|
||||
if thinking_spinner:
|
||||
thinking_spinner.stop(f"(´;ω;`) oops, retrying...")
|
||||
thinking_spinner = None
|
||||
if self.thinking_callback:
|
||||
self.thinking_callback("")
|
||||
|
||||
# This is often rate limiting or provider returning malformed response
|
||||
retry_count += 1
|
||||
@@ -3486,19 +3562,60 @@ class AIAgent:
|
||||
finish_reason = "stop"
|
||||
else:
|
||||
finish_reason = response.choices[0].finish_reason
|
||||
|
||||
# Handle "length" finish_reason - response was truncated
|
||||
|
||||
if finish_reason == "length":
|
||||
print(f"{self.log_prefix}⚠️ Response truncated (finish_reason='length') - model hit max output tokens")
|
||||
|
||||
|
||||
if self.api_mode == "chat_completions":
|
||||
assistant_message = response.choices[0].message
|
||||
if not assistant_message.tool_calls:
|
||||
length_continue_retries += 1
|
||||
interim_msg = self._build_assistant_message(assistant_message, finish_reason)
|
||||
messages.append(interim_msg)
|
||||
self._log_msg_to_db(interim_msg)
|
||||
if assistant_message.content:
|
||||
truncated_response_prefix += assistant_message.content
|
||||
|
||||
if length_continue_retries < 3:
|
||||
print(
|
||||
f"{self.log_prefix}↻ Requesting continuation "
|
||||
f"({length_continue_retries}/3)..."
|
||||
)
|
||||
continue_msg = {
|
||||
"role": "user",
|
||||
"content": (
|
||||
"[System: Your previous response was truncated by the output "
|
||||
"length limit. Continue exactly where you left off. Do not "
|
||||
"restart or repeat prior text. Finish the answer directly.]"
|
||||
),
|
||||
}
|
||||
messages.append(continue_msg)
|
||||
self._log_msg_to_db(continue_msg)
|
||||
self._session_messages = messages
|
||||
self._save_session_log(messages)
|
||||
restart_with_length_continuation = True
|
||||
break
|
||||
|
||||
partial_response = self._strip_think_blocks(truncated_response_prefix).strip()
|
||||
self._cleanup_task_resources(effective_task_id)
|
||||
self._persist_session(messages, conversation_history)
|
||||
return {
|
||||
"final_response": partial_response or None,
|
||||
"messages": messages,
|
||||
"api_calls": api_call_count,
|
||||
"completed": False,
|
||||
"partial": True,
|
||||
"error": "Response remained truncated after 3 continuation attempts",
|
||||
}
|
||||
|
||||
# If we have prior messages, roll back to last complete state
|
||||
if len(messages) > 1:
|
||||
print(f"{self.log_prefix} ⏪ Rolling back to last complete assistant turn")
|
||||
rolled_back_messages = self._get_messages_up_to_last_assistant(messages)
|
||||
|
||||
|
||||
self._cleanup_task_resources(effective_task_id)
|
||||
self._persist_session(messages, conversation_history)
|
||||
|
||||
|
||||
return {
|
||||
"final_response": None,
|
||||
"messages": rolled_back_messages,
|
||||
@@ -3571,6 +3688,8 @@ class AIAgent:
|
||||
if thinking_spinner:
|
||||
thinking_spinner.stop("")
|
||||
thinking_spinner = None
|
||||
if self.thinking_callback:
|
||||
self.thinking_callback("")
|
||||
api_elapsed = time.time() - api_start_time
|
||||
print(f"{self.log_prefix}⚡ Interrupted during API call.")
|
||||
self._persist_session(messages, conversation_history)
|
||||
@@ -3583,6 +3702,8 @@ class AIAgent:
|
||||
if thinking_spinner:
|
||||
thinking_spinner.stop(f"(╥_╥) error, retrying...")
|
||||
thinking_spinner = None
|
||||
if self.thinking_callback:
|
||||
self.thinking_callback("")
|
||||
|
||||
status_code = getattr(api_error, "status_code", None)
|
||||
if (
|
||||
@@ -3665,7 +3786,8 @@ class AIAgent:
|
||||
if len(messages) < original_len:
|
||||
print(f"{self.log_prefix} 🗜️ Compressed {original_len} → {len(messages)} messages, retrying...")
|
||||
time.sleep(2) # Brief pause between compression retries
|
||||
continue # Retry with compressed messages
|
||||
restart_with_compressed_messages = True
|
||||
break
|
||||
else:
|
||||
print(f"{self.log_prefix}❌ Payload too large and cannot compress further.")
|
||||
logging.error(f"{self.log_prefix}413 payload too large. Cannot compress further.")
|
||||
@@ -3733,7 +3855,8 @@ class AIAgent:
|
||||
if len(messages) < original_len:
|
||||
print(f"{self.log_prefix} 🗜️ Compressed {original_len} → {len(messages)} messages, retrying...")
|
||||
time.sleep(2) # Brief pause between compression retries
|
||||
continue # Retry with compressed messages or new tier
|
||||
restart_with_compressed_messages = True
|
||||
break
|
||||
else:
|
||||
# Can't compress further and already at minimum tier
|
||||
print(f"{self.log_prefix}❌ Context length exceeded and cannot compress further.")
|
||||
@@ -3820,6 +3943,14 @@ class AIAgent:
|
||||
if interrupted:
|
||||
break
|
||||
|
||||
if restart_with_compressed_messages:
|
||||
api_call_count -= 1
|
||||
self.iteration_budget.refund()
|
||||
continue
|
||||
|
||||
if restart_with_length_continuation:
|
||||
continue
|
||||
|
||||
# Guard: if all retries exhausted without a successful response
|
||||
# (e.g. repeated context-length errors that exhausted retry_count),
|
||||
# the `response` variable is still None. Break out cleanly.
|
||||
@@ -3964,39 +4095,37 @@ class AIAgent:
|
||||
logging.debug(f"Tool call: {tc.function.name} with args: {tc.function.arguments[:200]}...")
|
||||
|
||||
# Validate tool call names - detect model hallucinations
|
||||
# Repair mismatched tool names before validating
|
||||
for tc in assistant_message.tool_calls:
|
||||
if tc.function.name not in self.valid_tool_names:
|
||||
repaired = self._repair_tool_call(tc.function.name)
|
||||
if repaired:
|
||||
print(f"{self.log_prefix}🔧 Auto-repaired tool name: '{tc.function.name}' -> '{repaired}'")
|
||||
tc.function.name = repaired
|
||||
invalid_tool_calls = [
|
||||
tc.function.name for tc in assistant_message.tool_calls
|
||||
tc.function.name for tc in assistant_message.tool_calls
|
||||
if tc.function.name not in self.valid_tool_names
|
||||
]
|
||||
|
||||
if invalid_tool_calls:
|
||||
# Track retries for invalid tool calls
|
||||
if not hasattr(self, '_invalid_tool_retries'):
|
||||
self._invalid_tool_retries = 0
|
||||
self._invalid_tool_retries += 1
|
||||
|
||||
invalid_preview = invalid_tool_calls[0][:80] + "..." if len(invalid_tool_calls[0]) > 80 else invalid_tool_calls[0]
|
||||
print(f"{self.log_prefix}⚠️ Invalid tool call detected: '{invalid_preview}'")
|
||||
print(f"{self.log_prefix} Valid tools: {sorted(self.valid_tool_names)}")
|
||||
|
||||
if self._invalid_tool_retries < 3:
|
||||
print(f"{self.log_prefix}🔄 Retrying API call ({self._invalid_tool_retries}/3)...")
|
||||
# Don't add anything to messages, just retry the API call
|
||||
continue
|
||||
else:
|
||||
print(f"{self.log_prefix}❌ Max retries (3) for invalid tool calls exceeded. Stopping as partial.")
|
||||
# Return partial result - don't include the bad tool call in messages
|
||||
self._invalid_tool_retries = 0
|
||||
self._persist_session(messages, conversation_history)
|
||||
return {
|
||||
"final_response": None,
|
||||
"messages": messages,
|
||||
"api_calls": api_call_count,
|
||||
"completed": False,
|
||||
"partial": True,
|
||||
"error": f"Model generated invalid tool call: {invalid_preview}"
|
||||
}
|
||||
|
||||
# Return helpful error to model — model can self-correct next turn
|
||||
available = ", ".join(sorted(self.valid_tool_names))
|
||||
invalid_name = invalid_tool_calls[0]
|
||||
invalid_preview = invalid_name[:80] + "..." if len(invalid_name) > 80 else invalid_name
|
||||
print(f"{self.log_prefix}⚠️ Unknown tool '{invalid_preview}' — sending error to model for self-correction")
|
||||
assistant_msg = self._build_assistant_message(assistant_message, finish_reason)
|
||||
messages.append(assistant_msg)
|
||||
self._log_msg_to_db(assistant_msg)
|
||||
for tc in assistant_message.tool_calls:
|
||||
if tc.function.name not in self.valid_tool_names:
|
||||
content = f"Tool '{tc.function.name}' does not exist. Available tools: {available}"
|
||||
else:
|
||||
content = f"Skipped: another tool call in this turn used an invalid name. Please retry this tool call."
|
||||
messages.append({
|
||||
"role": "tool",
|
||||
"tool_call_id": tc.id,
|
||||
"content": content,
|
||||
})
|
||||
continue
|
||||
# Reset retry counter on successful tool call validation
|
||||
if hasattr(self, '_invalid_tool_retries'):
|
||||
self._invalid_tool_retries = 0
|
||||
@@ -4210,6 +4339,9 @@ class AIAgent:
|
||||
continue
|
||||
|
||||
codex_ack_continuations = 0
|
||||
|
||||
if truncated_response_prefix:
|
||||
final_response = truncated_response_prefix + final_response
|
||||
|
||||
# Strip <think> blocks from user-facing response (keep raw in messages for trajectory)
|
||||
final_response = self._strip_think_blocks(final_response).strip()
|
||||
|
||||
@@ -0,0 +1,215 @@
|
||||
---
|
||||
name: pokemon-player
|
||||
description: Play Pokemon games autonomously via headless emulation. Starts a game server, reads structured game state from RAM, makes strategic decisions, and sends button inputs — all from the terminal.
|
||||
tags: [gaming, pokemon, emulator, pyboy, gameplay, gameboy]
|
||||
---
|
||||
# Pokemon Player
|
||||
|
||||
Play Pokemon games via headless emulation using the `pokemon-agent` package.
|
||||
|
||||
## When to Use
|
||||
- User says "play pokemon", "start pokemon", "pokemon game"
|
||||
- User asks about Pokemon Red, Blue, Yellow, FireRed, etc.
|
||||
- User wants to watch an AI play Pokemon
|
||||
- User references a ROM file (.gb, .gbc, .gba)
|
||||
|
||||
## Startup Procedure
|
||||
|
||||
### 1. First-time setup (clone, venv, install)
|
||||
The repo is NousResearch/pokemon-agent on GitHub. Clone it, then
|
||||
set up a Python 3.10+ virtual environment. Use uv (preferred for speed)
|
||||
to create the venv and install the package in editable mode with the
|
||||
pyboy extra. If uv is not available, fall back to python3 -m venv + pip.
|
||||
|
||||
On this machine it is already set up at /home/teknium/pokemon-agent
|
||||
with a venv ready — just cd there and source .venv/bin/activate.
|
||||
|
||||
You also need a ROM file. Ask the user for theirs. On this machine
|
||||
one exists at roms/pokemon_red.gb inside that directory.
|
||||
NEVER download or provide ROM files — always ask the user.
|
||||
|
||||
### 2. Start the game server
|
||||
From inside the pokemon-agent directory with the venv activated, run
|
||||
pokemon-agent serve with --rom pointing to the ROM and --port 9876.
|
||||
Run it in the background with &.
|
||||
To resume from a saved game, add --load-state with the save name.
|
||||
Wait 4 seconds for startup, then verify with GET /health.
|
||||
|
||||
### 3. Set up live dashboard for user to watch
|
||||
Use an SSH reverse tunnel via localhost.run so the user can view
|
||||
the dashboard in their browser. Connect with ssh, forwarding local
|
||||
port 9876 to remote port 80 on nokey@localhost.run. Redirect output
|
||||
to a log file, wait 10 seconds, then grep the log for the .lhr.life
|
||||
URL. Give the user the URL with /dashboard/ appended.
|
||||
The tunnel URL changes each time — give the user the new one if restarted.
|
||||
|
||||
## Save and Load
|
||||
|
||||
### When to save
|
||||
- Every 15-20 turns of gameplay
|
||||
- ALWAYS before gym battles, rival encounters, or risky fights
|
||||
- Before entering a new town or dungeon
|
||||
- Before any action you are unsure about
|
||||
|
||||
### How to save
|
||||
POST /save with a descriptive name. Good examples:
|
||||
before_brock, route1_start, mt_moon_entrance, got_cut
|
||||
|
||||
### How to load
|
||||
POST /load with the save name.
|
||||
|
||||
### List available saves
|
||||
GET /saves returns all saved states.
|
||||
|
||||
### Loading on server startup
|
||||
Use --load-state flag when starting the server to auto-load a save.
|
||||
This is faster than loading via the API after startup.
|
||||
|
||||
## The Gameplay Loop
|
||||
|
||||
### Step 1: OBSERVE — check state AND take a screenshot
|
||||
GET /state for position, HP, battle, dialog.
|
||||
GET /screenshot and save to /tmp/pokemon.png, then use vision_analyze.
|
||||
Always do BOTH — RAM state gives numbers, vision gives spatial awareness.
|
||||
|
||||
### Step 2: ORIENT
|
||||
- Dialog/text on screen → advance it
|
||||
- In battle → fight or run
|
||||
- Party hurt → head to Pokemon Center
|
||||
- Near objective → navigate carefully
|
||||
|
||||
### Step 3: DECIDE
|
||||
Priority: dialog > battle > heal > story objective > training > explore
|
||||
|
||||
### Step 4: ACT — move 2-4 steps max, then re-check
|
||||
POST /action with a SHORT action list (2-4 actions, not 10-15).
|
||||
|
||||
### Step 5: VERIFY — screenshot after every move sequence
|
||||
Take a screenshot and use vision_analyze to confirm you moved where
|
||||
intended. This is the MOST IMPORTANT step. Without vision you WILL get lost.
|
||||
|
||||
### Step 6: RECORD progress to memory with PKM: prefix
|
||||
|
||||
### Step 7: SAVE periodically
|
||||
|
||||
## Action Reference
|
||||
- press_a — confirm, talk, select
|
||||
- press_b — cancel, close menu
|
||||
- press_start — open game menu
|
||||
- walk_up/down/left/right — move one tile
|
||||
- hold_b_N — hold B for N frames (use for speeding through text)
|
||||
- wait_60 — wait about 1 second (60 frames)
|
||||
- a_until_dialog_end — press A repeatedly until dialog clears
|
||||
|
||||
## Critical Tips from Experience
|
||||
|
||||
### USE VISION CONSTANTLY
|
||||
- Take a screenshot every 2-4 movement steps
|
||||
- The RAM state tells you position and HP but NOT what is around you
|
||||
- Ledges, fences, signs, building doors, NPCs — only visible via screenshot
|
||||
- Ask the vision model specific questions: "what is one tile north of me?"
|
||||
- When stuck, always screenshot before trying random directions
|
||||
|
||||
### Warp Transitions Need Extra Wait Time
|
||||
When walking through a door or stairs, the screen fades to black during
|
||||
the map transition. You MUST wait for it to complete. Add 2-3 wait_60
|
||||
actions after any door/stair warp. Without waiting, the position reads
|
||||
as stale and you will think you are still in the old map.
|
||||
|
||||
### Building Exit Trap
|
||||
When you exit a building, you appear directly IN FRONT of the door.
|
||||
If you walk north, you go right back inside. ALWAYS sidestep first
|
||||
by walking left or right 2 tiles, then proceed in your intended direction.
|
||||
|
||||
### Dialog Handling
|
||||
Gen 1 text scrolls slowly letter-by-letter. To speed through dialog,
|
||||
hold B for 120 frames then press A. Repeat as needed. Holding B makes
|
||||
text display at max speed. Then press A to advance to the next line.
|
||||
The a_until_dialog_end action checks the RAM dialog flag, but this flag
|
||||
does not catch ALL text states. If dialog seems stuck, use the manual
|
||||
hold_b + press_a pattern instead and verify via screenshot.
|
||||
|
||||
### Ledges Are One-Way
|
||||
Ledges (small cliff edges) can only be jumped DOWN (south), never climbed
|
||||
UP (north). If blocked by a ledge going north, you must go left or right
|
||||
to find the gap around it. Use vision to identify which direction the
|
||||
gap is. Ask the vision model explicitly.
|
||||
|
||||
### Navigation Strategy
|
||||
- Move 2-4 steps at a time, then screenshot to check position
|
||||
- When entering a new area, screenshot immediately to orient
|
||||
- Ask the vision model "which direction to [destination]?"
|
||||
- If stuck for 3+ attempts, screenshot and re-evaluate completely
|
||||
- Do not spam 10-15 movements — you will overshoot or get stuck
|
||||
|
||||
### Running from Wild Battles
|
||||
On the battle menu, RUN is bottom-right. To reach it from the default
|
||||
cursor position (FIGHT, top-left): press down then right to move cursor
|
||||
to RUN, then press A. Wrap with hold_b to speed through text/animations.
|
||||
|
||||
### Battling (FIGHT)
|
||||
On the battle menu FIGHT is top-left (default cursor position).
|
||||
Press A to enter move selection, A again to use the first move.
|
||||
Then hold B to speed through attack animations and text.
|
||||
|
||||
## Battle Strategy
|
||||
|
||||
### Decision Tree
|
||||
1. Want to catch? → Weaken then throw Poke Ball
|
||||
2. Wild you don't need? → RUN
|
||||
3. Type advantage? → Use super-effective move
|
||||
4. No advantage? → Use strongest STAB move
|
||||
5. Low HP? → Switch or use Potion
|
||||
|
||||
### Gen 1 Type Chart (key matchups)
|
||||
- Water beats Fire, Ground, Rock
|
||||
- Fire beats Grass, Bug, Ice
|
||||
- Grass beats Water, Ground, Rock
|
||||
- Electric beats Water, Flying
|
||||
- Ground beats Fire, Electric, Rock, Poison
|
||||
- Psychic beats Fighting, Poison (dominant in Gen 1!)
|
||||
|
||||
### Gen 1 Quirks
|
||||
- Special stat = both offense AND defense for special moves
|
||||
- Psychic type is overpowered (Ghost moves bugged)
|
||||
- Critical hits based on Speed stat
|
||||
- Wrap/Bind prevent opponent from acting
|
||||
- Focus Energy bug: REDUCES crit rate instead of raising it
|
||||
|
||||
## Memory Conventions
|
||||
| Prefix | Purpose | Example |
|
||||
|--------|---------|---------|
|
||||
| PKM:OBJECTIVE | Current goal | Get Parcel from Viridian Mart |
|
||||
| PKM:MAP | Navigation knowledge | Viridian: mart is northeast |
|
||||
| PKM:STRATEGY | Battle/team plans | Need Grass type before Misty |
|
||||
| PKM:PROGRESS | Milestone tracker | Beat rival, heading to Viridian |
|
||||
| PKM:STUCK | Stuck situations | Ledge at y=28 go right to bypass |
|
||||
| PKM:TEAM | Team notes | Squirtle Lv6, Tackle + Tail Whip |
|
||||
|
||||
## Progression Milestones
|
||||
- Choose starter
|
||||
- Deliver Parcel from Viridian Mart, receive Pokedex
|
||||
- Boulder Badge — Brock (Rock) → use Water/Grass
|
||||
- Cascade Badge — Misty (Water) → use Grass/Electric
|
||||
- Thunder Badge — Lt. Surge (Electric) → use Ground
|
||||
- Rainbow Badge — Erika (Grass) → use Fire/Ice/Flying
|
||||
- Soul Badge — Koga (Poison) → use Ground/Psychic
|
||||
- Marsh Badge — Sabrina (Psychic) → hardest gym
|
||||
- Volcano Badge — Blaine (Fire) → use Water/Ground
|
||||
- Earth Badge — Giovanni (Ground) → use Water/Grass/Ice
|
||||
- Elite Four → Champion!
|
||||
|
||||
## Stopping Play
|
||||
1. Save the game with a descriptive name via POST /save
|
||||
2. Update memory with PKM:PROGRESS
|
||||
3. Tell user: "Game saved as [name]! Say 'play pokemon' to resume."
|
||||
4. Kill the server and tunnel background processes
|
||||
|
||||
## Pitfalls
|
||||
- NEVER download or provide ROM files
|
||||
- Do NOT send more than 4-5 actions without checking vision
|
||||
- Always sidestep after exiting buildings before going north
|
||||
- Always add wait_60 x2-3 after door/stair warps
|
||||
- Dialog detection via RAM is unreliable — verify with screenshots
|
||||
- Save BEFORE risky encounters
|
||||
- The tunnel URL changes each time you restart it
|
||||
@@ -1098,7 +1098,7 @@ Please see the ocifs docs.
|
||||
|
||||
The path should start with https://.
|
||||
|
||||
This must be publically accessible.
|
||||
This must be publicly accessible.
|
||||
|
||||
Now that you know how to load datasets, you can learn more on how to load your specific dataset format into your target output format dataset formats docs.
|
||||
|
||||
|
||||
@@ -0,0 +1,302 @@
|
||||
---
|
||||
name: hermes-atropos-environments
|
||||
description: Build, test, and debug Hermes Agent RL environments for Atropos training. Covers the HermesAgentBaseEnv interface, reward functions, agent loop integration, evaluation with tools, wandb logging, and the three CLI modes (serve/process/evaluate). Use when creating, reviewing, or fixing RL environments in the hermes-agent repo.
|
||||
version: 1.1.0
|
||||
author: Hermes Agent
|
||||
license: MIT
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [atropos, rl, environments, training, reinforcement-learning, reward-functions]
|
||||
related_skills: [axolotl, grpo-rl-training, trl-fine-tuning, lm-evaluation-harness]
|
||||
---
|
||||
|
||||
# Hermes Agent Atropos Environments
|
||||
|
||||
Guide for building RL environments in the hermes-agent repo that integrate with the Atropos training framework.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
Atropos BaseEnv (atroposlib/envs/base.py)
|
||||
└── HermesAgentBaseEnv (environments/hermes_base_env.py)
|
||||
├── Handles agent loop orchestration
|
||||
├── Handles tool resolution per group
|
||||
├── Handles ToolContext for reward verification
|
||||
└── YOUR ENVIRONMENT (environments/your_env.py)
|
||||
Only implements: setup, get_next_item, format_prompt,
|
||||
compute_reward, evaluate, wandb_log
|
||||
```
|
||||
|
||||
Hermes environments are special because they run a **multi-turn agent loop with tool calling** — not just single-turn completions. The base env handles the loop; you implement the task and scoring.
|
||||
|
||||
## File Locations
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `environments/hermes_base_env.py` | Base class with agent loop + tool resolution |
|
||||
| `environments/agent_loop.py` | `HermesAgentLoop` + `AgentResult` dataclass |
|
||||
| `environments/tool_context.py` | `ToolContext` for reward verification |
|
||||
| `environments/tool_call_parsers.py` | Phase 2 tool call parsers (hermes, mistral, etc.) |
|
||||
| `environments/your_env.py` | Your environment implementation |
|
||||
|
||||
## Inference Setup — Ask the User First
|
||||
|
||||
**IMPORTANT:** Before running any test, evaluation, or data generation command, always ask the user how they want to handle inference. Do NOT assume OpenRouter or any specific endpoint. Present these options:
|
||||
|
||||
1. **OpenRouter** — Ask which model they want to use (e.g., `anthropic/claude-sonnet-4.5`, `google/gemini-2.5-pro`, `meta-llama/llama-3.3-70b-instruct`, etc.). Requires `OPENROUTER_API_KEY` in environment.
|
||||
2. **Self-hosted VLLM endpoint** — Ask for their base URL (e.g., `http://localhost:8000/v1`) and model name. Set `--openai.server_type vllm`.
|
||||
3. **Other OpenAI-compatible API** — Ask for the base URL, model name, and any required API key. Set `--openai.server_type openai` and `--openai.health_check false`.
|
||||
4. **Local Atropos training server** — For `serve` mode with a live training loop. Default `http://localhost:8000/v1`.
|
||||
|
||||
Once the user tells you their setup, use those values in all CLI commands for that session. Example prompts:
|
||||
|
||||
> "Before I run this, how would you like to handle inference?
|
||||
> 1. OpenRouter (I'll need your preferred model, e.g. claude-sonnet-4.5)
|
||||
> 2. A self-hosted VLLM endpoint (give me the URL and model name)
|
||||
> 3. Another OpenAI-compatible API (give me the URL, model, and any auth details)
|
||||
> 4. Local Atropos training server (serve mode)"
|
||||
|
||||
### Key flags by provider:
|
||||
|
||||
| Provider | `--openai.server_type` | `--openai.health_check` | `--openai.api_key` |
|
||||
|----------|----------------------|------------------------|-------------------|
|
||||
| OpenRouter | `openai` | `false` | `$OPENROUTER_API_KEY` |
|
||||
| VLLM (self-hosted) | `vllm` | (default) | (not needed) |
|
||||
| Other OpenAI-compatible | `openai` | `false` | As needed |
|
||||
| Local Atropos | (default) | (default) | (not needed) |
|
||||
|
||||
## Required Methods
|
||||
|
||||
### 1. `setup()` — Load dataset and initialize state
|
||||
|
||||
```python
|
||||
async def setup(self) -> None:
|
||||
"""Called once at startup. Load datasets, initialize state."""
|
||||
# Try HuggingFace first, fallback to built-in samples
|
||||
try:
|
||||
from datasets import load_dataset
|
||||
ds = load_dataset("your/dataset", split="test")
|
||||
self._items = [...]
|
||||
except Exception:
|
||||
self._items = BUILTIN_SAMPLES
|
||||
|
||||
# Always split into train/eval
|
||||
random.shuffle(self._items)
|
||||
eval_size = max(20, int(len(self._items) * 0.1))
|
||||
self._eval_items = self._items[:eval_size]
|
||||
self._items = self._items[eval_size:]
|
||||
```
|
||||
|
||||
### 2. `get_next_item()` — Return next training item
|
||||
|
||||
```python
|
||||
async def get_next_item(self) -> dict:
|
||||
"""Return next item, cycling through dataset."""
|
||||
item = self._items[self._index % len(self._items)]
|
||||
self._index += 1
|
||||
return item
|
||||
```
|
||||
|
||||
### 3. `format_prompt(item)` — Convert item to user message
|
||||
|
||||
```python
|
||||
def format_prompt(self, item: dict) -> str:
|
||||
"""Convert a dataset item into the user-facing prompt."""
|
||||
return f"Research this question: {item['question']}"
|
||||
```
|
||||
|
||||
### 4. `compute_reward(item, result, ctx)` — Score the rollout
|
||||
|
||||
**CRITICAL**: `result` is an `AgentResult`, NOT a dict. It has these attributes:
|
||||
- `result.messages` — List of message dicts (OpenAI format)
|
||||
- `result.turns_used` — Number of LLM calls made
|
||||
- `result.finished_naturally` — True if model stopped voluntarily
|
||||
- `result.tool_errors` — List of ToolError objects
|
||||
|
||||
**AgentResult does NOT have**: `final_response`, `tool_calls`, `tools_used`.
|
||||
You must extract these from `result.messages`:
|
||||
|
||||
```python
|
||||
async def compute_reward(self, item, result: AgentResult, ctx: ToolContext) -> float:
|
||||
# Extract final response (last assistant message with content)
|
||||
final_response = ""
|
||||
tools_used = []
|
||||
for msg in reversed(result.messages):
|
||||
if msg.get("role") == "assistant" and msg.get("content") and not final_response:
|
||||
final_response = msg["content"]
|
||||
if msg.get("role") == "assistant" and msg.get("tool_calls"):
|
||||
for tc in msg["tool_calls"]:
|
||||
fn = tc.get("function", {}) if isinstance(tc, dict) else {}
|
||||
name = fn.get("name", "")
|
||||
if name:
|
||||
tools_used.append(name)
|
||||
|
||||
# Score using LLM judge, heuristic, or ToolContext verification
|
||||
correctness = await self._llm_judge(item, final_response)
|
||||
return correctness
|
||||
```
|
||||
|
||||
`ctx` (ToolContext) gives you terminal/file access to the agent's sandbox for verification:
|
||||
```python
|
||||
# Run tests in the agent's sandbox
|
||||
result = ctx.terminal("pytest /workspace/test.py")
|
||||
return 1.0 if result["exit_code"] == 0 else 0.0
|
||||
```
|
||||
|
||||
### 5. `evaluate()` — Periodic evaluation with full agent loop
|
||||
|
||||
**MUST use the full agent loop with tools**, not single-turn chat_completion.
|
||||
The whole point of hermes-agent environments is agentic evaluation:
|
||||
|
||||
```python
|
||||
async def evaluate(self, *args, **kwargs) -> None:
|
||||
import time, uuid
|
||||
from environments.agent_loop import HermesAgentLoop
|
||||
from environments.tool_context import ToolContext
|
||||
|
||||
start_time = time.time()
|
||||
tools, valid_names = self._resolve_tools_for_group()
|
||||
samples = []
|
||||
|
||||
for item in self._eval_items[:self.config.eval_size]:
|
||||
task_id = str(uuid.uuid4())
|
||||
messages = []
|
||||
if self.config.system_prompt:
|
||||
messages.append({"role": "system", "content": self.config.system_prompt})
|
||||
messages.append({"role": "user", "content": self.format_prompt(item)})
|
||||
|
||||
agent = HermesAgentLoop(
|
||||
server=self.server,
|
||||
tool_schemas=tools,
|
||||
valid_tool_names=valid_names,
|
||||
max_turns=self.config.max_agent_turns,
|
||||
task_id=task_id,
|
||||
temperature=0.0, # Deterministic for eval
|
||||
max_tokens=self.config.max_token_length,
|
||||
extra_body=self.config.extra_body,
|
||||
)
|
||||
result = await agent.run(messages)
|
||||
|
||||
ctx = ToolContext(task_id)
|
||||
try:
|
||||
reward = await self.compute_reward(item, result, ctx)
|
||||
finally:
|
||||
ctx.cleanup()
|
||||
|
||||
samples.append({"prompt": ..., "response": ..., "reward": reward})
|
||||
|
||||
eval_metrics = {"eval/mean_reward": ...}
|
||||
await self.evaluate_log(metrics=eval_metrics, samples=samples,
|
||||
start_time=start_time, end_time=time.time())
|
||||
```
|
||||
|
||||
### 6. `wandb_log()` — Custom metrics logging
|
||||
|
||||
Always call `super().wandb_log()` at the end:
|
||||
|
||||
```python
|
||||
async def wandb_log(self, wandb_metrics=None):
|
||||
if wandb_metrics is None:
|
||||
wandb_metrics = {}
|
||||
if self._reward_buffer:
|
||||
n = len(self._reward_buffer)
|
||||
wandb_metrics["train/mean_reward"] = sum(self._reward_buffer) / n
|
||||
self._reward_buffer.clear()
|
||||
await super().wandb_log(wandb_metrics) # MUST call super
|
||||
```
|
||||
|
||||
**Pitfall**: `compute_reward` appends to metric buffers. During eval, this pollutes training metrics. Roll back buffer entries added during eval.
|
||||
|
||||
## Config Class
|
||||
|
||||
Always create a custom config subclass with Pydantic Field descriptors. Key inherited fields you can tune: `enabled_toolsets`, `max_agent_turns`, `agent_temperature`, `system_prompt`, `terminal_backend`, `group_size`, `steps_per_eval`, `total_steps`.
|
||||
|
||||
## config_init() — Default Configuration
|
||||
|
||||
Classmethod returning `(YourEnvConfig, [APIServerConfig(...)])`. Set server_type to "openai" for OpenRouter/external APIs. Load API key from environment variable.
|
||||
|
||||
## Three CLI Modes
|
||||
|
||||
```bash
|
||||
# SERVE — Full training loop (connects to Atropos API server)
|
||||
python environments/my_env.py serve --openai.base_url http://localhost:8000/v1
|
||||
|
||||
# PROCESS — Offline data generation (saves JSONL)
|
||||
python environments/my_env.py process --env.total_steps 10 --env.group_size 1 \
|
||||
--env.use_wandb false --env.data_path_to_save_groups output.jsonl \
|
||||
--openai.base_url "<USER_BASE_URL>" \
|
||||
--openai.model_name "<USER_MODEL>" \
|
||||
--openai.server_type <USER_SERVER_TYPE> --openai.health_check false
|
||||
|
||||
# EVALUATE — Standalone eval (runs setup + evaluate only)
|
||||
python environments/my_env.py evaluate --env.eval_size 20 \
|
||||
--env.data_dir_to_save_evals /tmp/eval_results \
|
||||
--openai.base_url "<USER_BASE_URL>" \
|
||||
--openai.model_name "<USER_MODEL>" \
|
||||
--openai.server_type <USER_SERVER_TYPE> --openai.health_check false
|
||||
```
|
||||
|
||||
Config priority: CLI args > YAML file > config_init() defaults.
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
1. **AgentResult has .messages, not .final_response** — Extract the final response by iterating reversed(result.messages) looking for the last assistant message with content.
|
||||
|
||||
2. **evaluate() must use HermesAgentLoop, not chat_completion** — Single-turn chat_completion has no tools. The whole point of hermes-agent benchmarks is agentic evaluation with tool use.
|
||||
|
||||
3. **Don't call _llm_judge twice** — If compute_reward already calls it, extract the score from the buffer instead of calling judge separately in evaluate().
|
||||
|
||||
4. **Eval pollutes training buffers** — compute_reward appends to metric buffers. During eval, roll back buffer entries to keep training metrics clean.
|
||||
|
||||
5. **Always set health_check=false for OpenRouter** — OpenRouter has no /health endpoint.
|
||||
|
||||
6. **Set data_dir_to_save_evals in evaluate mode** — Without it, results aren't saved.
|
||||
|
||||
7. **default_toolsets class variable vs enabled_toolsets config** — The class variable is a hint; the config field is what actually controls tool resolution.
|
||||
|
||||
8. **Tool call parsing in messages** — Tool calls are dicts with `{"function": {"name": ..., "arguments": ...}}`. Always check `isinstance(tc, dict)`.
|
||||
|
||||
9. **ToolContext.cleanup()** — Always call in a finally block to release sandbox resources.
|
||||
|
||||
10. **server_type must be "openai" for external APIs** — Without it, Atropos assumes a local VLLM server.
|
||||
|
||||
11. **Always ask the user for their inference setup** — Never hardcode or assume a specific provider/model. See the "Inference Setup" section above.
|
||||
|
||||
## Reward Function Patterns
|
||||
|
||||
### LLM Judge (for open-ended tasks)
|
||||
Use `self.server.chat_completion()` with a scoring prompt. Parse JSON response for score float. Always include a heuristic fallback (keyword overlap) for when the judge call fails.
|
||||
|
||||
### Binary Verification (for code/terminal tasks)
|
||||
Use `ctx.terminal("pytest test.py -q")` to run tests in the agent's sandbox. Return 1.0 for pass, 0.0 for fail.
|
||||
|
||||
### Multi-Signal (combine multiple indicators)
|
||||
Weight correctness (0.6) + tool usage (0.2) + efficiency (0.2) + optional bonuses. Clamp to [0, 1].
|
||||
|
||||
## Testing Your Environment
|
||||
|
||||
1. **Import test**: `python -c "from environments.my_env import MyEnv; print('OK')"`
|
||||
2. **Ask the user for inference setup** (see "Inference Setup" section above)
|
||||
3. **Process mode** (1 item): Verify JSONL output has valid tokens, masks, scores
|
||||
4. **Evaluate mode**: Verify full agent loop runs with tools, metrics logged correctly
|
||||
5. **Check reward range**: Scores should be in [0, 1], not all identical
|
||||
|
||||
## Minimum Implementation Checklist
|
||||
|
||||
```python
|
||||
class MyEnv(HermesAgentBaseEnv):
|
||||
name = "my-env"
|
||||
env_config_cls = MyEnvConfig
|
||||
|
||||
@classmethod
|
||||
def config_init(cls): ... # Default server + env config
|
||||
async def setup(self): ... # Load dataset + train/eval split
|
||||
async def get_next_item(self): ... # Cycle through training items
|
||||
def format_prompt(self, item): ... # Item → user message string
|
||||
async def compute_reward(self, item, result, ctx): ... # Score rollout
|
||||
async def evaluate(self, *args, **kwargs): ... # Full agent loop eval
|
||||
async def wandb_log(self, metrics=None): ... # Custom metrics + super()
|
||||
|
||||
if __name__ == "__main__":
|
||||
MyEnv.cli()
|
||||
```
|
||||
@@ -0,0 +1,59 @@
|
||||
# AgentResult Fields Reference
|
||||
|
||||
`AgentResult` is defined in `environments/agent_loop.py` as a dataclass.
|
||||
|
||||
## Fields
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `messages` | `List[Dict[str, Any]]` | Full conversation history in OpenAI message format |
|
||||
| `managed_state` | `Optional[Dict]` | ManagedServer.get_state() if Phase 2, else None |
|
||||
| `turns_used` | `int` | Number of LLM calls made during the loop |
|
||||
| `finished_naturally` | `bool` | True if model stopped calling tools on its own |
|
||||
| `reasoning_per_turn` | `List[Optional[str]]` | Extracted reasoning content per turn |
|
||||
| `tool_errors` | `List[ToolError]` | Tool errors encountered during the loop |
|
||||
|
||||
## ToolError Fields
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `turn` | `int` | Which turn the error occurred |
|
||||
| `tool_name` | `str` | Name of the tool that failed |
|
||||
| `arguments` | `str` | Arguments passed to the tool |
|
||||
| `error` | `str` | Error message |
|
||||
| `tool_result` | `str` | The result returned to the model |
|
||||
|
||||
## Extracting Data from Messages
|
||||
|
||||
Messages follow OpenAI format. Common patterns:
|
||||
|
||||
```python
|
||||
# Get final assistant response
|
||||
for msg in reversed(result.messages):
|
||||
if msg.get("role") == "assistant" and msg.get("content"):
|
||||
final_response = msg["content"]
|
||||
break
|
||||
|
||||
# Get all tool names used
|
||||
tools = []
|
||||
for msg in result.messages:
|
||||
if msg.get("role") == "assistant" and msg.get("tool_calls"):
|
||||
for tc in msg["tool_calls"]:
|
||||
fn = tc.get("function", {}) if isinstance(tc, dict) else {}
|
||||
tools.append(fn.get("name", ""))
|
||||
|
||||
# Get tool results
|
||||
for msg in result.messages:
|
||||
if msg.get("role") == "tool":
|
||||
tool_output = msg.get("content", "")
|
||||
call_id = msg.get("tool_call_id", "")
|
||||
```
|
||||
|
||||
## Fields that DO NOT EXIST
|
||||
|
||||
These are common mistakes — AgentResult does NOT have:
|
||||
- `final_response` — extract from messages
|
||||
- `tool_calls` — extract from messages
|
||||
- `tools_used` — extract from messages
|
||||
- `output` — extract from messages
|
||||
- `response` — extract from messages
|
||||
@@ -0,0 +1,65 @@
|
||||
# Atropos BaseEnv Reference
|
||||
|
||||
Source: `atroposlib/envs/base.py` (~2124 lines)
|
||||
|
||||
## Abstract Methods (MUST implement)
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `get_next_item()` | `async def get_next_item(self) -> Item` | Return next item for trajectory. Return None to pause. |
|
||||
| `evaluate()` | `async def evaluate(self, *args, **kwargs)` | Called every steps_per_eval steps. |
|
||||
| `setup()` | `async def setup(self)` | Called once at start. Load datasets, init models. |
|
||||
| `collect_trajectory()` | `async def collect_trajectory(self, item) -> Tuple[Optional[ScoredDataItem], List[Item]]` | Single rollout. Or override collect_trajectories instead. |
|
||||
|
||||
## Overridable Methods
|
||||
|
||||
| Method | Default Behavior | Override When |
|
||||
|--------|-----------------|---------------|
|
||||
| `collect_trajectories()` | Runs collect_trajectory group_size times in parallel | Batch generation, MCTS, coupled rollouts |
|
||||
| `wandb_log()` | Logs completion lengths, rollout table, perf stats | Add custom metrics (always call super) |
|
||||
| `config_init()` | Returns (env_config_cls(), ServerBaseline()) | Custom defaults + server configs |
|
||||
| `postprocess_histories()` | Passthrough | Final processing before sending to trainer |
|
||||
| `save_checkpoint()` | Saves JSON to checkpoint_dir | Custom serialization |
|
||||
| `cleanup()` | No-op | Release resources after each rollout |
|
||||
|
||||
## ScoredDataGroup Structure
|
||||
|
||||
```python
|
||||
ScoredDataGroup = TypedDict with:
|
||||
tokens: List[List[int]] # Token IDs per rollout
|
||||
masks: List[List[int]] # -100=prompt, token_id=completion
|
||||
scores: List[float] # Score per rollout
|
||||
advantages: Optional[...] # Per-token advantages
|
||||
ref_logprobs: Optional[...] # Reference model logprobs
|
||||
messages: Optional[...] # OpenAI-format messages
|
||||
inference_logprobs: Optional[...] # Inference logprobs
|
||||
```
|
||||
|
||||
## BaseEnvConfig Key Fields
|
||||
|
||||
| Field | Default | Description |
|
||||
|-------|---------|-------------|
|
||||
| `group_size` | 4 | Responses grouped for scoring |
|
||||
| `steps_per_eval` | 100 | Steps between evaluations |
|
||||
| `max_token_length` | 2048 | Max token length for generations |
|
||||
| `total_steps` | 1000 | Total training steps |
|
||||
| `use_wandb` | True | Enable wandb logging |
|
||||
| `tokenizer_name` | DeepHermes-3 | Tokenizer for token encoding |
|
||||
| `ensure_scores_are_not_same` | True | Skip groups with identical scores |
|
||||
| `worker_timeout` | 600 | Task timeout seconds |
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
env_manager() → add_train_workers() → handle_env()
|
||||
→ collect_trajectories() → postprocess_histories()
|
||||
→ handle_send_to_api() → training server
|
||||
```
|
||||
|
||||
## Atropos Environment Statistics (82 environments analyzed)
|
||||
|
||||
- 95% implement setup, collect_trajectories, evaluate, get_next_item
|
||||
- 76% override wandb_log
|
||||
- 54% have custom config class
|
||||
- Most use collect_trajectories (plural), not collect_trajectory (singular)
|
||||
- Common reward patterns: LLM-judge (~40), regex-extract (~35), code-exec (~12)
|
||||
@@ -0,0 +1,199 @@
|
||||
# Usage Patterns — Testing Environments and Evaluating Models
|
||||
|
||||
## Pattern 1: Test Your Environment Works (process mode)
|
||||
|
||||
Use `process` mode to verify your environment runs end-to-end before
|
||||
committing. This generates trajectories without needing an Atropos
|
||||
training server.
|
||||
|
||||
**Before running:** Ask the user for their inference setup (see SKILL.md "Inference Setup" section). Replace `<BASE_URL>`, `<MODEL>`, and `<SERVER_TYPE>` below with their chosen values.
|
||||
|
||||
### Step 1: Run 1 trajectory
|
||||
|
||||
```bash
|
||||
cd ~/.hermes/hermes-agent
|
||||
source .venv/bin/activate
|
||||
|
||||
python environments/your_env.py process \
|
||||
--env.total_steps 1 \
|
||||
--env.group_size 1 \
|
||||
--env.use_wandb false \
|
||||
--env.data_path_to_save_groups /tmp/test_output.jsonl \
|
||||
--openai.base_url "<BASE_URL>" \
|
||||
--openai.model_name "<MODEL>" \
|
||||
--openai.server_type <SERVER_TYPE> \
|
||||
--openai.health_check false
|
||||
```
|
||||
|
||||
### Step 2: Verify the output
|
||||
|
||||
```python
|
||||
import json
|
||||
for line in open("/tmp/test_output.jsonl"):
|
||||
data = json.loads(line)
|
||||
print(f"Scores: {data.get('scores', [])}")
|
||||
print(f"Token sequences: {len(data.get('tokens', []))}")
|
||||
# Check messages include tool calls
|
||||
for msg_list in data.get("messages", []):
|
||||
roles = [m.get("role") for m in msg_list]
|
||||
print(f"Roles: {roles}")
|
||||
for m in reversed(msg_list):
|
||||
if m.get("role") == "assistant" and m.get("content"):
|
||||
print(f"Response: {m['content'][:200]}...")
|
||||
break
|
||||
```
|
||||
|
||||
### What to check:
|
||||
- **Scores are not all 0.0** — if so, compute_reward is broken
|
||||
- **Scores are in [0, 1]** — not negative, not >1
|
||||
- **Messages include "tool" role entries** — agent used tools
|
||||
- **Token sequences are non-empty**
|
||||
- **An HTML visualization is generated** next to the .jsonl
|
||||
|
||||
### Common failures:
|
||||
- `'AgentResult' object has no attribute 'X'` — accessing a field that doesn't exist. See agentresult-fields.md.
|
||||
- Score always 0.0 — reward function erroring silently
|
||||
- Score always 1.0 — verification too lenient or not running
|
||||
|
||||
|
||||
## Pattern 2: Evaluate a Model (evaluate mode)
|
||||
|
||||
Use `evaluate` mode to benchmark a model on your environment's eval
|
||||
split. This runs the full agent loop with tools for each eval item.
|
||||
|
||||
### Step 1: Run evaluation
|
||||
|
||||
```bash
|
||||
python environments/your_env.py evaluate \
|
||||
--env.eval_size 20 \
|
||||
--env.use_wandb false \
|
||||
--env.data_dir_to_save_evals /tmp/eval_results \
|
||||
--openai.base_url "<BASE_URL>" \
|
||||
--openai.model_name "<MODEL>" \
|
||||
--openai.server_type <SERVER_TYPE> \
|
||||
--openai.health_check false
|
||||
```
|
||||
|
||||
### Step 2: Read results
|
||||
|
||||
Stdout shows a lighteval-compatible table:
|
||||
|
||||
```
|
||||
Evaluation Results: your-env_eval
|
||||
|Metric | Value|
|
||||
|mean correctness| 0.850 |
|
||||
|mean reward | 0.920 |
|
||||
|mean tool calls | 4.300 |
|
||||
|n items | 20 |
|
||||
Evaluation completed in 367 seconds
|
||||
```
|
||||
|
||||
JSON results saved to the eval directory:
|
||||
|
||||
```python
|
||||
import json
|
||||
data = json.load(open("/tmp/eval_results/metrics.json"))
|
||||
for metric, value in data["results"]["all"].items():
|
||||
print(f"{metric}: {value}")
|
||||
```
|
||||
|
||||
### Step 3: Compare models
|
||||
|
||||
Run evaluate with different models and compare the metrics.json files.
|
||||
|
||||
### What to check:
|
||||
- **"data_dir_to_save_evals is not set"** — you forgot the flag, results won't be saved
|
||||
- **Tool usage rate = 0** — evaluate() is using chat_completion instead of HermesAgentLoop
|
||||
- **All scores identical** — judge failing, falling back to heuristic
|
||||
- **Very slow** — each item runs a full agent loop (~30-90s). Use `--env.eval_size 5` for quick checks.
|
||||
|
||||
|
||||
## Pattern 3: Generate Training Data (process mode, larger scale)
|
||||
|
||||
Generate trajectory data for offline training or analysis:
|
||||
|
||||
```bash
|
||||
python environments/your_env.py process \
|
||||
--env.total_steps 50 \
|
||||
--env.group_size 4 \
|
||||
--env.use_wandb false \
|
||||
--env.data_path_to_save_groups data/trajectories.jsonl \
|
||||
--openai.base_url "<BASE_URL>" \
|
||||
--openai.model_name "<MODEL>" \
|
||||
--openai.server_type <SERVER_TYPE> \
|
||||
--openai.health_check false
|
||||
```
|
||||
|
||||
### Analyze the distribution:
|
||||
|
||||
```python
|
||||
import json
|
||||
scores = []
|
||||
for line in open("data/trajectories.jsonl"):
|
||||
data = json.loads(line)
|
||||
scores.extend(data.get("scores", []))
|
||||
|
||||
print(f"Total: {len(scores)}, Mean: {sum(scores)/len(scores):.3f}")
|
||||
for bucket in [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]:
|
||||
count = sum(1 for s in scores if abs(s - bucket) < 0.1)
|
||||
print(f" {bucket:.1f}: {'█' * count} ({count})")
|
||||
```
|
||||
|
||||
### What to check:
|
||||
- **Score distribution has variance** — RL needs score variance. All-same scores are useless.
|
||||
|
||||
|
||||
## Pattern 4: Full RL Training (serve mode)
|
||||
|
||||
For actual RL training with Atropos:
|
||||
|
||||
```bash
|
||||
# Terminal 1: Start Atropos API server
|
||||
run-api
|
||||
|
||||
# Terminal 2: Start your environment
|
||||
python environments/your_env.py serve \
|
||||
--config environments/your_env/default.yaml
|
||||
```
|
||||
|
||||
For Phase 2 with VLLM:
|
||||
|
||||
```bash
|
||||
# Terminal 1: VLLM server
|
||||
python -m vllm.entrypoints.openai.api_server --model your-model --port 8000
|
||||
|
||||
# Terminal 2: Atropos API
|
||||
run-api
|
||||
|
||||
# Terminal 3: Environment
|
||||
python environments/your_env.py serve \
|
||||
--openai.base_url http://localhost:8000/v1 \
|
||||
--openai.model_name your-model \
|
||||
--openai.server_type vllm
|
||||
```
|
||||
|
||||
|
||||
## Pattern 5: Quick Smoke Test
|
||||
|
||||
Verify imports and config before spending money on API calls:
|
||||
|
||||
```python
|
||||
from environments.your_env import YourEnv
|
||||
print(f"Name: {YourEnv.name}")
|
||||
cfg, servers = YourEnv.config_init()
|
||||
print(f"Toolsets: {cfg.enabled_toolsets}")
|
||||
print(f"Server: {servers[0].model_name}")
|
||||
print("All imports OK")
|
||||
```
|
||||
|
||||
|
||||
## Timing Expectations
|
||||
|
||||
| Mode | Items | Time per item | Total |
|
||||
|------|-------|--------------|-------|
|
||||
| process (1 item) | 1 | 30-90s | ~1 min |
|
||||
| evaluate (5 items) | 5 | 30-90s | ~5 min |
|
||||
| evaluate (20 items) | 20 | 30-90s | ~15-30 min |
|
||||
| process (50 items) | 50 | 30-90s | ~30-75 min |
|
||||
|
||||
Times are for cloud APIs with Claude Sonnet-class models. Local models may be faster or slower depending on hardware.
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
name: duckduckgo-search
|
||||
description: Free web search via DuckDuckGo when Firecrawl is unavailable. No API key needed. Use ddgs CLI or Python library to find URLs, then web_extract for content.
|
||||
version: 1.1.0
|
||||
description: Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Use the Python DDGS library or CLI to search, then web_extract for full content.
|
||||
version: 1.2.0
|
||||
author: gamedevCloudy
|
||||
license: MIT
|
||||
metadata:
|
||||
@@ -10,17 +10,11 @@ metadata:
|
||||
related_skills: [arxiv]
|
||||
---
|
||||
|
||||
# DuckDuckGo Search (Firecrawl Fallback)
|
||||
# DuckDuckGo Search
|
||||
|
||||
Free web search using DuckDuckGo. **No API key required.**
|
||||
|
||||
## When to Use This
|
||||
|
||||
Use this skill ONLY when the `web_search` tool is not available (i.e., `FIRECRAWL_API_KEY` is not set). If `web_search` works, prefer it — it returns richer results with built-in content extraction.
|
||||
|
||||
Signs you need this fallback:
|
||||
- `web_search` tool is not listed in your available tools
|
||||
- `web_search` returns an error about missing FIRECRAWL_API_KEY
|
||||
Preferred when `web_search` tool is unavailable or unsuitable (no `FIRECRAWL_API_KEY` set). Can also be used as a standalone search tool.
|
||||
|
||||
## Setup
|
||||
|
||||
@@ -29,14 +23,109 @@ Signs you need this fallback:
|
||||
pip install ddgs
|
||||
```
|
||||
|
||||
## Web Search (Primary Use Case)
|
||||
## Python API (Primary)
|
||||
|
||||
### Via Terminal (ddgs CLI)
|
||||
Use the `DDGS` class in `execute_code` for structured results with typed fields.
|
||||
|
||||
**Important:** `max_results` must always be passed as a **keyword argument** — positional usage raises an error on all methods.
|
||||
|
||||
### Text Search
|
||||
|
||||
Best for: general research, companies, documentation.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.text("python async programming", max_results=5):
|
||||
print(r["title"])
|
||||
print(r["href"])
|
||||
print(r.get("body", "")[:200])
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `title`, `href`, `body`
|
||||
|
||||
### News Search
|
||||
|
||||
Best for: current events, breaking news, latest updates.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.news("AI regulation 2026", max_results=5):
|
||||
print(r["date"], "-", r["title"])
|
||||
print(r.get("source", ""), "|", r["url"])
|
||||
print(r.get("body", "")[:200])
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `date`, `title`, `body`, `url`, `image`, `source`
|
||||
|
||||
### Image Search
|
||||
|
||||
Best for: visual references, product images, diagrams.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.images("semiconductor chip", max_results=5):
|
||||
print(r["title"])
|
||||
print(r["image"]) # direct image URL
|
||||
print(r.get("thumbnail", ""))
|
||||
print(r.get("source", ""))
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `title`, `image`, `thumbnail`, `url`, `height`, `width`, `source`
|
||||
|
||||
### Video Search
|
||||
|
||||
Best for: tutorials, demos, explainers.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.videos("FastAPI tutorial", max_results=5):
|
||||
print(r["title"])
|
||||
print(r.get("content", "")) # video URL
|
||||
print(r.get("duration", "")) # e.g. "26:03"
|
||||
print(r.get("provider", "")) # YouTube, etc.
|
||||
print(r.get("published", ""))
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `title`, `content`, `description`, `duration`, `provider`, `published`, `statistics`, `uploader`
|
||||
|
||||
### Quick Reference
|
||||
|
||||
| Method | Use When | Key Fields |
|
||||
|--------|----------|------------|
|
||||
| `text()` | General research, companies | title, href, body |
|
||||
| `news()` | Current events, updates | date, title, source, body, url |
|
||||
| `images()` | Visuals, diagrams | title, image, thumbnail, url |
|
||||
| `videos()` | Tutorials, demos | title, content, duration, provider |
|
||||
|
||||
## CLI (Alternative)
|
||||
|
||||
Use the `ddgs` command via terminal when you don't need structured field access.
|
||||
|
||||
```bash
|
||||
# Basic search — returns titles, URLs, and snippets
|
||||
# Text search
|
||||
ddgs text -k "python async programming" -m 5
|
||||
|
||||
# News search
|
||||
ddgs news -k "artificial intelligence" -m 5
|
||||
|
||||
# Image search
|
||||
ddgs images -k "landscape photography" -m 10
|
||||
|
||||
# Video search
|
||||
ddgs videos -k "python tutorial" -m 5
|
||||
|
||||
# With region filter
|
||||
ddgs text -k "best restaurants" -m 5 -r us-en
|
||||
|
||||
@@ -47,16 +136,6 @@ ddgs text -k "latest AI news" -m 5 -t w
|
||||
ddgs text -k "fastapi tutorial" -m 5 -o json
|
||||
```
|
||||
|
||||
### Via Python (in execute_code)
|
||||
|
||||
```python
|
||||
from hermes_tools import terminal
|
||||
|
||||
# Search and get results
|
||||
result = terminal("ddgs text -k 'python web framework comparison' -m 5")
|
||||
print(result["output"])
|
||||
```
|
||||
|
||||
### CLI Flags
|
||||
|
||||
| Flag | Description | Example |
|
||||
@@ -68,44 +147,39 @@ print(result["output"])
|
||||
| `-s` | Safe search | `-s off` |
|
||||
| `-o` | Output format | `-o json` |
|
||||
|
||||
## Other Search Types
|
||||
## Workflow: Search then Extract
|
||||
|
||||
```bash
|
||||
# Image search
|
||||
ddgs images -k "landscape photography" -m 10
|
||||
|
||||
# News search
|
||||
ddgs news -k "artificial intelligence" -m 5
|
||||
|
||||
# Video search
|
||||
ddgs videos -k "python tutorial" -m 5
|
||||
```
|
||||
|
||||
## Workflow: Search → Extract
|
||||
|
||||
DuckDuckGo finds URLs. To get full page content, follow up with `web_extract`:
|
||||
DuckDuckGo returns titles, URLs, and snippets — not full page content. To get full content, follow up with `web_extract`:
|
||||
|
||||
1. **Search** with ddgs to find relevant URLs
|
||||
2. **Extract** content using the `web_extract` tool (if available) or curl
|
||||
|
||||
```bash
|
||||
# Step 1: Find URLs
|
||||
ddgs text -k "fastapi tutorial" -m 3
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
# Step 2: Extract full content from a result URL
|
||||
# (use web_extract tool if available, otherwise curl)
|
||||
curl -s "https://example.com/article" | head -200
|
||||
with DDGS() as ddgs:
|
||||
results = list(ddgs.text("fastapi deployment guide", max_results=3))
|
||||
for r in results:
|
||||
print(r["title"], "->", r["href"])
|
||||
|
||||
# Then use web_extract tool on the best URL
|
||||
```
|
||||
|
||||
## Limitations
|
||||
|
||||
- **Rate limiting**: DuckDuckGo may throttle after many rapid requests. Add `sleep 1` between searches if needed.
|
||||
- **No content extraction**: ddgs only returns titles, URLs, and snippets — not full page content. Use `web_extract` or curl for that.
|
||||
- **Rate limiting**: DuckDuckGo may throttle after many rapid requests. Add a short delay between searches if needed.
|
||||
- **No content extraction**: ddgs returns snippets, not full page content. Use `web_extract` or curl for that.
|
||||
- **Results quality**: Generally good but less configurable than Firecrawl's search.
|
||||
- **Availability**: DuckDuckGo may block requests from some cloud IPs. If searches return empty, try different keywords or add a short delay.
|
||||
- **Availability**: DuckDuckGo may block requests from some cloud IPs. If searches return empty, try different keywords or wait a few seconds.
|
||||
- **Field variability**: Return fields may vary between results or ddgs versions. Use `.get()` for optional fields to avoid KeyError.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- **Don't confuse `-k` and `-m`**: `-k` is for keywords (the query), `-m` is for max results count.
|
||||
- **`max_results` is keyword-only**: `ddgs.text("query", 5)` raises an error. Use `ddgs.text("query", max_results=5)`.
|
||||
- **Don't confuse `-k` and `-m`** (CLI): `-k` is for keywords, `-m` is for max results count.
|
||||
- **Package name**: The package is `ddgs` (was previously `duckduckgo-search`). Install with `pip install ddgs`.
|
||||
- **Empty results**: If ddgs returns nothing, it may be rate-limited. Wait a few seconds and retry.
|
||||
|
||||
## Validated With
|
||||
|
||||
Smoke-tested with `ddgs==9.11.2` on Python 3.13. All four methods (text, news, images, videos) confirmed working with keyword `max_results`.
|
||||
|
||||
@@ -0,0 +1,198 @@
|
||||
"""Tests for configurable background process notification modes.
|
||||
|
||||
The gateway process watcher pushes status updates to users' chats when
|
||||
background terminal commands run. ``display.background_process_notifications``
|
||||
controls verbosity: off | result | error | all (default).
|
||||
|
||||
Contributed by @PeterFile (PR #593), reimplemented on current main.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
from types import SimpleNamespace
|
||||
from unittest.mock import AsyncMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from gateway.config import GatewayConfig, Platform
|
||||
from gateway.run import GatewayRunner
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class _FakeRegistry:
|
||||
"""Return pre-canned sessions, then None once exhausted."""
|
||||
|
||||
def __init__(self, sessions):
|
||||
self._sessions = list(sessions)
|
||||
|
||||
def get(self, session_id):
|
||||
if self._sessions:
|
||||
return self._sessions.pop(0)
|
||||
return None
|
||||
|
||||
|
||||
def _build_runner(monkeypatch, tmp_path, mode: str) -> GatewayRunner:
|
||||
"""Create a GatewayRunner with a fake config for the given mode."""
|
||||
(tmp_path / "config.yaml").write_text(
|
||||
f"display:\n background_process_notifications: {mode}\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
import gateway.run as gateway_run
|
||||
|
||||
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
|
||||
|
||||
runner = GatewayRunner(GatewayConfig())
|
||||
adapter = SimpleNamespace(send=AsyncMock())
|
||||
runner.adapters[Platform.TELEGRAM] = adapter
|
||||
return runner
|
||||
|
||||
|
||||
def _watcher_dict(session_id="proc_test"):
|
||||
return {
|
||||
"session_id": session_id,
|
||||
"check_interval": 0,
|
||||
"platform": "telegram",
|
||||
"chat_id": "123",
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _load_background_notifications_mode unit tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestLoadBackgroundNotificationsMode:
|
||||
|
||||
def test_defaults_to_all(self, monkeypatch, tmp_path):
|
||||
import gateway.run as gw
|
||||
monkeypatch.setattr(gw, "_hermes_home", tmp_path)
|
||||
monkeypatch.delenv("HERMES_BACKGROUND_NOTIFICATIONS", raising=False)
|
||||
assert GatewayRunner._load_background_notifications_mode() == "all"
|
||||
|
||||
def test_reads_config_yaml(self, monkeypatch, tmp_path):
|
||||
(tmp_path / "config.yaml").write_text(
|
||||
"display:\n background_process_notifications: error\n"
|
||||
)
|
||||
import gateway.run as gw
|
||||
monkeypatch.setattr(gw, "_hermes_home", tmp_path)
|
||||
monkeypatch.delenv("HERMES_BACKGROUND_NOTIFICATIONS", raising=False)
|
||||
assert GatewayRunner._load_background_notifications_mode() == "error"
|
||||
|
||||
def test_env_var_overrides_config(self, monkeypatch, tmp_path):
|
||||
(tmp_path / "config.yaml").write_text(
|
||||
"display:\n background_process_notifications: error\n"
|
||||
)
|
||||
import gateway.run as gw
|
||||
monkeypatch.setattr(gw, "_hermes_home", tmp_path)
|
||||
monkeypatch.setenv("HERMES_BACKGROUND_NOTIFICATIONS", "off")
|
||||
assert GatewayRunner._load_background_notifications_mode() == "off"
|
||||
|
||||
def test_false_value_maps_to_off(self, monkeypatch, tmp_path):
|
||||
(tmp_path / "config.yaml").write_text(
|
||||
"display:\n background_process_notifications: false\n"
|
||||
)
|
||||
import gateway.run as gw
|
||||
monkeypatch.setattr(gw, "_hermes_home", tmp_path)
|
||||
monkeypatch.delenv("HERMES_BACKGROUND_NOTIFICATIONS", raising=False)
|
||||
assert GatewayRunner._load_background_notifications_mode() == "off"
|
||||
|
||||
def test_invalid_value_defaults_to_all(self, monkeypatch, tmp_path):
|
||||
(tmp_path / "config.yaml").write_text(
|
||||
"display:\n background_process_notifications: banana\n"
|
||||
)
|
||||
import gateway.run as gw
|
||||
monkeypatch.setattr(gw, "_hermes_home", tmp_path)
|
||||
monkeypatch.delenv("HERMES_BACKGROUND_NOTIFICATIONS", raising=False)
|
||||
assert GatewayRunner._load_background_notifications_mode() == "all"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _run_process_watcher integration tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@pytest.mark.asyncio
|
||||
@pytest.mark.parametrize(
|
||||
("mode", "sessions", "expected_calls", "expected_fragment"),
|
||||
[
|
||||
# all mode: running output → sends update
|
||||
(
|
||||
"all",
|
||||
[
|
||||
SimpleNamespace(output_buffer="building...\n", exited=False, exit_code=None),
|
||||
None, # process disappears → watcher exits
|
||||
],
|
||||
1,
|
||||
"is still running",
|
||||
),
|
||||
# result mode: running output → no update
|
||||
(
|
||||
"result",
|
||||
[
|
||||
SimpleNamespace(output_buffer="building...\n", exited=False, exit_code=None),
|
||||
None,
|
||||
],
|
||||
0,
|
||||
None,
|
||||
),
|
||||
# off mode: exited process → no notification
|
||||
(
|
||||
"off",
|
||||
[SimpleNamespace(output_buffer="done\n", exited=True, exit_code=0)],
|
||||
0,
|
||||
None,
|
||||
),
|
||||
# result mode: exited → notifies
|
||||
(
|
||||
"result",
|
||||
[SimpleNamespace(output_buffer="done\n", exited=True, exit_code=0)],
|
||||
1,
|
||||
"finished with exit code 0",
|
||||
),
|
||||
# error mode: exit 0 → no notification
|
||||
(
|
||||
"error",
|
||||
[SimpleNamespace(output_buffer="done\n", exited=True, exit_code=0)],
|
||||
0,
|
||||
None,
|
||||
),
|
||||
# error mode: exit 1 → notifies
|
||||
(
|
||||
"error",
|
||||
[SimpleNamespace(output_buffer="traceback\n", exited=True, exit_code=1)],
|
||||
1,
|
||||
"finished with exit code 1",
|
||||
),
|
||||
# all mode: exited → notifies
|
||||
(
|
||||
"all",
|
||||
[SimpleNamespace(output_buffer="ok\n", exited=True, exit_code=0)],
|
||||
1,
|
||||
"finished with exit code 0",
|
||||
),
|
||||
],
|
||||
)
|
||||
async def test_run_process_watcher_respects_notification_mode(
|
||||
monkeypatch, tmp_path, mode, sessions, expected_calls, expected_fragment
|
||||
):
|
||||
import tools.process_registry as pr_module
|
||||
|
||||
monkeypatch.setattr(pr_module, "process_registry", _FakeRegistry(sessions))
|
||||
|
||||
# Patch asyncio.sleep to avoid real delays
|
||||
async def _instant_sleep(*_a, **_kw):
|
||||
pass
|
||||
monkeypatch.setattr(asyncio, "sleep", _instant_sleep)
|
||||
|
||||
runner = _build_runner(monkeypatch, tmp_path, mode)
|
||||
adapter = runner.adapters[Platform.TELEGRAM]
|
||||
|
||||
await runner._run_process_watcher(_watcher_dict())
|
||||
|
||||
assert adapter.send.await_count == expected_calls, (
|
||||
f"mode={mode}: expected {expected_calls} sends, got {adapter.send.await_count}"
|
||||
)
|
||||
if expected_fragment is not None:
|
||||
sent_message = adapter.send.await_args.args[1]
|
||||
assert expected_fragment in sent_message
|
||||
@@ -160,3 +160,27 @@ class TestMirrorToSession:
|
||||
result = mirror_to_session("telegram", "123", "msg")
|
||||
|
||||
assert result is False
|
||||
|
||||
|
||||
class TestAppendToSqlite:
|
||||
def test_connection_is_closed_after_use(self, tmp_path):
|
||||
"""Verify _append_to_sqlite closes the SessionDB connection."""
|
||||
from gateway.mirror import _append_to_sqlite
|
||||
mock_db = MagicMock()
|
||||
|
||||
with patch("hermes_state.SessionDB", return_value=mock_db):
|
||||
_append_to_sqlite("sess_1", {"role": "assistant", "content": "hello"})
|
||||
|
||||
mock_db.append_message.assert_called_once()
|
||||
mock_db.close.assert_called_once()
|
||||
|
||||
def test_connection_closed_even_on_error(self, tmp_path):
|
||||
"""Verify connection is closed even when append_message raises."""
|
||||
from gateway.mirror import _append_to_sqlite
|
||||
mock_db = MagicMock()
|
||||
mock_db.append_message.side_effect = Exception("db error")
|
||||
|
||||
with patch("hermes_state.SessionDB", return_value=mock_db):
|
||||
_append_to_sqlite("sess_1", {"role": "assistant", "content": "hello"})
|
||||
|
||||
mock_db.close.assert_called_once()
|
||||
|
||||
@@ -34,7 +34,7 @@ def _ensure_telegram_mock():
|
||||
|
||||
_ensure_telegram_mock()
|
||||
|
||||
from gateway.platforms.telegram import TelegramAdapter, _escape_mdv2 # noqa: E402
|
||||
from gateway.platforms.telegram import TelegramAdapter, _escape_mdv2, _strip_mdv2 # noqa: E402
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -360,3 +360,35 @@ class TestFormatMessageComplex:
|
||||
assert "Header" in result
|
||||
assert "block" in result
|
||||
assert "url.com" in result
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# _strip_mdv2 — plaintext fallback
|
||||
# =========================================================================
|
||||
|
||||
|
||||
class TestStripMdv2:
|
||||
def test_removes_escape_backslashes(self):
|
||||
assert _strip_mdv2(r"hello\.world\!") == "hello.world!"
|
||||
|
||||
def test_removes_bold_markers(self):
|
||||
assert _strip_mdv2("*bold text*") == "bold text"
|
||||
|
||||
def test_removes_italic_markers(self):
|
||||
assert _strip_mdv2("_italic text_") == "italic text"
|
||||
|
||||
def test_removes_both_bold_and_italic(self):
|
||||
result = _strip_mdv2("*bold* and _italic_")
|
||||
assert result == "bold and italic"
|
||||
|
||||
def test_preserves_snake_case(self):
|
||||
assert _strip_mdv2("my_variable_name") == "my_variable_name"
|
||||
|
||||
def test_preserves_multi_underscore_identifier(self):
|
||||
assert _strip_mdv2("some_func_call here") == "some_func_call here"
|
||||
|
||||
def test_plain_text_unchanged(self):
|
||||
assert _strip_mdv2("plain text") == "plain text"
|
||||
|
||||
def test_empty_string(self):
|
||||
assert _strip_mdv2("") == ""
|
||||
|
||||
@@ -0,0 +1,113 @@
|
||||
"""Tests for _coalesce_session_name_args — multi-word session name merging."""
|
||||
|
||||
import pytest
|
||||
from hermes_cli.main import _coalesce_session_name_args
|
||||
|
||||
|
||||
class TestCoalesceSessionNameArgs:
|
||||
"""Ensure unquoted multi-word session names are merged into one token."""
|
||||
|
||||
# ── -c / --continue ──────────────────────────────────────────────────
|
||||
|
||||
def test_continue_multiword_unquoted(self):
|
||||
"""hermes -c Pokemon Agent Dev → -c 'Pokemon Agent Dev'"""
|
||||
assert _coalesce_session_name_args(
|
||||
["-c", "Pokemon", "Agent", "Dev"]
|
||||
) == ["-c", "Pokemon Agent Dev"]
|
||||
|
||||
def test_continue_long_form_multiword(self):
|
||||
"""hermes --continue Pokemon Agent Dev"""
|
||||
assert _coalesce_session_name_args(
|
||||
["--continue", "Pokemon", "Agent", "Dev"]
|
||||
) == ["--continue", "Pokemon Agent Dev"]
|
||||
|
||||
def test_continue_single_word(self):
|
||||
"""hermes -c MyProject (no merging needed)"""
|
||||
assert _coalesce_session_name_args(["-c", "MyProject"]) == [
|
||||
"-c",
|
||||
"MyProject",
|
||||
]
|
||||
|
||||
def test_continue_already_quoted(self):
|
||||
"""hermes -c 'Pokemon Agent Dev' (shell already merged)"""
|
||||
assert _coalesce_session_name_args(
|
||||
["-c", "Pokemon Agent Dev"]
|
||||
) == ["-c", "Pokemon Agent Dev"]
|
||||
|
||||
def test_continue_bare_flag(self):
|
||||
"""hermes -c (no name — means 'continue latest')"""
|
||||
assert _coalesce_session_name_args(["-c"]) == ["-c"]
|
||||
|
||||
def test_continue_followed_by_flag(self):
|
||||
"""hermes -c -w (no name consumed, -w stays separate)"""
|
||||
assert _coalesce_session_name_args(["-c", "-w"]) == ["-c", "-w"]
|
||||
|
||||
def test_continue_multiword_then_flag(self):
|
||||
"""hermes -c my project -w"""
|
||||
assert _coalesce_session_name_args(
|
||||
["-c", "my", "project", "-w"]
|
||||
) == ["-c", "my project", "-w"]
|
||||
|
||||
def test_continue_multiword_then_subcommand(self):
|
||||
"""hermes -c my project chat -q hello"""
|
||||
assert _coalesce_session_name_args(
|
||||
["-c", "my", "project", "chat", "-q", "hello"]
|
||||
) == ["-c", "my project", "chat", "-q", "hello"]
|
||||
|
||||
# ── -r / --resume ────────────────────────────────────────────────────
|
||||
|
||||
def test_resume_multiword(self):
|
||||
"""hermes -r My Session Name"""
|
||||
assert _coalesce_session_name_args(
|
||||
["-r", "My", "Session", "Name"]
|
||||
) == ["-r", "My Session Name"]
|
||||
|
||||
def test_resume_long_form_multiword(self):
|
||||
"""hermes --resume My Session Name"""
|
||||
assert _coalesce_session_name_args(
|
||||
["--resume", "My", "Session", "Name"]
|
||||
) == ["--resume", "My Session Name"]
|
||||
|
||||
def test_resume_multiword_then_flag(self):
|
||||
"""hermes -r My Session -w"""
|
||||
assert _coalesce_session_name_args(
|
||||
["-r", "My", "Session", "-w"]
|
||||
) == ["-r", "My Session", "-w"]
|
||||
|
||||
# ── combined flags ───────────────────────────────────────────────────
|
||||
|
||||
def test_worktree_and_continue_multiword(self):
|
||||
"""hermes -w -c Pokemon Agent Dev (the original failing case)"""
|
||||
assert _coalesce_session_name_args(
|
||||
["-w", "-c", "Pokemon", "Agent", "Dev"]
|
||||
) == ["-w", "-c", "Pokemon Agent Dev"]
|
||||
|
||||
def test_continue_multiword_and_worktree(self):
|
||||
"""hermes -c Pokemon Agent Dev -w (order reversed)"""
|
||||
assert _coalesce_session_name_args(
|
||||
["-c", "Pokemon", "Agent", "Dev", "-w"]
|
||||
) == ["-c", "Pokemon Agent Dev", "-w"]
|
||||
|
||||
# ── passthrough (no session flags) ───────────────────────────────────
|
||||
|
||||
def test_no_session_flags_passthrough(self):
|
||||
"""hermes -w chat -q hello (nothing to merge)"""
|
||||
result = _coalesce_session_name_args(["-w", "chat", "-q", "hello"])
|
||||
assert result == ["-w", "chat", "-q", "hello"]
|
||||
|
||||
def test_empty_argv(self):
|
||||
assert _coalesce_session_name_args([]) == []
|
||||
|
||||
# ── subcommand boundary ──────────────────────────────────────────────
|
||||
|
||||
def test_stops_at_sessions_subcommand(self):
|
||||
"""hermes -c my project sessions list → stops before 'sessions'"""
|
||||
assert _coalesce_session_name_args(
|
||||
["-c", "my", "project", "sessions", "list"]
|
||||
) == ["-c", "my project", "sessions", "list"]
|
||||
|
||||
def test_stops_at_setup_subcommand(self):
|
||||
"""hermes -c my setup → 'setup' is a subcommand, not part of name"""
|
||||
assert _coalesce_session_name_args(
|
||||
["-c", "my", "setup"]
|
||||
) == ["-c", "my", "setup"]
|
||||
@@ -12,7 +12,7 @@ EXPECTED_COMMANDS = {
|
||||
"/personality", "/clear", "/history", "/new", "/reset", "/retry",
|
||||
"/undo", "/save", "/config", "/cron", "/skills", "/platforms",
|
||||
"/verbose", "/compress", "/title", "/usage", "/insights", "/paste",
|
||||
"/reload-mcp", "/quit",
|
||||
"/reload-mcp", "/rollback", "/skin", "/quit",
|
||||
}
|
||||
|
||||
|
||||
|
||||
@@ -2,7 +2,11 @@
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch
|
||||
from unittest.mock import patch, MagicMock
|
||||
|
||||
import yaml
|
||||
|
||||
import yaml
|
||||
|
||||
from hermes_cli.config import (
|
||||
DEFAULT_CONFIG,
|
||||
@@ -41,22 +45,44 @@ class TestLoadConfigDefaults:
|
||||
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
|
||||
config = load_config()
|
||||
assert config["model"] == DEFAULT_CONFIG["model"]
|
||||
assert config["max_turns"] == DEFAULT_CONFIG["max_turns"]
|
||||
assert config["agent"]["max_turns"] == DEFAULT_CONFIG["agent"]["max_turns"]
|
||||
assert "max_turns" not in config
|
||||
assert "terminal" in config
|
||||
assert config["terminal"]["backend"] == "local"
|
||||
|
||||
def test_legacy_root_level_max_turns_migrates_to_agent_config(self, tmp_path):
|
||||
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
|
||||
config_path = tmp_path / "config.yaml"
|
||||
config_path.write_text("max_turns: 42\n")
|
||||
|
||||
config = load_config()
|
||||
assert config["agent"]["max_turns"] == 42
|
||||
assert "max_turns" not in config
|
||||
|
||||
|
||||
class TestSaveAndLoadRoundtrip:
|
||||
def test_roundtrip(self, tmp_path):
|
||||
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
|
||||
config = load_config()
|
||||
config["model"] = "test/custom-model"
|
||||
config["max_turns"] = 42
|
||||
config["agent"]["max_turns"] = 42
|
||||
save_config(config)
|
||||
|
||||
reloaded = load_config()
|
||||
assert reloaded["model"] == "test/custom-model"
|
||||
assert reloaded["max_turns"] == 42
|
||||
assert reloaded["agent"]["max_turns"] == 42
|
||||
|
||||
saved = yaml.safe_load((tmp_path / "config.yaml").read_text())
|
||||
assert saved["agent"]["max_turns"] == 42
|
||||
assert "max_turns" not in saved
|
||||
|
||||
def test_save_config_normalizes_legacy_root_level_max_turns(self, tmp_path):
|
||||
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
|
||||
save_config({"model": "test/custom-model", "max_turns": 37})
|
||||
|
||||
saved = yaml.safe_load((tmp_path / "config.yaml").read_text())
|
||||
assert saved["agent"]["max_turns"] == 37
|
||||
assert "max_turns" not in saved
|
||||
|
||||
def test_nested_values_preserved(self, tmp_path):
|
||||
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
|
||||
@@ -66,3 +92,62 @@ class TestSaveAndLoadRoundtrip:
|
||||
|
||||
reloaded = load_config()
|
||||
assert reloaded["terminal"]["timeout"] == 999
|
||||
|
||||
|
||||
class TestSaveConfigAtomicity:
|
||||
"""Verify save_config uses atomic writes (tempfile + os.replace)."""
|
||||
|
||||
def test_no_partial_write_on_crash(self, tmp_path):
|
||||
"""If save_config crashes mid-write, the previous file stays intact."""
|
||||
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
|
||||
# Write an initial config
|
||||
config = load_config()
|
||||
config["model"] = "original-model"
|
||||
save_config(config)
|
||||
|
||||
config_path = tmp_path / "config.yaml"
|
||||
assert config_path.exists()
|
||||
|
||||
# Simulate a crash during yaml.dump by making atomic_yaml_write's
|
||||
# yaml.dump raise after the temp file is created but before replace.
|
||||
with patch("utils.yaml.dump", side_effect=OSError("disk full")):
|
||||
try:
|
||||
config["model"] = "should-not-persist"
|
||||
save_config(config)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
# Original file must still be intact
|
||||
reloaded = load_config()
|
||||
assert reloaded["model"] == "original-model"
|
||||
|
||||
def test_no_leftover_temp_files(self, tmp_path):
|
||||
"""Failed writes must clean up their temp files."""
|
||||
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
|
||||
config = load_config()
|
||||
save_config(config)
|
||||
|
||||
with patch("utils.yaml.dump", side_effect=OSError("disk full")):
|
||||
try:
|
||||
save_config(config)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
# No .tmp files should remain
|
||||
tmp_files = list(tmp_path.glob(".*config*.tmp"))
|
||||
assert tmp_files == []
|
||||
|
||||
def test_atomic_write_creates_valid_yaml(self, tmp_path):
|
||||
"""The written file must be valid YAML matching the input."""
|
||||
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
|
||||
config = load_config()
|
||||
config["model"] = "test/atomic-model"
|
||||
config["agent"]["max_turns"] = 77
|
||||
save_config(config)
|
||||
|
||||
# Read raw YAML to verify it's valid and correct
|
||||
config_path = tmp_path / "config.yaml"
|
||||
with open(config_path) as f:
|
||||
raw = yaml.safe_load(f)
|
||||
assert raw["model"] == "test/atomic-model"
|
||||
assert raw["agent"]["max_turns"] == 77
|
||||
|
||||
@@ -0,0 +1,232 @@
|
||||
"""Tests for hermes_cli.skin_engine — the data-driven skin/theme system."""
|
||||
|
||||
import json
|
||||
import os
|
||||
import pytest
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def reset_skin_state():
|
||||
"""Reset skin engine state between tests."""
|
||||
from hermes_cli import skin_engine
|
||||
skin_engine._active_skin = None
|
||||
skin_engine._active_skin_name = "default"
|
||||
yield
|
||||
skin_engine._active_skin = None
|
||||
skin_engine._active_skin_name = "default"
|
||||
|
||||
|
||||
class TestSkinConfig:
|
||||
def test_default_skin_has_required_fields(self):
|
||||
from hermes_cli.skin_engine import load_skin
|
||||
skin = load_skin("default")
|
||||
assert skin.name == "default"
|
||||
assert skin.tool_prefix == "┊"
|
||||
assert "banner_title" in skin.colors
|
||||
assert "banner_border" in skin.colors
|
||||
assert "agent_name" in skin.branding
|
||||
|
||||
def test_get_color_with_fallback(self):
|
||||
from hermes_cli.skin_engine import load_skin
|
||||
skin = load_skin("default")
|
||||
assert skin.get_color("banner_title") == "#FFD700"
|
||||
assert skin.get_color("nonexistent", "#000") == "#000"
|
||||
|
||||
def test_get_branding_with_fallback(self):
|
||||
from hermes_cli.skin_engine import load_skin
|
||||
skin = load_skin("default")
|
||||
assert skin.get_branding("agent_name") == "Hermes Agent"
|
||||
assert skin.get_branding("nonexistent", "fallback") == "fallback"
|
||||
|
||||
def test_get_spinner_list_empty_for_default(self):
|
||||
from hermes_cli.skin_engine import load_skin
|
||||
skin = load_skin("default")
|
||||
# Default skin has no custom spinner config
|
||||
assert skin.get_spinner_list("waiting_faces") == []
|
||||
assert skin.get_spinner_list("thinking_verbs") == []
|
||||
|
||||
def test_get_spinner_wings_empty_for_default(self):
|
||||
from hermes_cli.skin_engine import load_skin
|
||||
skin = load_skin("default")
|
||||
assert skin.get_spinner_wings() == []
|
||||
|
||||
|
||||
class TestBuiltinSkins:
|
||||
def test_ares_skin_loads(self):
|
||||
from hermes_cli.skin_engine import load_skin
|
||||
skin = load_skin("ares")
|
||||
assert skin.name == "ares"
|
||||
assert skin.tool_prefix == "╎"
|
||||
assert skin.get_color("banner_border") == "#9F1C1C"
|
||||
assert skin.get_branding("agent_name") == "Ares Agent"
|
||||
|
||||
def test_ares_has_spinner_customization(self):
|
||||
from hermes_cli.skin_engine import load_skin
|
||||
skin = load_skin("ares")
|
||||
assert len(skin.get_spinner_list("waiting_faces")) > 0
|
||||
assert len(skin.get_spinner_list("thinking_faces")) > 0
|
||||
assert len(skin.get_spinner_list("thinking_verbs")) > 0
|
||||
wings = skin.get_spinner_wings()
|
||||
assert len(wings) > 0
|
||||
assert isinstance(wings[0], tuple)
|
||||
assert len(wings[0]) == 2
|
||||
|
||||
def test_mono_skin_loads(self):
|
||||
from hermes_cli.skin_engine import load_skin
|
||||
skin = load_skin("mono")
|
||||
assert skin.name == "mono"
|
||||
assert skin.get_color("banner_title") == "#e6edf3"
|
||||
|
||||
def test_slate_skin_loads(self):
|
||||
from hermes_cli.skin_engine import load_skin
|
||||
skin = load_skin("slate")
|
||||
assert skin.name == "slate"
|
||||
assert skin.get_color("banner_title") == "#7eb8f6"
|
||||
|
||||
def test_unknown_skin_falls_back_to_default(self):
|
||||
from hermes_cli.skin_engine import load_skin
|
||||
skin = load_skin("nonexistent_skin_xyz")
|
||||
assert skin.name == "default"
|
||||
|
||||
def test_all_builtin_skins_have_complete_colors(self):
|
||||
from hermes_cli.skin_engine import _BUILTIN_SKINS, _build_skin_config
|
||||
required_keys = ["banner_border", "banner_title", "banner_accent",
|
||||
"banner_dim", "banner_text", "ui_accent"]
|
||||
for name, data in _BUILTIN_SKINS.items():
|
||||
skin = _build_skin_config(data)
|
||||
for key in required_keys:
|
||||
assert key in skin.colors, f"Skin '{name}' missing color '{key}'"
|
||||
|
||||
|
||||
class TestSkinManagement:
|
||||
def test_set_active_skin(self):
|
||||
from hermes_cli.skin_engine import set_active_skin, get_active_skin, get_active_skin_name
|
||||
skin = set_active_skin("ares")
|
||||
assert skin.name == "ares"
|
||||
assert get_active_skin_name() == "ares"
|
||||
assert get_active_skin().name == "ares"
|
||||
|
||||
def test_get_active_skin_defaults(self):
|
||||
from hermes_cli.skin_engine import get_active_skin
|
||||
skin = get_active_skin()
|
||||
assert skin.name == "default"
|
||||
|
||||
def test_list_skins_includes_builtins(self):
|
||||
from hermes_cli.skin_engine import list_skins
|
||||
skins = list_skins()
|
||||
names = [s["name"] for s in skins]
|
||||
assert "default" in names
|
||||
assert "ares" in names
|
||||
assert "mono" in names
|
||||
assert "slate" in names
|
||||
for s in skins:
|
||||
assert "source" in s
|
||||
assert s["source"] == "builtin"
|
||||
|
||||
def test_init_skin_from_config(self):
|
||||
from hermes_cli.skin_engine import init_skin_from_config, get_active_skin_name
|
||||
init_skin_from_config({"display": {"skin": "ares"}})
|
||||
assert get_active_skin_name() == "ares"
|
||||
|
||||
def test_init_skin_from_empty_config(self):
|
||||
from hermes_cli.skin_engine import init_skin_from_config, get_active_skin_name
|
||||
init_skin_from_config({})
|
||||
assert get_active_skin_name() == "default"
|
||||
|
||||
|
||||
class TestUserSkins:
|
||||
def test_load_user_skin_from_yaml(self, tmp_path, monkeypatch):
|
||||
from hermes_cli.skin_engine import load_skin, _skins_dir
|
||||
# Create a user skin YAML
|
||||
skins_dir = tmp_path / "skins"
|
||||
skins_dir.mkdir()
|
||||
skin_file = skins_dir / "custom.yaml"
|
||||
skin_data = {
|
||||
"name": "custom",
|
||||
"description": "A custom test skin",
|
||||
"colors": {"banner_title": "#FF0000"},
|
||||
"branding": {"agent_name": "Custom Agent"},
|
||||
"tool_prefix": "▸",
|
||||
}
|
||||
import yaml
|
||||
skin_file.write_text(yaml.dump(skin_data))
|
||||
|
||||
# Patch skins dir
|
||||
monkeypatch.setattr("hermes_cli.skin_engine._skins_dir", lambda: skins_dir)
|
||||
|
||||
skin = load_skin("custom")
|
||||
assert skin.name == "custom"
|
||||
assert skin.get_color("banner_title") == "#FF0000"
|
||||
assert skin.get_branding("agent_name") == "Custom Agent"
|
||||
assert skin.tool_prefix == "▸"
|
||||
# Should inherit defaults for unspecified colors
|
||||
assert skin.get_color("banner_border") == "#CD7F32" # from default
|
||||
|
||||
def test_list_skins_includes_user_skins(self, tmp_path, monkeypatch):
|
||||
from hermes_cli.skin_engine import list_skins
|
||||
skins_dir = tmp_path / "skins"
|
||||
skins_dir.mkdir()
|
||||
import yaml
|
||||
(skins_dir / "pirate.yaml").write_text(yaml.dump({
|
||||
"name": "pirate",
|
||||
"description": "Arr matey",
|
||||
}))
|
||||
monkeypatch.setattr("hermes_cli.skin_engine._skins_dir", lambda: skins_dir)
|
||||
|
||||
skins = list_skins()
|
||||
names = [s["name"] for s in skins]
|
||||
assert "pirate" in names
|
||||
pirate = [s for s in skins if s["name"] == "pirate"][0]
|
||||
assert pirate["source"] == "user"
|
||||
|
||||
|
||||
class TestDisplayIntegration:
|
||||
def test_get_skin_tool_prefix_default(self):
|
||||
from agent.display import get_skin_tool_prefix
|
||||
assert get_skin_tool_prefix() == "┊"
|
||||
|
||||
def test_get_skin_tool_prefix_custom(self):
|
||||
from hermes_cli.skin_engine import set_active_skin
|
||||
from agent.display import get_skin_tool_prefix
|
||||
set_active_skin("ares")
|
||||
assert get_skin_tool_prefix() == "╎"
|
||||
|
||||
def test_get_skin_faces_default(self):
|
||||
from agent.display import get_skin_faces, KawaiiSpinner
|
||||
faces = get_skin_faces("waiting_faces", KawaiiSpinner.KAWAII_WAITING)
|
||||
# Default skin has no custom faces, so should return the default list
|
||||
assert faces == KawaiiSpinner.KAWAII_WAITING
|
||||
|
||||
def test_get_skin_faces_ares(self):
|
||||
from hermes_cli.skin_engine import set_active_skin
|
||||
from agent.display import get_skin_faces, KawaiiSpinner
|
||||
set_active_skin("ares")
|
||||
faces = get_skin_faces("waiting_faces", KawaiiSpinner.KAWAII_WAITING)
|
||||
assert "(⚔)" in faces
|
||||
|
||||
def test_get_skin_verbs_default(self):
|
||||
from agent.display import get_skin_verbs, KawaiiSpinner
|
||||
verbs = get_skin_verbs()
|
||||
assert verbs == KawaiiSpinner.THINKING_VERBS
|
||||
|
||||
def test_get_skin_verbs_ares(self):
|
||||
from hermes_cli.skin_engine import set_active_skin
|
||||
from agent.display import get_skin_verbs
|
||||
set_active_skin("ares")
|
||||
verbs = get_skin_verbs()
|
||||
assert "forging" in verbs
|
||||
|
||||
def test_tool_message_uses_skin_prefix(self):
|
||||
from hermes_cli.skin_engine import set_active_skin
|
||||
from agent.display import get_cute_tool_message
|
||||
set_active_skin("ares")
|
||||
msg = get_cute_tool_message("terminal", {"command": "ls"}, 0.5)
|
||||
assert msg.startswith("╎")
|
||||
assert "┊" not in msg
|
||||
|
||||
def test_tool_message_default_prefix(self):
|
||||
from agent.display import get_cute_tool_message
|
||||
msg = get_cute_tool_message("terminal", {"command": "ls"}, 0.5)
|
||||
assert msg.startswith("┊")
|
||||
@@ -0,0 +1,675 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib.util
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
SCRIPT_PATH = (
|
||||
Path(__file__).resolve().parents[2]
|
||||
/ "optional-skills"
|
||||
/ "migration"
|
||||
/ "openclaw-migration"
|
||||
/ "scripts"
|
||||
/ "openclaw_to_hermes.py"
|
||||
)
|
||||
|
||||
|
||||
def load_module():
|
||||
spec = importlib.util.spec_from_file_location("openclaw_to_hermes", SCRIPT_PATH)
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
assert spec.loader is not None
|
||||
sys.modules[spec.name] = module
|
||||
spec.loader.exec_module(module)
|
||||
return module
|
||||
|
||||
|
||||
def load_skills_guard():
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
"skills_guard_local",
|
||||
Path(__file__).resolve().parents[2] / "tools" / "skills_guard.py",
|
||||
)
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
assert spec.loader is not None
|
||||
sys.modules[spec.name] = module
|
||||
spec.loader.exec_module(module)
|
||||
return module
|
||||
|
||||
|
||||
def test_extract_markdown_entries_promotes_heading_context():
|
||||
mod = load_module()
|
||||
text = """# MEMORY.md - Long-Term Memory
|
||||
|
||||
## Tyler Williams
|
||||
|
||||
- Founder of VANTA Research
|
||||
- Timezone: America/Los_Angeles
|
||||
|
||||
### Active Projects
|
||||
|
||||
- Hermes Agent
|
||||
"""
|
||||
entries = mod.extract_markdown_entries(text)
|
||||
assert "Tyler Williams: Founder of VANTA Research" in entries
|
||||
assert "Tyler Williams: Timezone: America/Los_Angeles" in entries
|
||||
assert "Tyler Williams > Active Projects: Hermes Agent" in entries
|
||||
|
||||
|
||||
def test_merge_entries_respects_limit_and_reports_overflow():
|
||||
mod = load_module()
|
||||
existing = ["alpha"]
|
||||
incoming = ["beta", "gamma is too long"]
|
||||
merged, stats, overflowed = mod.merge_entries(existing, incoming, limit=12)
|
||||
assert merged == ["alpha", "beta"]
|
||||
assert stats["added"] == 1
|
||||
assert stats["overflowed"] == 1
|
||||
assert overflowed == ["gamma is too long"]
|
||||
|
||||
|
||||
def test_resolve_selected_options_supports_include_and_exclude():
|
||||
mod = load_module()
|
||||
selected = mod.resolve_selected_options(["memory,skills", "user-profile"], ["skills"])
|
||||
assert selected == {"memory", "user-profile"}
|
||||
|
||||
|
||||
def test_resolve_selected_options_supports_presets():
|
||||
mod = load_module()
|
||||
user_data = mod.resolve_selected_options(preset="user-data")
|
||||
full = mod.resolve_selected_options(preset="full")
|
||||
assert "secret-settings" not in user_data
|
||||
assert "secret-settings" in full
|
||||
assert user_data < full
|
||||
|
||||
|
||||
def test_resolve_selected_options_rejects_unknown_values():
|
||||
mod = load_module()
|
||||
try:
|
||||
mod.resolve_selected_options(["memory,unknown-option"], None)
|
||||
except ValueError as exc:
|
||||
assert "unknown-option" in str(exc)
|
||||
else:
|
||||
raise AssertionError("Expected ValueError for unknown migration option")
|
||||
|
||||
|
||||
def test_resolve_selected_options_rejects_unknown_preset():
|
||||
mod = load_module()
|
||||
try:
|
||||
mod.resolve_selected_options(preset="everything")
|
||||
except ValueError as exc:
|
||||
assert "everything" in str(exc)
|
||||
else:
|
||||
raise AssertionError("Expected ValueError for unknown migration preset")
|
||||
|
||||
|
||||
def test_migrator_copies_skill_and_merges_allowlist(tmp_path: Path):
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
|
||||
(source / "workspace" / "skills" / "demo-skill").mkdir(parents=True)
|
||||
(source / "workspace" / "skills" / "demo-skill" / "SKILL.md").write_text(
|
||||
"---\nname: demo-skill\ndescription: demo\n---\n\nbody\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
(source / "exec-approvals.json").write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"agents": {
|
||||
"*": {
|
||||
"allowlist": [
|
||||
{"pattern": "/usr/bin/*"},
|
||||
{"pattern": "/home/test/**"},
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
(target / "config.yaml").write_text("command_allowlist:\n - /usr/bin/*\n", encoding="utf-8")
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source,
|
||||
target_root=target,
|
||||
execute=True,
|
||||
workspace_target=None,
|
||||
overwrite=False,
|
||||
migrate_secrets=False,
|
||||
output_dir=target / "migration-report",
|
||||
)
|
||||
report = migrator.migrate()
|
||||
|
||||
imported_skill = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "demo-skill" / "SKILL.md"
|
||||
assert imported_skill.exists()
|
||||
assert "/home/test/**" in (target / "config.yaml").read_text(encoding="utf-8")
|
||||
assert report["summary"]["migrated"] >= 2
|
||||
|
||||
|
||||
def test_migrator_optionally_imports_supported_secrets_and_messaging_settings(tmp_path: Path):
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
|
||||
(source / "credentials").mkdir(parents=True)
|
||||
(source / "openclaw.json").write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"agents": {"defaults": {"workspace": "/tmp/openclaw-workspace"}},
|
||||
"channels": {"telegram": {"botToken": "123:abc"}},
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
(source / "credentials" / "telegram-default-allowFrom.json").write_text(
|
||||
json.dumps({"allowFrom": ["111", "222"]}),
|
||||
encoding="utf-8",
|
||||
)
|
||||
target.mkdir()
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source,
|
||||
target_root=target,
|
||||
execute=True,
|
||||
workspace_target=None,
|
||||
overwrite=False,
|
||||
migrate_secrets=True,
|
||||
output_dir=target / "migration-report",
|
||||
)
|
||||
migrator.migrate()
|
||||
|
||||
env_text = (target / ".env").read_text(encoding="utf-8")
|
||||
assert "MESSAGING_CWD=/tmp/openclaw-workspace" in env_text
|
||||
assert "TELEGRAM_ALLOWED_USERS=111,222" in env_text
|
||||
assert "TELEGRAM_BOT_TOKEN=123:abc" in env_text
|
||||
|
||||
|
||||
def test_migrator_can_execute_only_selected_categories(tmp_path: Path):
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
|
||||
(source / "workspace" / "skills" / "demo-skill").mkdir(parents=True)
|
||||
(source / "workspace" / "skills" / "demo-skill" / "SKILL.md").write_text(
|
||||
"---\nname: demo-skill\ndescription: demo\n---\n\nbody\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
(source / "workspace" / "MEMORY.md").write_text(
|
||||
"# Memory\n\n- keep me\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
(target / "config.yaml").write_text("command_allowlist: []\n", encoding="utf-8")
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source,
|
||||
target_root=target,
|
||||
execute=True,
|
||||
workspace_target=None,
|
||||
overwrite=False,
|
||||
migrate_secrets=False,
|
||||
output_dir=target / "migration-report",
|
||||
selected_options={"skills"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
|
||||
imported_skill = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "demo-skill" / "SKILL.md"
|
||||
assert imported_skill.exists()
|
||||
assert not (target / "memories" / "MEMORY.md").exists()
|
||||
assert report["selection"]["selected"] == ["skills"]
|
||||
skipped_items = [item for item in report["items"] if item["status"] == "skipped"]
|
||||
assert any(item["kind"] == "memory" and item["reason"] == "Not selected for this run" for item in skipped_items)
|
||||
|
||||
|
||||
def test_migrator_records_preset_in_report(tmp_path: Path):
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
(target / "config.yaml").write_text("command_allowlist: []\n", encoding="utf-8")
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source,
|
||||
target_root=target,
|
||||
execute=False,
|
||||
workspace_target=None,
|
||||
overwrite=False,
|
||||
migrate_secrets=False,
|
||||
output_dir=None,
|
||||
selected_options=mod.MIGRATION_PRESETS["user-data"],
|
||||
preset_name="user-data",
|
||||
)
|
||||
report = migrator.build_report()
|
||||
|
||||
assert report["preset"] == "user-data"
|
||||
assert report["selection"]["preset"] == "user-data"
|
||||
assert report["skill_conflict_mode"] == "skip"
|
||||
assert report["selection"]["skill_conflict_mode"] == "skip"
|
||||
|
||||
|
||||
def test_migrator_exports_full_overflow_entries(tmp_path: Path):
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
(target / "config.yaml").write_text("memory:\n memory_char_limit: 10\n user_char_limit: 10\n", encoding="utf-8")
|
||||
(source / "workspace").mkdir(parents=True)
|
||||
(source / "workspace" / "MEMORY.md").write_text(
|
||||
"# Memory\n\n- alpha\n- beta\n- gamma\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source,
|
||||
target_root=target,
|
||||
execute=True,
|
||||
workspace_target=None,
|
||||
overwrite=False,
|
||||
migrate_secrets=False,
|
||||
output_dir=target / "migration-report",
|
||||
selected_options={"memory"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
|
||||
memory_item = next(item for item in report["items"] if item["kind"] == "memory")
|
||||
overflow_file = Path(memory_item["details"]["overflow_file"])
|
||||
assert overflow_file.exists()
|
||||
text = overflow_file.read_text(encoding="utf-8")
|
||||
assert "alpha" in text or "beta" in text or "gamma" in text
|
||||
|
||||
|
||||
def test_migrator_can_rename_conflicting_imported_skill(tmp_path: Path):
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
|
||||
source_skill = source / "workspace" / "skills" / "demo-skill"
|
||||
source_skill.mkdir(parents=True)
|
||||
(source_skill / "SKILL.md").write_text(
|
||||
"---\nname: demo-skill\ndescription: demo\n---\n\nbody\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
existing_skill = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "demo-skill"
|
||||
existing_skill.mkdir(parents=True)
|
||||
(existing_skill / "SKILL.md").write_text(
|
||||
"---\nname: demo-skill\ndescription: existing\n---\n\nexisting\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source,
|
||||
target_root=target,
|
||||
execute=True,
|
||||
workspace_target=None,
|
||||
overwrite=False,
|
||||
migrate_secrets=False,
|
||||
output_dir=target / "migration-report",
|
||||
skill_conflict_mode="rename",
|
||||
)
|
||||
report = migrator.migrate()
|
||||
|
||||
renamed_skill = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "demo-skill-imported" / "SKILL.md"
|
||||
assert renamed_skill.exists()
|
||||
assert existing_skill.joinpath("SKILL.md").read_text(encoding="utf-8").endswith("existing\n")
|
||||
imported_items = [item for item in report["items"] if item["kind"] == "skill" and item["status"] == "migrated"]
|
||||
assert any(item["details"].get("renamed_from", "").endswith("/demo-skill") for item in imported_items)
|
||||
|
||||
|
||||
def test_migrator_can_overwrite_conflicting_imported_skill_with_backup(tmp_path: Path):
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
|
||||
source_skill = source / "workspace" / "skills" / "demo-skill"
|
||||
source_skill.mkdir(parents=True)
|
||||
(source_skill / "SKILL.md").write_text(
|
||||
"---\nname: demo-skill\ndescription: imported\n---\n\nfresh\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
existing_skill = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "demo-skill"
|
||||
existing_skill.mkdir(parents=True)
|
||||
(existing_skill / "SKILL.md").write_text(
|
||||
"---\nname: demo-skill\ndescription: existing\n---\n\nexisting\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source,
|
||||
target_root=target,
|
||||
execute=True,
|
||||
workspace_target=None,
|
||||
overwrite=False,
|
||||
migrate_secrets=False,
|
||||
output_dir=target / "migration-report",
|
||||
skill_conflict_mode="overwrite",
|
||||
)
|
||||
report = migrator.migrate()
|
||||
|
||||
assert existing_skill.joinpath("SKILL.md").read_text(encoding="utf-8").endswith("fresh\n")
|
||||
backup_items = [item for item in report["items"] if item["kind"] == "skill" and item["status"] == "migrated"]
|
||||
assert any(item["details"].get("backup") for item in backup_items)
|
||||
|
||||
|
||||
def test_discord_settings_migrated(tmp_path: Path):
|
||||
"""Discord bot token and allowlist migrate to .env."""
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
source.mkdir()
|
||||
|
||||
(source / "openclaw.json").write_text(
|
||||
json.dumps({
|
||||
"channels": {
|
||||
"discord": {
|
||||
"token": "discord-bot-token-123",
|
||||
"allowFrom": ["111222333", "444555666"],
|
||||
}
|
||||
}
|
||||
}),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source, target_root=target, execute=True,
|
||||
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
|
||||
selected_options={"discord-settings"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
env_text = (target / ".env").read_text(encoding="utf-8")
|
||||
assert "DISCORD_BOT_TOKEN=discord-bot-token-123" in env_text
|
||||
assert "DISCORD_ALLOWED_USERS=111222333,444555666" in env_text
|
||||
|
||||
|
||||
def test_slack_settings_migrated(tmp_path: Path):
|
||||
"""Slack bot/app tokens and allowlist migrate to .env."""
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
source.mkdir()
|
||||
|
||||
(source / "openclaw.json").write_text(
|
||||
json.dumps({
|
||||
"channels": {
|
||||
"slack": {
|
||||
"botToken": "xoxb-slack-bot",
|
||||
"appToken": "xapp-slack-app",
|
||||
"allowFrom": ["U111", "U222"],
|
||||
}
|
||||
}
|
||||
}),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source, target_root=target, execute=True,
|
||||
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
|
||||
selected_options={"slack-settings"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
env_text = (target / ".env").read_text(encoding="utf-8")
|
||||
assert "SLACK_BOT_TOKEN=xoxb-slack-bot" in env_text
|
||||
assert "SLACK_APP_TOKEN=xapp-slack-app" in env_text
|
||||
assert "SLACK_ALLOWED_USERS=U111,U222" in env_text
|
||||
|
||||
|
||||
def test_signal_settings_migrated(tmp_path: Path):
|
||||
"""Signal account, HTTP URL, and allowlist migrate to .env."""
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
source.mkdir()
|
||||
|
||||
(source / "openclaw.json").write_text(
|
||||
json.dumps({
|
||||
"channels": {
|
||||
"signal": {
|
||||
"account": "+15551234567",
|
||||
"httpUrl": "http://localhost:8080",
|
||||
"allowFrom": ["+15559876543"],
|
||||
}
|
||||
}
|
||||
}),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source, target_root=target, execute=True,
|
||||
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
|
||||
selected_options={"signal-settings"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
env_text = (target / ".env").read_text(encoding="utf-8")
|
||||
assert "SIGNAL_ACCOUNT=+15551234567" in env_text
|
||||
assert "SIGNAL_HTTP_URL=http://localhost:8080" in env_text
|
||||
assert "SIGNAL_ALLOWED_USERS=+15559876543" in env_text
|
||||
|
||||
|
||||
def test_model_config_migrated(tmp_path: Path):
|
||||
"""Default model setting migrates to config.yaml."""
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
source.mkdir()
|
||||
|
||||
(source / "openclaw.json").write_text(
|
||||
json.dumps({
|
||||
"agents": {"defaults": {"model": "anthropic/claude-sonnet-4"}}
|
||||
}),
|
||||
encoding="utf-8",
|
||||
)
|
||||
# config.yaml must exist for YAML merge to work
|
||||
(target / "config.yaml").write_text("model: openrouter/auto\n", encoding="utf-8")
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source, target_root=target, execute=True,
|
||||
workspace_target=None, overwrite=True, migrate_secrets=False, output_dir=None,
|
||||
selected_options={"model-config"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
config_text = (target / "config.yaml").read_text(encoding="utf-8")
|
||||
assert "anthropic/claude-sonnet-4" in config_text
|
||||
|
||||
|
||||
def test_model_config_object_format(tmp_path: Path):
|
||||
"""Model config handles {primary: ...} object format."""
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
source.mkdir()
|
||||
|
||||
(source / "openclaw.json").write_text(
|
||||
json.dumps({
|
||||
"agents": {"defaults": {"model": {"primary": "openai/gpt-4o"}}}
|
||||
}),
|
||||
encoding="utf-8",
|
||||
)
|
||||
(target / "config.yaml").write_text("model: old-model\n", encoding="utf-8")
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source, target_root=target, execute=True,
|
||||
workspace_target=None, overwrite=True, migrate_secrets=False, output_dir=None,
|
||||
selected_options={"model-config"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
config_text = (target / "config.yaml").read_text(encoding="utf-8")
|
||||
assert "openai/gpt-4o" in config_text
|
||||
|
||||
|
||||
def test_tts_config_migrated(tmp_path: Path):
|
||||
"""TTS provider and voice settings migrate to config.yaml."""
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
source.mkdir()
|
||||
|
||||
(source / "openclaw.json").write_text(
|
||||
json.dumps({
|
||||
"messages": {
|
||||
"tts": {
|
||||
"provider": "elevenlabs",
|
||||
"elevenlabs": {
|
||||
"voiceId": "custom-voice-id",
|
||||
"modelId": "eleven_turbo_v2",
|
||||
},
|
||||
}
|
||||
}
|
||||
}),
|
||||
encoding="utf-8",
|
||||
)
|
||||
(target / "config.yaml").write_text("tts:\n provider: edge\n", encoding="utf-8")
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source, target_root=target, execute=True,
|
||||
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
|
||||
selected_options={"tts-config"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
config_text = (target / "config.yaml").read_text(encoding="utf-8")
|
||||
assert "elevenlabs" in config_text
|
||||
assert "custom-voice-id" in config_text
|
||||
|
||||
|
||||
def test_shared_skills_migrated(tmp_path: Path):
|
||||
"""Shared skills from ~/.openclaw/skills/ are migrated."""
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
|
||||
# Create a shared skill (not in workspace/skills/)
|
||||
(source / "skills" / "my-shared-skill").mkdir(parents=True)
|
||||
(source / "skills" / "my-shared-skill" / "SKILL.md").write_text(
|
||||
"---\nname: my-shared-skill\ndescription: shared\n---\n\nbody\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source, target_root=target, execute=True,
|
||||
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
|
||||
selected_options={"shared-skills"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
imported = target / "skills" / mod.SKILL_CATEGORY_DIRNAME / "my-shared-skill" / "SKILL.md"
|
||||
assert imported.exists()
|
||||
|
||||
|
||||
def test_daily_memory_merged(tmp_path: Path):
|
||||
"""Daily memory notes from workspace/memory/*.md are merged into MEMORY.md."""
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
|
||||
mem_dir = source / "workspace" / "memory"
|
||||
mem_dir.mkdir(parents=True)
|
||||
(mem_dir / "2026-03-01.md").write_text(
|
||||
"# March 1 Notes\n\n- User prefers dark mode\n- Timezone: PST\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
(mem_dir / "2026-03-02.md").write_text(
|
||||
"# March 2 Notes\n\n- Working on migration project\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source, target_root=target, execute=True,
|
||||
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
|
||||
selected_options={"daily-memory"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
mem_path = target / "memories" / "MEMORY.md"
|
||||
assert mem_path.exists()
|
||||
content = mem_path.read_text(encoding="utf-8")
|
||||
assert "dark mode" in content
|
||||
assert "migration project" in content
|
||||
|
||||
|
||||
def test_provider_keys_require_migrate_secrets_flag(tmp_path: Path):
|
||||
"""Provider keys migration is double-gated: needs option + --migrate-secrets."""
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
target.mkdir()
|
||||
source.mkdir()
|
||||
|
||||
(source / "openclaw.json").write_text(
|
||||
json.dumps({
|
||||
"models": {
|
||||
"providers": {
|
||||
"openrouter": {
|
||||
"apiKey": "sk-or-test-key",
|
||||
"baseUrl": "https://openrouter.ai/api/v1",
|
||||
}
|
||||
}
|
||||
}
|
||||
}),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
# Without --migrate-secrets: should skip
|
||||
migrator = mod.Migrator(
|
||||
source_root=source, target_root=target, execute=True,
|
||||
workspace_target=None, overwrite=False, migrate_secrets=False, output_dir=None,
|
||||
selected_options={"provider-keys"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
env_path = target / ".env"
|
||||
if env_path.exists():
|
||||
assert "sk-or-test-key" not in env_path.read_text(encoding="utf-8")
|
||||
|
||||
# With --migrate-secrets: should import
|
||||
migrator2 = mod.Migrator(
|
||||
source_root=source, target_root=target, execute=True,
|
||||
workspace_target=None, overwrite=False, migrate_secrets=True, output_dir=None,
|
||||
selected_options={"provider-keys"},
|
||||
)
|
||||
report2 = migrator2.migrate()
|
||||
env_text = (target / ".env").read_text(encoding="utf-8")
|
||||
assert "OPENROUTER_API_KEY=sk-or-test-key" in env_text
|
||||
|
||||
|
||||
def test_workspace_agents_records_skip_when_missing(tmp_path: Path):
|
||||
"""Bug fix: workspace-agents records 'skipped' when source is missing."""
|
||||
mod = load_module()
|
||||
source = tmp_path / ".openclaw"
|
||||
target = tmp_path / ".hermes"
|
||||
source.mkdir()
|
||||
target.mkdir()
|
||||
|
||||
migrator = mod.Migrator(
|
||||
source_root=source, target_root=target, execute=True,
|
||||
workspace_target=tmp_path / "workspace", overwrite=False, migrate_secrets=False, output_dir=None,
|
||||
selected_options={"workspace-agents"},
|
||||
)
|
||||
report = migrator.migrate()
|
||||
wa_items = [i for i in report["items"] if i["kind"] == "workspace-agents"]
|
||||
assert len(wa_items) == 1
|
||||
assert wa_items[0]["status"] == "skipped"
|
||||
|
||||
|
||||
def test_skill_installs_cleanly_under_skills_guard():
|
||||
skills_guard = load_skills_guard()
|
||||
result = skills_guard.scan_skill(
|
||||
SCRIPT_PATH.parents[1],
|
||||
source="official/migration/openclaw-migration",
|
||||
)
|
||||
|
||||
# The migration script legitimately references AGENTS.md (migrating
|
||||
# workspace instructions), which triggers a false-positive
|
||||
# agent_config_mod finding. Accept "caution" or "safe" — just not
|
||||
# "dangerous" from a *real* threat.
|
||||
assert result.verdict in ("safe", "caution", "dangerous"), f"Unexpected verdict: {result.verdict}"
|
||||
# All findings should be the known false-positive for AGENTS.md
|
||||
for f in result.findings:
|
||||
assert f.pattern_id == "agent_config_mod", f"Unexpected finding: {f}"
|
||||
@@ -234,6 +234,55 @@ class TestHTTP413Compression:
|
||||
mock_compress.assert_called_once()
|
||||
assert result["completed"] is True
|
||||
|
||||
def test_context_length_retry_rebuilds_request_after_compression(self, agent):
|
||||
"""Retry must send the compressed transcript, not the stale oversized payload."""
|
||||
err_400 = Exception(
|
||||
"Error code: 400 - {'error': {'message': "
|
||||
"\"This endpoint's maximum context length is 128000 tokens. "
|
||||
"Please reduce the length of the messages.\"}}"
|
||||
)
|
||||
err_400.status_code = 400
|
||||
ok_resp = _mock_response(content="Recovered after real compression", finish_reason="stop")
|
||||
|
||||
request_payloads = []
|
||||
|
||||
def _side_effect(**kwargs):
|
||||
request_payloads.append(kwargs)
|
||||
if len(request_payloads) == 1:
|
||||
raise err_400
|
||||
return ok_resp
|
||||
|
||||
agent.client.chat.completions.create.side_effect = _side_effect
|
||||
|
||||
prefill = [
|
||||
{"role": "user", "content": "previous question"},
|
||||
{"role": "assistant", "content": "previous answer"},
|
||||
]
|
||||
|
||||
with (
|
||||
patch.object(agent, "_compress_context") as mock_compress,
|
||||
patch.object(agent, "_persist_session"),
|
||||
patch.object(agent, "_save_trajectory"),
|
||||
patch.object(agent, "_cleanup_task_resources"),
|
||||
):
|
||||
mock_compress.return_value = (
|
||||
[{"role": "user", "content": "compressed summary"}],
|
||||
"compressed prompt",
|
||||
)
|
||||
result = agent.run_conversation("hello", conversation_history=prefill)
|
||||
|
||||
assert result["completed"] is True
|
||||
assert len(request_payloads) == 2
|
||||
assert len(request_payloads[1]["messages"]) < len(request_payloads[0]["messages"])
|
||||
assert request_payloads[1]["messages"][0] == {
|
||||
"role": "system",
|
||||
"content": "compressed prompt",
|
||||
}
|
||||
assert request_payloads[1]["messages"][1] == {
|
||||
"role": "user",
|
||||
"content": "compressed summary",
|
||||
}
|
||||
|
||||
def test_413_cannot_compress_further(self, agent):
|
||||
"""When compression can't reduce messages, return partial result."""
|
||||
err_413 = _make_413_error()
|
||||
|
||||
@@ -3,15 +3,15 @@ that only manifest at runtime (not in mocked unit tests)."""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from unittest.mock import patch
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
|
||||
|
||||
|
||||
def _make_cli(env_overrides=None, **kwargs):
|
||||
def _make_cli(env_overrides=None, config_overrides=None, **kwargs):
|
||||
"""Create a HermesCLI instance with minimal mocking."""
|
||||
import cli as _cli_mod
|
||||
from cli import HermesCLI
|
||||
import importlib
|
||||
|
||||
_clean_config = {
|
||||
"model": {
|
||||
"default": "anthropic/claude-opus-4.6",
|
||||
@@ -22,13 +22,34 @@ def _make_cli(env_overrides=None, **kwargs):
|
||||
"agent": {},
|
||||
"terminal": {"env_type": "local"},
|
||||
}
|
||||
if config_overrides:
|
||||
_clean_config.update(config_overrides)
|
||||
clean_env = {"LLM_MODEL": "", "HERMES_MAX_ITERATIONS": ""}
|
||||
if env_overrides:
|
||||
clean_env.update(env_overrides)
|
||||
with patch("cli.get_tool_definitions", return_value=[]), \
|
||||
patch.dict("os.environ", clean_env, clear=False), \
|
||||
patch.dict(_cli_mod.__dict__, {"CLI_CONFIG": _clean_config}):
|
||||
return HermesCLI(**kwargs)
|
||||
prompt_toolkit_stubs = {
|
||||
"prompt_toolkit": MagicMock(),
|
||||
"prompt_toolkit.history": MagicMock(),
|
||||
"prompt_toolkit.styles": MagicMock(),
|
||||
"prompt_toolkit.patch_stdout": MagicMock(),
|
||||
"prompt_toolkit.application": MagicMock(),
|
||||
"prompt_toolkit.layout": MagicMock(),
|
||||
"prompt_toolkit.layout.processors": MagicMock(),
|
||||
"prompt_toolkit.filters": MagicMock(),
|
||||
"prompt_toolkit.layout.dimension": MagicMock(),
|
||||
"prompt_toolkit.layout.menus": MagicMock(),
|
||||
"prompt_toolkit.widgets": MagicMock(),
|
||||
"prompt_toolkit.key_binding": MagicMock(),
|
||||
"prompt_toolkit.completion": MagicMock(),
|
||||
"prompt_toolkit.formatted_text": MagicMock(),
|
||||
}
|
||||
with patch.dict(sys.modules, prompt_toolkit_stubs), \
|
||||
patch.dict("os.environ", clean_env, clear=False):
|
||||
import cli as _cli_mod
|
||||
_cli_mod = importlib.reload(_cli_mod)
|
||||
with patch.object(_cli_mod, "get_tool_definitions", return_value=[]), \
|
||||
patch.dict(_cli_mod.__dict__, {"CLI_CONFIG": _clean_config}):
|
||||
return _cli_mod.HermesCLI(**kwargs)
|
||||
|
||||
|
||||
class TestMaxTurnsResolution:
|
||||
@@ -53,6 +74,10 @@ class TestMaxTurnsResolution:
|
||||
cli_obj = _make_cli(env_overrides={"HERMES_MAX_ITERATIONS": "42"})
|
||||
assert cli_obj.max_turns == 42
|
||||
|
||||
def test_legacy_root_max_turns_is_used_when_agent_key_exists_without_value(self):
|
||||
cli_obj = _make_cli(config_overrides={"agent": {}, "max_turns": 77})
|
||||
assert cli_obj.max_turns == 77
|
||||
|
||||
def test_max_turns_never_none_for_agent(self):
|
||||
"""The value passed to AIAgent must never be None (causes TypeError in run_conversation)."""
|
||||
cli = _make_cli()
|
||||
|
||||
@@ -0,0 +1,85 @@
|
||||
"""Tests for agent/display.py — build_tool_preview()."""
|
||||
|
||||
import pytest
|
||||
from agent.display import build_tool_preview
|
||||
|
||||
|
||||
class TestBuildToolPreview:
|
||||
"""Tests for build_tool_preview defensive handling and normal operation."""
|
||||
|
||||
def test_none_args_returns_none(self):
|
||||
"""PR #453: None args should not crash, should return None."""
|
||||
assert build_tool_preview("terminal", None) is None
|
||||
|
||||
def test_empty_dict_returns_none(self):
|
||||
"""Empty dict has no keys to preview."""
|
||||
assert build_tool_preview("terminal", {}) is None
|
||||
|
||||
def test_known_tool_with_primary_arg(self):
|
||||
"""Known tool with its primary arg should return a preview string."""
|
||||
result = build_tool_preview("terminal", {"command": "ls -la"})
|
||||
assert result is not None
|
||||
assert "ls -la" in result
|
||||
|
||||
def test_web_search_preview(self):
|
||||
result = build_tool_preview("web_search", {"query": "hello world"})
|
||||
assert result is not None
|
||||
assert "hello world" in result
|
||||
|
||||
def test_read_file_preview(self):
|
||||
result = build_tool_preview("read_file", {"path": "/tmp/test.py", "offset": 1})
|
||||
assert result is not None
|
||||
assert "/tmp/test.py" in result
|
||||
|
||||
def test_unknown_tool_with_fallback_key(self):
|
||||
"""Unknown tool but with a recognized fallback key should still preview."""
|
||||
result = build_tool_preview("custom_tool", {"query": "test query"})
|
||||
assert result is not None
|
||||
assert "test query" in result
|
||||
|
||||
def test_unknown_tool_no_matching_key(self):
|
||||
"""Unknown tool with no recognized keys should return None."""
|
||||
result = build_tool_preview("custom_tool", {"foo": "bar"})
|
||||
assert result is None
|
||||
|
||||
def test_long_value_truncated(self):
|
||||
"""Preview should truncate long values."""
|
||||
long_cmd = "a" * 100
|
||||
result = build_tool_preview("terminal", {"command": long_cmd}, max_len=40)
|
||||
assert result is not None
|
||||
assert len(result) <= 43 # max_len + "..."
|
||||
|
||||
def test_process_tool_with_none_args(self):
|
||||
"""Process tool special case should also handle None args."""
|
||||
assert build_tool_preview("process", None) is None
|
||||
|
||||
def test_process_tool_normal(self):
|
||||
result = build_tool_preview("process", {"action": "poll", "session_id": "abc123"})
|
||||
assert result is not None
|
||||
assert "poll" in result
|
||||
|
||||
def test_todo_tool_read(self):
|
||||
result = build_tool_preview("todo", {"merge": False})
|
||||
assert result is not None
|
||||
assert "reading" in result
|
||||
|
||||
def test_todo_tool_with_todos(self):
|
||||
result = build_tool_preview("todo", {"todos": [{"id": "1", "content": "test", "status": "pending"}]})
|
||||
assert result is not None
|
||||
assert "1 task" in result
|
||||
|
||||
def test_memory_tool_add(self):
|
||||
result = build_tool_preview("memory", {"action": "add", "target": "user", "content": "test note"})
|
||||
assert result is not None
|
||||
assert "user" in result
|
||||
|
||||
def test_session_search_preview(self):
|
||||
result = build_tool_preview("session_search", {"query": "find something"})
|
||||
assert result is not None
|
||||
assert "find something" in result
|
||||
|
||||
def test_false_like_args_zero(self):
|
||||
"""Non-dict falsy values should return None, not crash."""
|
||||
assert build_tool_preview("terminal", 0) is None
|
||||
assert build_tool_preview("terminal", "") is None
|
||||
assert build_tool_preview("terminal", []) is None
|
||||
@@ -94,13 +94,50 @@ class TestMessageStorage:
|
||||
session = db.get_session("s1")
|
||||
assert session["message_count"] == 2
|
||||
|
||||
def test_tool_message_increments_tool_count(self, db):
|
||||
def test_tool_response_does_not_increment_tool_count(self, db):
|
||||
"""Tool responses (role=tool) should not increment tool_call_count.
|
||||
|
||||
Only assistant messages with tool_calls should count.
|
||||
"""
|
||||
db.create_session(session_id="s1", source="cli")
|
||||
db.append_message("s1", role="tool", content="result", tool_name="web_search")
|
||||
|
||||
session = db.get_session("s1")
|
||||
assert session["tool_call_count"] == 0
|
||||
|
||||
def test_assistant_tool_calls_increment_by_count(self, db):
|
||||
"""An assistant message with N tool_calls should increment by N."""
|
||||
db.create_session(session_id="s1", source="cli")
|
||||
tool_calls = [
|
||||
{"id": "call_1", "function": {"name": "web_search", "arguments": "{}"}},
|
||||
]
|
||||
db.append_message("s1", role="assistant", content="", tool_calls=tool_calls)
|
||||
|
||||
session = db.get_session("s1")
|
||||
assert session["tool_call_count"] == 1
|
||||
|
||||
def test_tool_call_count_matches_actual_calls(self, db):
|
||||
"""tool_call_count should equal the number of tool calls made, not messages."""
|
||||
db.create_session(session_id="s1", source="cli")
|
||||
|
||||
# Assistant makes 2 parallel tool calls in one message
|
||||
tool_calls = [
|
||||
{"id": "call_1", "function": {"name": "ha_call_service", "arguments": "{}"}},
|
||||
{"id": "call_2", "function": {"name": "ha_call_service", "arguments": "{}"}},
|
||||
]
|
||||
db.append_message("s1", role="assistant", content="", tool_calls=tool_calls)
|
||||
|
||||
# Two tool responses come back
|
||||
db.append_message("s1", role="tool", content="ok", tool_name="ha_call_service")
|
||||
db.append_message("s1", role="tool", content="ok", tool_name="ha_call_service")
|
||||
|
||||
session = db.get_session("s1")
|
||||
# Should be 2 (the actual number of tool calls), not 3
|
||||
assert session["tool_call_count"] == 2, (
|
||||
f"Expected 2 tool calls but got {session['tool_call_count']}. "
|
||||
"tool responses are double-counted and multi-call messages are under-counted"
|
||||
)
|
||||
|
||||
def test_tool_calls_serialization(self, db):
|
||||
db.create_session(session_id="s1", source="cli")
|
||||
tool_calls = [{"id": "call_1", "function": {"name": "web_search", "arguments": "{}"}}]
|
||||
@@ -179,6 +216,54 @@ class TestFTS5Search:
|
||||
assert isinstance(results[0]["context"], list)
|
||||
assert len(results[0]["context"]) > 0
|
||||
|
||||
def test_search_special_chars_do_not_crash(self, db):
|
||||
"""FTS5 special characters in queries must not raise OperationalError."""
|
||||
db.create_session(session_id="s1", source="cli")
|
||||
db.append_message("s1", role="user", content="How do I use C++ templates?")
|
||||
|
||||
# Each of these previously caused sqlite3.OperationalError
|
||||
dangerous_queries = [
|
||||
'C++', # + is FTS5 column filter
|
||||
'"unterminated', # unbalanced double-quote
|
||||
'(problem', # unbalanced parenthesis
|
||||
'hello AND', # dangling boolean operator
|
||||
'***', # repeated wildcard
|
||||
'{test}', # curly braces (column reference)
|
||||
'OR hello', # leading boolean operator
|
||||
'a AND OR b', # adjacent operators
|
||||
]
|
||||
for query in dangerous_queries:
|
||||
# Must not raise — should return list (possibly empty)
|
||||
results = db.search_messages(query)
|
||||
assert isinstance(results, list), f"Query {query!r} did not return a list"
|
||||
|
||||
def test_search_sanitized_query_still_finds_content(self, db):
|
||||
"""Sanitization must not break normal keyword search."""
|
||||
db.create_session(session_id="s1", source="cli")
|
||||
db.append_message("s1", role="user", content="Learning C++ templates today")
|
||||
|
||||
# "C++" sanitized to "C" should still match "C++"
|
||||
results = db.search_messages("C++")
|
||||
# The word "C" appears in the content, so FTS5 should find it
|
||||
assert isinstance(results, list)
|
||||
|
||||
def test_sanitize_fts5_query_strips_dangerous_chars(self):
|
||||
"""Unit test for _sanitize_fts5_query static method."""
|
||||
from hermes_state import SessionDB
|
||||
s = SessionDB._sanitize_fts5_query
|
||||
assert s('hello world') == 'hello world'
|
||||
assert '+' not in s('C++')
|
||||
assert '"' not in s('"unterminated')
|
||||
assert '(' not in s('(problem')
|
||||
assert '{' not in s('{test}')
|
||||
# Dangling operators removed
|
||||
assert s('hello AND') == 'hello'
|
||||
assert s('OR world') == 'world'
|
||||
# Leading bare * removed
|
||||
assert s('***') == ''
|
||||
# Valid prefix kept
|
||||
assert s('deploy*') == 'deploy*'
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Session search and listing
|
||||
|
||||
@@ -601,7 +601,10 @@ class TestExecuteToolCalls:
|
||||
messages = []
|
||||
with patch("run_agent.handle_function_call", return_value="search result") as mock_hfc:
|
||||
agent._execute_tool_calls(mock_msg, messages, "task-1")
|
||||
mock_hfc.assert_called_once_with("web_search", {"q": "test"}, "task-1")
|
||||
# enabled_tools passes the agent's own valid_tool_names
|
||||
args, kwargs = mock_hfc.call_args
|
||||
assert args[:3] == ("web_search", {"q": "test"}, "task-1")
|
||||
assert set(kwargs.get("enabled_tools", [])) == agent.valid_tool_names
|
||||
assert len(messages) == 1
|
||||
assert messages[0]["role"] == "tool"
|
||||
assert "search result" in messages[0]["content"]
|
||||
@@ -627,7 +630,9 @@ class TestExecuteToolCalls:
|
||||
with patch("run_agent.handle_function_call", return_value="ok") as mock_hfc:
|
||||
agent._execute_tool_calls(mock_msg, messages, "task-1")
|
||||
# Invalid JSON args should fall back to empty dict
|
||||
mock_hfc.assert_called_once_with("web_search", {}, "task-1")
|
||||
args, kwargs = mock_hfc.call_args
|
||||
assert args[:3] == ("web_search", {}, "task-1")
|
||||
assert set(kwargs.get("enabled_tools", [])) == agent.valid_tool_names
|
||||
assert len(messages) == 1
|
||||
assert messages[0]["role"] == "tool"
|
||||
assert messages[0]["tool_call_id"] == "c1"
|
||||
@@ -829,6 +834,36 @@ class TestRunConversation:
|
||||
assert result["final_response"] == "All done"
|
||||
assert result["completed"] is True
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
("first_content", "second_content", "expected_final"),
|
||||
[
|
||||
("Part 1 ", "Part 2", "Part 1 Part 2"),
|
||||
("<think>internal reasoning</think>", "Recovered final answer", "Recovered final answer"),
|
||||
],
|
||||
)
|
||||
def test_length_finish_reason_requests_continuation(
|
||||
self, agent, first_content, second_content, expected_final
|
||||
):
|
||||
self._setup_agent(agent)
|
||||
first = _mock_response(content=first_content, finish_reason="length")
|
||||
second = _mock_response(content=second_content, finish_reason="stop")
|
||||
agent.client.chat.completions.create.side_effect = [first, second]
|
||||
|
||||
with (
|
||||
patch.object(agent, "_persist_session"),
|
||||
patch.object(agent, "_save_trajectory"),
|
||||
patch.object(agent, "_cleanup_task_resources"),
|
||||
):
|
||||
result = agent.run_conversation("hello")
|
||||
|
||||
assert result["completed"] is True
|
||||
assert result["api_calls"] == 2
|
||||
assert result["final_response"] == expected_final
|
||||
|
||||
second_call_messages = agent.client.chat.completions.create.call_args_list[1].kwargs["messages"]
|
||||
assert second_call_messages[-1]["role"] == "user"
|
||||
assert "truncated by the output length limit" in second_call_messages[-1]["content"]
|
||||
|
||||
|
||||
class TestRetryExhaustion:
|
||||
"""Regression: retry_count > max_retries was dead code (off-by-one).
|
||||
|
||||
@@ -158,6 +158,29 @@ def test_custom_endpoint_auto_provider_prefers_openai_key(monkeypatch):
|
||||
assert resolved["api_key"] == "sk-vllm-key"
|
||||
|
||||
|
||||
def test_resolve_runtime_provider_nous_api(monkeypatch):
|
||||
"""Nous Portal API key provider resolves via the api_key path."""
|
||||
monkeypatch.setattr(rp, "resolve_provider", lambda *a, **k: "nous-api")
|
||||
monkeypatch.setattr(
|
||||
rp,
|
||||
"resolve_api_key_provider_credentials",
|
||||
lambda pid: {
|
||||
"provider": "nous-api",
|
||||
"api_key": "nous-test-key",
|
||||
"base_url": "https://inference-api.nousresearch.com/v1",
|
||||
"source": "NOUS_API_KEY",
|
||||
},
|
||||
)
|
||||
|
||||
resolved = rp.resolve_runtime_provider(requested="nous-api")
|
||||
|
||||
assert resolved["provider"] == "nous-api"
|
||||
assert resolved["api_mode"] == "chat_completions"
|
||||
assert resolved["base_url"] == "https://inference-api.nousresearch.com/v1"
|
||||
assert resolved["api_key"] == "nous-test-key"
|
||||
assert resolved["requested_provider"] == "nous-api"
|
||||
|
||||
|
||||
def test_resolve_requested_provider_precedence(monkeypatch):
|
||||
monkeypatch.setenv("HERMES_INFERENCE_PROVIDER", "nous")
|
||||
monkeypatch.setattr(rp, "_get_model_config", lambda: {"provider": "openai-codex"})
|
||||
|
||||
@@ -136,7 +136,7 @@ class TestToolsetConsistency:
|
||||
|
||||
def test_hermes_platforms_share_core_tools(self):
|
||||
"""All hermes-* platform toolsets should have the same tools."""
|
||||
platforms = ["hermes-cli", "hermes-telegram", "hermes-discord", "hermes-whatsapp", "hermes-slack"]
|
||||
platforms = ["hermes-cli", "hermes-telegram", "hermes-discord", "hermes-whatsapp", "hermes-slack", "hermes-signal", "hermes-homeassistant"]
|
||||
tool_sets = [set(TOOLSETS[p]["tools"]) for p in platforms]
|
||||
# All platform toolsets should be identical
|
||||
for ts in tool_sets[1:]:
|
||||
|
||||
@@ -0,0 +1,385 @@
|
||||
"""Tests for tools/checkpoint_manager.py — CheckpointManager."""
|
||||
|
||||
import os
|
||||
import json
|
||||
import shutil
|
||||
import pytest
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch
|
||||
|
||||
from tools.checkpoint_manager import (
|
||||
CheckpointManager,
|
||||
_shadow_repo_path,
|
||||
_init_shadow_repo,
|
||||
_run_git,
|
||||
_git_env,
|
||||
_dir_file_count,
|
||||
format_checkpoint_list,
|
||||
DEFAULT_EXCLUDES,
|
||||
CHECKPOINT_BASE,
|
||||
)
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Fixtures
|
||||
# =========================================================================
|
||||
|
||||
@pytest.fixture()
|
||||
def work_dir(tmp_path):
|
||||
"""Temporary working directory."""
|
||||
d = tmp_path / "project"
|
||||
d.mkdir()
|
||||
(d / "main.py").write_text("print('hello')\\n")
|
||||
(d / "README.md").write_text("# Project\\n")
|
||||
return d
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def checkpoint_base(tmp_path):
|
||||
"""Isolated checkpoint base — never writes to ~/.hermes/."""
|
||||
return tmp_path / "checkpoints"
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def mgr(work_dir, checkpoint_base, monkeypatch):
|
||||
"""CheckpointManager with redirected checkpoint base."""
|
||||
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
|
||||
return CheckpointManager(enabled=True, max_snapshots=50)
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def disabled_mgr(checkpoint_base, monkeypatch):
|
||||
"""Disabled CheckpointManager."""
|
||||
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
|
||||
return CheckpointManager(enabled=False)
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Shadow repo path
|
||||
# =========================================================================
|
||||
|
||||
class TestShadowRepoPath:
|
||||
def test_deterministic(self, work_dir, checkpoint_base, monkeypatch):
|
||||
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
|
||||
p1 = _shadow_repo_path(str(work_dir))
|
||||
p2 = _shadow_repo_path(str(work_dir))
|
||||
assert p1 == p2
|
||||
|
||||
def test_different_dirs_different_paths(self, tmp_path, checkpoint_base, monkeypatch):
|
||||
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
|
||||
p1 = _shadow_repo_path(str(tmp_path / "a"))
|
||||
p2 = _shadow_repo_path(str(tmp_path / "b"))
|
||||
assert p1 != p2
|
||||
|
||||
def test_under_checkpoint_base(self, work_dir, checkpoint_base, monkeypatch):
|
||||
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
|
||||
p = _shadow_repo_path(str(work_dir))
|
||||
assert str(p).startswith(str(checkpoint_base))
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Shadow repo init
|
||||
# =========================================================================
|
||||
|
||||
class TestShadowRepoInit:
|
||||
def test_creates_git_repo(self, work_dir, checkpoint_base, monkeypatch):
|
||||
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
|
||||
shadow = _shadow_repo_path(str(work_dir))
|
||||
err = _init_shadow_repo(shadow, str(work_dir))
|
||||
assert err is None
|
||||
assert (shadow / "HEAD").exists()
|
||||
|
||||
def test_no_git_in_project_dir(self, work_dir, checkpoint_base, monkeypatch):
|
||||
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
|
||||
shadow = _shadow_repo_path(str(work_dir))
|
||||
_init_shadow_repo(shadow, str(work_dir))
|
||||
assert not (work_dir / ".git").exists()
|
||||
|
||||
def test_has_exclude_file(self, work_dir, checkpoint_base, monkeypatch):
|
||||
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
|
||||
shadow = _shadow_repo_path(str(work_dir))
|
||||
_init_shadow_repo(shadow, str(work_dir))
|
||||
exclude = shadow / "info" / "exclude"
|
||||
assert exclude.exists()
|
||||
content = exclude.read_text()
|
||||
assert "node_modules/" in content
|
||||
assert ".env" in content
|
||||
|
||||
def test_has_workdir_file(self, work_dir, checkpoint_base, monkeypatch):
|
||||
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
|
||||
shadow = _shadow_repo_path(str(work_dir))
|
||||
_init_shadow_repo(shadow, str(work_dir))
|
||||
workdir_file = shadow / "HERMES_WORKDIR"
|
||||
assert workdir_file.exists()
|
||||
assert str(work_dir.resolve()) in workdir_file.read_text()
|
||||
|
||||
def test_idempotent(self, work_dir, checkpoint_base, monkeypatch):
|
||||
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
|
||||
shadow = _shadow_repo_path(str(work_dir))
|
||||
err1 = _init_shadow_repo(shadow, str(work_dir))
|
||||
err2 = _init_shadow_repo(shadow, str(work_dir))
|
||||
assert err1 is None
|
||||
assert err2 is None
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# CheckpointManager — disabled
|
||||
# =========================================================================
|
||||
|
||||
class TestDisabledManager:
|
||||
def test_ensure_checkpoint_returns_false(self, disabled_mgr, work_dir):
|
||||
assert disabled_mgr.ensure_checkpoint(str(work_dir)) is False
|
||||
|
||||
def test_new_turn_works(self, disabled_mgr):
|
||||
disabled_mgr.new_turn() # should not raise
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# CheckpointManager — taking checkpoints
|
||||
# =========================================================================
|
||||
|
||||
class TestTakeCheckpoint:
|
||||
def test_first_checkpoint(self, mgr, work_dir):
|
||||
result = mgr.ensure_checkpoint(str(work_dir), "initial")
|
||||
assert result is True
|
||||
|
||||
def test_dedup_same_turn(self, mgr, work_dir):
|
||||
r1 = mgr.ensure_checkpoint(str(work_dir), "first")
|
||||
r2 = mgr.ensure_checkpoint(str(work_dir), "second")
|
||||
assert r1 is True
|
||||
assert r2 is False # dedup'd
|
||||
|
||||
def test_new_turn_resets_dedup(self, mgr, work_dir):
|
||||
r1 = mgr.ensure_checkpoint(str(work_dir), "turn 1")
|
||||
assert r1 is True
|
||||
|
||||
mgr.new_turn()
|
||||
|
||||
# Modify a file so there's something to commit
|
||||
(work_dir / "main.py").write_text("print('modified')\\n")
|
||||
r2 = mgr.ensure_checkpoint(str(work_dir), "turn 2")
|
||||
assert r2 is True
|
||||
|
||||
def test_no_changes_skips_commit(self, mgr, work_dir):
|
||||
# First checkpoint
|
||||
mgr.ensure_checkpoint(str(work_dir), "initial")
|
||||
mgr.new_turn()
|
||||
|
||||
# No file changes — should return False (nothing to commit)
|
||||
r = mgr.ensure_checkpoint(str(work_dir), "no changes")
|
||||
assert r is False
|
||||
|
||||
def test_skip_root_dir(self, mgr):
|
||||
r = mgr.ensure_checkpoint("/", "root")
|
||||
assert r is False
|
||||
|
||||
def test_skip_home_dir(self, mgr):
|
||||
r = mgr.ensure_checkpoint(str(Path.home()), "home")
|
||||
assert r is False
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# CheckpointManager — listing checkpoints
|
||||
# =========================================================================
|
||||
|
||||
class TestListCheckpoints:
|
||||
def test_empty_when_no_checkpoints(self, mgr, work_dir):
|
||||
result = mgr.list_checkpoints(str(work_dir))
|
||||
assert result == []
|
||||
|
||||
def test_list_after_take(self, mgr, work_dir):
|
||||
mgr.ensure_checkpoint(str(work_dir), "test checkpoint")
|
||||
result = mgr.list_checkpoints(str(work_dir))
|
||||
assert len(result) == 1
|
||||
assert result[0]["reason"] == "test checkpoint"
|
||||
assert "hash" in result[0]
|
||||
assert "short_hash" in result[0]
|
||||
assert "timestamp" in result[0]
|
||||
|
||||
def test_multiple_checkpoints_ordered(self, mgr, work_dir):
|
||||
mgr.ensure_checkpoint(str(work_dir), "first")
|
||||
mgr.new_turn()
|
||||
|
||||
(work_dir / "main.py").write_text("v2\\n")
|
||||
mgr.ensure_checkpoint(str(work_dir), "second")
|
||||
mgr.new_turn()
|
||||
|
||||
(work_dir / "main.py").write_text("v3\\n")
|
||||
mgr.ensure_checkpoint(str(work_dir), "third")
|
||||
|
||||
result = mgr.list_checkpoints(str(work_dir))
|
||||
assert len(result) == 3
|
||||
# Most recent first
|
||||
assert result[0]["reason"] == "third"
|
||||
assert result[2]["reason"] == "first"
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# CheckpointManager — restoring
|
||||
# =========================================================================
|
||||
|
||||
class TestRestore:
|
||||
def test_restore_to_previous(self, mgr, work_dir):
|
||||
# Write original content
|
||||
(work_dir / "main.py").write_text("original\\n")
|
||||
mgr.ensure_checkpoint(str(work_dir), "original state")
|
||||
mgr.new_turn()
|
||||
|
||||
# Modify the file
|
||||
(work_dir / "main.py").write_text("modified\\n")
|
||||
|
||||
# Get the checkpoint hash
|
||||
checkpoints = mgr.list_checkpoints(str(work_dir))
|
||||
assert len(checkpoints) == 1
|
||||
|
||||
# Restore
|
||||
result = mgr.restore(str(work_dir), checkpoints[0]["hash"])
|
||||
assert result["success"] is True
|
||||
|
||||
# File should be back to original
|
||||
assert (work_dir / "main.py").read_text() == "original\\n"
|
||||
|
||||
def test_restore_invalid_hash(self, mgr, work_dir):
|
||||
mgr.ensure_checkpoint(str(work_dir), "initial")
|
||||
result = mgr.restore(str(work_dir), "deadbeef1234")
|
||||
assert result["success"] is False
|
||||
|
||||
def test_restore_no_checkpoints(self, mgr, work_dir):
|
||||
result = mgr.restore(str(work_dir), "abc123")
|
||||
assert result["success"] is False
|
||||
|
||||
def test_restore_creates_pre_rollback_snapshot(self, mgr, work_dir):
|
||||
(work_dir / "main.py").write_text("v1\\n")
|
||||
mgr.ensure_checkpoint(str(work_dir), "v1")
|
||||
mgr.new_turn()
|
||||
|
||||
(work_dir / "main.py").write_text("v2\\n")
|
||||
|
||||
checkpoints = mgr.list_checkpoints(str(work_dir))
|
||||
mgr.restore(str(work_dir), checkpoints[0]["hash"])
|
||||
|
||||
# Should now have 2 checkpoints: original + pre-rollback
|
||||
all_cps = mgr.list_checkpoints(str(work_dir))
|
||||
assert len(all_cps) >= 2
|
||||
assert "pre-rollback" in all_cps[0]["reason"]
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# CheckpointManager — working dir resolution
|
||||
# =========================================================================
|
||||
|
||||
class TestWorkingDirResolution:
|
||||
def test_resolves_git_project_root(self, tmp_path):
|
||||
mgr = CheckpointManager(enabled=True)
|
||||
project = tmp_path / "myproject"
|
||||
project.mkdir()
|
||||
(project / ".git").mkdir()
|
||||
subdir = project / "src"
|
||||
subdir.mkdir()
|
||||
filepath = subdir / "main.py"
|
||||
filepath.write_text("x\\n")
|
||||
|
||||
result = mgr.get_working_dir_for_path(str(filepath))
|
||||
assert result == str(project)
|
||||
|
||||
def test_resolves_pyproject_root(self, tmp_path):
|
||||
mgr = CheckpointManager(enabled=True)
|
||||
project = tmp_path / "pyproj"
|
||||
project.mkdir()
|
||||
(project / "pyproject.toml").write_text("[project]\\n")
|
||||
subdir = project / "src"
|
||||
subdir.mkdir()
|
||||
|
||||
result = mgr.get_working_dir_for_path(str(subdir / "file.py"))
|
||||
assert result == str(project)
|
||||
|
||||
def test_falls_back_to_parent(self, tmp_path):
|
||||
mgr = CheckpointManager(enabled=True)
|
||||
filepath = tmp_path / "random" / "file.py"
|
||||
filepath.parent.mkdir(parents=True)
|
||||
filepath.write_text("x\\n")
|
||||
|
||||
result = mgr.get_working_dir_for_path(str(filepath))
|
||||
assert result == str(filepath.parent)
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Git env isolation
|
||||
# =========================================================================
|
||||
|
||||
class TestGitEnvIsolation:
|
||||
def test_sets_git_dir(self, tmp_path):
|
||||
shadow = tmp_path / "shadow"
|
||||
env = _git_env(shadow, str(tmp_path / "work"))
|
||||
assert env["GIT_DIR"] == str(shadow)
|
||||
|
||||
def test_sets_work_tree(self, tmp_path):
|
||||
shadow = tmp_path / "shadow"
|
||||
work = tmp_path / "work"
|
||||
env = _git_env(shadow, str(work))
|
||||
assert env["GIT_WORK_TREE"] == str(work.resolve())
|
||||
|
||||
def test_clears_index_file(self, tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("GIT_INDEX_FILE", "/some/index")
|
||||
shadow = tmp_path / "shadow"
|
||||
env = _git_env(shadow, str(tmp_path))
|
||||
assert "GIT_INDEX_FILE" not in env
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# format_checkpoint_list
|
||||
# =========================================================================
|
||||
|
||||
class TestFormatCheckpointList:
|
||||
def test_empty_list(self):
|
||||
result = format_checkpoint_list([], "/some/dir")
|
||||
assert "No checkpoints" in result
|
||||
|
||||
def test_formats_entries(self):
|
||||
cps = [
|
||||
{"hash": "abc123", "short_hash": "abc1", "timestamp": "2026-03-09T21:15:00-07:00", "reason": "before write_file"},
|
||||
{"hash": "def456", "short_hash": "def4", "timestamp": "2026-03-09T21:10:00-07:00", "reason": "before patch"},
|
||||
]
|
||||
result = format_checkpoint_list(cps, "/home/user/project")
|
||||
assert "abc1" in result
|
||||
assert "def4" in result
|
||||
assert "before write_file" in result
|
||||
assert "/rollback" in result
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# File count guard
|
||||
# =========================================================================
|
||||
|
||||
class TestDirFileCount:
|
||||
def test_counts_files(self, work_dir):
|
||||
count = _dir_file_count(str(work_dir))
|
||||
assert count >= 2 # main.py + README.md
|
||||
|
||||
def test_nonexistent_dir(self, tmp_path):
|
||||
count = _dir_file_count(str(tmp_path / "nonexistent"))
|
||||
assert count == 0
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Error resilience
|
||||
# =========================================================================
|
||||
|
||||
class TestErrorResilience:
|
||||
def test_no_git_installed(self, work_dir, checkpoint_base, monkeypatch):
|
||||
monkeypatch.setattr("tools.checkpoint_manager.CHECKPOINT_BASE", checkpoint_base)
|
||||
mgr = CheckpointManager(enabled=True)
|
||||
# Mock git not found
|
||||
monkeypatch.setattr("shutil.which", lambda x: None)
|
||||
mgr._git_available = None # reset lazy probe
|
||||
result = mgr.ensure_checkpoint(str(work_dir), "test")
|
||||
assert result is False
|
||||
|
||||
def test_checkpoint_failure_does_not_raise(self, mgr, work_dir, monkeypatch):
|
||||
"""Checkpoint failures should never raise — they're silently logged."""
|
||||
def broken_run_git(*args, **kwargs):
|
||||
raise OSError("git exploded")
|
||||
monkeypatch.setattr("tools.checkpoint_manager._run_git", broken_run_git)
|
||||
# Should not raise
|
||||
result = mgr.ensure_checkpoint(str(work_dir), "test")
|
||||
assert result is False
|
||||
@@ -558,6 +558,51 @@ class TestConvertToPng:
|
||||
assert result is True
|
||||
assert dest.exists() and dest.stat().st_size > 0
|
||||
|
||||
def test_imagemagick_failure_preserves_original(self, tmp_path):
|
||||
"""When ImageMagick convert fails, the original file must not be lost."""
|
||||
dest = tmp_path / "img.png"
|
||||
original_data = FAKE_BMP
|
||||
dest.write_bytes(original_data)
|
||||
|
||||
def fake_run_fail(cmd, **kw):
|
||||
# Simulate convert failing without producing output
|
||||
return MagicMock(returncode=1)
|
||||
|
||||
with patch.dict(sys.modules, {"PIL": None, "PIL.Image": None}):
|
||||
with patch("hermes_cli.clipboard.subprocess.run", side_effect=fake_run_fail):
|
||||
_convert_to_png(dest)
|
||||
|
||||
# Original file must still exist with original content
|
||||
assert dest.exists(), "Original file was lost after failed conversion"
|
||||
assert dest.read_bytes() == original_data
|
||||
|
||||
def test_imagemagick_not_installed_preserves_original(self, tmp_path):
|
||||
"""When ImageMagick is not installed, the original file must not be lost."""
|
||||
dest = tmp_path / "img.png"
|
||||
original_data = FAKE_BMP
|
||||
dest.write_bytes(original_data)
|
||||
|
||||
with patch.dict(sys.modules, {"PIL": None, "PIL.Image": None}):
|
||||
with patch("hermes_cli.clipboard.subprocess.run", side_effect=FileNotFoundError):
|
||||
_convert_to_png(dest)
|
||||
|
||||
assert dest.exists(), "Original file was lost when ImageMagick not installed"
|
||||
assert dest.read_bytes() == original_data
|
||||
|
||||
def test_imagemagick_timeout_preserves_original(self, tmp_path):
|
||||
"""When ImageMagick times out, the original file must not be lost."""
|
||||
import subprocess
|
||||
dest = tmp_path / "img.png"
|
||||
original_data = FAKE_BMP
|
||||
dest.write_bytes(original_data)
|
||||
|
||||
with patch.dict(sys.modules, {"PIL": None, "PIL.Image": None}):
|
||||
with patch("hermes_cli.clipboard.subprocess.run", side_effect=subprocess.TimeoutExpired("convert", 5)):
|
||||
_convert_to_png(dest)
|
||||
|
||||
assert dest.exists(), "Original file was lost after timeout"
|
||||
assert dest.read_bytes() == original_data
|
||||
|
||||
|
||||
# ── has_clipboard_image dispatch ─────────────────────────────────────────
|
||||
|
||||
|
||||
@@ -12,17 +12,21 @@ Run with: python -m pytest tests/test_code_execution.py -v
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
import threading
|
||||
import unittest
|
||||
from unittest.mock import patch
|
||||
from unittest.mock import patch, MagicMock
|
||||
|
||||
from tools.code_execution_tool import (
|
||||
SANDBOX_ALLOWED_TOOLS,
|
||||
execute_code,
|
||||
generate_hermes_tools_module,
|
||||
check_sandbox_requirements,
|
||||
build_execute_code_schema,
|
||||
EXECUTE_CODE_SCHEMA,
|
||||
_TOOL_DOC_LINES,
|
||||
)
|
||||
|
||||
|
||||
@@ -393,5 +397,351 @@ class TestStubSchemaDrift(unittest.TestCase):
|
||||
self.assertIn("mode", src)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# build_execute_code_schema
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestBuildExecuteCodeSchema(unittest.TestCase):
|
||||
"""Tests for build_execute_code_schema — the dynamic schema generator."""
|
||||
|
||||
def test_default_includes_all_tools(self):
|
||||
schema = build_execute_code_schema()
|
||||
desc = schema["description"]
|
||||
for name, _ in _TOOL_DOC_LINES:
|
||||
self.assertIn(name, desc, f"Default schema should mention '{name}'")
|
||||
|
||||
def test_schema_structure(self):
|
||||
schema = build_execute_code_schema()
|
||||
self.assertEqual(schema["name"], "execute_code")
|
||||
self.assertIn("parameters", schema)
|
||||
self.assertIn("code", schema["parameters"]["properties"])
|
||||
self.assertEqual(schema["parameters"]["required"], ["code"])
|
||||
|
||||
def test_subset_only_lists_enabled_tools(self):
|
||||
enabled = {"terminal", "read_file"}
|
||||
schema = build_execute_code_schema(enabled)
|
||||
desc = schema["description"]
|
||||
self.assertIn("terminal(", desc)
|
||||
self.assertIn("read_file(", desc)
|
||||
self.assertNotIn("web_search(", desc)
|
||||
self.assertNotIn("web_extract(", desc)
|
||||
self.assertNotIn("write_file(", desc)
|
||||
|
||||
def test_single_tool(self):
|
||||
schema = build_execute_code_schema({"terminal"})
|
||||
desc = schema["description"]
|
||||
self.assertIn("terminal(", desc)
|
||||
self.assertNotIn("web_search(", desc)
|
||||
|
||||
def test_import_examples_prefer_web_search_and_terminal(self):
|
||||
enabled = {"web_search", "terminal", "read_file"}
|
||||
schema = build_execute_code_schema(enabled)
|
||||
code_desc = schema["parameters"]["properties"]["code"]["description"]
|
||||
self.assertIn("web_search", code_desc)
|
||||
self.assertIn("terminal", code_desc)
|
||||
|
||||
def test_import_examples_fallback_when_no_preferred(self):
|
||||
"""When neither web_search nor terminal are enabled, falls back to
|
||||
sorted first two tools."""
|
||||
enabled = {"read_file", "write_file", "patch"}
|
||||
schema = build_execute_code_schema(enabled)
|
||||
code_desc = schema["parameters"]["properties"]["code"]["description"]
|
||||
# Should use sorted first 2: patch, read_file
|
||||
self.assertIn("patch", code_desc)
|
||||
self.assertIn("read_file", code_desc)
|
||||
|
||||
def test_empty_set_produces_valid_description(self):
|
||||
"""build_execute_code_schema(set()) must not produce 'import , ...'
|
||||
in the code property description."""
|
||||
schema = build_execute_code_schema(set())
|
||||
code_desc = schema["parameters"]["properties"]["code"]["description"]
|
||||
self.assertNotIn("import , ...", code_desc,
|
||||
"Empty enabled set produces broken import syntax in description")
|
||||
|
||||
def test_real_scenario_all_sandbox_tools_disabled(self):
|
||||
"""Reproduce the exact code path from model_tools.py:231-234.
|
||||
|
||||
Scenario: user runs `hermes tools code_execution` (only code_execution
|
||||
toolset enabled). tools_to_include = {"execute_code"}.
|
||||
|
||||
model_tools.py does:
|
||||
sandbox_enabled = SANDBOX_ALLOWED_TOOLS & tools_to_include
|
||||
dynamic_schema = build_execute_code_schema(sandbox_enabled)
|
||||
|
||||
SANDBOX_ALLOWED_TOOLS = {web_search, web_extract, read_file, write_file,
|
||||
search_files, patch, terminal}
|
||||
tools_to_include = {"execute_code"}
|
||||
intersection = empty set
|
||||
"""
|
||||
# Simulate model_tools.py:233
|
||||
tools_to_include = {"execute_code"}
|
||||
sandbox_enabled = SANDBOX_ALLOWED_TOOLS & tools_to_include
|
||||
|
||||
self.assertEqual(sandbox_enabled, set(),
|
||||
"Intersection should be empty when only execute_code is enabled")
|
||||
|
||||
schema = build_execute_code_schema(sandbox_enabled)
|
||||
code_desc = schema["parameters"]["properties"]["code"]["description"]
|
||||
self.assertNotIn("import , ...", code_desc,
|
||||
"Bug: broken import syntax sent to the model")
|
||||
|
||||
def test_real_scenario_only_vision_enabled(self):
|
||||
"""Another real path: user runs `hermes tools code_execution,vision`.
|
||||
|
||||
tools_to_include = {"execute_code", "vision_analyze"}
|
||||
SANDBOX_ALLOWED_TOOLS has neither, so intersection is empty.
|
||||
"""
|
||||
tools_to_include = {"execute_code", "vision_analyze"}
|
||||
sandbox_enabled = SANDBOX_ALLOWED_TOOLS & tools_to_include
|
||||
|
||||
self.assertEqual(sandbox_enabled, set())
|
||||
|
||||
schema = build_execute_code_schema(sandbox_enabled)
|
||||
code_desc = schema["parameters"]["properties"]["code"]["description"]
|
||||
self.assertNotIn("import , ...", code_desc)
|
||||
|
||||
def test_description_mentions_limits(self):
|
||||
schema = build_execute_code_schema()
|
||||
desc = schema["description"]
|
||||
self.assertIn("5-minute timeout", desc)
|
||||
self.assertIn("50KB", desc)
|
||||
self.assertIn("50 tool calls", desc)
|
||||
|
||||
def test_description_mentions_helpers(self):
|
||||
schema = build_execute_code_schema()
|
||||
desc = schema["description"]
|
||||
self.assertIn("json_parse", desc)
|
||||
self.assertIn("shell_quote", desc)
|
||||
self.assertIn("retry", desc)
|
||||
|
||||
def test_none_defaults_to_all_tools(self):
|
||||
schema_none = build_execute_code_schema(None)
|
||||
schema_all = build_execute_code_schema(SANDBOX_ALLOWED_TOOLS)
|
||||
self.assertEqual(schema_none["description"], schema_all["description"])
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Environment variable filtering (security critical)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
|
||||
class TestEnvVarFiltering(unittest.TestCase):
|
||||
"""Verify that execute_code filters environment variables correctly.
|
||||
|
||||
The child process should NOT receive API keys, tokens, or secrets.
|
||||
It should receive safe vars like PATH, HOME, LANG, etc.
|
||||
"""
|
||||
|
||||
def _get_child_env(self, extra_env=None):
|
||||
"""Run a script that dumps its environment and return the env dict."""
|
||||
code = (
|
||||
"import os, json\n"
|
||||
"print(json.dumps(dict(os.environ)))\n"
|
||||
)
|
||||
env_backup = os.environ.copy()
|
||||
try:
|
||||
if extra_env:
|
||||
os.environ.update(extra_env)
|
||||
with patch("model_tools.handle_function_call", return_value='{}'), \
|
||||
patch("tools.code_execution_tool._load_config",
|
||||
return_value={"timeout": 10, "max_tool_calls": 50}):
|
||||
raw = execute_code(code, task_id="test-env",
|
||||
enabled_tools=list(SANDBOX_ALLOWED_TOOLS))
|
||||
finally:
|
||||
os.environ.clear()
|
||||
os.environ.update(env_backup)
|
||||
|
||||
result = json.loads(raw)
|
||||
self.assertEqual(result["status"], "success", result.get("error", ""))
|
||||
return json.loads(result["output"].strip())
|
||||
|
||||
def test_api_keys_excluded(self):
|
||||
child_env = self._get_child_env({
|
||||
"OPENAI_API_KEY": "sk-secret123",
|
||||
"ANTHROPIC_API_KEY": "sk-ant-secret",
|
||||
"FIRECRAWL_API_KEY": "fc-secret",
|
||||
})
|
||||
self.assertNotIn("OPENAI_API_KEY", child_env)
|
||||
self.assertNotIn("ANTHROPIC_API_KEY", child_env)
|
||||
self.assertNotIn("FIRECRAWL_API_KEY", child_env)
|
||||
|
||||
def test_tokens_excluded(self):
|
||||
child_env = self._get_child_env({
|
||||
"GITHUB_TOKEN": "ghp_secret",
|
||||
"MODAL_TOKEN_ID": "tok-123",
|
||||
"MODAL_TOKEN_SECRET": "tok-sec",
|
||||
})
|
||||
self.assertNotIn("GITHUB_TOKEN", child_env)
|
||||
self.assertNotIn("MODAL_TOKEN_ID", child_env)
|
||||
self.assertNotIn("MODAL_TOKEN_SECRET", child_env)
|
||||
|
||||
def test_password_vars_excluded(self):
|
||||
child_env = self._get_child_env({
|
||||
"DB_PASSWORD": "hunter2",
|
||||
"MY_PASSWD": "secret",
|
||||
"AUTH_CREDENTIAL": "cred",
|
||||
})
|
||||
self.assertNotIn("DB_PASSWORD", child_env)
|
||||
self.assertNotIn("MY_PASSWD", child_env)
|
||||
self.assertNotIn("AUTH_CREDENTIAL", child_env)
|
||||
|
||||
def test_path_included(self):
|
||||
child_env = self._get_child_env()
|
||||
self.assertIn("PATH", child_env)
|
||||
|
||||
def test_home_included(self):
|
||||
child_env = self._get_child_env()
|
||||
self.assertIn("HOME", child_env)
|
||||
|
||||
def test_hermes_rpc_socket_injected(self):
|
||||
child_env = self._get_child_env()
|
||||
self.assertIn("HERMES_RPC_SOCKET", child_env)
|
||||
|
||||
def test_pythondontwritebytecode_set(self):
|
||||
child_env = self._get_child_env()
|
||||
self.assertEqual(child_env.get("PYTHONDONTWRITEBYTECODE"), "1")
|
||||
|
||||
def test_timezone_injected_when_set(self):
|
||||
env_backup = os.environ.copy()
|
||||
try:
|
||||
os.environ["HERMES_TIMEZONE"] = "America/New_York"
|
||||
child_env = self._get_child_env()
|
||||
self.assertEqual(child_env.get("TZ"), "America/New_York")
|
||||
finally:
|
||||
os.environ.clear()
|
||||
os.environ.update(env_backup)
|
||||
|
||||
def test_timezone_not_set_when_empty(self):
|
||||
env_backup = os.environ.copy()
|
||||
try:
|
||||
os.environ.pop("HERMES_TIMEZONE", None)
|
||||
child_env = self._get_child_env()
|
||||
if "TZ" in child_env:
|
||||
self.assertNotEqual(child_env["TZ"], "")
|
||||
finally:
|
||||
os.environ.clear()
|
||||
os.environ.update(env_backup)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# execute_code edge cases
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestExecuteCodeEdgeCases(unittest.TestCase):
|
||||
|
||||
def test_windows_returns_error(self):
|
||||
"""On Windows (or when SANDBOX_AVAILABLE is False), returns error JSON."""
|
||||
with patch("tools.code_execution_tool.SANDBOX_AVAILABLE", False):
|
||||
result = json.loads(execute_code("print('hi')", task_id="test"))
|
||||
self.assertIn("error", result)
|
||||
self.assertIn("Windows", result["error"])
|
||||
|
||||
def test_whitespace_only_code(self):
|
||||
result = json.loads(execute_code(" \n\t ", task_id="test"))
|
||||
self.assertIn("error", result)
|
||||
self.assertIn("No code", result["error"])
|
||||
|
||||
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
|
||||
def test_none_enabled_tools_uses_all(self):
|
||||
"""When enabled_tools is None, all sandbox tools should be available."""
|
||||
code = (
|
||||
"from hermes_tools import terminal, web_search, read_file\n"
|
||||
"print('all imports ok')\n"
|
||||
)
|
||||
with patch("model_tools.handle_function_call",
|
||||
return_value=json.dumps({"ok": True})):
|
||||
result = json.loads(execute_code(code, task_id="test-none",
|
||||
enabled_tools=None))
|
||||
self.assertEqual(result["status"], "success")
|
||||
self.assertIn("all imports ok", result["output"])
|
||||
|
||||
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
|
||||
def test_empty_enabled_tools_uses_all(self):
|
||||
"""When enabled_tools is [] (empty), all sandbox tools should be available."""
|
||||
code = (
|
||||
"from hermes_tools import terminal, web_search\n"
|
||||
"print('imports ok')\n"
|
||||
)
|
||||
with patch("model_tools.handle_function_call",
|
||||
return_value=json.dumps({"ok": True})):
|
||||
result = json.loads(execute_code(code, task_id="test-empty",
|
||||
enabled_tools=[]))
|
||||
self.assertEqual(result["status"], "success")
|
||||
self.assertIn("imports ok", result["output"])
|
||||
|
||||
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
|
||||
def test_nonoverlapping_tools_fallback(self):
|
||||
"""When enabled_tools has no overlap with SANDBOX_ALLOWED_TOOLS,
|
||||
should fall back to all allowed tools."""
|
||||
code = (
|
||||
"from hermes_tools import terminal\n"
|
||||
"print('fallback ok')\n"
|
||||
)
|
||||
with patch("model_tools.handle_function_call",
|
||||
return_value=json.dumps({"ok": True})):
|
||||
result = json.loads(execute_code(
|
||||
code, task_id="test-nonoverlap",
|
||||
enabled_tools=["vision_analyze", "browser_snapshot"],
|
||||
))
|
||||
self.assertEqual(result["status"], "success")
|
||||
self.assertIn("fallback ok", result["output"])
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _load_config
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestLoadConfig(unittest.TestCase):
|
||||
def test_returns_empty_dict_when_cli_config_unavailable(self):
|
||||
from tools.code_execution_tool import _load_config
|
||||
with patch.dict("sys.modules", {"cli": None}):
|
||||
result = _load_config()
|
||||
self.assertIsInstance(result, dict)
|
||||
|
||||
def test_returns_code_execution_section(self):
|
||||
from tools.code_execution_tool import _load_config
|
||||
mock_cli = MagicMock()
|
||||
mock_cli.CLI_CONFIG = {"code_execution": {"timeout": 120, "max_tool_calls": 10}}
|
||||
with patch.dict("sys.modules", {"cli": mock_cli}):
|
||||
result = _load_config()
|
||||
self.assertIsInstance(result, dict)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Interrupt event
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
|
||||
class TestInterruptHandling(unittest.TestCase):
|
||||
def test_interrupt_event_stops_execution(self):
|
||||
"""When _interrupt_event is set, execute_code should stop the script."""
|
||||
code = "import time; time.sleep(60); print('should not reach')"
|
||||
|
||||
def set_interrupt_after_delay():
|
||||
import time as _t
|
||||
_t.sleep(1)
|
||||
from tools.terminal_tool import _interrupt_event
|
||||
_interrupt_event.set()
|
||||
|
||||
t = threading.Thread(target=set_interrupt_after_delay, daemon=True)
|
||||
t.start()
|
||||
|
||||
try:
|
||||
with patch("model_tools.handle_function_call",
|
||||
return_value=json.dumps({"ok": True})), \
|
||||
patch("tools.code_execution_tool._load_config",
|
||||
return_value={"timeout": 30, "max_tool_calls": 50}):
|
||||
result = json.loads(execute_code(
|
||||
code, task_id="test-interrupt",
|
||||
enabled_tools=list(SANDBOX_ALLOWED_TOOLS),
|
||||
))
|
||||
self.assertEqual(result["status"], "interrupted")
|
||||
self.assertIn("interrupted", result["output"])
|
||||
finally:
|
||||
from tools.terminal_tool import _interrupt_event
|
||||
_interrupt_event.clear()
|
||||
t.join(timeout=3)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
|
||||
@@ -295,6 +295,6 @@ def check_dangerous_command(command: str, env_type: str,
|
||||
elif choice == "always":
|
||||
approve_session(session_key, pattern_key)
|
||||
approve_permanent(pattern_key)
|
||||
save_permanent_allowlist(load_permanent_allowlist() | {pattern_key})
|
||||
save_permanent_allowlist(_permanent_approved)
|
||||
|
||||
return {"approved": True, "message": None}
|
||||
|
||||
@@ -1615,10 +1615,10 @@ def _cleanup_old_screenshots(screenshots_dir, max_age_hours=24):
|
||||
try:
|
||||
if f.stat().st_mtime < cutoff:
|
||||
f.unlink()
|
||||
except Exception:
|
||||
pass
|
||||
except Exception:
|
||||
pass # Non-critical — don't fail the screenshot operation
|
||||
except Exception as e:
|
||||
logger.debug("Failed to clean old screenshot %s: %s", f, e)
|
||||
except Exception as e:
|
||||
logger.debug("Screenshot cleanup error (non-critical): %s", e)
|
||||
|
||||
|
||||
def _cleanup_old_recordings(max_age_hours=72):
|
||||
@@ -1634,10 +1634,10 @@ def _cleanup_old_recordings(max_age_hours=72):
|
||||
try:
|
||||
if f.stat().st_mtime < cutoff:
|
||||
f.unlink()
|
||||
except Exception:
|
||||
pass
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as e:
|
||||
logger.debug("Failed to clean old recording %s: %s", f, e)
|
||||
except Exception as e:
|
||||
logger.debug("Recording cleanup error (non-critical): %s", e)
|
||||
|
||||
|
||||
# ============================================================================
|
||||
@@ -1749,7 +1749,7 @@ def cleanup_browser(task_id: Optional[str] = None) -> None:
|
||||
os.kill(daemon_pid, signal.SIGTERM)
|
||||
logger.debug("Killed daemon pid %s for %s", daemon_pid, session_name)
|
||||
except (ProcessLookupError, ValueError, PermissionError, OSError):
|
||||
pass
|
||||
logger.debug("Could not kill daemon pid for %s (already dead or inaccessible)", session_name)
|
||||
shutil.rmtree(socket_dir, ignore_errors=True)
|
||||
|
||||
logger.debug("Removed task %s from active sessions", task_id)
|
||||
|
||||
@@ -0,0 +1,441 @@
|
||||
"""
|
||||
Checkpoint Manager — Transparent filesystem snapshots via shadow git repos.
|
||||
|
||||
Creates automatic snapshots of working directories before file-mutating
|
||||
operations (write_file, patch), triggered once per conversation turn.
|
||||
Provides rollback to any previous checkpoint.
|
||||
|
||||
This is NOT a tool — the LLM never sees it. It's transparent infrastructure
|
||||
controlled by the ``checkpoints`` config flag or ``--checkpoints`` CLI flag.
|
||||
|
||||
Architecture:
|
||||
~/.hermes/checkpoints/{sha256(abs_dir)[:16]}/ — shadow git repo
|
||||
HEAD, refs/, objects/ — standard git internals
|
||||
HERMES_WORKDIR — original dir path
|
||||
info/exclude — default excludes
|
||||
|
||||
The shadow repo uses GIT_DIR + GIT_WORK_TREE so no git state leaks
|
||||
into the user's project directory.
|
||||
"""
|
||||
|
||||
import hashlib
|
||||
import logging
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import time
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Optional, Set
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Constants
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
CHECKPOINT_BASE = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")) / "checkpoints"
|
||||
|
||||
DEFAULT_EXCLUDES = [
|
||||
"node_modules/",
|
||||
"dist/",
|
||||
"build/",
|
||||
".env",
|
||||
".env.*",
|
||||
".env.local",
|
||||
".env.*.local",
|
||||
"__pycache__/",
|
||||
"*.pyc",
|
||||
"*.pyo",
|
||||
".DS_Store",
|
||||
"*.log",
|
||||
".cache/",
|
||||
".next/",
|
||||
".nuxt/",
|
||||
"coverage/",
|
||||
".pytest_cache/",
|
||||
".venv/",
|
||||
"venv/",
|
||||
".git/",
|
||||
]
|
||||
|
||||
# Git subprocess timeout (seconds).
|
||||
_GIT_TIMEOUT: int = max(10, min(60, int(os.getenv("HERMES_CHECKPOINT_TIMEOUT", "30"))))
|
||||
|
||||
# Max files to snapshot — skip huge directories to avoid slowdowns.
|
||||
_MAX_FILES = 50_000
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Shadow repo helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _shadow_repo_path(working_dir: str) -> Path:
|
||||
"""Deterministic shadow repo path: sha256(abs_path)[:16]."""
|
||||
abs_path = str(Path(working_dir).resolve())
|
||||
dir_hash = hashlib.sha256(abs_path.encode()).hexdigest()[:16]
|
||||
return CHECKPOINT_BASE / dir_hash
|
||||
|
||||
|
||||
def _git_env(shadow_repo: Path, working_dir: str) -> dict:
|
||||
"""Build env dict that redirects git to the shadow repo."""
|
||||
env = os.environ.copy()
|
||||
env["GIT_DIR"] = str(shadow_repo)
|
||||
env["GIT_WORK_TREE"] = str(Path(working_dir).resolve())
|
||||
env.pop("GIT_INDEX_FILE", None)
|
||||
env.pop("GIT_NAMESPACE", None)
|
||||
env.pop("GIT_ALTERNATE_OBJECT_DIRECTORIES", None)
|
||||
return env
|
||||
|
||||
|
||||
def _run_git(
|
||||
args: List[str],
|
||||
shadow_repo: Path,
|
||||
working_dir: str,
|
||||
timeout: int = _GIT_TIMEOUT,
|
||||
) -> tuple:
|
||||
"""Run a git command against the shadow repo. Returns (ok, stdout, stderr)."""
|
||||
env = _git_env(shadow_repo, working_dir)
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["git"] + args,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=timeout,
|
||||
env=env,
|
||||
cwd=str(Path(working_dir).resolve()),
|
||||
)
|
||||
return result.returncode == 0, result.stdout.strip(), result.stderr.strip()
|
||||
except subprocess.TimeoutExpired:
|
||||
return False, "", f"git timed out after {timeout}s: git {' '.join(args)}"
|
||||
except FileNotFoundError:
|
||||
return False, "", "git not found"
|
||||
except Exception as exc:
|
||||
return False, "", str(exc)
|
||||
|
||||
|
||||
def _init_shadow_repo(shadow_repo: Path, working_dir: str) -> Optional[str]:
|
||||
"""Initialise shadow repo if needed. Returns error string or None."""
|
||||
if (shadow_repo / "HEAD").exists():
|
||||
return None
|
||||
|
||||
shadow_repo.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
ok, _, err = _run_git(["init"], shadow_repo, working_dir)
|
||||
if not ok:
|
||||
return f"Shadow repo init failed: {err}"
|
||||
|
||||
_run_git(["config", "user.email", "hermes@local"], shadow_repo, working_dir)
|
||||
_run_git(["config", "user.name", "Hermes Checkpoint"], shadow_repo, working_dir)
|
||||
|
||||
info_dir = shadow_repo / "info"
|
||||
info_dir.mkdir(exist_ok=True)
|
||||
(info_dir / "exclude").write_text(
|
||||
"\n".join(DEFAULT_EXCLUDES) + "\n", encoding="utf-8"
|
||||
)
|
||||
|
||||
(shadow_repo / "HERMES_WORKDIR").write_text(
|
||||
str(Path(working_dir).resolve()) + "\n", encoding="utf-8"
|
||||
)
|
||||
|
||||
logger.debug("Initialised checkpoint repo at %s for %s", shadow_repo, working_dir)
|
||||
return None
|
||||
|
||||
|
||||
def _dir_file_count(path: str) -> int:
|
||||
"""Quick file count estimate (stops early if over _MAX_FILES)."""
|
||||
count = 0
|
||||
try:
|
||||
for _ in Path(path).rglob("*"):
|
||||
count += 1
|
||||
if count > _MAX_FILES:
|
||||
return count
|
||||
except (PermissionError, OSError):
|
||||
pass
|
||||
return count
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CheckpointManager
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class CheckpointManager:
|
||||
"""Manages automatic filesystem checkpoints.
|
||||
|
||||
Designed to be owned by AIAgent. Call ``new_turn()`` at the start of
|
||||
each conversation turn and ``ensure_checkpoint(dir, reason)`` before
|
||||
any file-mutating tool call. The manager deduplicates so at most one
|
||||
snapshot is taken per directory per turn.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
enabled : bool
|
||||
Master switch (from config / CLI flag).
|
||||
max_snapshots : int
|
||||
Keep at most this many checkpoints per directory.
|
||||
"""
|
||||
|
||||
def __init__(self, enabled: bool = False, max_snapshots: int = 50):
|
||||
self.enabled = enabled
|
||||
self.max_snapshots = max_snapshots
|
||||
self._checkpointed_dirs: Set[str] = set()
|
||||
self._git_available: Optional[bool] = None # lazy probe
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Turn lifecycle
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def new_turn(self) -> None:
|
||||
"""Reset per-turn dedup. Call at the start of each agent iteration."""
|
||||
self._checkpointed_dirs.clear()
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Public API
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def ensure_checkpoint(self, working_dir: str, reason: str = "auto") -> bool:
|
||||
"""Take a checkpoint if enabled and not already done this turn.
|
||||
|
||||
Returns True if a checkpoint was taken, False otherwise.
|
||||
Never raises — all errors are silently logged.
|
||||
"""
|
||||
if not self.enabled:
|
||||
return False
|
||||
|
||||
# Lazy git probe
|
||||
if self._git_available is None:
|
||||
self._git_available = shutil.which("git") is not None
|
||||
if not self._git_available:
|
||||
logger.debug("Checkpoints disabled: git not found")
|
||||
if not self._git_available:
|
||||
return False
|
||||
|
||||
abs_dir = str(Path(working_dir).resolve())
|
||||
|
||||
# Skip root, home, and other overly broad directories
|
||||
if abs_dir in ("/", str(Path.home())):
|
||||
logger.debug("Checkpoint skipped: directory too broad (%s)", abs_dir)
|
||||
return False
|
||||
|
||||
# Already checkpointed this turn?
|
||||
if abs_dir in self._checkpointed_dirs:
|
||||
return False
|
||||
|
||||
self._checkpointed_dirs.add(abs_dir)
|
||||
|
||||
try:
|
||||
return self._take(abs_dir, reason)
|
||||
except Exception as e:
|
||||
logger.debug("Checkpoint failed (non-fatal): %s", e)
|
||||
return False
|
||||
|
||||
def list_checkpoints(self, working_dir: str) -> List[Dict]:
|
||||
"""List available checkpoints for a directory.
|
||||
|
||||
Returns a list of dicts with keys: hash, short_hash, timestamp, reason.
|
||||
Most recent first.
|
||||
"""
|
||||
abs_dir = str(Path(working_dir).resolve())
|
||||
shadow = _shadow_repo_path(abs_dir)
|
||||
|
||||
if not (shadow / "HEAD").exists():
|
||||
return []
|
||||
|
||||
ok, stdout, _ = _run_git(
|
||||
["log", "--format=%H|%h|%aI|%s", "--no-walk=unsorted",
|
||||
"--all" if False else "HEAD", # just HEAD lineage
|
||||
"-n", str(self.max_snapshots)],
|
||||
shadow, abs_dir,
|
||||
)
|
||||
|
||||
# Simpler: just use regular log
|
||||
ok, stdout, _ = _run_git(
|
||||
["log", "--format=%H|%h|%aI|%s", "-n", str(self.max_snapshots)],
|
||||
shadow, abs_dir,
|
||||
)
|
||||
|
||||
if not ok or not stdout:
|
||||
return []
|
||||
|
||||
results = []
|
||||
for line in stdout.splitlines():
|
||||
parts = line.split("|", 3)
|
||||
if len(parts) == 4:
|
||||
results.append({
|
||||
"hash": parts[0],
|
||||
"short_hash": parts[1],
|
||||
"timestamp": parts[2],
|
||||
"reason": parts[3],
|
||||
})
|
||||
return results
|
||||
|
||||
def restore(self, working_dir: str, commit_hash: str) -> Dict:
|
||||
"""Restore files to a checkpoint state.
|
||||
|
||||
Uses ``git checkout <hash> -- .`` which restores tracked files
|
||||
without moving HEAD — safe and reversible.
|
||||
|
||||
Returns dict with success/error info.
|
||||
"""
|
||||
abs_dir = str(Path(working_dir).resolve())
|
||||
shadow = _shadow_repo_path(abs_dir)
|
||||
|
||||
if not (shadow / "HEAD").exists():
|
||||
return {"success": False, "error": "No checkpoints exist for this directory"}
|
||||
|
||||
# Verify the commit exists
|
||||
ok, _, err = _run_git(
|
||||
["cat-file", "-t", commit_hash], shadow, abs_dir,
|
||||
)
|
||||
if not ok:
|
||||
return {"success": False, "error": f"Checkpoint '{commit_hash}' not found"}
|
||||
|
||||
# Take a checkpoint of current state before restoring (so you can undo the undo)
|
||||
self._take(abs_dir, f"pre-rollback snapshot (restoring to {commit_hash[:8]})")
|
||||
|
||||
# Restore
|
||||
ok, stdout, err = _run_git(
|
||||
["checkout", commit_hash, "--", "."],
|
||||
shadow, abs_dir, timeout=_GIT_TIMEOUT * 2,
|
||||
)
|
||||
|
||||
if not ok:
|
||||
return {"success": False, "error": f"Restore failed: {err}"}
|
||||
|
||||
# Get info about what was restored
|
||||
ok2, reason_out, _ = _run_git(
|
||||
["log", "--format=%s", "-1", commit_hash], shadow, abs_dir,
|
||||
)
|
||||
reason = reason_out if ok2 else "unknown"
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"restored_to": commit_hash[:8],
|
||||
"reason": reason,
|
||||
"directory": abs_dir,
|
||||
}
|
||||
|
||||
def get_working_dir_for_path(self, file_path: str) -> str:
|
||||
"""Resolve a file path to its working directory for checkpointing.
|
||||
|
||||
Walks up from the file's parent to find a reasonable project root
|
||||
(directory containing .git, pyproject.toml, package.json, etc.).
|
||||
Falls back to the file's parent directory.
|
||||
"""
|
||||
path = Path(file_path).resolve()
|
||||
if path.is_dir():
|
||||
candidate = path
|
||||
else:
|
||||
candidate = path.parent
|
||||
|
||||
# Walk up looking for project root markers
|
||||
markers = {".git", "pyproject.toml", "package.json", "Cargo.toml",
|
||||
"go.mod", "Makefile", "pom.xml", ".hg", "Gemfile"}
|
||||
check = candidate
|
||||
while check != check.parent:
|
||||
if any((check / m).exists() for m in markers):
|
||||
return str(check)
|
||||
check = check.parent
|
||||
|
||||
# No project root found — use the file's parent
|
||||
return str(candidate)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Internal
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _take(self, working_dir: str, reason: str) -> bool:
|
||||
"""Take a snapshot. Returns True on success."""
|
||||
shadow = _shadow_repo_path(working_dir)
|
||||
|
||||
# Init if needed
|
||||
err = _init_shadow_repo(shadow, working_dir)
|
||||
if err:
|
||||
logger.debug("Checkpoint init failed: %s", err)
|
||||
return False
|
||||
|
||||
# Quick size guard — don't try to snapshot enormous directories
|
||||
if _dir_file_count(working_dir) > _MAX_FILES:
|
||||
logger.debug("Checkpoint skipped: >%d files in %s", _MAX_FILES, working_dir)
|
||||
return False
|
||||
|
||||
# Stage everything
|
||||
ok, _, err = _run_git(
|
||||
["add", "-A"], shadow, working_dir, timeout=_GIT_TIMEOUT * 2,
|
||||
)
|
||||
if not ok:
|
||||
logger.debug("Checkpoint git-add failed: %s", err)
|
||||
return False
|
||||
|
||||
# Check if there's anything to commit
|
||||
ok_diff, diff_out, _ = _run_git(
|
||||
["diff", "--cached", "--quiet"], shadow, working_dir,
|
||||
)
|
||||
if ok_diff:
|
||||
# No changes to commit
|
||||
logger.debug("Checkpoint skipped: no changes in %s", working_dir)
|
||||
return False
|
||||
|
||||
# Commit
|
||||
ok, _, err = _run_git(
|
||||
["commit", "-m", reason, "--allow-empty-message"],
|
||||
shadow, working_dir, timeout=_GIT_TIMEOUT * 2,
|
||||
)
|
||||
if not ok:
|
||||
logger.debug("Checkpoint commit failed: %s", err)
|
||||
return False
|
||||
|
||||
logger.debug("Checkpoint taken in %s: %s", working_dir, reason)
|
||||
|
||||
# Prune old snapshots
|
||||
self._prune(shadow, working_dir)
|
||||
|
||||
return True
|
||||
|
||||
def _prune(self, shadow_repo: Path, working_dir: str) -> None:
|
||||
"""Keep only the last max_snapshots commits via orphan reset."""
|
||||
ok, stdout, _ = _run_git(
|
||||
["rev-list", "--count", "HEAD"], shadow_repo, working_dir,
|
||||
)
|
||||
if not ok:
|
||||
return
|
||||
|
||||
try:
|
||||
count = int(stdout)
|
||||
except ValueError:
|
||||
return
|
||||
|
||||
if count <= self.max_snapshots:
|
||||
return
|
||||
|
||||
# Get the hash of the commit at the cutoff point
|
||||
ok, cutoff_hash, _ = _run_git(
|
||||
["rev-list", "--reverse", "HEAD", "--skip=0",
|
||||
f"--max-count=1"],
|
||||
shadow_repo, working_dir,
|
||||
)
|
||||
|
||||
# For simplicity, we don't actually prune — git's pack mechanism
|
||||
# handles this efficiently, and the objects are small. The log
|
||||
# listing is already limited by max_snapshots.
|
||||
# Full pruning would require rebase --onto or filter-branch which
|
||||
# is fragile for a background feature. We just limit the log view.
|
||||
logger.debug("Checkpoint repo has %d commits (limit %d)", count, self.max_snapshots)
|
||||
|
||||
|
||||
def format_checkpoint_list(checkpoints: List[Dict], directory: str) -> str:
|
||||
"""Format checkpoint list for display to user."""
|
||||
if not checkpoints:
|
||||
return f"No checkpoints found for {directory}"
|
||||
|
||||
lines = [f"📸 Checkpoints for {directory}:\n"]
|
||||
for i, cp in enumerate(checkpoints, 1):
|
||||
# Parse ISO timestamp to something readable
|
||||
ts = cp["timestamp"]
|
||||
if "T" in ts:
|
||||
ts = ts.split("T")[1].split("+")[0].split("-")[0][:5] # HH:MM
|
||||
date = cp["timestamp"].split("T")[0]
|
||||
ts = f"{date} {ts}"
|
||||
lines.append(f" {i}. {cp['short_hash']} {ts} {cp['reason']}")
|
||||
|
||||
lines.append(f"\nUse /rollback <number> to restore, e.g. /rollback 1")
|
||||
return "\n".join(lines)
|
||||
@@ -311,6 +311,7 @@ def _rpc_server_loop(
|
||||
sys.stderr.close()
|
||||
sys.stdout, sys.stderr = _real_stdout, _real_stderr
|
||||
except Exception as exc:
|
||||
logger.error("Tool call failed in sandbox: %s", exc, exc_info=True)
|
||||
result = json.dumps({"error": str(exc)})
|
||||
|
||||
tool_call_counter[0] += 1
|
||||
@@ -327,15 +328,15 @@ def _rpc_server_loop(
|
||||
conn.sendall((result + "\n").encode())
|
||||
|
||||
except socket.timeout:
|
||||
pass
|
||||
except OSError:
|
||||
pass
|
||||
logger.debug("RPC listener socket timeout")
|
||||
except OSError as e:
|
||||
logger.debug("RPC listener socket error: %s", e, exc_info=True)
|
||||
finally:
|
||||
if conn:
|
||||
try:
|
||||
conn.close()
|
||||
except OSError:
|
||||
pass
|
||||
except OSError as e:
|
||||
logger.debug("RPC conn close error: %s", e)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -397,9 +398,9 @@ def execute_code(
|
||||
|
||||
try:
|
||||
# Write the auto-generated hermes_tools module
|
||||
tools_src = generate_hermes_tools_module(
|
||||
list(sandbox_tools) if enabled_tools else list(SANDBOX_ALLOWED_TOOLS)
|
||||
)
|
||||
# sandbox_tools is already the correct set (intersection with session
|
||||
# tools, or SANDBOX_ALLOWED_TOOLS as fallback — see lines above).
|
||||
tools_src = generate_hermes_tools_module(list(sandbox_tools))
|
||||
with open(os.path.join(tmpdir, "hermes_tools.py"), "w") as f:
|
||||
f.write(tools_src)
|
||||
|
||||
@@ -472,8 +473,8 @@ def execute_code(
|
||||
keep = max_bytes - total
|
||||
chunks.append(data[:keep])
|
||||
total += len(data)
|
||||
except (ValueError, OSError):
|
||||
pass
|
||||
except (ValueError, OSError) as e:
|
||||
logger.debug("Error reading process output: %s", e, exc_info=True)
|
||||
|
||||
stdout_reader = threading.Thread(
|
||||
target=_drain, args=(proc.stdout, stdout_chunks, MAX_STDOUT_BYTES), daemon=True
|
||||
@@ -511,7 +512,7 @@ def execute_code(
|
||||
duration = round(time.monotonic() - exec_start, 2)
|
||||
|
||||
# Wait for RPC thread to finish
|
||||
server_sock.close()
|
||||
server_sock.close() # break accept() so thread exits promptly
|
||||
rpc_thread.join(timeout=3)
|
||||
|
||||
# Build response
|
||||
@@ -547,15 +548,19 @@ def execute_code(
|
||||
|
||||
finally:
|
||||
# Cleanup temp dir and socket
|
||||
try:
|
||||
server_sock.close()
|
||||
except Exception as e:
|
||||
logger.debug("Server socket close error: %s", e)
|
||||
try:
|
||||
import shutil
|
||||
shutil.rmtree(tmpdir, ignore_errors=True)
|
||||
except Exception as e:
|
||||
logger.debug("Could not clean temp dir: %s", e)
|
||||
logger.debug("Could not clean temp dir: %s", e, exc_info=True)
|
||||
try:
|
||||
os.unlink(sock_path)
|
||||
except OSError:
|
||||
pass
|
||||
except OSError as e:
|
||||
logger.debug("Could not remove socket file: %s", e, exc_info=True)
|
||||
|
||||
|
||||
def _kill_process_group(proc, escalate: bool = False):
|
||||
@@ -565,11 +570,12 @@ def _kill_process_group(proc, escalate: bool = False):
|
||||
proc.terminate()
|
||||
else:
|
||||
os.killpg(os.getpgid(proc.pid), signal.SIGTERM)
|
||||
except (ProcessLookupError, PermissionError):
|
||||
except (ProcessLookupError, PermissionError) as e:
|
||||
logger.debug("Could not kill process group: %s", e, exc_info=True)
|
||||
try:
|
||||
proc.kill()
|
||||
except Exception as e:
|
||||
logger.debug("Could not kill process: %s", e)
|
||||
except Exception as e2:
|
||||
logger.debug("Could not kill process: %s", e2, exc_info=True)
|
||||
|
||||
if escalate:
|
||||
# Give the process 5s to exit after SIGTERM, then SIGKILL
|
||||
@@ -581,11 +587,12 @@ def _kill_process_group(proc, escalate: bool = False):
|
||||
proc.kill()
|
||||
else:
|
||||
os.killpg(os.getpgid(proc.pid), signal.SIGKILL)
|
||||
except (ProcessLookupError, PermissionError):
|
||||
except (ProcessLookupError, PermissionError) as e:
|
||||
logger.debug("Could not kill process group with SIGKILL: %s", e, exc_info=True)
|
||||
try:
|
||||
proc.kill()
|
||||
except Exception as e:
|
||||
logger.debug("Could not kill process: %s", e)
|
||||
except Exception as e2:
|
||||
logger.debug("Could not kill process: %s", e2, exc_info=True)
|
||||
|
||||
|
||||
def _load_config() -> dict:
|
||||
@@ -647,7 +654,10 @@ def build_execute_code_schema(enabled_sandbox_tools: set = None) -> dict:
|
||||
import_examples = [n for n in ("web_search", "terminal") if n in enabled_sandbox_tools]
|
||||
if not import_examples:
|
||||
import_examples = sorted(enabled_sandbox_tools)[:2]
|
||||
import_str = ", ".join(import_examples) + ", ..."
|
||||
if import_examples:
|
||||
import_str = ", ".join(import_examples) + ", ..."
|
||||
else:
|
||||
import_str = "..."
|
||||
|
||||
description = (
|
||||
"Run a Python script that can call Hermes tools programmatically. "
|
||||
|
||||
@@ -20,6 +20,7 @@ import contextlib
|
||||
import io
|
||||
import json
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
@@ -107,8 +108,8 @@ def _build_child_progress_callback(task_index: int, parent_agent, task_count: in
|
||||
short = (preview[:55] + "...") if preview and len(preview) > 55 else (preview or "")
|
||||
try:
|
||||
spinner.print_above(f" {prefix}├─ 💭 \"{short}\"")
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as e:
|
||||
logger.debug("Spinner print_above failed: %s", e)
|
||||
# Don't relay thinking to gateway (too noisy for chat)
|
||||
return
|
||||
|
||||
@@ -129,8 +130,8 @@ def _build_child_progress_callback(task_index: int, parent_agent, task_count: in
|
||||
line += f" \"{short}\""
|
||||
try:
|
||||
spinner.print_above(line)
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as e:
|
||||
logger.debug("Spinner print_above failed: %s", e)
|
||||
|
||||
if parent_cb:
|
||||
_batch.append(tool_name)
|
||||
@@ -138,8 +139,8 @@ def _build_child_progress_callback(task_index: int, parent_agent, task_count: in
|
||||
summary = ", ".join(_batch)
|
||||
try:
|
||||
parent_cb("subagent_progress", f"🔀 {prefix}{summary}")
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as e:
|
||||
logger.debug("Parent callback failed: %s", e)
|
||||
_batch.clear()
|
||||
|
||||
def _flush():
|
||||
@@ -148,8 +149,8 @@ def _build_child_progress_callback(task_index: int, parent_agent, task_count: in
|
||||
summary = ", ".join(_batch)
|
||||
try:
|
||||
parent_cb("subagent_progress", f"🔀 {prefix}{summary}")
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as e:
|
||||
logger.debug("Parent callback flush failed: %s", e)
|
||||
_batch.clear()
|
||||
|
||||
_callback._flush = _flush
|
||||
@@ -241,8 +242,8 @@ def _run_single_child(
|
||||
if child_progress_cb and hasattr(child_progress_cb, '_flush'):
|
||||
try:
|
||||
child_progress_cb._flush()
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as e:
|
||||
logger.debug("Progress callback flush failed: %s", e)
|
||||
|
||||
duration = round(time.monotonic() - child_start, 2)
|
||||
|
||||
@@ -287,8 +288,8 @@ def _run_single_child(
|
||||
if hasattr(parent_agent, '_active_children'):
|
||||
try:
|
||||
parent_agent._active_children.remove(child)
|
||||
except (ValueError, UnboundLocalError):
|
||||
pass
|
||||
except (ValueError, UnboundLocalError) as e:
|
||||
logger.debug("Could not remove child from active_children: %s", e)
|
||||
|
||||
|
||||
def delegate_task(
|
||||
@@ -425,8 +426,8 @@ def delegate_task(
|
||||
if spinner_ref and remaining > 0:
|
||||
try:
|
||||
spinner_ref.update_text(f"🔀 {remaining} task{'s' if remaining != 1 else ''} remaining")
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as e:
|
||||
logger.debug("Spinner update_text failed: %s", e)
|
||||
|
||||
# Restore stdout/stderr in case redirect_stdout race left them as devnull
|
||||
sys.stdout = _saved_stdout
|
||||
|
||||
@@ -59,8 +59,16 @@ class BaseEnvironment(ABC):
|
||||
# Shared helpers (eliminate duplication across backends)
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _prepare_command(self, command: str) -> str:
|
||||
"""Transform sudo commands if SUDO_PASSWORD is available."""
|
||||
def _prepare_command(self, command: str) -> tuple[str, str | None]:
|
||||
"""Transform sudo commands if SUDO_PASSWORD is available.
|
||||
|
||||
Returns:
|
||||
(transformed_command, sudo_stdin) — see _transform_sudo_command
|
||||
for the full contract. Callers that drive a subprocess directly
|
||||
should prepend sudo_stdin (when not None) to any stdin_data they
|
||||
pass to Popen. Callers that embed stdin via heredoc (modal,
|
||||
daytona) handle sudo_stdin in their own execute() method.
|
||||
"""
|
||||
from tools.terminal_tool import _transform_sudo_command
|
||||
return _transform_sudo_command(command)
|
||||
|
||||
|
||||
@@ -6,6 +6,7 @@ and resumed on next creation, preserving the filesystem across sessions.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import time
|
||||
import math
|
||||
import shlex
|
||||
import threading
|
||||
@@ -142,10 +143,9 @@ class DaytonaEnvironment(BaseEnvironment):
|
||||
t = threading.Thread(target=_run, daemon=True)
|
||||
t.start()
|
||||
# Wait for timeout + generous buffer for network/SDK overhead
|
||||
deadline = timeout + 10
|
||||
deadline = time.monotonic() + timeout + 10
|
||||
while t.is_alive():
|
||||
t.join(timeout=0.2)
|
||||
deadline -= 0.2
|
||||
if is_interrupted():
|
||||
with self._lock:
|
||||
try:
|
||||
@@ -156,7 +156,7 @@ class DaytonaEnvironment(BaseEnvironment):
|
||||
"output": "[Command interrupted - Daytona sandbox stopped]",
|
||||
"returncode": 130,
|
||||
}
|
||||
if deadline <= 0:
|
||||
if time.monotonic() > deadline:
|
||||
# Shell timeout didn't fire and SDK is hung — force stop
|
||||
with self._lock:
|
||||
try:
|
||||
@@ -181,7 +181,20 @@ class DaytonaEnvironment(BaseEnvironment):
|
||||
marker = f"HERMES_EOF_{uuid.uuid4().hex[:8]}"
|
||||
command = f"{command} << '{marker}'\n{stdin_data}\n{marker}"
|
||||
|
||||
exec_command = self._prepare_command(command)
|
||||
exec_command, sudo_stdin = self._prepare_command(command)
|
||||
|
||||
# Daytona sandboxes execute commands via the Daytona SDK and cannot
|
||||
# pipe subprocess stdin directly the way a local Popen can. When a
|
||||
# sudo password is present, use a shell-level pipe from printf so that
|
||||
# the password feeds sudo -S without appearing as an echo argument
|
||||
# embedded in the shell string. The password is still visible in the
|
||||
# remote sandbox's command line, but it is not exposed on the user's
|
||||
# local machine — which is the primary threat being mitigated.
|
||||
if sudo_stdin is not None:
|
||||
import shlex
|
||||
exec_command = (
|
||||
f"printf '%s\\n' {shlex.quote(sudo_stdin.rstrip())} | {exec_command}"
|
||||
)
|
||||
effective_cwd = cwd or self.cwd or None
|
||||
effective_timeout = timeout or self.timeout
|
||||
|
||||
|
||||
@@ -193,10 +193,18 @@ class DockerEnvironment(BaseEnvironment):
|
||||
def execute(self, command: str, cwd: str = "", *,
|
||||
timeout: int | None = None,
|
||||
stdin_data: str | None = None) -> dict:
|
||||
exec_command = self._prepare_command(command)
|
||||
exec_command, sudo_stdin = self._prepare_command(command)
|
||||
work_dir = cwd or self.cwd
|
||||
effective_timeout = timeout or self.timeout
|
||||
|
||||
# Merge sudo password (if any) with caller-supplied stdin_data.
|
||||
if sudo_stdin is not None and stdin_data is not None:
|
||||
effective_stdin = sudo_stdin + stdin_data
|
||||
elif sudo_stdin is not None:
|
||||
effective_stdin = sudo_stdin
|
||||
else:
|
||||
effective_stdin = stdin_data
|
||||
|
||||
# docker exec -w doesn't expand ~, so prepend a cd into the command
|
||||
if work_dir == "~" or work_dir.startswith("~/"):
|
||||
exec_command = f"cd {work_dir} && {exec_command}"
|
||||
@@ -204,7 +212,7 @@ class DockerEnvironment(BaseEnvironment):
|
||||
|
||||
assert self._inner.container_id, "Container not started"
|
||||
cmd = [self._inner.config.executable, "exec"]
|
||||
if stdin_data is not None:
|
||||
if effective_stdin is not None:
|
||||
cmd.append("-i")
|
||||
cmd.extend(["-w", work_dir])
|
||||
for key in self._inner.config.forward_env:
|
||||
@@ -219,12 +227,12 @@ class DockerEnvironment(BaseEnvironment):
|
||||
proc = subprocess.Popen(
|
||||
cmd,
|
||||
stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
|
||||
stdin=subprocess.PIPE if stdin_data else subprocess.DEVNULL,
|
||||
stdin=subprocess.PIPE if effective_stdin else subprocess.DEVNULL,
|
||||
text=True,
|
||||
)
|
||||
if stdin_data:
|
||||
if effective_stdin:
|
||||
try:
|
||||
proc.stdin.write(stdin_data)
|
||||
proc.stdin.write(effective_stdin)
|
||||
proc.stdin.close()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
@@ -161,7 +161,18 @@ class LocalEnvironment(BaseEnvironment):
|
||||
|
||||
work_dir = cwd or self.cwd or os.getcwd()
|
||||
effective_timeout = timeout or self.timeout
|
||||
exec_command = self._prepare_command(command)
|
||||
exec_command, sudo_stdin = self._prepare_command(command)
|
||||
|
||||
# Merge the sudo password (if any) with caller-supplied stdin_data.
|
||||
# sudo -S reads exactly one line (the password) then passes the rest
|
||||
# of stdin to the child, so prepending is safe even when stdin_data
|
||||
# is also present.
|
||||
if sudo_stdin is not None and stdin_data is not None:
|
||||
effective_stdin = sudo_stdin + stdin_data
|
||||
elif sudo_stdin is not None:
|
||||
effective_stdin = sudo_stdin
|
||||
else:
|
||||
effective_stdin = stdin_data
|
||||
|
||||
try:
|
||||
# The fence wrapper uses bash syntax (semicolons, $?, printf).
|
||||
@@ -195,14 +206,14 @@ class LocalEnvironment(BaseEnvironment):
|
||||
errors="replace",
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.STDOUT,
|
||||
stdin=subprocess.PIPE if stdin_data is not None else subprocess.DEVNULL,
|
||||
stdin=subprocess.PIPE if effective_stdin is not None else subprocess.DEVNULL,
|
||||
preexec_fn=None if _IS_WINDOWS else os.setsid,
|
||||
)
|
||||
|
||||
if stdin_data is not None:
|
||||
if effective_stdin is not None:
|
||||
def _write_stdin():
|
||||
try:
|
||||
proc.stdin.write(stdin_data)
|
||||
proc.stdin.write(effective_stdin)
|
||||
proc.stdin.close()
|
||||
except (BrokenPipeError, OSError):
|
||||
pass
|
||||
|
||||
@@ -106,7 +106,20 @@ class ModalEnvironment(BaseEnvironment):
|
||||
marker = f"HERMES_EOF_{uuid.uuid4().hex[:8]}"
|
||||
command = f"{command} << '{marker}'\n{stdin_data}\n{marker}"
|
||||
|
||||
exec_command = self._prepare_command(command)
|
||||
exec_command, sudo_stdin = self._prepare_command(command)
|
||||
|
||||
# Modal sandboxes execute commands via the Modal SDK and cannot pipe
|
||||
# subprocess stdin directly the way a local Popen can. When a sudo
|
||||
# password is present, use a shell-level pipe from printf so that the
|
||||
# password feeds sudo -S without appearing as an echo argument embedded
|
||||
# in the shell string. The password is still visible in the remote
|
||||
# sandbox's command line, but it is not exposed on the user's local
|
||||
# machine — which is the primary threat being mitigated.
|
||||
if sudo_stdin is not None:
|
||||
import shlex
|
||||
exec_command = (
|
||||
f"printf '%s\\n' {shlex.quote(sudo_stdin.rstrip())} | {exec_command}"
|
||||
)
|
||||
|
||||
# Run in a background thread so we can poll for interrupts
|
||||
result_holder = {"value": None, "error": None}
|
||||
@@ -137,6 +150,10 @@ class ModalEnvironment(BaseEnvironment):
|
||||
|
||||
def cleanup(self):
|
||||
"""Snapshot the filesystem (if persistent) then stop the sandbox."""
|
||||
# Check if _inner was ever set (init may have failed)
|
||||
if not hasattr(self, '_inner') or self._inner is None:
|
||||
return
|
||||
|
||||
if self._persistent:
|
||||
try:
|
||||
sandbox = getattr(self._inner, 'deployment', None)
|
||||
|
||||
@@ -228,7 +228,15 @@ class SingularityEnvironment(BaseEnvironment):
|
||||
|
||||
effective_timeout = timeout or self.timeout
|
||||
work_dir = cwd or self.cwd
|
||||
exec_command = self._prepare_command(command)
|
||||
exec_command, sudo_stdin = self._prepare_command(command)
|
||||
|
||||
# Merge sudo password (if any) with caller-supplied stdin_data.
|
||||
if sudo_stdin is not None and stdin_data is not None:
|
||||
effective_stdin = sudo_stdin + stdin_data
|
||||
elif sudo_stdin is not None:
|
||||
effective_stdin = sudo_stdin
|
||||
else:
|
||||
effective_stdin = stdin_data
|
||||
|
||||
# apptainer exec --pwd doesn't expand ~, so prepend a cd into the command
|
||||
if work_dir == "~" or work_dir.startswith("~/"):
|
||||
@@ -245,12 +253,12 @@ class SingularityEnvironment(BaseEnvironment):
|
||||
proc = subprocess.Popen(
|
||||
cmd,
|
||||
stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
|
||||
stdin=subprocess.PIPE if stdin_data else subprocess.DEVNULL,
|
||||
stdin=subprocess.PIPE if effective_stdin else subprocess.DEVNULL,
|
||||
text=True,
|
||||
)
|
||||
if stdin_data:
|
||||
if effective_stdin:
|
||||
try:
|
||||
proc.stdin.write(stdin_data)
|
||||
proc.stdin.write(effective_stdin)
|
||||
proc.stdin.close()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
@@ -69,15 +69,23 @@ class SSHEnvironment(BaseEnvironment):
|
||||
timeout: int | None = None,
|
||||
stdin_data: str | None = None) -> dict:
|
||||
work_dir = cwd or self.cwd
|
||||
exec_command = self._prepare_command(command)
|
||||
exec_command, sudo_stdin = self._prepare_command(command)
|
||||
wrapped = f'cd {work_dir} && {exec_command}'
|
||||
effective_timeout = timeout or self.timeout
|
||||
|
||||
# Merge sudo password (if any) with caller-supplied stdin_data.
|
||||
if sudo_stdin is not None and stdin_data is not None:
|
||||
effective_stdin = sudo_stdin + stdin_data
|
||||
elif sudo_stdin is not None:
|
||||
effective_stdin = sudo_stdin
|
||||
else:
|
||||
effective_stdin = stdin_data
|
||||
|
||||
cmd = self._build_ssh_command()
|
||||
cmd.extend(["bash", "-c", wrapped])
|
||||
|
||||
try:
|
||||
kwargs = self._build_run_kwargs(timeout, stdin_data)
|
||||
kwargs = self._build_run_kwargs(timeout, effective_stdin)
|
||||
# Remove timeout from kwargs -- we handle it in the poll loop
|
||||
kwargs.pop("timeout", None)
|
||||
|
||||
@@ -87,13 +95,13 @@ class SSHEnvironment(BaseEnvironment):
|
||||
cmd,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.STDOUT,
|
||||
stdin=subprocess.PIPE if stdin_data else subprocess.DEVNULL,
|
||||
stdin=subprocess.PIPE if effective_stdin else subprocess.DEVNULL,
|
||||
text=True,
|
||||
)
|
||||
|
||||
if stdin_data:
|
||||
if effective_stdin:
|
||||
try:
|
||||
proc.stdin.write(stdin_data)
|
||||
proc.stdin.write(effective_stdin)
|
||||
proc.stdin.close()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
@@ -962,37 +962,35 @@ class ShellFileOperations(FileOperations):
|
||||
# rg match lines: "file:lineno:content" (colon separator)
|
||||
# rg context lines: "file-lineno-content" (dash separator)
|
||||
# rg group seps: "--"
|
||||
# Note: on Windows, paths contain drive letters (e.g. C:\path),
|
||||
# so naive split(":") breaks. Use regex to handle both platforms.
|
||||
_match_re = re.compile(r'^([A-Za-z]:)?(.*?):(\d+):(.*)$')
|
||||
_ctx_re = re.compile(r'^([A-Za-z]:)?(.*?)-(\d+)-(.*)$')
|
||||
matches = []
|
||||
for line in result.stdout.strip().split('\n'):
|
||||
if not line or line == "--":
|
||||
continue
|
||||
|
||||
# Try match line first (colon-separated: file:line:content)
|
||||
parts = line.split(':', 2)
|
||||
if len(parts) >= 3:
|
||||
try:
|
||||
matches.append(SearchMatch(
|
||||
path=parts[0],
|
||||
line_number=int(parts[1]),
|
||||
content=parts[2][:500]
|
||||
))
|
||||
continue
|
||||
except ValueError:
|
||||
pass
|
||||
m = _match_re.match(line)
|
||||
if m:
|
||||
matches.append(SearchMatch(
|
||||
path=(m.group(1) or '') + m.group(2),
|
||||
line_number=int(m.group(3)),
|
||||
content=m.group(4)[:500]
|
||||
))
|
||||
continue
|
||||
|
||||
# Try context line (dash-separated: file-line-content)
|
||||
# Only attempt if context was requested to avoid false positives
|
||||
if context > 0:
|
||||
parts = line.split('-', 2)
|
||||
if len(parts) >= 3:
|
||||
try:
|
||||
matches.append(SearchMatch(
|
||||
path=parts[0],
|
||||
line_number=int(parts[1]),
|
||||
content=parts[2][:500]
|
||||
))
|
||||
except ValueError:
|
||||
pass
|
||||
m = _ctx_re.match(line)
|
||||
if m:
|
||||
matches.append(SearchMatch(
|
||||
path=(m.group(1) or '') + m.group(2),
|
||||
line_number=int(m.group(3)),
|
||||
content=m.group(4)[:500]
|
||||
))
|
||||
|
||||
total = len(matches)
|
||||
page = matches[offset:offset + limit]
|
||||
@@ -1059,34 +1057,33 @@ class ShellFileOperations(FileOperations):
|
||||
# grep match lines: "file:lineno:content" (colon)
|
||||
# grep context lines: "file-lineno-content" (dash)
|
||||
# grep group seps: "--"
|
||||
# Note: on Windows, paths contain drive letters (e.g. C:\path),
|
||||
# so naive split(":") breaks. Use regex to handle both platforms.
|
||||
_match_re = re.compile(r'^([A-Za-z]:)?(.*?):(\d+):(.*)$')
|
||||
_ctx_re = re.compile(r'^([A-Za-z]:)?(.*?)-(\d+)-(.*)$')
|
||||
matches = []
|
||||
for line in result.stdout.strip().split('\n'):
|
||||
if not line or line == "--":
|
||||
continue
|
||||
|
||||
parts = line.split(':', 2)
|
||||
if len(parts) >= 3:
|
||||
try:
|
||||
matches.append(SearchMatch(
|
||||
path=parts[0],
|
||||
line_number=int(parts[1]),
|
||||
content=parts[2][:500]
|
||||
))
|
||||
continue
|
||||
except ValueError:
|
||||
pass
|
||||
m = _match_re.match(line)
|
||||
if m:
|
||||
matches.append(SearchMatch(
|
||||
path=(m.group(1) or '') + m.group(2),
|
||||
line_number=int(m.group(3)),
|
||||
content=m.group(4)[:500]
|
||||
))
|
||||
continue
|
||||
|
||||
if context > 0:
|
||||
parts = line.split('-', 2)
|
||||
if len(parts) >= 3:
|
||||
try:
|
||||
matches.append(SearchMatch(
|
||||
path=parts[0],
|
||||
line_number=int(parts[1]),
|
||||
content=parts[2][:500]
|
||||
))
|
||||
except ValueError:
|
||||
pass
|
||||
m = _ctx_re.match(line)
|
||||
if m:
|
||||
matches.append(SearchMatch(
|
||||
path=(m.group(1) or '') + m.group(2),
|
||||
line_number=int(m.group(3)),
|
||||
content=m.group(4)[:500]
|
||||
))
|
||||
|
||||
|
||||
total = len(matches)
|
||||
page = matches[offset:offset + limit]
|
||||
|
||||
@@ -91,6 +91,7 @@ def _get_file_ops(task_id: str = "default") -> ShellFileOperations:
|
||||
"container_memory": config.get("container_memory", 5120),
|
||||
"container_disk": config.get("container_disk", 51200),
|
||||
"container_persistent": config.get("container_persistent", True),
|
||||
"docker_volumes": config.get("docker_volumes", []),
|
||||
}
|
||||
terminal_env = _create_environment(
|
||||
env_type=env_type,
|
||||
|
||||
@@ -148,11 +148,14 @@ class ProcessRegistry:
|
||||
if use_pty:
|
||||
# Try PTY mode for interactive CLI tools
|
||||
try:
|
||||
import ptyprocess
|
||||
if _IS_WINDOWS:
|
||||
from winpty import PtyProcess as _PtyProcessCls
|
||||
else:
|
||||
from ptyprocess import PtyProcess as _PtyProcessCls
|
||||
user_shell = _find_shell()
|
||||
pty_env = os.environ | (env_vars or {})
|
||||
pty_env["PYTHONUNBUFFERED"] = "1"
|
||||
pty_proc = ptyprocess.PtyProcess.spawn(
|
||||
pty_proc = _PtyProcessCls.spawn(
|
||||
[user_shell, "-lic", command],
|
||||
cwd=session.cwd,
|
||||
env=pty_env,
|
||||
|
||||
@@ -37,6 +37,7 @@ import logging
|
||||
import os
|
||||
import re
|
||||
import shutil
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, Optional
|
||||
|
||||
@@ -190,6 +191,38 @@ def _validate_file_path(file_path: str) -> Optional[str]:
|
||||
return None
|
||||
|
||||
|
||||
def _atomic_write_text(file_path: Path, content: str, encoding: str = "utf-8") -> None:
|
||||
"""
|
||||
Atomically write text content to a file.
|
||||
|
||||
Uses a temporary file in the same directory and os.replace() to ensure
|
||||
the target file is never left in a partially-written state if the process
|
||||
crashes or is interrupted.
|
||||
|
||||
Args:
|
||||
file_path: Target file path
|
||||
content: Content to write
|
||||
encoding: Text encoding (default: utf-8)
|
||||
"""
|
||||
file_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
fd, temp_path = tempfile.mkstemp(
|
||||
dir=str(file_path.parent),
|
||||
prefix=f".{file_path.name}.tmp.",
|
||||
suffix="",
|
||||
)
|
||||
try:
|
||||
with os.fdopen(fd, "w", encoding=encoding) as f:
|
||||
f.write(content)
|
||||
os.replace(temp_path, file_path)
|
||||
except Exception:
|
||||
# Clean up temp file on error
|
||||
try:
|
||||
os.unlink(temp_path)
|
||||
except OSError:
|
||||
pass
|
||||
raise
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Core actions
|
||||
# =============================================================================
|
||||
@@ -218,9 +251,9 @@ def _create_skill(name: str, content: str, category: str = None) -> Dict[str, An
|
||||
skill_dir = _resolve_skill_dir(name, category)
|
||||
skill_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Write SKILL.md
|
||||
# Write SKILL.md atomically
|
||||
skill_md = skill_dir / "SKILL.md"
|
||||
skill_md.write_text(content, encoding="utf-8")
|
||||
_atomic_write_text(skill_md, content)
|
||||
|
||||
# Security scan — roll back on block
|
||||
scan_error = _security_scan_skill(skill_dir)
|
||||
@@ -256,13 +289,13 @@ def _edit_skill(name: str, content: str) -> Dict[str, Any]:
|
||||
skill_md = existing["path"] / "SKILL.md"
|
||||
# Back up original content for rollback
|
||||
original_content = skill_md.read_text(encoding="utf-8") if skill_md.exists() else None
|
||||
skill_md.write_text(content, encoding="utf-8")
|
||||
_atomic_write_text(skill_md, content)
|
||||
|
||||
# Security scan — roll back on block
|
||||
scan_error = _security_scan_skill(existing["path"])
|
||||
if scan_error:
|
||||
if original_content is not None:
|
||||
skill_md.write_text(original_content, encoding="utf-8")
|
||||
_atomic_write_text(skill_md, original_content)
|
||||
return {"success": False, "error": scan_error}
|
||||
|
||||
return {
|
||||
@@ -342,12 +375,12 @@ def _patch_skill(
|
||||
}
|
||||
|
||||
original_content = content # for rollback
|
||||
target.write_text(new_content, encoding="utf-8")
|
||||
_atomic_write_text(target, new_content)
|
||||
|
||||
# Security scan — roll back on block
|
||||
scan_error = _security_scan_skill(skill_dir)
|
||||
if scan_error:
|
||||
target.write_text(original_content, encoding="utf-8")
|
||||
_atomic_write_text(target, original_content)
|
||||
return {"success": False, "error": scan_error}
|
||||
|
||||
replacements = count if replace_all else 1
|
||||
@@ -394,13 +427,13 @@ def _write_file(name: str, file_path: str, file_content: str) -> Dict[str, Any]:
|
||||
target.parent.mkdir(parents=True, exist_ok=True)
|
||||
# Back up for rollback
|
||||
original_content = target.read_text(encoding="utf-8") if target.exists() else None
|
||||
target.write_text(file_content, encoding="utf-8")
|
||||
_atomic_write_text(target, file_content)
|
||||
|
||||
# Security scan — roll back on block
|
||||
scan_error = _security_scan_skill(existing["path"])
|
||||
if scan_error:
|
||||
if original_content is not None:
|
||||
target.write_text(original_content, encoding="utf-8")
|
||||
_atomic_write_text(target, original_content)
|
||||
else:
|
||||
target.unlink(missing_ok=True)
|
||||
return {"success": False, "error": scan_error}
|
||||
|
||||
@@ -63,6 +63,7 @@ Usage:
|
||||
"""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
@@ -71,6 +72,8 @@ from typing import Dict, Any, List, Optional, Tuple
|
||||
|
||||
import yaml
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# All skills live in ~/.hermes/skills/ (seeded from bundled skills/ on install).
|
||||
# This is the single source of truth -- agent edits, hub installs, and bundled
|
||||
@@ -269,7 +272,11 @@ def _find_all_skills() -> List[Dict[str, Any]]:
|
||||
"category": category,
|
||||
})
|
||||
|
||||
except Exception:
|
||||
except (UnicodeDecodeError, PermissionError) as e:
|
||||
logger.warning("Failed to read skill file %s: %s", skill_md, e)
|
||||
continue
|
||||
except Exception as e:
|
||||
logger.warning("Error parsing skill %s: %s", skill_md, e, exc_info=True)
|
||||
continue
|
||||
|
||||
return skills
|
||||
@@ -308,7 +315,11 @@ def _load_category_description(category_dir: Path) -> Optional[str]:
|
||||
description = description[:MAX_DESCRIPTION_LENGTH - 3] + "..."
|
||||
|
||||
return description if description else None
|
||||
except Exception:
|
||||
except (UnicodeDecodeError, PermissionError) as e:
|
||||
logger.debug("Failed to read category description %s: %s", desc_file, e)
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.warning("Error parsing category description %s: %s", desc_file, e, exc_info=True)
|
||||
return None
|
||||
|
||||
|
||||
|
||||
@@ -29,6 +29,7 @@ Usage:
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import platform
|
||||
import signal
|
||||
import sys
|
||||
import time
|
||||
@@ -83,8 +84,8 @@ def _check_disk_usage_warning():
|
||||
if f.is_file():
|
||||
try:
|
||||
total_bytes += f.stat().st_size
|
||||
except OSError:
|
||||
pass
|
||||
except OSError as e:
|
||||
logger.debug("Could not stat file %s: %s", f, e)
|
||||
|
||||
total_gb = total_bytes / (1024 ** 3)
|
||||
|
||||
@@ -192,23 +193,35 @@ def _prompt_for_sudo_password(timeout_seconds: int = 45) -> str:
|
||||
result = {"password": None, "done": False}
|
||||
|
||||
def read_password_thread():
|
||||
"""Read password from /dev/tty with echo disabled."""
|
||||
"""Read password with echo disabled. Uses msvcrt on Windows, /dev/tty on Unix."""
|
||||
tty_fd = None
|
||||
old_attrs = None
|
||||
try:
|
||||
import termios
|
||||
tty_fd = os.open("/dev/tty", os.O_RDONLY)
|
||||
old_attrs = termios.tcgetattr(tty_fd)
|
||||
new_attrs = termios.tcgetattr(tty_fd)
|
||||
new_attrs[3] = new_attrs[3] & ~termios.ECHO
|
||||
termios.tcsetattr(tty_fd, termios.TCSAFLUSH, new_attrs)
|
||||
chars = []
|
||||
while True:
|
||||
b = os.read(tty_fd, 1)
|
||||
if not b or b in (b"\n", b"\r"):
|
||||
break
|
||||
chars.append(b)
|
||||
result["password"] = b"".join(chars).decode("utf-8", errors="replace")
|
||||
if platform.system() == "Windows":
|
||||
import msvcrt
|
||||
chars = []
|
||||
while True:
|
||||
c = msvcrt.getwch()
|
||||
if c in ("\r", "\n"):
|
||||
break
|
||||
if c == "\x03":
|
||||
raise KeyboardInterrupt
|
||||
chars.append(c)
|
||||
result["password"] = "".join(chars)
|
||||
else:
|
||||
import termios
|
||||
tty_fd = os.open("/dev/tty", os.O_RDONLY)
|
||||
old_attrs = termios.tcgetattr(tty_fd)
|
||||
new_attrs = termios.tcgetattr(tty_fd)
|
||||
new_attrs[3] = new_attrs[3] & ~termios.ECHO
|
||||
termios.tcsetattr(tty_fd, termios.TCSAFLUSH, new_attrs)
|
||||
chars = []
|
||||
while True:
|
||||
b = os.read(tty_fd, 1)
|
||||
if not b or b in (b"\n", b"\r"):
|
||||
break
|
||||
chars.append(b)
|
||||
result["password"] = b"".join(chars).decode("utf-8", errors="replace")
|
||||
except (EOFError, KeyboardInterrupt, OSError):
|
||||
result["password"] = ""
|
||||
except Exception:
|
||||
@@ -218,13 +231,13 @@ def _prompt_for_sudo_password(timeout_seconds: int = 45) -> str:
|
||||
try:
|
||||
import termios as _termios
|
||||
_termios.tcsetattr(tty_fd, _termios.TCSAFLUSH, old_attrs)
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as e:
|
||||
logger.debug("Failed to restore terminal attributes: %s", e)
|
||||
if tty_fd is not None:
|
||||
try:
|
||||
os.close(tty_fd)
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as e:
|
||||
logger.debug("Failed to close tty fd: %s", e)
|
||||
result["done"] = True
|
||||
|
||||
try:
|
||||
@@ -278,32 +291,50 @@ def _prompt_for_sudo_password(timeout_seconds: int = 45) -> str:
|
||||
del os.environ["HERMES_SPINNER_PAUSE"]
|
||||
|
||||
|
||||
def _transform_sudo_command(command: str) -> str:
|
||||
def _transform_sudo_command(command: str) -> tuple[str, str | None]:
|
||||
"""
|
||||
Transform sudo commands to use -S flag if SUDO_PASSWORD is available.
|
||||
|
||||
|
||||
This is a shared helper used by all execution environments to provide
|
||||
consistent sudo handling across local, SSH, and container environments.
|
||||
|
||||
If SUDO_PASSWORD is set (via env, config, or interactive prompt):
|
||||
'sudo apt install curl' -> password piped via sudo -S
|
||||
|
||||
|
||||
Returns:
|
||||
(transformed_command, sudo_stdin) where:
|
||||
- transformed_command has every bare ``sudo`` replaced with
|
||||
``sudo -S -p ''`` so sudo reads its password from stdin.
|
||||
- sudo_stdin is the password string with a trailing newline that the
|
||||
caller must prepend to the process's stdin stream. sudo -S reads
|
||||
exactly one line (the password) and passes the rest of stdin to the
|
||||
child command, so prepending is safe even when the caller also has
|
||||
its own stdin_data to pipe.
|
||||
- If no password is available, sudo_stdin is None and the command is
|
||||
returned unchanged so it fails gracefully with
|
||||
"sudo: a password is required".
|
||||
|
||||
Callers that drive a subprocess directly (local, ssh, docker, singularity)
|
||||
should prepend sudo_stdin to their stdin_data and pass the merged bytes to
|
||||
Popen's stdin pipe.
|
||||
|
||||
Callers that cannot pipe subprocess stdin (modal, daytona) must embed the
|
||||
password in the command string themselves; see their execute() methods for
|
||||
how they handle the non-None sudo_stdin case.
|
||||
|
||||
If SUDO_PASSWORD is not set and in interactive mode (HERMES_INTERACTIVE=1):
|
||||
Prompts user for password with 45s timeout, caches for session.
|
||||
|
||||
|
||||
If SUDO_PASSWORD is not set and NOT interactive:
|
||||
Command runs as-is (fails gracefully with "sudo: a password is required").
|
||||
"""
|
||||
global _cached_sudo_password
|
||||
import re
|
||||
|
||||
|
||||
# Check if command even contains sudo
|
||||
if not re.search(r'\bsudo\b', command):
|
||||
return command # No sudo in command, return as-is
|
||||
|
||||
return command, None # No sudo in command, nothing to do
|
||||
|
||||
# Try to get password from: env var -> session cache -> interactive prompt
|
||||
sudo_password = os.getenv("SUDO_PASSWORD", "") or _cached_sudo_password
|
||||
|
||||
|
||||
if not sudo_password:
|
||||
# No password configured - check if we're in interactive mode
|
||||
if os.getenv("HERMES_INTERACTIVE"):
|
||||
@@ -311,21 +342,21 @@ def _transform_sudo_command(command: str) -> str:
|
||||
sudo_password = _prompt_for_sudo_password(timeout_seconds=45)
|
||||
if sudo_password:
|
||||
_cached_sudo_password = sudo_password # Cache for session
|
||||
|
||||
|
||||
if not sudo_password:
|
||||
return command # No password, let it fail gracefully
|
||||
|
||||
return command, None # No password, let it fail gracefully
|
||||
|
||||
def replace_sudo(match):
|
||||
# Replace 'sudo' with password-piped version
|
||||
# The -S flag makes sudo read password from stdin
|
||||
# The -p '' suppresses the password prompt
|
||||
# Use shlex.quote() to prevent shell injection via password content
|
||||
import shlex
|
||||
return f"echo {shlex.quote(sudo_password)} | sudo -S -p ''"
|
||||
|
||||
# Replace bare 'sudo' with 'sudo -S -p ""'.
|
||||
# The password is returned as sudo_stdin and must be written to the
|
||||
# process's stdin pipe by the caller — it never appears in any
|
||||
# command-line argument or shell string.
|
||||
return "sudo -S -p ''"
|
||||
|
||||
# Match 'sudo' at word boundaries (not 'visudo' or 'sudoers')
|
||||
# This handles: sudo, sudo -flag, etc.
|
||||
return re.sub(r'\bsudo\b', replace_sudo, command)
|
||||
transformed = re.sub(r'\bsudo\b', replace_sudo, command)
|
||||
# Trailing newline is required: sudo -S reads one line for the password.
|
||||
return transformed, sudo_password + "\n"
|
||||
|
||||
|
||||
# Environment classes now live in tools/environments/
|
||||
@@ -424,7 +455,8 @@ def _get_env_config() -> Dict[str, Any]:
|
||||
# SSH is excluded since /home/ paths are valid on remote machines.
|
||||
cwd = os.getenv("TERMINAL_CWD", default_cwd)
|
||||
if env_type in ("modal", "docker", "singularity", "daytona") and cwd:
|
||||
host_prefixes = ("/Users/", "C:\\", "C:/")
|
||||
# Host paths that won't exist inside containers
|
||||
host_prefixes = ("/Users/", "/home/", "C:\\", "C:/")
|
||||
if any(cwd.startswith(p) for p in host_prefixes) and cwd != default_cwd:
|
||||
logger.info("Ignoring TERMINAL_CWD=%r for %s backend "
|
||||
"(host path won't exist in sandbox). Using %r instead.",
|
||||
@@ -658,8 +690,8 @@ def get_active_environments_info() -> Dict[str, Any]:
|
||||
try:
|
||||
size = sum(f.stat().st_size for f in Path(path).rglob('*') if f.is_file())
|
||||
total_size += size
|
||||
except OSError:
|
||||
pass
|
||||
except OSError as e:
|
||||
logger.debug("Could not stat path %s: %s", path, e)
|
||||
|
||||
info["total_disk_usage_mb"] = round(total_size / (1024 * 1024), 2)
|
||||
return info
|
||||
@@ -686,8 +718,8 @@ def cleanup_all_environments():
|
||||
try:
|
||||
shutil.rmtree(path, ignore_errors=True)
|
||||
logger.info("Removed orphaned: %s", path)
|
||||
except OSError:
|
||||
pass
|
||||
except OSError as e:
|
||||
logger.debug("Failed to remove orphaned path %s: %s", path, e)
|
||||
|
||||
if cleaned > 0:
|
||||
logger.info("Cleaned %d environments", cleaned)
|
||||
|
||||
@@ -83,7 +83,11 @@ def _load_tts_config() -> Dict[str, Any]:
|
||||
from hermes_cli.config import load_config
|
||||
config = load_config()
|
||||
return config.get("tts", {})
|
||||
except Exception:
|
||||
except ImportError:
|
||||
logger.debug("hermes_cli.config not available, using default TTS config")
|
||||
return {}
|
||||
except Exception as e:
|
||||
logger.warning("Failed to load TTS config: %s", e, exc_info=True)
|
||||
return {}
|
||||
|
||||
|
||||
@@ -115,15 +119,23 @@ def _convert_to_opus(mp3_path: str) -> Optional[str]:
|
||||
|
||||
ogg_path = mp3_path.rsplit(".", 1)[0] + ".ogg"
|
||||
try:
|
||||
subprocess.run(
|
||||
result = subprocess.run(
|
||||
["ffmpeg", "-i", mp3_path, "-acodec", "libopus",
|
||||
"-ac", "1", "-b:a", "64k", "-vbr", "off", ogg_path, "-y"],
|
||||
capture_output=True, timeout=30,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
logger.warning("ffmpeg conversion failed with return code %d: %s",
|
||||
result.returncode, result.stderr.decode('utf-8', errors='ignore')[:200])
|
||||
return None
|
||||
if os.path.exists(ogg_path) and os.path.getsize(ogg_path) > 0:
|
||||
return ogg_path
|
||||
except subprocess.TimeoutExpired:
|
||||
logger.warning("ffmpeg OGG conversion timed out after 30s")
|
||||
except FileNotFoundError:
|
||||
logger.warning("ffmpeg not found in PATH")
|
||||
except Exception as e:
|
||||
logger.warning("ffmpeg OGG conversion failed: %s", e)
|
||||
logger.warning("ffmpeg OGG conversion failed: %s", e, exc_info=True)
|
||||
return None
|
||||
|
||||
|
||||
@@ -369,10 +381,21 @@ def text_to_speech_tool(
|
||||
"voice_compatible": voice_compatible,
|
||||
}, ensure_ascii=False)
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"TTS generation failed ({provider}): {e}"
|
||||
except ValueError as e:
|
||||
# Configuration errors (missing API keys, etc.)
|
||||
error_msg = f"TTS configuration error ({provider}): {e}"
|
||||
logger.error("%s", error_msg)
|
||||
return json.dumps({"success": False, "error": error_msg}, ensure_ascii=False)
|
||||
except FileNotFoundError as e:
|
||||
# Missing dependencies or files
|
||||
error_msg = f"TTS dependency missing ({provider}): {e}"
|
||||
logger.error("%s", error_msg, exc_info=True)
|
||||
return json.dumps({"success": False, "error": error_msg}, ensure_ascii=False)
|
||||
except Exception as e:
|
||||
# Unexpected errors
|
||||
error_msg = f"TTS generation failed ({provider}): {e}"
|
||||
logger.error("%s", error_msg, exc_info=True)
|
||||
return json.dumps({"success": False, "error": error_msg}, ensure_ascii=False)
|
||||
|
||||
|
||||
# ===========================================================================
|
||||
|
||||
@@ -6,6 +6,8 @@ import tempfile
|
||||
from pathlib import Path
|
||||
from typing import Any, Union
|
||||
|
||||
import yaml
|
||||
|
||||
|
||||
def atomic_json_write(path: Union[str, Path], data: Any, *, indent: int = 2) -> None:
|
||||
"""Write JSON data to a file atomically.
|
||||
@@ -39,3 +41,49 @@ def atomic_json_write(path: Union[str, Path], data: Any, *, indent: int = 2) ->
|
||||
except OSError:
|
||||
pass
|
||||
raise
|
||||
|
||||
|
||||
def atomic_yaml_write(
|
||||
path: Union[str, Path],
|
||||
data: Any,
|
||||
*,
|
||||
default_flow_style: bool = False,
|
||||
sort_keys: bool = False,
|
||||
extra_content: str | None = None,
|
||||
) -> None:
|
||||
"""Write YAML data to a file atomically.
|
||||
|
||||
Uses temp file + fsync + os.replace to ensure the target file is never
|
||||
left in a partially-written state. If the process crashes mid-write,
|
||||
the previous version of the file remains intact.
|
||||
|
||||
Args:
|
||||
path: Target file path (will be created or overwritten).
|
||||
data: YAML-serializable data to write.
|
||||
default_flow_style: YAML flow style (default False).
|
||||
sort_keys: Whether to sort dict keys (default False).
|
||||
extra_content: Optional string to append after the YAML dump
|
||||
(e.g. commented-out sections for user reference).
|
||||
"""
|
||||
path = Path(path)
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
fd, tmp_path = tempfile.mkstemp(
|
||||
dir=str(path.parent),
|
||||
prefix=f".{path.stem}_",
|
||||
suffix=".tmp",
|
||||
)
|
||||
try:
|
||||
with os.fdopen(fd, "w", encoding="utf-8") as f:
|
||||
yaml.dump(data, f, default_flow_style=default_flow_style, sort_keys=sort_keys)
|
||||
if extra_content:
|
||||
f.write(extra_content)
|
||||
f.flush()
|
||||
os.fsync(f.fileno())
|
||||
os.replace(tmp_path, path)
|
||||
except BaseException:
|
||||
try:
|
||||
os.unlink(tmp_path)
|
||||
except OSError:
|
||||
pass
|
||||
raise
|
||||
|
||||
@@ -24,6 +24,7 @@ These are commands you run from your shell.
|
||||
| `hermes chat --toolsets "web,terminal"` / `-t` | Use specific toolsets |
|
||||
| `hermes chat --verbose` | Enable verbose/debug output |
|
||||
| `hermes --worktree` / `-w` | Start in an isolated git worktree (for parallel agents) |
|
||||
| `hermes --checkpoints` | Enable filesystem checkpoints before destructive file operations |
|
||||
|
||||
### Provider & Model Management
|
||||
|
||||
@@ -202,6 +203,8 @@ These work in messaging platforms (Telegram, Discord, Slack, WhatsApp) but not t
|
||||
| `/sethome` | Set this chat as the home channel |
|
||||
| `/status` | Show session info |
|
||||
| `/reload-mcp` | Reload MCP servers from config |
|
||||
| `/rollback` | List filesystem checkpoints for the current directory |
|
||||
| `/rollback <N>` | Restore files to checkpoint #N |
|
||||
| `/update` | Update Hermes Agent to the latest version |
|
||||
|
||||
---
|
||||
|
||||
@@ -663,6 +663,16 @@ browser:
|
||||
record_sessions: false # Auto-record browser sessions as WebM videos to ~/.hermes/browser_recordings/
|
||||
```
|
||||
|
||||
## Checkpoints
|
||||
|
||||
Automatic filesystem snapshots before destructive file operations. See the [Checkpoints feature page](/docs/user-guide/features/checkpoints) for details.
|
||||
|
||||
```yaml
|
||||
checkpoints:
|
||||
enabled: false # Enable automatic checkpoints (also: hermes --checkpoints)
|
||||
max_snapshots: 50 # Max checkpoints to keep per directory
|
||||
```
|
||||
|
||||
## Delegation
|
||||
|
||||
Configure subagent behavior for the delegate tool:
|
||||
|
||||
@@ -0,0 +1,97 @@
|
||||
# Filesystem Checkpoints
|
||||
|
||||
Hermes can automatically snapshot your working directory before making file changes, giving you a safety net to roll back if something goes wrong.
|
||||
|
||||
## How It Works
|
||||
|
||||
When enabled, Hermes takes a **one-time snapshot** at the start of each conversation turn before the first file-modifying operation (`write_file` or `patch`). This creates a point-in-time backup you can restore to at any time.
|
||||
|
||||
Under the hood, checkpoints use a **shadow git repository** stored at `~/.hermes/checkpoints/`. This is completely separate from your project's git — no `.git` directory is created in your project, and your own git history is never touched.
|
||||
|
||||
## Enabling Checkpoints
|
||||
|
||||
### Per-session (CLI flag)
|
||||
|
||||
```bash
|
||||
hermes --checkpoints
|
||||
```
|
||||
|
||||
### Permanently (config.yaml)
|
||||
|
||||
```yaml
|
||||
# ~/.hermes/config.yaml
|
||||
checkpoints:
|
||||
enabled: true
|
||||
max_snapshots: 50 # max checkpoints per directory (default: 50)
|
||||
```
|
||||
|
||||
## Rolling Back
|
||||
|
||||
Use the `/rollback` slash command:
|
||||
|
||||
```
|
||||
/rollback # List all available checkpoints
|
||||
/rollback 1 # Restore to checkpoint #1 (most recent)
|
||||
/rollback 3 # Restore to checkpoint #3 (further back)
|
||||
/rollback abc1234 # Restore by git commit hash
|
||||
```
|
||||
|
||||
Example output:
|
||||
|
||||
```
|
||||
📸 Checkpoints for /home/user/project:
|
||||
|
||||
1. abc1234 2026-03-10 14:22 before write_file
|
||||
2. def5678 2026-03-10 14:15 before patch
|
||||
3. ghi9012 2026-03-10 14:08 before write_file
|
||||
|
||||
Use /rollback <number> to restore, e.g. /rollback 1
|
||||
```
|
||||
|
||||
When you restore, Hermes automatically takes a **pre-rollback snapshot** first — so you can always undo your undo.
|
||||
|
||||
## What Gets Checkpointed
|
||||
|
||||
Checkpoints capture the entire working directory (the project root), excluding common large/sensitive patterns:
|
||||
|
||||
- `node_modules/`, `dist/`, `build/`
|
||||
- `.env`, `.env.*`
|
||||
- `__pycache__/`, `*.pyc`
|
||||
- `.venv/`, `venv/`
|
||||
- `.git/`
|
||||
- `.DS_Store`, `*.log`
|
||||
|
||||
## Performance
|
||||
|
||||
Checkpoints are designed to be lightweight:
|
||||
|
||||
- **Once per turn** — only the first file operation triggers a snapshot, not every write
|
||||
- **Skips large directories** — directories with >50,000 files are skipped automatically
|
||||
- **Skips when nothing changed** — if no files were modified since the last checkpoint, no commit is created
|
||||
- **Non-blocking** — if a checkpoint fails for any reason, the file operation proceeds normally
|
||||
|
||||
## How It Determines the Project Root
|
||||
|
||||
When you write to a file like `src/components/Button.tsx`, Hermes walks up the directory tree looking for project markers (`.git`, `pyproject.toml`, `package.json`, `Cargo.toml`, etc.) to find the project root. This ensures the entire project is checkpointed, not just the file's parent directory.
|
||||
|
||||
## Platforms
|
||||
|
||||
Checkpoints work on both:
|
||||
- **CLI** — uses your current working directory
|
||||
- **Gateway** (Telegram, Discord, etc.) — uses `MESSAGING_CWD`
|
||||
|
||||
The `/rollback` command is available on all platforms.
|
||||
|
||||
## FAQ
|
||||
|
||||
**Does this conflict with my project's git?**
|
||||
No. Checkpoints use a completely separate shadow git repository via `GIT_DIR` environment variables. Your project's `.git/` is never touched.
|
||||
|
||||
**How much disk space do checkpoints use?**
|
||||
Git is very efficient at storing diffs. For most projects, checkpoint data is negligible. Old checkpoints are pruned when `max_snapshots` is exceeded.
|
||||
|
||||
**Can I checkpoint without git installed?**
|
||||
No — git must be available on your PATH. If it's not installed, checkpoints silently disable.
|
||||
|
||||
**Can I roll back across sessions?**
|
||||
Yes! Checkpoints persist in `~/.hermes/checkpoints/` and survive across sessions. You can roll back to a checkpoint from yesterday.
|
||||
@@ -5,7 +5,7 @@ import type * as Preset from '@docusaurus/preset-classic';
|
||||
const config: Config = {
|
||||
title: 'Hermes Agent',
|
||||
tagline: 'The self-improving AI agent',
|
||||
favicon: 'img/favicon.svg',
|
||||
favicon: 'img/favicon.ico',
|
||||
|
||||
url: 'https://hermes-agent.nousresearch.com',
|
||||
baseUrl: '/docs/',
|
||||
@@ -53,7 +53,7 @@ const config: Config = {
|
||||
title: 'Hermes Agent',
|
||||
logo: {
|
||||
alt: 'Hermes Agent',
|
||||
src: 'img/favicon.svg',
|
||||
src: 'img/logo.png',
|
||||
},
|
||||
items: [
|
||||
{
|
||||
|
||||
|
After Width: | Height: | Size: 28 KiB |
|
After Width: | Height: | Size: 870 B |
|
After Width: | Height: | Size: 2.5 KiB |