Compare commits

..

24 Commits

Author SHA1 Message Date
teknium1 38b4fd3737 fix(gateway): make group session isolation configurable
default group and channel sessions to per-user isolation, allow opting back into shared room sessions via config.yaml, and document Discord gateway routing and session behavior.
2026-03-16 00:22:23 -07:00
teknium1 06a7d19f98 fix(gateway): isolate group sessions per user
Include participant identifiers in non-DM session keys when available so group and channel conversations no longer share one transcript across every active user in the chat.
2026-03-15 23:08:56 -07:00
teknium1 1f72ce71b7 fix: restore local STT fallback for gateway voice notes
Restore local STT command fallback for voice transcription, detect whisper and ffmpeg in common local install paths, and avoid bogus no-provider messaging when only a backend-specific key is missing.
2026-03-15 21:51:40 -07:00
Teknium 5beb681c70 fix(cli): prefer curses over simple_term_menu in setup.py (#1487) 2026-03-15 21:16:21 -07:00
Teknium c9a9db318e feat(tools): persistent shell mode for local and SSH backends (#1483)
feat(tools): persistent shell mode for local and SSH backends
2026-03-15 21:14:01 -07:00
teknium1 01e62c067b merge: resolve conflicts with origin/main (SSH preflight check) 2026-03-15 21:13:40 -07:00
Teknium ceb970c559 fix(terminal): add SSH preflight check (#1486) 2026-03-15 21:09:07 -07:00
teknium1 6894358fe1 docs: add persistent shell section to configuration and env-vars reference
Documents terminal.persistent_shell config option, per-backend env var
overrides, precedence table, and what state persists across commands.
2026-03-15 21:01:50 -07:00
Teknium 3f0f4a04a9 fix(agent): skip reasoning extra_body for unsupported OpenRouter models (#1485)
* fix(agent): skip reasoning extra_body for models that don't support it

Sending reasoning config to models like MiniMax or Nvidia via OpenRouter
causes a 400 BadRequestError. Previously, reasoning extra_body was sent
to all OpenRouter and Nous models unconditionally.

Fix: only send reasoning extra_body when the model slug starts with a
known reasoning-capable prefix (deepseek/, anthropic/, openai/, x-ai/,
google/gemini-2, qwen/qwen3) or when using Nous Portal directly.

Applies to both the main API call path (_build_api_kwargs) and the
conversation summary path.

Fixes #1083

* test(agent): cover reasoning extra_body gating

---------

Co-authored-by: ygd58 <buraysandro9@gmail.com>
2026-03-15 20:42:07 -07:00
Teknium c564e1c3dc feat(tools): centralize tool emoji metadata in registry + skin integration (#1484)
feat(tools): centralize tool emoji metadata in registry + skin integration
2026-03-15 20:35:24 -07:00
teknium1 210d5ade1e feat(tools): centralize tool emoji metadata in registry + skin integration
- Add 'emoji' field to ToolEntry and 'get_emoji()' to ToolRegistry
- Add emoji= to all 50+ registry.register() calls across tool files
- Add get_tool_emoji() helper in agent/display.py with 3-tier resolution:
  skin override → registry default → hardcoded fallback
- Replace hardcoded emoji maps in run_agent.py, delegate_tool.py, and
  gateway/run.py with centralized get_tool_emoji() calls
- Add 'tool_emojis' field to SkinConfig so skins can override per-tool
  emojis (e.g. ares skin could use swords instead of wrenches)
- Add 11 tests (5 registry emoji, 6 display/skin integration)
- Update AGENTS.md skin docs table

Based on the approach from PR #1061 by ForgingAlex (emoji centralization
in registry). This salvage fixes several issues from the original:
- Does NOT split the cronjob tool (which would crash on missing schemas)
- Does NOT change image_generate toolset/requires_env/is_async
- Does NOT delete existing tests
- Completes the centralization (gateway/run.py was missed)
- Hooks into the skin system for full customizability
2026-03-15 20:21:21 -07:00
teknium1 33ebedc76d feat: enable persistent shell by default for SSH, add config option
SSH persistent shell now defaults to true — non-local backends benefit
most from state persistence across execute() calls. Local backend
remains opt-in via TERMINAL_LOCAL_PERSISTENT env var.

New config.yaml option: terminal.persistent_shell (default: true)
Controls the default for non-local backends. Users can disable with:
  hermes config set terminal.persistent_shell false

Precedence: per-backend env var > TERMINAL_PERSISTENT_SHELL > default.

Wired through cli.py, gateway/run.py, and hermes_cli/config.py so the
config.yaml value reaches terminal_tool via env var bridge.
2026-03-15 20:17:13 -07:00
teknium1 5b80654198 feat(tools): add persistent shell mode to local and SSH backends
Cherry-picked from PR #1067 by alt-glitch.
Adds PersistentShellMixin with file-based IPC protocol for long-lived
bash shells. LocalEnvironment and SSHEnvironment gain persistent=True
option. Controlled via TERMINAL_LOCAL_PERSISTENT / TERMINAL_SSH_PERSISTENT
env vars. Fixes latent stderr pipe buffer deadlock.

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-03-15 20:13:02 -07:00
Teknium 25e53f3c1a fix(custom-endpoint): verify /models and suggest working /v1 base URL (#1480) 2026-03-15 20:09:50 -07:00
Teknium 103f7b1ebc fix: verbose mode shows full untruncated output
* fix(cli): silence tirith prefetch install warnings at startup

* fix: verbose mode now shows full untruncated tool args, results, content, and think blocks

When tool progress is set to 'verbose' (via /verbose or config), the display
was still truncating tool arguments to 100 chars, tool results to 100-200 chars,
assistant content to 100 chars, and think blocks to 5 lines. This defeated the
purpose of verbose mode.

Changes:
- Tool args: show full JSON args (not truncated to log_prefix_chars)
- Tool results: show full result content in both display and debug logs
- Assistant content: show full content during tool-call loops
- Think blocks: show full reasoning text (not truncated to 5 lines/100 chars)
- Auto-enable reasoning display when verbose mode is active
- Fix initial agent creation to respect verbose config (was always quiet_mode=True)
- Updated verbose label to mention think blocks
2026-03-15 20:03:37 -07:00
Teknium a56937735e fix(telegram): escape chunk indicators in MarkdownV2 (#1478) 2026-03-15 19:27:15 -07:00
Teknium 7148534401 fix(gateway): make /status report live state and tokens (#1476) 2026-03-15 19:18:58 -07:00
Teknium 4e91b0240b fix(honcho): correct seed_ai_identity to use session.add_messages() (#1475)
The seed_ai_identity method was calling assistant_peer.add_message() which
doesn't exist on the Honcho SDK's Peer class. Fixed to use the correct
pattern: session.add_messages([peer.message(content)]), matching the
existing message sync code at line 294.

Discovered and fixed by Yuqi (Hermes Agent), Angello's AI companion.

Co-authored-by: Angello Picasso <angello.picasso@devsu.com>
2026-03-15 19:07:57 -07:00
Teknium 5e92a4ce5a fix: auto-reload MCP tools when mcp_servers config changes without restart (#1474)
Fixes #1036

After adding an MCP server to config.yaml, users had to restart Hermes
before the new tools became visible — even though /reload-mcp existed.

Add _check_config_mcp_changes() called from process_loop every 5s:
- stat() config.yaml for mtime changes (fast path, no YAML parse)
- On mtime change, parse and compare mcp_servers section
- If mcp_servers changed, auto-trigger _reload_mcp() and notify user
- Skip check while agent is running to avoid interrupting tool calls
- Throttled to CONFIG_WATCH_INTERVAL=5s to avoid busy-polling

/reload-mcp still works for manual force-reload.

Tests: 6 new tests in TestMCPConfigWatch, all passed

Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>
2026-03-15 19:03:34 -07:00
Teknium 471c663fdf fix(cli): silence tirith prefetch install warnings at startup (#1452) 2026-03-15 18:07:03 -07:00
Teknium 64d333204b Merge pull request #1242 from NousResearch/fix/file-tool-log-noise
fix: reduce file tool log noise
2026-03-15 11:11:18 -07:00
Teknium c44af43840 Merge pull request #1401 from NousResearch/hermes/hermes-eca4a640
test: protect atomic temp cleanup on interrupts
2026-03-15 11:10:41 -07:00
teknium1 b117bbc125 test: cover atomic temp cleanup on interrupts
- add regression coverage for BaseException cleanup in atomic_json_write
- add dedicated atomic_yaml_write tests, including interrupt cleanup
- document why BaseException is intentional in both helpers
2026-03-14 22:31:51 -07:00
teknium1 b59da08730 fix: reduce file tool log noise
- treat git diff --cached --quiet rc=1 as an expected checkpoint state
  instead of logging it as an error
- downgrade expected write PermissionError/EROFS/EACCES failures out of
  error logging while keeping unexpected exceptions at error level
- add regression tests for both logging behaviors
2026-03-13 22:14:00 -07:00
69 changed files with 2089 additions and 320 deletions
+1
View File
@@ -235,6 +235,7 @@ hermes_cli/skin_engine.py # SkinConfig dataclass, built-in skins, YAML loader
| Spinner verbs | `spinner.thinking_verbs` | `display.py` |
| Spinner wings (optional) | `spinner.wings` | `display.py` |
| Tool output prefix | `tool_prefix` | `display.py` |
| Per-tool emojis | `tool_emojis` | `display.py``get_tool_emoji()` |
| Agent name | `branding.agent_name` | `banner.py`, `cli.py` |
| Welcome message | `branding.welcome` | `cli.py` |
| Response box label | `branding.response_label` | `cli.py` |
+26
View File
@@ -59,6 +59,32 @@ def get_skin_tool_prefix() -> str:
return ""
def get_tool_emoji(tool_name: str, default: str = "") -> str:
"""Get the display emoji for a tool.
Resolution order:
1. Active skin's ``tool_emojis`` overrides (if a skin is loaded)
2. Tool registry's per-tool ``emoji`` field
3. *default* fallback
"""
# 1. Skin override
skin = _get_skin()
if skin and skin.tool_emojis:
override = skin.tool_emojis.get(tool_name)
if override:
return override
# 2. Registry default
try:
from tools.registry import registry
emoji = registry.get_emoji(tool_name, default="")
if emoji:
return emoji
except Exception:
pass
# 3. Hardcoded fallback
return default
# =========================================================================
# Tool preview (one-line summary of a tool call's primary argument)
# =========================================================================
+6
View File
@@ -333,6 +333,12 @@ session_reset:
idle_minutes: 1440 # Inactivity timeout in minutes (default: 1440 = 24 hours)
at_hour: 4 # Daily reset hour, 0-23 local time (default: 4 AM)
# When true, group/channel chats use one session per participant when the platform
# provides a user ID. This is the secure default and prevents users in the same
# room from sharing context, interrupts, and token costs. Set false only if you
# explicitly want one shared "room brain" per group/channel.
group_sessions_per_user: true
# =============================================================================
# Skills Configuration
# =============================================================================
+80 -10
View File
@@ -328,6 +328,8 @@ def load_cli_config() -> Dict[str, Any]:
"container_persistent": "TERMINAL_CONTAINER_PERSISTENT",
"docker_volumes": "TERMINAL_DOCKER_VOLUMES",
"sandbox_dir": "TERMINAL_SANDBOX_DIR",
# Persistent shell (non-local backends)
"persistent_shell": "TERMINAL_PERSISTENT_SHELL",
# Sudo support (works with all backends)
"sudo_password": "SUDO_PASSWORD",
}
@@ -1414,7 +1416,7 @@ class HermesCLI:
max_iterations=self.max_turns,
enabled_toolsets=self.enabled_toolsets,
verbose_logging=self.verbose,
quiet_mode=True,
quiet_mode=not self.verbose,
ephemeral_system_prompt=self.system_prompt if self.system_prompt else None,
prefill_messages=self.prefill_messages or None,
reasoning_config=self.reasoning_config,
@@ -1428,7 +1430,7 @@ class HermesCLI:
platform="cli",
session_db=self._session_db,
clarify_callback=self._clarify_callback,
reasoning_callback=self._on_reasoning if self.show_reasoning else None,
reasoning_callback=self._on_reasoning if (self.show_reasoning or self.verbose) else None,
honcho_session_key=None, # resolved by run_agent via config sessions map / title
fallback_model=self._fallback_model,
thinking_callback=self._on_thinking,
@@ -3285,12 +3287,17 @@ class HermesCLI:
if self.agent:
self.agent.verbose_logging = self.verbose
self.agent.quiet_mode = not self.verbose
# Auto-enable reasoning display in verbose mode
if self.verbose:
self.agent.reasoning_callback = self._on_reasoning
elif not self.show_reasoning:
self.agent.reasoning_callback = None
labels = {
"off": "[dim]Tool progress: OFF[/] — silent mode, just the final response.",
"new": "[yellow]Tool progress: NEW[/] — show each new tool (skip repeats).",
"all": "[green]Tool progress: ALL[/] — show every tool call.",
"verbose": "[bold green]Tool progress: VERBOSE[/] — full args, results, and debug logs.",
"verbose": "[bold green]Tool progress: VERBOSE[/] — full args, results, think blocks, and debug logs.",
}
self.console.print(labels.get(self.tool_progress_mode, ""))
@@ -3357,13 +3364,17 @@ class HermesCLI:
def _on_reasoning(self, reasoning_text: str):
"""Callback for intermediate reasoning display during tool-call loops."""
lines = reasoning_text.strip().splitlines()
if len(lines) > 5:
preview = "\n".join(lines[:5])
preview += f"\n ... ({len(lines) - 5} more lines)"
if self.verbose:
# Verbose mode: show full reasoning text
_cprint(f" {_DIM}[thinking] {reasoning_text.strip()}{_RST}")
else:
preview = reasoning_text.strip()
_cprint(f" {_DIM}[thinking] {preview}{_RST}")
lines = reasoning_text.strip().splitlines()
if len(lines) > 5:
preview = "\n".join(lines[:5])
preview += f"\n ... ({len(lines) - 5} more lines)"
else:
preview = reasoning_text.strip()
_cprint(f" {_DIM}[thinking] {preview}{_RST}")
def _manual_compress(self):
"""Manually trigger context compression on the current conversation."""
@@ -3484,6 +3495,56 @@ class HermesCLI:
except Exception as e:
print(f" Error generating insights: {e}")
def _check_config_mcp_changes(self) -> None:
"""Detect mcp_servers changes in config.yaml and auto-reload MCP connections.
Called from process_loop every CONFIG_WATCH_INTERVAL seconds.
Compares config.yaml mtime + mcp_servers section against the last
known state. When a change is detected, triggers _reload_mcp() and
informs the user so they know the tool list has been refreshed.
"""
import time
import yaml as _yaml
CONFIG_WATCH_INTERVAL = 5.0 # seconds between config.yaml stat() calls
now = time.monotonic()
if now - self._last_config_check < CONFIG_WATCH_INTERVAL:
return
self._last_config_check = now
from hermes_cli.config import get_config_path as _get_config_path
cfg_path = _get_config_path()
if not cfg_path.exists():
return
try:
mtime = cfg_path.stat().st_mtime
except OSError:
return
if mtime == self._config_mtime:
return # File unchanged — fast path
# File changed — check whether mcp_servers section changed
self._config_mtime = mtime
try:
with open(cfg_path, encoding="utf-8") as f:
new_cfg = _yaml.safe_load(f) or {}
except Exception:
return
new_mcp = new_cfg.get("mcp_servers") or {}
if new_mcp == self._config_mcp_servers:
return # mcp_servers unchanged (some other section was edited)
self._config_mcp_servers = new_mcp
# Notify user and reload
print()
print("🔄 MCP server config changed — reloading connections...")
with self._busy_command(self._slow_command_status("/reload-mcp")):
self._reload_mcp()
def _reload_mcp(self):
"""Reload MCP servers: disconnect all, re-read config.yaml, reconnect.
@@ -4749,6 +4810,12 @@ class HermesCLI:
self._interrupt_queue = queue.Queue() # For messages typed while agent is running
self._should_exit = False
self._last_ctrl_c_time = 0 # Track double Ctrl+C for force exit
# Config file watcher — detect mcp_servers changes and auto-reload
from hermes_cli.config import get_config_path as _get_config_path
_cfg_path = _get_config_path()
self._config_mtime: float = _cfg_path.stat().st_mtime if _cfg_path.exists() else 0.0
self._config_mcp_servers: dict = self.config.get("mcp_servers") or {}
self._last_config_check: float = 0.0 # monotonic time of last check
# Clarify tool state: interactive question/answer with the user.
# When the agent calls the clarify tool, _clarify_state is set and
@@ -4797,7 +4864,7 @@ class HermesCLI:
# Ensure tirith security scanner is available (downloads if needed)
try:
from tools.tirith_security import ensure_installed
ensure_installed()
ensure_installed(log_failures=False)
except Exception:
pass # Non-fatal — fail-open at scan time if unavailable
@@ -5682,6 +5749,9 @@ class HermesCLI:
try:
user_input = self._pending_input.get(timeout=0.1)
except queue.Empty:
# Periodic config watcher — auto-reload MCP on mcp_servers change
if not self._agent_running:
self._check_config_mcp_changes()
continue
if not user_input:
+16 -1
View File
@@ -174,7 +174,10 @@ class GatewayConfig:
# STT settings
stt_enabled: bool = True # Whether to auto-transcribe inbound voice messages
# Session isolation in shared chats
group_sessions_per_user: bool = True # Isolate group/channel sessions per participant when user IDs are available
def get_connected_platforms(self) -> List[Platform]:
"""Return list of platforms that are enabled and configured."""
connected = []
@@ -239,6 +242,7 @@ class GatewayConfig:
"sessions_dir": str(self.sessions_dir),
"always_log_local": self.always_log_local,
"stt_enabled": self.stt_enabled,
"group_sessions_per_user": self.group_sessions_per_user,
}
@classmethod
@@ -279,6 +283,8 @@ class GatewayConfig:
if stt_enabled is None:
stt_enabled = data.get("stt", {}).get("enabled") if isinstance(data.get("stt"), dict) else None
group_sessions_per_user = data.get("group_sessions_per_user")
return cls(
platforms=platforms,
default_reset_policy=default_policy,
@@ -289,6 +295,7 @@ class GatewayConfig:
sessions_dir=sessions_dir,
always_log_local=data.get("always_log_local", True),
stt_enabled=_coerce_bool(stt_enabled, True),
group_sessions_per_user=_coerce_bool(group_sessions_per_user, True),
)
@@ -344,6 +351,14 @@ def load_gateway_config() -> GatewayConfig:
if isinstance(stt_cfg, dict) and "enabled" in stt_cfg:
config.stt_enabled = _coerce_bool(stt_cfg.get("enabled"), True)
# Bridge group session isolation from config.yaml into gateway runtime.
# Secure default is per-user isolation in shared chats.
if "group_sessions_per_user" in yaml_cfg:
config.group_sessions_per_user = _coerce_bool(
yaml_cfg.get("group_sessions_per_user"),
True,
)
# Bridge discord settings from config.yaml to env vars
# (env vars take precedence — only set if not already defined)
discord_cfg = yaml_cfg.get("discord", {})
+4 -1
View File
@@ -752,7 +752,10 @@ class BasePlatformAdapter(ABC):
if not self._message_handler:
return
session_key = build_session_key(event.source)
session_key = build_session_key(
event.source,
group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
)
# Check if there's already an active handler for this session
if session_key in self._active_sessions:
+12 -1
View File
@@ -322,6 +322,14 @@ class TelegramAdapter(BasePlatformAdapter):
# Format and split message if needed
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
if len(chunks) > 1:
# truncate_message appends a raw " (1/2)" suffix. Escape the
# MarkdownV2-special parentheses so Telegram doesn't reject the
# chunk and fall back to plain text.
chunks = [
re.sub(r" \((\d+)/(\d+)\)$", r" \\(\1/\2\\)", chunk)
for chunk in chunks
]
message_ids = []
thread_id = metadata.get("thread_id") if metadata else None
@@ -821,7 +829,10 @@ class TelegramAdapter(BasePlatformAdapter):
def _photo_batch_key(self, event: MessageEvent, msg: Message) -> str:
"""Return a batching key for Telegram photos/albums."""
from gateway.session import build_session_key
session_key = build_session_key(event.source)
session_key = build_session_key(
event.source,
group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
)
media_group_id = getattr(msg, "media_group_id", None)
if media_group_id:
return f"{session_key}:album:{media_group_id}"
+48 -47
View File
@@ -77,6 +77,7 @@ if _config_path.exists():
"container_persistent": "TERMINAL_CONTAINER_PERSISTENT",
"docker_volumes": "TERMINAL_DOCKER_VOLUMES",
"sandbox_dir": "TERMINAL_SANDBOX_DIR",
"persistent_shell": "TERMINAL_PERSISTENT_SHELL",
}
for _cfg_key, _env_var in _terminal_env_map.items():
if _cfg_key in _terminal_cfg:
@@ -305,7 +306,7 @@ class GatewayRunner:
# Ensure tirith security scanner is available (downloads if needed)
try:
from tools.tirith_security import ensure_installed
ensure_installed()
ensure_installed(log_failures=False)
except Exception:
pass # Non-fatal — fail-open at scan time if unavailable
@@ -513,6 +514,21 @@ class GatewayRunner:
def exit_reason(self) -> Optional[str]:
return self._exit_reason
def _session_key_for_source(self, source: SessionSource) -> str:
"""Resolve the current session key for a source, honoring gateway config when available."""
if hasattr(self, "session_store") and self.session_store is not None:
try:
session_key = self.session_store._generate_session_key(source)
if isinstance(session_key, str) and session_key:
return session_key
except Exception:
pass
config = getattr(self, "config", None)
return build_session_key(
source,
group_sessions_per_user=getattr(config, "group_sessions_per_user", True),
)
async def _handle_adapter_fatal_error(self, adapter: BasePlatformAdapter) -> None:
"""React to a non-retryable adapter failure after startup."""
logger.error(
@@ -941,6 +957,12 @@ class GatewayRunner:
config: Any
) -> Optional[BasePlatformAdapter]:
"""Create the appropriate adapter for a platform."""
if hasattr(config, "extra") and isinstance(config.extra, dict):
config.extra.setdefault(
"group_sessions_per_user",
self.config.group_sessions_per_user,
)
if platform == Platform.TELEGRAM:
from gateway.platforms.telegram import TelegramAdapter, check_telegram_requirements
if not check_telegram_requirements():
@@ -1112,8 +1134,11 @@ class GatewayRunner:
# Special case: Telegram/photo bursts often arrive as multiple near-
# simultaneous updates. Do NOT interrupt for photo-only follow-ups here;
# let the adapter-level batching/queueing logic absorb them.
_quick_key = build_session_key(source)
_quick_key = self._session_key_for_source(source)
if _quick_key in self._running_agents:
if event.get_command() == "status":
return await self._handle_status_command(event)
if event.message_type == MessageType.PHOTO:
logger.debug("PRIORITY photo follow-up for session %s — queueing without interrupt", _quick_key[:20])
adapter = self.adapters.get(source.platform)
@@ -1298,7 +1323,7 @@ class GatewayRunner:
logger.debug("Skill command check failed (non-fatal): %s", e)
# Check for pending exec approval responses
session_key_preview = build_session_key(source)
session_key_preview = self._session_key_for_source(source)
if session_key_preview in self._pending_approvals:
user_text = event.text.strip().lower()
if user_text in ("yes", "y", "approve", "ok", "go", "do it"):
@@ -1822,6 +1847,8 @@ class GatewayRunner:
# Update session with actual prompt token count and model from the agent
self.session_store.update_session(
session_entry.session_key,
input_tokens=agent_result.get("input_tokens", 0),
output_tokens=agent_result.get("output_tokens", 0),
last_prompt_tokens=agent_result.get("last_prompt_tokens", 0),
model=agent_result.get("model"),
)
@@ -1848,7 +1875,7 @@ class GatewayRunner:
source = event.source
# Get existing session key
session_key = self.session_store._generate_session_key(source)
session_key = self._session_key_for_source(source)
# Flush memories in the background (fire-and-forget) so the user
# gets the "Session reset!" response immediately.
@@ -3078,7 +3105,7 @@ class GatewayRunner:
return "Session database not available."
source = event.source
session_key = build_session_key(source)
session_key = self._session_key_for_source(source)
name = event.get_command_args().strip()
if not name:
@@ -3150,7 +3177,7 @@ class GatewayRunner:
async def _handle_usage_command(self, event: MessageEvent) -> str:
"""Handle /usage command -- show token usage for the session's last agent run."""
source = event.source
session_key = build_session_key(source)
session_key = self._session_key_for_source(source)
agent = self._running_agents.get(session_key)
if agent and hasattr(agent, "session_total_tokens") and agent.session_api_calls > 0:
@@ -3629,7 +3656,10 @@ class GatewayRunner:
)
else:
error = result.get("error", "unknown error")
if "No STT provider" in error or "not set" in error:
if (
"No STT provider" in error
or error.startswith("Neither VOICE_TOOLS_OPENAI_KEY nor OPENAI_API_KEY is set")
):
enriched_parts.append(
"[The user sent a voice message but I can't listen "
"to it right now~ No STT provider is configured "
@@ -3851,45 +3881,8 @@ class GatewayRunner:
last_tool[0] = tool_name
# Build progress message with primary argument preview
tool_emojis = {
"terminal": "💻",
"process": "⚙️",
"web_search": "🔍",
"web_extract": "📄",
"read_file": "📖",
"write_file": "✍️",
"patch": "🔧",
"search": "🔎",
"search_files": "🔎",
"list_directory": "📂",
"image_generate": "🎨",
"text_to_speech": "🔊",
"browser_navigate": "🌐",
"browser_click": "👆",
"browser_type": "⌨️",
"browser_snapshot": "📸",
"browser_scroll": "📜",
"browser_back": "◀️",
"browser_press": "⌨️",
"browser_close": "🚪",
"browser_get_images": "🖼️",
"browser_vision": "👁️",
"moa_query": "🧠",
"mixture_of_agents": "🧠",
"vision_analyze": "👁️",
"skill_view": "📚",
"skills_list": "📋",
"todo": "📋",
"memory": "🧠",
"session_search": "🔍",
"send_message": "📨",
"cronjob": "",
"execute_code": "🐍",
"delegate_task": "🔀",
"clarify": "",
"skill_manage": "📝",
}
emoji = tool_emojis.get(tool_name, "⚙️")
from agent.display import get_tool_emoji
emoji = get_tool_emoji(tool_name, default="⚙️")
# Verbose mode: show detailed arguments
if progress_mode == "verbose" and args:
@@ -4171,11 +4164,15 @@ class GatewayRunner:
# Return final response, or a message if something went wrong
final_response = result.get("final_response")
# Extract last actual prompt token count from the agent's compressor
# Extract actual token counts from the agent instance used for this run
_last_prompt_toks = 0
_input_toks = 0
_output_toks = 0
_agent = agent_holder[0]
if _agent and hasattr(_agent, "context_compressor"):
_last_prompt_toks = getattr(_agent.context_compressor, "last_prompt_tokens", 0)
_input_toks = getattr(_agent, "session_prompt_tokens", 0)
_output_toks = getattr(_agent, "session_completion_tokens", 0)
_resolved_model = getattr(_agent, "model", None) if _agent else None
if not final_response:
@@ -4187,6 +4184,8 @@ class GatewayRunner:
"tools": tools_holder[0] or [],
"history_offset": len(agent_history),
"last_prompt_tokens": _last_prompt_toks,
"input_tokens": _input_toks,
"output_tokens": _output_toks,
"model": _resolved_model,
}
@@ -4250,6 +4249,8 @@ class GatewayRunner:
"tools": tools_holder[0] or [],
"history_offset": len(agent_history),
"last_prompt_tokens": _last_prompt_toks,
"input_tokens": _input_toks,
"output_tokens": _output_toks,
"model": _resolved_model,
"session_id": effective_session_id,
}
+19 -7
View File
@@ -315,7 +315,7 @@ class SessionEntry:
)
def build_session_key(source: SessionSource) -> str:
def build_session_key(source: SessionSource, group_sessions_per_user: bool = True) -> str:
"""Build a deterministic session key from a message source.
This is the single source of truth for session key construction.
@@ -328,7 +328,11 @@ def build_session_key(source: SessionSource) -> str:
Group/channel rules:
- chat_id identifies the parent group/channel.
- user_id/user_id_alt isolates participants within that parent chat when available when
``group_sessions_per_user`` is enabled.
- thread_id differentiates threads within that parent chat.
- Without participant identifiers, or when isolation is disabled, messages fall back to one
shared session per chat.
- Without identifiers, messages fall back to one session per platform/chat_type.
"""
platform = source.platform.value
@@ -340,13 +344,18 @@ def build_session_key(source: SessionSource) -> str:
if source.thread_id:
return f"agent:main:{platform}:dm:{source.thread_id}"
return f"agent:main:{platform}:dm"
participant_id = source.user_id_alt or source.user_id
key_parts = ["agent:main", platform, source.chat_type]
if source.chat_id:
if source.thread_id:
return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}:{source.thread_id}"
return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}"
key_parts.append(source.chat_id)
if source.thread_id:
return f"agent:main:{platform}:{source.chat_type}:{source.thread_id}"
return f"agent:main:{platform}:{source.chat_type}"
key_parts.append(source.thread_id)
if group_sessions_per_user and participant_id:
key_parts.append(str(participant_id))
return ":".join(key_parts)
class SessionStore:
@@ -425,7 +434,10 @@ class SessionStore:
def _generate_session_key(self, source: SessionSource) -> str:
"""Generate a session key from a source."""
return build_session_key(source)
return build_session_key(
source,
group_sessions_per_user=getattr(self.config, "group_sessions_per_user", True),
)
def _is_session_expired(self, entry: SessionEntry) -> bool:
"""Check if a session has expired based on its reset policy.
+6
View File
@@ -118,6 +118,11 @@ DEFAULT_CONFIG = {
# Each entry is "host_path:container_path" (standard Docker -v syntax).
# Example: ["/home/user/projects:/workspace/projects", "/data:/data"]
"docker_volumes": [],
# Persistent shell — keep a long-lived bash shell across execute() calls
# so cwd/env vars/shell variables survive between commands.
# Enabled by default for non-local backends (SSH); local is always opt-in
# via TERMINAL_LOCAL_PERSISTENT env var.
"persistent_shell": True,
},
"browser": {
@@ -1391,6 +1396,7 @@ def set_config_value(key: str, value: str):
"terminal.cwd": "TERMINAL_CWD",
"terminal.timeout": "TERMINAL_TIMEOUT",
"terminal.sandbox_dir": "TERMINAL_SANDBOX_DIR",
"terminal.persistent_shell": "TERMINAL_PERSISTENT_SHELL",
}
if key in _config_to_env_sync:
save_env_value(_config_to_env_sync[key], str(value))
+25 -1
View File
@@ -1112,8 +1112,32 @@ def _model_flow_custom(config):
effective_key = api_key or current_key
from hermes_cli.models import probe_api_models
probe = probe_api_models(effective_key, effective_url)
if probe.get("used_fallback") and probe.get("resolved_base_url"):
print(
f"Warning: endpoint verification worked at {probe['resolved_base_url']}/models, "
f"not the exact URL you entered. Saving the working base URL instead."
)
effective_url = probe["resolved_base_url"]
if base_url:
base_url = effective_url
elif probe.get("models") is not None:
print(
f"Verified endpoint via {probe.get('probed_url')} "
f"({len(probe.get('models') or [])} model(s) visible)"
)
else:
print(
f"Warning: could not verify this endpoint via {probe.get('probed_url')}. "
f"Hermes will still save it."
)
if probe.get("suggested_base_url"):
print(f" If this server expects /v1, try base URL: {probe['suggested_base_url']}")
if base_url:
save_env_value("OPENAI_BASE_URL", base_url)
save_env_value("OPENAI_BASE_URL", effective_url)
if api_key:
save_env_value("OPENAI_API_KEY", api_key)
+99 -18
View File
@@ -308,6 +308,62 @@ def _fetch_anthropic_models(timeout: float = 5.0) -> Optional[list[str]]:
return None
def probe_api_models(
api_key: Optional[str],
base_url: Optional[str],
timeout: float = 5.0,
) -> dict[str, Any]:
"""Probe an OpenAI-compatible ``/models`` endpoint with light URL heuristics."""
normalized = (base_url or "").strip().rstrip("/")
if not normalized:
return {
"models": None,
"probed_url": None,
"resolved_base_url": "",
"suggested_base_url": None,
"used_fallback": False,
}
if normalized.endswith("/v1"):
alternate_base = normalized[:-3].rstrip("/")
else:
alternate_base = normalized + "/v1"
candidates: list[tuple[str, bool]] = [(normalized, False)]
if alternate_base and alternate_base != normalized:
candidates.append((alternate_base, True))
tried: list[str] = []
headers: dict[str, str] = {}
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
for candidate_base, is_fallback in candidates:
url = candidate_base.rstrip("/") + "/models"
tried.append(url)
req = urllib.request.Request(url, headers=headers)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
data = json.loads(resp.read().decode())
return {
"models": [m.get("id", "") for m in data.get("data", [])],
"probed_url": url,
"resolved_base_url": candidate_base.rstrip("/"),
"suggested_base_url": alternate_base if alternate_base != candidate_base else normalized,
"used_fallback": is_fallback,
}
except Exception:
continue
return {
"models": None,
"probed_url": tried[-1] if tried else normalized.rstrip("/") + "/models",
"resolved_base_url": normalized,
"suggested_base_url": alternate_base if alternate_base != normalized else None,
"used_fallback": False,
}
def fetch_api_models(
api_key: Optional[str],
base_url: Optional[str],
@@ -318,22 +374,7 @@ def fetch_api_models(
Returns a list of model ID strings, or ``None`` if the endpoint could not
be reached (network error, timeout, auth failure, etc.).
"""
if not base_url:
return None
url = base_url.rstrip("/") + "/models"
headers: dict[str, str] = {}
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
req = urllib.request.Request(url, headers=headers)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
data = json.loads(resp.read().decode())
# Standard OpenAI format: {"data": [{"id": "model-name", ...}, ...]}
return [m.get("id", "") for m in data.get("data", [])]
except Exception:
return None
return probe_api_models(api_key, base_url, timeout=timeout).get("models")
def validate_requested_model(
@@ -376,13 +417,53 @@ def validate_requested_model(
"message": "Model names cannot contain spaces.",
}
# Custom endpoints can serve any model — skip validation
if normalized == "custom":
probe = probe_api_models(api_key, base_url)
api_models = probe.get("models")
if api_models is not None:
if requested in set(api_models):
return {
"accepted": True,
"persist": True,
"recognized": True,
"message": None,
}
suggestions = get_close_matches(requested, api_models, n=3, cutoff=0.5)
suggestion_text = ""
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
message = (
f"Note: `{requested}` was not found in this custom endpoint's model listing "
f"({probe.get('probed_url')}). It may still work if the server supports hidden or aliased models."
f"{suggestion_text}"
)
if probe.get("used_fallback"):
message += (
f"\n Endpoint verification succeeded after trying `{probe.get('resolved_base_url')}`. "
f"Consider saving that as your base URL."
)
return {
"accepted": True,
"persist": True,
"recognized": False,
"message": message,
}
message = (
f"Note: could not reach this custom endpoint's model listing at `{probe.get('probed_url')}`. "
f"Hermes will still save `{requested}`, but the endpoint should expose `/models` for verification."
)
if probe.get("suggested_base_url"):
message += f"\n If this server expects `/v1`, try base URL: `{probe.get('suggested_base_url')}`"
return {
"accepted": True,
"persist": True,
"recognized": False,
"message": None,
"message": message,
}
# Probe the live API to check if the model actually exists
+103 -116
View File
@@ -227,54 +227,86 @@ def prompt(question: str, default: str = None, password: bool = False) -> str:
sys.exit(1)
def _curses_prompt_choice(question: str, choices: list, default: int = 0) -> int:
"""Single-select menu using curses to avoid simple_term_menu rendering bugs."""
try:
import curses
result_holder = [default]
def _curses_menu(stdscr):
curses.curs_set(0)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
cursor = default
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
try:
stdscr.addnstr(
0,
0,
question,
max_x - 1,
curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0),
)
except curses.error:
pass
for i, choice in enumerate(choices):
y = i + 2
if y >= max_y - 1:
break
arrow = "" if i == cursor else " "
line = f" {arrow} {choice}"
attr = curses.A_NORMAL
if i == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(1)
try:
stdscr.addnstr(y, 0, line, max_x - 1, attr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
cursor = (cursor - 1) % len(choices)
elif key in (curses.KEY_DOWN, ord("j")):
cursor = (cursor + 1) % len(choices)
elif key in (curses.KEY_ENTER, 10, 13):
result_holder[0] = cursor
return
elif key in (27, ord("q")):
return
curses.wrapper(_curses_menu)
return result_holder[0]
except Exception:
return -1
def prompt_choice(question: str, choices: list, default: int = 0) -> int:
"""Prompt for a choice from a list with arrow key navigation.
Escape keeps the current default (skips the question).
Ctrl+C exits the wizard.
"""
print(color(question, Colors.YELLOW))
# Try to use interactive menu if available
try:
from simple_term_menu import TerminalMenu
import re
# Strip emoji characters — simple_term_menu miscalculates visual
# width of emojis, causing duplicated/garbled lines on redraw.
_emoji_re = re.compile(
"[\U0001f300-\U0001f9ff\U00002600-\U000027bf\U0000fe00-\U0000fe0f"
"\U0001fa00-\U0001fa6f\U0001fa70-\U0001faff\u200d]+",
flags=re.UNICODE,
)
menu_choices = [f" {_emoji_re.sub('', choice).strip()}" for choice in choices]
print_info(" ↑/↓ Navigate Enter Select Esc Skip Ctrl+C Exit")
terminal_menu = TerminalMenu(
menu_choices,
cursor_index=default,
menu_cursor="",
menu_cursor_style=("fg_green", "bold"),
menu_highlight_style=("fg_green",),
cycle_cursor=True,
clear_screen=False,
)
idx = terminal_menu.show()
if idx is None: # User pressed Escape — keep current value
print_info(f" Skipped (keeping current)")
idx = _curses_prompt_choice(question, choices, default)
if idx >= 0:
if idx == default:
print_info(" Skipped (keeping current)")
print()
return default
print() # Add newline after selection
print()
return idx
except (ImportError, NotImplementedError):
pass
except Exception as e:
print(f" (Interactive menu unavailable: {e})")
# Fallback to number-based selection (simple_term_menu doesn't support Windows)
print(color(question, Colors.YELLOW))
for i, choice in enumerate(choices):
marker = "" if i == default else ""
if i == default:
@@ -344,84 +376,15 @@ def prompt_checklist(title: str, items: list, pre_selected: list = None) -> list
if pre_selected is None:
pre_selected = []
print(color(title, Colors.YELLOW))
print_info(" SPACE Toggle ENTER Confirm ESC Skip Ctrl+C Exit")
print()
from hermes_cli.curses_ui import curses_checklist
try:
from simple_term_menu import TerminalMenu
import re
# Strip emoji characters from menu labels — simple_term_menu miscalculates
# visual width of emojis on macOS, causing duplicated/garbled lines.
_emoji_re = re.compile(
"[\U0001f300-\U0001f9ff\U00002600-\U000027bf\U0000fe00-\U0000fe0f"
"\U0001fa00-\U0001fa6f\U0001fa70-\U0001faff\u200d]+",
flags=re.UNICODE,
)
menu_items = [f" {_emoji_re.sub('', item).strip()}" for item in items]
# Map pre-selected indices to the actual menu entry strings
preselected = [menu_items[i] for i in pre_selected if i < len(menu_items)]
terminal_menu = TerminalMenu(
menu_items,
multi_select=True,
show_multi_select_hint=False,
multi_select_cursor="[✓] ",
multi_select_select_on_accept=False,
multi_select_empty_ok=True,
preselected_entries=preselected if preselected else None,
menu_cursor="",
menu_cursor_style=("fg_green", "bold"),
menu_highlight_style=("fg_green",),
cycle_cursor=True,
clear_screen=False,
)
terminal_menu.show()
if terminal_menu.chosen_menu_entries is None:
print_info(" Skipped (keeping current)")
return list(pre_selected)
selected = list(terminal_menu.chosen_menu_indices or [])
return selected
except (ImportError, NotImplementedError):
# Fallback: numbered toggle interface (simple_term_menu doesn't support Windows)
selected = set(pre_selected)
while True:
for i, item in enumerate(items):
marker = color("[✓]", Colors.GREEN) if i in selected else "[ ]"
print(f" {marker} {i + 1}. {item}")
print()
try:
value = input(
color(" Toggle # (or Enter to confirm): ", Colors.DIM)
).strip()
if not value:
break
idx = int(value) - 1
if 0 <= idx < len(items):
if idx in selected:
selected.discard(idx)
else:
selected.add(idx)
else:
print_error(f"Enter a number between 1 and {len(items)}")
except ValueError:
print_error("Enter a number")
except (KeyboardInterrupt, EOFError):
print()
return []
# Clear and redraw (simple approach)
print()
return sorted(selected)
chosen = curses_checklist(
title,
items,
set(pre_selected),
cancel_returns=set(pre_selected),
)
return sorted(chosen)
def _prompt_api_key(var: dict):
@@ -933,11 +896,35 @@ def setup_model_provider(config: dict):
base_url = prompt(
" API base URL (e.g., https://api.example.com/v1)", current_url
)
).strip()
api_key = prompt(" API key", password=True)
model_name = prompt(" Model name (e.g., gpt-4, claude-3-opus)", current_model)
if base_url:
from hermes_cli.models import probe_api_models
probe = probe_api_models(api_key, base_url)
if probe.get("used_fallback") and probe.get("resolved_base_url"):
print_warning(
f"Endpoint verification worked at {probe['resolved_base_url']}/models, "
f"not the exact URL you entered. Saving the working base URL instead."
)
base_url = probe["resolved_base_url"]
elif probe.get("models") is not None:
print_success(
f"Verified endpoint via {probe.get('probed_url')} "
f"({len(probe.get('models') or [])} model(s) visible)"
)
else:
print_warning(
f"Could not verify this endpoint via {probe.get('probed_url')}. "
f"Hermes will still save it."
)
if probe.get("suggested_base_url"):
print_info(
f" If this server expects /v1, try base URL: {probe['suggested_base_url']}"
)
save_env_value("OPENAI_BASE_URL", base_url)
if api_key:
save_env_value("OPENAI_API_KEY", api_key)
+8
View File
@@ -60,6 +60,12 @@ All fields are optional. Missing values inherit from the ``default`` skin.
# Tool prefix: character for tool output lines (default: ┊)
tool_prefix: ""
# Tool emojis: override the default emoji for any tool (used in spinners & progress)
tool_emojis:
terminal: "" # Override terminal tool emoji
web_search: "🔮" # Override web_search tool emoji
# Any tool not listed here uses its registry default
USAGE
=====
@@ -111,6 +117,7 @@ class SkinConfig:
spinner: Dict[str, Any] = field(default_factory=dict)
branding: Dict[str, str] = field(default_factory=dict)
tool_prefix: str = ""
tool_emojis: Dict[str, str] = field(default_factory=dict) # per-tool emoji overrides
banner_logo: str = "" # Rich-markup ASCII art logo (replaces HERMES_AGENT_LOGO)
banner_hero: str = "" # Rich-markup hero art (replaces HERMES_CADUCEUS)
@@ -541,6 +548,7 @@ def _build_skin_config(data: Dict[str, Any]) -> SkinConfig:
spinner=spinner,
branding=branding,
tool_prefix=data.get("tool_prefix", default.get("tool_prefix", "")),
tool_emojis=data.get("tool_emojis", {}),
banner_logo=data.get("banner_logo", ""),
banner_hero=data.get("banner_hero", ""),
)
+6 -1
View File
@@ -927,6 +927,11 @@ class HonchoSessionManager:
return False
assistant_peer = self._get_or_create_peer(session.assistant_peer_id)
honcho_session = self._sessions_cache.get(session.honcho_session_id)
if not honcho_session:
logger.warning("No Honcho session cached for '%s', skipping AI seed", session_key)
return False
try:
wrapped = (
f"<ai_identity_seed>\n"
@@ -935,7 +940,7 @@ class HonchoSessionManager:
f"{content.strip()}\n"
f"</ai_identity_seed>"
)
assistant_peer.add_message("assistant", wrapped)
honcho_session.add_messages([assistant_peer.message(wrapped)])
logger.info("Seeded AI identity from '%s' into %s", source, session_key)
return True
except Exception as e:
+64 -36
View File
@@ -90,6 +90,7 @@ from agent.display import (
KawaiiSpinner, build_tool_preview as _build_tool_preview,
get_cute_tool_message as _get_cute_tool_message_impl,
_detect_tool_failure,
get_tool_emoji as _get_tool_emoji,
)
from agent.trajectory import (
convert_scratchpad_to_think, has_incomplete_scratchpad,
@@ -3301,8 +3302,7 @@ class AIAgent:
extra_body["provider"] = provider_preferences
_is_nous = "nousresearch" in self.base_url.lower()
_is_mistral = "api.mistral.ai" in self.base_url.lower()
if (_is_openrouter or _is_nous) and not _is_mistral:
if self._supports_reasoning_extra_body():
if self.reasoning_config is not None:
rc = dict(self.reasoning_config)
# Nous Portal requires reasoning enabled — don't send
@@ -3326,6 +3326,32 @@ class AIAgent:
return api_kwargs
def _supports_reasoning_extra_body(self) -> bool:
"""Return True when reasoning extra_body is safe to send for this route/model.
OpenRouter forwards unknown extra_body fields to upstream providers.
Some providers/routes reject `reasoning` with 400s, so gate it to
known reasoning-capable model families and direct Nous Portal.
"""
base_url = (self.base_url or "").lower()
if "nousresearch" in base_url:
return True
if "openrouter" not in base_url:
return False
if "api.mistral.ai" in base_url:
return False
model = (self.model or "").lower()
reasoning_model_prefixes = (
"deepseek/",
"anthropic/",
"openai/",
"x-ai/",
"google/gemini-2",
"qwen/qwen3",
)
return any(model.startswith(prefix) for prefix in reasoning_model_prefixes)
def _build_assistant_message(self, assistant_message, finish_reason: str) -> dict:
"""Build a normalized assistant message dict from an API response message.
@@ -3345,8 +3371,7 @@ class AIAgent:
reasoning_text = combined or None
if reasoning_text and self.verbose_logging:
preview = reasoning_text[:100] + "..." if len(reasoning_text) > 100 else reasoning_text
logging.debug(f"Captured reasoning ({len(reasoning_text)} chars): {preview}")
logging.debug(f"Captured reasoning ({len(reasoning_text)} chars): {reasoning_text}")
if reasoning_text and self.reasoning_callback:
try:
@@ -3823,8 +3848,12 @@ class AIAgent:
print(f" ⚡ Concurrent: {num_tools} tool calls — {tool_names_str}")
for i, (tc, name, args) in enumerate(parsed_calls, 1):
args_str = json.dumps(args, ensure_ascii=False)
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {name}({list(args.keys())}) - {args_preview}")
if self.verbose_logging:
print(f" 📞 Tool {i}: {name}({list(args.keys())})")
print(f" Args: {args_str}")
else:
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {name}({list(args.keys())}) - {args_preview}")
for _, name, args in parsed_calls:
if self.tool_progress_callback:
@@ -3889,17 +3918,20 @@ class AIAgent:
logger.warning("Tool %s returned error (%.2fs): %s", function_name, tool_duration, result_preview)
if self.verbose_logging:
result_preview = function_result[:200] if len(function_result) > 200 else function_result
logging.debug(f"Tool {function_name} completed in {tool_duration:.2f}s")
logging.debug(f"Tool result preview: {result_preview}...")
logging.debug(f"Tool result ({len(function_result)} chars): {function_result}")
# Print cute message per tool
if self.quiet_mode:
cute_msg = _get_cute_tool_message_impl(name, args, tool_duration, result=function_result)
print(f" {cute_msg}")
elif not self.quiet_mode:
response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
print(f" ✅ Tool {i+1} completed in {tool_duration:.2f}s - {response_preview}")
if self.verbose_logging:
print(f" ✅ Tool {i+1} completed in {tool_duration:.2f}s")
print(f" Result: {function_result}")
else:
response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
print(f" ✅ Tool {i+1} completed in {tool_duration:.2f}s - {response_preview}")
# Truncate oversized results
MAX_TOOL_RESULT_CHARS = 100_000
@@ -3975,8 +4007,12 @@ class AIAgent:
if not self.quiet_mode:
args_str = json.dumps(function_args, ensure_ascii=False)
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {function_name}({list(function_args.keys())}) - {args_preview}")
if self.verbose_logging:
print(f" 📞 Tool {i}: {function_name}({list(function_args.keys())})")
print(f" Args: {args_str}")
else:
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
print(f" 📞 Tool {i}: {function_name}({list(function_args.keys())}) - {args_preview}")
if self.tool_progress_callback:
try:
@@ -4085,23 +4121,7 @@ class AIAgent:
self._vprint(f" {cute_msg}")
elif self.quiet_mode and self._stream_callback is None:
face = random.choice(KawaiiSpinner.KAWAII_WAITING)
tool_emoji_map = {
'web_search': '🔍', 'web_extract': '📄', 'web_crawl': '🕸️',
'terminal': '💻', 'process': '⚙️',
'read_file': '📖', 'write_file': '✍️', 'patch': '🔧', 'search_files': '🔎',
'browser_navigate': '🌐', 'browser_snapshot': '📸',
'browser_click': '👆', 'browser_type': '⌨️',
'browser_scroll': '📜', 'browser_back': '◀️',
'browser_press': '⌨️', 'browser_close': '🚪',
'browser_get_images': '🖼️', 'browser_vision': '👁️',
'image_generate': '🎨', 'text_to_speech': '🔊',
'vision_analyze': '👁️', 'mixture_of_agents': '🧠',
'skills_list': '📚', 'skill_view': '📚',
'cronjob': '',
'send_message': '📨', 'todo': '📋', 'memory': '🧠', 'session_search': '🔍',
'clarify': '', 'execute_code': '🐍', 'delegate_task': '🔀',
}
emoji = tool_emoji_map.get(function_name, '')
emoji = _get_tool_emoji(function_name)
preview = _build_tool_preview(function_name, function_args) or function_name
if len(preview) > 30:
preview = preview[:27] + "..."
@@ -4132,7 +4152,9 @@ class AIAgent:
logger.error("handle_function_call raised for %s: %s", function_name, tool_error, exc_info=True)
tool_duration = time.time() - tool_start_time
result_preview = function_result[:200] if len(function_result) > 200 else function_result
result_preview = function_result if self.verbose_logging else (
function_result[:200] if len(function_result) > 200 else function_result
)
# Log tool errors to the persistent error log so [error] tags
# in the UI always have a corresponding detailed entry on disk.
@@ -4142,7 +4164,7 @@ class AIAgent:
if self.verbose_logging:
logging.debug(f"Tool {function_name} completed in {tool_duration:.2f}s")
logging.debug(f"Tool result preview: {result_preview}...")
logging.debug(f"Tool result ({len(function_result)} chars): {function_result}")
# Guard against tools returning absurdly large content that would
# blow up the context window. 100K chars ≈ 25K tokens — generous
@@ -4165,8 +4187,12 @@ class AIAgent:
messages.append(tool_msg)
if not self.quiet_mode:
response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
print(f" ✅ Tool {i} completed in {tool_duration:.2f}s - {response_preview}")
if self.verbose_logging:
print(f" ✅ Tool {i} completed in {tool_duration:.2f}s")
print(f" Result: {function_result}")
else:
response_preview = function_result[:self.log_prefix_chars] + "..." if len(function_result) > self.log_prefix_chars else function_result
print(f" ✅ Tool {i} completed in {tool_duration:.2f}s - {response_preview}")
if self._interrupt_requested and i < len(assistant_message.tool_calls):
remaining = len(assistant_message.tool_calls) - i
@@ -4264,9 +4290,8 @@ class AIAgent:
api_messages.insert(sys_offset + idx, pfm.copy())
summary_extra_body = {}
_is_openrouter = "openrouter" in self.base_url.lower()
_is_nous = "nousresearch" in self.base_url.lower()
if _is_openrouter or _is_nous:
if self._supports_reasoning_extra_body():
if self.reasoning_config is not None:
summary_extra_body["reasoning"] = self.reasoning_config
else:
@@ -5418,7 +5443,10 @@ class AIAgent:
# Handle assistant response
if assistant_message.content and not self.quiet_mode:
self._vprint(f"{self.log_prefix}🤖 Assistant: {assistant_message.content[:100]}{'...' if len(assistant_message.content) > 100 else ''}")
if self.verbose_logging:
self._vprint(f"{self.log_prefix}🤖 Assistant: {assistant_message.content}")
else:
self._vprint(f"{self.log_prefix}🤖 Assistant: {assistant_message.content[:100]}{'...' if len(assistant_message.content) > 100 else ''}")
# Notify progress callback of model's thinking (used by subagent
# delegation to relay the child's reasoning to the parent display).
+123
View File
@@ -0,0 +1,123 @@
"""Tests for get_tool_emoji in agent/display.py — skin + registry integration."""
from unittest.mock import patch as mock_patch, MagicMock
from agent.display import get_tool_emoji
class TestGetToolEmoji:
"""Verify the skin → registry → fallback resolution chain."""
def test_returns_registry_emoji_when_no_skin(self):
"""Registry-registered emoji is used when no skin is active."""
mock_registry = MagicMock()
mock_registry.get_emoji.return_value = "🎨"
with mock_patch("agent.display._get_skin", return_value=None), \
mock_patch("agent.display.registry", mock_registry, create=True):
# Need to patch the import inside get_tool_emoji
pass
# Direct test: patch the lazy import path
with mock_patch("agent.display._get_skin", return_value=None):
# get_tool_emoji will try to import registry — mock that
mock_reg = MagicMock()
mock_reg.get_emoji.return_value = "📖"
with mock_patch.dict("sys.modules", {}):
import sys
# Patch tools.registry module
mock_module = MagicMock()
mock_module.registry = mock_reg
with mock_patch.dict(sys.modules, {"tools.registry": mock_module}):
result = get_tool_emoji("read_file")
assert result == "📖"
def test_skin_override_takes_precedence(self):
"""Skin tool_emojis override registry defaults."""
skin = MagicMock()
skin.tool_emojis = {"terminal": ""}
with mock_patch("agent.display._get_skin", return_value=skin):
result = get_tool_emoji("terminal")
assert result == ""
def test_skin_empty_dict_falls_through(self):
"""Empty skin tool_emojis falls through to registry."""
skin = MagicMock()
skin.tool_emojis = {}
mock_reg = MagicMock()
mock_reg.get_emoji.return_value = "💻"
import sys
mock_module = MagicMock()
mock_module.registry = mock_reg
with mock_patch("agent.display._get_skin", return_value=skin), \
mock_patch.dict(sys.modules, {"tools.registry": mock_module}):
result = get_tool_emoji("terminal")
assert result == "💻"
def test_fallback_default(self):
"""When neither skin nor registry has an emoji, use the default."""
skin = MagicMock()
skin.tool_emojis = {}
mock_reg = MagicMock()
mock_reg.get_emoji.return_value = ""
import sys
mock_module = MagicMock()
mock_module.registry = mock_reg
with mock_patch("agent.display._get_skin", return_value=skin), \
mock_patch.dict(sys.modules, {"tools.registry": mock_module}):
result = get_tool_emoji("unknown_tool")
assert result == ""
def test_custom_default(self):
"""Custom default is returned when nothing matches."""
with mock_patch("agent.display._get_skin", return_value=None):
mock_reg = MagicMock()
mock_reg.get_emoji.return_value = ""
import sys
mock_module = MagicMock()
mock_module.registry = mock_reg
with mock_patch.dict(sys.modules, {"tools.registry": mock_module}):
result = get_tool_emoji("x", default="⚙️")
assert result == "⚙️"
def test_skin_override_only_for_matching_tool(self):
"""Skin override for one tool doesn't affect others."""
skin = MagicMock()
skin.tool_emojis = {"terminal": ""}
mock_reg = MagicMock()
mock_reg.get_emoji.return_value = "🔍"
import sys
mock_module = MagicMock()
mock_module.registry = mock_reg
with mock_patch("agent.display._get_skin", return_value=skin), \
mock_patch.dict(sys.modules, {"tools.registry": mock_module}):
assert get_tool_emoji("terminal") == "" # skin override
assert get_tool_emoji("web_search") == "🔍" # registry fallback
class TestSkinConfigToolEmojis:
"""Verify SkinConfig handles tool_emojis field correctly."""
def test_skin_config_has_tool_emojis_field(self):
from hermes_cli.skin_engine import SkinConfig
skin = SkinConfig(name="test")
assert skin.tool_emojis == {}
def test_skin_config_accepts_tool_emojis(self):
from hermes_cli.skin_engine import SkinConfig
emojis = {"terminal": "", "web_search": "🔮"}
skin = SkinConfig(name="test", tool_emojis=emojis)
assert skin.tool_emojis == emojis
def test_build_skin_config_includes_tool_emojis(self):
from hermes_cli.skin_engine import _build_skin_config
data = {
"name": "custom",
"tool_emojis": {"terminal": "🗡️", "patch": "⚒️"},
}
skin = _build_skin_config(data)
assert skin.tool_emojis == {"terminal": "🗡️", "patch": "⚒️"}
def test_build_skin_config_empty_tool_emojis_default(self):
from hermes_cli.skin_engine import _build_skin_config
data = {"name": "minimal"}
skin = _build_skin_config(data)
assert skin.tool_emojis == {}
+14
View File
@@ -96,6 +96,7 @@ class TestGatewayConfigRoundtrip:
},
reset_triggers=["/new"],
quick_commands={"limits": {"type": "exec", "command": "echo ok"}},
group_sessions_per_user=False,
)
d = config.to_dict()
restored = GatewayConfig.from_dict(d)
@@ -104,6 +105,7 @@ class TestGatewayConfigRoundtrip:
assert restored.platforms[Platform.TELEGRAM].token == "tok_123"
assert restored.reset_triggers == ["/new"]
assert restored.quick_commands == {"limits": {"type": "exec", "command": "echo ok"}}
assert restored.group_sessions_per_user is False
class TestLoadGatewayConfig:
@@ -125,6 +127,18 @@ class TestLoadGatewayConfig:
assert config.quick_commands == {"limits": {"type": "exec", "command": "echo ok"}}
def test_bridges_group_sessions_per_user_from_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text("group_sessions_per_user: false\n", encoding="utf-8")
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
config = load_gateway_config()
assert config.group_sessions_per_user is False
def test_invalid_quick_commands_in_config_yaml_are_ignored(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
+94
View File
@@ -369,6 +369,54 @@ class TestWhatsAppDMSessionKeyConsistency:
)
assert store._generate_session_key(source) == build_session_key(source)
def test_store_creates_distinct_group_sessions_per_user(self, store):
first = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="alice",
user_name="Alice",
)
second = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="bob",
user_name="Bob",
)
first_entry = store.get_or_create_session(first)
second_entry = store.get_or_create_session(second)
assert first_entry.session_key == "agent:main:discord:group:guild-123:alice"
assert second_entry.session_key == "agent:main:discord:group:guild-123:bob"
assert first_entry.session_id != second_entry.session_id
def test_store_shares_group_sessions_when_disabled_in_config(self, store):
store.config.group_sessions_per_user = False
first = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="alice",
user_name="Alice",
)
second = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="bob",
user_name="Bob",
)
first_entry = store.get_or_create_session(first)
second_entry = store.get_or_create_session(second)
assert first_entry.session_key == "agent:main:discord:group:guild-123"
assert second_entry.session_key == "agent:main:discord:group:guild-123"
assert first_entry.session_id == second_entry.session_id
def test_telegram_dm_includes_chat_id(self):
"""Non-WhatsApp DMs should also include chat_id to separate users."""
source = SessionSource(
@@ -398,6 +446,41 @@ class TestWhatsAppDMSessionKeyConsistency:
key = build_session_key(source)
assert key == "agent:main:discord:group:guild-123"
def test_group_sessions_are_isolated_per_user_when_user_id_present(self):
first = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="alice",
)
second = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="bob",
)
assert build_session_key(first) == "agent:main:discord:group:guild-123:alice"
assert build_session_key(second) == "agent:main:discord:group:guild-123:bob"
assert build_session_key(first) != build_session_key(second)
def test_group_sessions_can_be_shared_when_isolation_disabled(self):
first = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="alice",
)
second = SessionSource(
platform=Platform.DISCORD,
chat_id="guild-123",
chat_type="group",
user_id="bob",
)
assert build_session_key(first, group_sessions_per_user=False) == "agent:main:discord:group:guild-123"
assert build_session_key(second, group_sessions_per_user=False) == "agent:main:discord:group:guild-123"
def test_group_thread_includes_thread_id(self):
"""Forum-style threads need a distinct session key within one group."""
source = SessionSource(
@@ -409,6 +492,17 @@ class TestWhatsAppDMSessionKeyConsistency:
key = build_session_key(source)
assert key == "agent:main:telegram:group:-1002285219667:17585"
def test_group_thread_sessions_are_isolated_per_user(self):
source = SessionSource(
platform=Platform.TELEGRAM,
chat_id="-1002285219667",
chat_type="group",
thread_id="17585",
user_id="42",
)
key = build_session_key(source)
assert key == "agent:main:telegram:group:-1002285219667:17585:42"
class TestSessionStoreEntriesAttribute:
"""Regression: /reset must access _entries, not _sessions."""
+133
View File
@@ -0,0 +1,133 @@
"""Tests for gateway /status behavior and token persistence."""
from datetime import datetime
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock
import pytest
from gateway.config import GatewayConfig, Platform, PlatformConfig
from gateway.platforms.base import MessageEvent
from gateway.session import SessionEntry, SessionSource, build_session_key
def _make_source() -> SessionSource:
return SessionSource(
platform=Platform.TELEGRAM,
user_id="u1",
chat_id="c1",
user_name="tester",
chat_type="dm",
)
def _make_event(text: str) -> MessageEvent:
return MessageEvent(
text=text,
source=_make_source(),
message_id="m1",
)
def _make_runner(session_entry: SessionEntry):
from gateway.run import GatewayRunner
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(
platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="***")}
)
adapter = MagicMock()
adapter.send = AsyncMock()
runner.adapters = {Platform.TELEGRAM: adapter}
runner._voice_mode = {}
runner.hooks = SimpleNamespace(emit=AsyncMock(), loaded_hooks=False)
runner.session_store = MagicMock()
runner.session_store.get_or_create_session.return_value = session_entry
runner.session_store.load_transcript.return_value = []
runner.session_store.has_any_sessions.return_value = True
runner.session_store.append_to_transcript = MagicMock()
runner.session_store.rewrite_transcript = MagicMock()
runner.session_store.update_session = MagicMock()
runner._running_agents = {}
runner._pending_messages = {}
runner._pending_approvals = {}
runner._session_db = None
runner._reasoning_config = None
runner._provider_routing = {}
runner._fallback_model = None
runner._show_reasoning = False
runner._is_user_authorized = lambda _source: True
runner._set_session_env = lambda _context: None
runner._should_send_voice_reply = lambda *_args, **_kwargs: False
runner._send_voice_reply = AsyncMock()
runner._capture_gateway_honcho_if_configured = lambda *args, **kwargs: None
runner._emit_gateway_run_progress = AsyncMock()
return runner
@pytest.mark.asyncio
async def test_status_command_reports_running_agent_without_interrupt(monkeypatch):
session_entry = SessionEntry(
session_key=build_session_key(_make_source()),
session_id="sess-1",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
chat_type="dm",
total_tokens=321,
)
runner = _make_runner(session_entry)
running_agent = MagicMock()
runner._running_agents[build_session_key(_make_source())] = running_agent
result = await runner._handle_message(_make_event("/status"))
assert "**Tokens:** 321" in result
assert "**Agent Running:** Yes ⚡" in result
running_agent.interrupt.assert_not_called()
assert runner._pending_messages == {}
@pytest.mark.asyncio
async def test_handle_message_persists_agent_token_counts(monkeypatch):
import gateway.run as gateway_run
session_entry = SessionEntry(
session_key=build_session_key(_make_source()),
session_id="sess-1",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
chat_type="dm",
)
runner = _make_runner(session_entry)
runner.session_store.load_transcript.return_value = [{"role": "user", "content": "earlier"}]
runner._run_agent = AsyncMock(
return_value={
"final_response": "ok",
"messages": [],
"tools": [],
"history_offset": 0,
"last_prompt_tokens": 80,
"input_tokens": 120,
"output_tokens": 45,
"model": "openai/test-model",
}
)
monkeypatch.setattr(gateway_run, "_resolve_runtime_agent_kwargs", lambda: {"api_key": "***"})
monkeypatch.setattr(
"agent.model_metadata.get_model_context_length",
lambda *_args, **_kwargs: 100000,
)
result = await runner._handle_message(_make_event("hello"))
assert result == "ok"
runner.session_store.update_session.assert_called_once_with(
session_entry.session_key,
input_tokens=120,
output_tokens=45,
last_prompt_tokens=80,
model="openai/test-model",
)
+24
View File
@@ -51,3 +51,27 @@ async def test_enrich_message_with_transcription_skips_when_stt_disabled():
assert "transcription is disabled" in result.lower()
assert "caption" in result
@pytest.mark.asyncio
async def test_enrich_message_with_transcription_avoids_bogus_no_provider_message_for_backend_key_errors():
from gateway.run import GatewayRunner
runner = GatewayRunner.__new__(GatewayRunner)
runner.config = GatewayConfig(stt_enabled=True)
with patch(
"tools.transcription_tools.transcribe_audio",
return_value={"success": False, "error": "VOICE_TOOLS_OPENAI_KEY not set"},
), patch(
"tools.transcription_tools.get_stt_model_from_config",
return_value=None,
):
result = await runner._enrich_message_with_transcription(
"caption",
["/tmp/voice.ogg"],
)
assert "No STT provider is configured" not in result
assert "trouble transcribing" in result
assert "caption" in result
+25 -1
View File
@@ -7,7 +7,7 @@ or corrupt user-visible content.
import re
import sys
from unittest.mock import MagicMock
from unittest.mock import AsyncMock, MagicMock
import pytest
@@ -392,3 +392,27 @@ class TestStripMdv2:
def test_empty_string(self):
assert _strip_mdv2("") == ""
@pytest.mark.asyncio
async def test_send_escapes_chunk_indicator_for_markdownv2(adapter):
adapter.MAX_MESSAGE_LENGTH = 80
adapter._bot = MagicMock()
sent_texts = []
async def _fake_send_message(**kwargs):
sent_texts.append(kwargs["text"])
msg = MagicMock()
msg.message_id = len(sent_texts)
return msg
adapter._bot.send_message = AsyncMock(side_effect=_fake_send_message)
content = ("**bold** chunk content " * 12).strip()
result = await adapter.send("123", content)
assert result.success is True
assert len(sent_texts) > 1
assert re.search(r" \\\([0-9]+/[0-9]+\\\)$", sent_texts[0])
assert re.search(r" \\\([0-9]+/[0-9]+\\\)$", sent_texts[-1])
+61 -1
View File
@@ -7,6 +7,7 @@ from hermes_cli.models import (
fetch_api_models,
normalize_provider,
parse_model_input,
probe_api_models,
provider_label,
provider_model_ids,
validate_requested_model,
@@ -26,7 +27,15 @@ FAKE_API_MODELS = [
def _validate(model, provider="openrouter", api_models=FAKE_API_MODELS, **kw):
"""Shortcut: call validate_requested_model with mocked API."""
with patch("hermes_cli.models.fetch_api_models", return_value=api_models):
probe_payload = {
"models": api_models,
"probed_url": "http://localhost:11434/v1/models",
"resolved_base_url": kw.get("base_url", "") or "http://localhost:11434/v1",
"suggested_base_url": None,
"used_fallback": False,
}
with patch("hermes_cli.models.fetch_api_models", return_value=api_models), \
patch("hermes_cli.models.probe_api_models", return_value=probe_payload):
return validate_requested_model(model, provider, **kw)
@@ -147,6 +156,33 @@ class TestFetchApiModels:
with patch("hermes_cli.models.urllib.request.urlopen", side_effect=Exception("timeout")):
assert fetch_api_models("key", "https://example.com/v1") is None
def test_probe_api_models_tries_v1_fallback(self):
class _Resp:
def __enter__(self):
return self
def __exit__(self, exc_type, exc, tb):
return False
def read(self):
return b'{"data": [{"id": "local-model"}]}'
calls = []
def _fake_urlopen(req, timeout=5.0):
calls.append(req.full_url)
if req.full_url.endswith("/v1/models"):
return _Resp()
raise Exception("404")
with patch("hermes_cli.models.urllib.request.urlopen", side_effect=_fake_urlopen):
probe = probe_api_models("key", "http://localhost:8000")
assert calls == ["http://localhost:8000/models", "http://localhost:8000/v1/models"]
assert probe["models"] == ["local-model"]
assert probe["resolved_base_url"] == "http://localhost:8000/v1"
assert probe["used_fallback"] is True
# -- validate — format checks -----------------------------------------------
@@ -191,6 +227,7 @@ class TestValidateApiFound:
)
assert result["accepted"] is True
assert result["persist"] is True
assert result["recognized"] is True
# -- validate — API not found ------------------------------------------------
@@ -232,3 +269,26 @@ class TestValidateApiFallback:
result = _validate("some-model", provider="totally-unknown", api_models=None)
assert result["accepted"] is True
assert result["persist"] is True
def test_custom_endpoint_warns_with_probed_url_and_v1_hint(self):
with patch(
"hermes_cli.models.probe_api_models",
return_value={
"models": None,
"probed_url": "http://localhost:8000/v1/models",
"resolved_base_url": "http://localhost:8000",
"suggested_base_url": "http://localhost:8000/v1",
"used_fallback": False,
},
):
result = validate_requested_model(
"qwen3",
"custom",
api_key="local-key",
base_url="http://localhost:8000",
)
assert result["accepted"] is True
assert result["persist"] is True
assert "http://localhost:8000/v1/models" in result["message"]
assert "http://localhost:8000/v1" in result["message"]
@@ -75,6 +75,58 @@ def test_setup_keep_current_custom_from_config_does_not_fall_through(tmp_path, m
assert calls["count"] == 1
def test_setup_custom_endpoint_saves_working_v1_base_url(tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
_clear_provider_env(monkeypatch)
config = load_config()
def fake_prompt_choice(question, choices, default=0):
if question == "Select your inference provider:":
return 3 # Custom endpoint
if question == "Configure vision:":
return len(choices) - 1 # Skip
raise AssertionError(f"Unexpected prompt_choice call: {question}")
def fake_prompt(message, current=None, **kwargs):
if "API base URL" in message:
return "http://localhost:8000"
if "API key" in message:
return "local-key"
if "Model name" in message:
return "llm"
return ""
monkeypatch.setattr("hermes_cli.setup.prompt_choice", fake_prompt_choice)
monkeypatch.setattr("hermes_cli.setup.prompt", fake_prompt)
monkeypatch.setattr("hermes_cli.setup.prompt_yes_no", lambda *args, **kwargs: False)
monkeypatch.setattr("hermes_cli.auth.get_active_provider", lambda: None)
monkeypatch.setattr("hermes_cli.auth.detect_external_credentials", lambda: [])
monkeypatch.setattr("agent.auxiliary_client.get_available_vision_backends", lambda: [])
monkeypatch.setattr(
"hermes_cli.models.probe_api_models",
lambda api_key, base_url: {
"models": ["llm"],
"probed_url": "http://localhost:8000/v1/models",
"resolved_base_url": "http://localhost:8000/v1",
"suggested_base_url": "http://localhost:8000/v1",
"used_fallback": True,
},
)
setup_model_provider(config)
save_config(config)
env = _read_env(tmp_path)
reloaded = load_config()
assert env.get("OPENAI_BASE_URL") == "http://localhost:8000/v1"
assert env.get("OPENAI_API_KEY") == "local-key"
assert reloaded["model"]["provider"] == "custom"
assert reloaded["model"]["base_url"] == "http://localhost:8000/v1"
assert reloaded["model"]["default"] == "llm"
def test_setup_keep_current_config_provider_uses_provider_specific_model_menu(tmp_path, monkeypatch):
"""Keep-current should respect config-backed providers, not fall back to OpenRouter."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
@@ -0,0 +1,29 @@
from hermes_cli import setup as setup_mod
def test_prompt_choice_uses_curses_helper(monkeypatch):
monkeypatch.setattr(setup_mod, "_curses_prompt_choice", lambda question, choices, default=0: 1)
idx = setup_mod.prompt_choice("Pick one", ["a", "b", "c"], default=0)
assert idx == 1
def test_prompt_choice_falls_back_to_numbered_input(monkeypatch):
monkeypatch.setattr(setup_mod, "_curses_prompt_choice", lambda question, choices, default=0: -1)
monkeypatch.setattr("builtins.input", lambda _prompt="": "2")
idx = setup_mod.prompt_choice("Pick one", ["a", "b", "c"], default=0)
assert idx == 1
def test_prompt_checklist_uses_shared_curses_checklist(monkeypatch):
monkeypatch.setattr(
"hermes_cli.curses_ui.curses_checklist",
lambda title, items, selected, cancel_returns=None: {0, 2},
)
selected = setup_mod.prompt_checklist("Pick tools", ["one", "two", "three"], pre_selected=[1])
assert selected == [0, 2]
+16
View File
@@ -68,6 +68,22 @@ class TestAtomicJsonWrite:
tmp_files = [f for f in tmp_path.iterdir() if ".tmp" in f.name]
assert len(tmp_files) == 0
def test_cleans_up_temp_file_on_baseexception(self, tmp_path):
class SimulatedAbort(BaseException):
pass
target = tmp_path / "data.json"
original = {"preserved": True}
target.write_text(json.dumps(original), encoding="utf-8")
with patch("utils.json.dump", side_effect=SimulatedAbort):
with pytest.raises(SimulatedAbort):
atomic_json_write(target, {"new": True})
tmp_files = [f for f in tmp_path.iterdir() if ".tmp" in f.name]
assert len(tmp_files) == 0
assert json.loads(target.read_text(encoding="utf-8")) == original
def test_accepts_string_path(self, tmp_path):
target = str(tmp_path / "string_path.json")
atomic_json_write(target, {"string": True})
+44
View File
@@ -0,0 +1,44 @@
"""Tests for utils.atomic_yaml_write — crash-safe YAML file writes."""
from pathlib import Path
from unittest.mock import patch
import pytest
import yaml
from utils import atomic_yaml_write
class TestAtomicYamlWrite:
def test_writes_valid_yaml(self, tmp_path):
target = tmp_path / "data.yaml"
data = {"key": "value", "nested": {"a": 1}}
atomic_yaml_write(target, data)
assert yaml.safe_load(target.read_text(encoding="utf-8")) == data
def test_cleans_up_temp_file_on_baseexception(self, tmp_path):
class SimulatedAbort(BaseException):
pass
target = tmp_path / "data.yaml"
original = {"preserved": True}
target.write_text(yaml.safe_dump(original), encoding="utf-8")
with patch("utils.yaml.dump", side_effect=SimulatedAbort):
with pytest.raises(SimulatedAbort):
atomic_yaml_write(target, {"new": True})
tmp_files = [f for f in tmp_path.iterdir() if ".tmp" in f.name]
assert len(tmp_files) == 0
assert yaml.safe_load(target.read_text(encoding="utf-8")) == original
def test_appends_extra_content(self, tmp_path):
target = tmp_path / "data.yaml"
atomic_yaml_write(target, {"key": "value"}, extra_content="\n# comment\n")
text = target.read_text(encoding="utf-8")
assert "key: value" in text
assert "# comment" in text
+103
View File
@@ -0,0 +1,103 @@
"""Tests for automatic MCP reload when config.yaml mcp_servers section changes."""
import time
from pathlib import Path
from unittest.mock import MagicMock, patch
def _make_cli(tmp_path, mcp_servers=None):
"""Create a minimal HermesCLI instance with mocked config."""
import cli as cli_mod
obj = object.__new__(cli_mod.HermesCLI)
obj.config = {"mcp_servers": mcp_servers or {}}
obj._agent_running = False
obj._last_config_check = 0.0
obj._config_mcp_servers = mcp_servers or {}
cfg_file = tmp_path / "config.yaml"
cfg_file.write_text("mcp_servers: {}\n")
obj._config_mtime = cfg_file.stat().st_mtime
obj._reload_mcp = MagicMock()
obj._busy_command = MagicMock()
obj._busy_command.return_value.__enter__ = MagicMock(return_value=None)
obj._busy_command.return_value.__exit__ = MagicMock(return_value=False)
obj._slow_command_status = MagicMock(return_value="reloading...")
return obj, cfg_file
class TestMCPConfigWatch:
def test_no_change_does_not_reload(self, tmp_path):
"""If mtime and mcp_servers unchanged, _reload_mcp is NOT called."""
obj, cfg_file = _make_cli(tmp_path)
with patch("hermes_cli.config.get_config_path", return_value=cfg_file):
obj._check_config_mcp_changes()
obj._reload_mcp.assert_not_called()
def test_mtime_change_with_same_mcp_servers_does_not_reload(self, tmp_path):
"""If file mtime changes but mcp_servers is identical, no reload."""
import yaml
obj, cfg_file = _make_cli(tmp_path, mcp_servers={"fs": {"command": "npx"}})
# Write same mcp_servers but touch the file
cfg_file.write_text(yaml.dump({"mcp_servers": {"fs": {"command": "npx"}}}))
# Force mtime to appear changed
obj._config_mtime = 0.0
with patch("hermes_cli.config.get_config_path", return_value=cfg_file):
obj._check_config_mcp_changes()
obj._reload_mcp.assert_not_called()
def test_new_mcp_server_triggers_reload(self, tmp_path):
"""Adding a new MCP server to config triggers auto-reload."""
import yaml
obj, cfg_file = _make_cli(tmp_path, mcp_servers={})
# Simulate user adding a new MCP server to config.yaml
cfg_file.write_text(yaml.dump({"mcp_servers": {"github": {"url": "https://mcp.github.com"}}}))
obj._config_mtime = 0.0 # force stale mtime
with patch("hermes_cli.config.get_config_path", return_value=cfg_file):
obj._check_config_mcp_changes()
obj._reload_mcp.assert_called_once()
def test_removed_mcp_server_triggers_reload(self, tmp_path):
"""Removing an MCP server from config triggers auto-reload."""
import yaml
obj, cfg_file = _make_cli(tmp_path, mcp_servers={"github": {"url": "https://mcp.github.com"}})
# Simulate user removing the server
cfg_file.write_text(yaml.dump({"mcp_servers": {}}))
obj._config_mtime = 0.0
with patch("hermes_cli.config.get_config_path", return_value=cfg_file):
obj._check_config_mcp_changes()
obj._reload_mcp.assert_called_once()
def test_interval_throttle_skips_check(self, tmp_path):
"""If called within CONFIG_WATCH_INTERVAL, stat() is skipped."""
obj, cfg_file = _make_cli(tmp_path)
obj._last_config_check = time.monotonic() # just checked
with patch("hermes_cli.config.get_config_path", return_value=cfg_file), \
patch.object(Path, "stat") as mock_stat:
obj._check_config_mcp_changes()
mock_stat.assert_not_called()
obj._reload_mcp.assert_not_called()
def test_missing_config_file_does_not_crash(self, tmp_path):
"""If config.yaml doesn't exist, _check_config_mcp_changes is a no-op."""
obj, cfg_file = _make_cli(tmp_path)
missing = tmp_path / "nonexistent.yaml"
with patch("hermes_cli.config.get_config_path", return_value=missing):
obj._check_config_mcp_changes() # should not raise
obj._reload_mcp.assert_not_called()
+39 -1
View File
@@ -336,4 +336,42 @@ def test_cmd_model_falls_back_to_auto_on_invalid_provider(monkeypatch, capsys):
assert "Warning:" in output
assert "falling back to auto provider detection" in output.lower()
assert "No change." in output
assert "No change." in output
def test_model_flow_custom_saves_verified_v1_base_url(monkeypatch, capsys):
monkeypatch.setattr(
"hermes_cli.config.get_env_value",
lambda key: "" if key in {"OPENAI_BASE_URL", "OPENAI_API_KEY"} else "",
)
saved_env = {}
monkeypatch.setattr("hermes_cli.config.save_env_value", lambda key, value: saved_env.__setitem__(key, value))
monkeypatch.setattr("hermes_cli.auth._save_model_choice", lambda model: saved_env.__setitem__("MODEL", model))
monkeypatch.setattr("hermes_cli.auth.deactivate_provider", lambda: None)
monkeypatch.setattr("hermes_cli.main._save_custom_provider", lambda *args, **kwargs: None)
monkeypatch.setattr(
"hermes_cli.models.probe_api_models",
lambda api_key, base_url: {
"models": ["llm"],
"probed_url": "http://localhost:8000/v1/models",
"resolved_base_url": "http://localhost:8000/v1",
"suggested_base_url": "http://localhost:8000/v1",
"used_fallback": True,
},
)
monkeypatch.setattr(
"hermes_cli.config.load_config",
lambda: {"model": {"default": "", "provider": "custom", "base_url": ""}},
)
monkeypatch.setattr("hermes_cli.config.save_config", lambda cfg: None)
answers = iter(["http://localhost:8000", "local-key", "llm"])
monkeypatch.setattr("builtins.input", lambda _prompt="": next(answers))
hermes_main._model_flow_custom({})
output = capsys.readouterr().out
assert "Saving the working base URL instead" in output
assert saved_env["OPENAI_BASE_URL"] == "http://localhost:8000/v1"
assert saved_env["OPENAI_API_KEY"] == "local-key"
assert saved_env["MODEL"] == "llm"
+32
View File
@@ -612,6 +612,25 @@ class TestBuildApiKwargs:
kwargs = agent._build_api_kwargs(messages)
assert kwargs["extra_body"]["reasoning"] == {"enabled": False}
def test_reasoning_not_sent_for_unsupported_openrouter_model(self, agent):
agent.model = "minimax/minimax-m2.5"
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
assert "reasoning" not in kwargs.get("extra_body", {})
def test_reasoning_sent_for_supported_openrouter_model(self, agent):
agent.model = "qwen/qwen3.5-plus-02-15"
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
assert kwargs["extra_body"]["reasoning"]["effort"] == "medium"
def test_reasoning_sent_for_nous_route(self, agent):
agent.base_url = "https://inference-api.nousresearch.com/v1"
agent.model = "minimax/minimax-m2.5"
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
assert kwargs["extra_body"]["reasoning"]["effort"] == "medium"
def test_max_tokens_injected(self, agent):
agent.max_tokens = 4096
messages = [{"role": "user", "content": "hi"}]
@@ -942,6 +961,19 @@ class TestHandleMaxIterations:
assert "error" in result.lower()
assert "API down" in result
def test_summary_skips_reasoning_for_unsupported_openrouter_model(self, agent):
agent.model = "minimax/minimax-m2.5"
resp = _mock_response(content="Summary")
agent.client.chat.completions.create.return_value = resp
agent._cached_system_prompt = "You are helpful."
messages = [{"role": "user", "content": "do stuff"}]
result = agent._handle_max_iterations(messages, 60)
assert result == "Summary"
kwargs = agent.client.chat.completions.create.call_args.kwargs
assert "reasoning" not in kwargs.get("extra_body", {})
class TestRunConversation:
"""Tests for the main run_conversation method.
+28
View File
@@ -1,8 +1,10 @@
"""Tests for tools/checkpoint_manager.py — CheckpointManager."""
import logging
import os
import json
import shutil
import subprocess
import pytest
from pathlib import Path
from unittest.mock import patch
@@ -143,6 +145,12 @@ class TestTakeCheckpoint:
result = mgr.ensure_checkpoint(str(work_dir), "initial")
assert result is True
def test_successful_checkpoint_does_not_log_expected_diff_exit(self, mgr, work_dir, caplog):
with caplog.at_level(logging.ERROR, logger="tools.checkpoint_manager"):
result = mgr.ensure_checkpoint(str(work_dir), "initial")
assert result is True
assert not any("diff --cached --quiet" in r.getMessage() for r in caplog.records)
def test_dedup_same_turn(self, mgr, work_dir):
r1 = mgr.ensure_checkpoint(str(work_dir), "first")
r2 = mgr.ensure_checkpoint(str(work_dir), "second")
@@ -375,6 +383,26 @@ class TestErrorResilience:
result = mgr.ensure_checkpoint(str(work_dir), "test")
assert result is False
def test_run_git_allows_expected_nonzero_without_error_log(self, tmp_path, caplog):
completed = subprocess.CompletedProcess(
args=["git", "diff", "--cached", "--quiet"],
returncode=1,
stdout="",
stderr="",
)
with patch("tools.checkpoint_manager.subprocess.run", return_value=completed):
with caplog.at_level(logging.ERROR, logger="tools.checkpoint_manager"):
ok, stdout, stderr = _run_git(
["diff", "--cached", "--quiet"],
tmp_path / "shadow",
str(tmp_path / "work"),
allowed_returncodes={1},
)
assert ok is False
assert stdout == ""
assert stderr == ""
assert not caplog.records
def test_checkpoint_failure_does_not_raise(self, mgr, work_dir, monkeypatch):
"""Checkpoint failures should never raise — they're silently logged."""
def broken_run_git(*args, **kwargs):
+16 -2
View File
@@ -5,6 +5,7 @@ handling without requiring a running terminal environment.
"""
import json
import logging
from unittest.mock import MagicMock, patch
from tools.file_tools import (
@@ -87,13 +88,26 @@ class TestWriteFileHandler:
mock_ops.write_file.assert_called_once_with("/tmp/out.txt", "hello world!\n")
@patch("tools.file_tools._get_file_ops")
def test_exception_returns_error_json(self, mock_get):
def test_permission_error_returns_error_json_without_error_log(self, mock_get, caplog):
mock_get.side_effect = PermissionError("read-only filesystem")
from tools.file_tools import write_file_tool
result = json.loads(write_file_tool("/tmp/out.txt", "data"))
with caplog.at_level(logging.DEBUG, logger="tools.file_tools"):
result = json.loads(write_file_tool("/tmp/out.txt", "data"))
assert "error" in result
assert "read-only" in result["error"]
assert any("write_file expected denial" in r.getMessage() for r in caplog.records)
assert not any(r.levelno >= logging.ERROR for r in caplog.records)
@patch("tools.file_tools._get_file_ops")
def test_unexpected_exception_still_logs_error(self, mock_get, caplog):
mock_get.side_effect = RuntimeError("boom")
from tools.file_tools import write_file_tool
with caplog.at_level(logging.ERROR, logger="tools.file_tools"):
result = json.loads(write_file_tool("/tmp/out.txt", "data"))
assert result["error"] == "boom"
assert any("write_file error" in r.getMessage() for r in caplog.records)
class TestPatchHandler:
+42
View File
@@ -232,6 +232,48 @@ class TestCheckFnExceptionHandling:
assert any(u["name"] == "crashes" for u in unavailable)
class TestEmojiMetadata:
"""Verify per-tool emoji registration and lookup."""
def test_emoji_stored_on_entry(self):
reg = ToolRegistry()
reg.register(
name="t", toolset="s", schema=_make_schema(),
handler=_dummy_handler, emoji="🔥",
)
assert reg._tools["t"].emoji == "🔥"
def test_get_emoji_returns_registered(self):
reg = ToolRegistry()
reg.register(
name="t", toolset="s", schema=_make_schema(),
handler=_dummy_handler, emoji="🎯",
)
assert reg.get_emoji("t") == "🎯"
def test_get_emoji_returns_default_when_unset(self):
reg = ToolRegistry()
reg.register(
name="t", toolset="s", schema=_make_schema(),
handler=_dummy_handler,
)
assert reg.get_emoji("t") == ""
assert reg.get_emoji("t", default="🔧") == "🔧"
def test_get_emoji_returns_default_for_unknown_tool(self):
reg = ToolRegistry()
assert reg.get_emoji("nonexistent") == ""
assert reg.get_emoji("nonexistent", default="") == ""
def test_emoji_empty_string_treated_as_unset(self):
reg = ToolRegistry()
reg.register(
name="t", toolset="s", schema=_make_schema(),
handler=_dummy_handler, emoji="",
)
assert reg.get_emoji("t") == ""
class TestSecretCaptureResultContract:
def test_secret_request_result_does_not_include_secret_value(self):
result = {
+53 -2
View File
@@ -8,6 +8,7 @@ from unittest.mock import MagicMock
import pytest
from tools.environments.ssh import SSHEnvironment
from tools.environments import ssh as ssh_env
_SSH_HOST = os.getenv("TERMINAL_SSH_HOST", "")
_SSH_USER = os.getenv("TERMINAL_SSH_USER", "")
@@ -67,16 +68,66 @@ class TestBuildSSHCommand:
class TestTerminalToolConfig:
def test_ssh_persistent_default_false(self, monkeypatch):
def test_ssh_persistent_default_true(self, monkeypatch):
"""SSH persistent defaults to True (via TERMINAL_PERSISTENT_SHELL)."""
monkeypatch.delenv("TERMINAL_SSH_PERSISTENT", raising=False)
monkeypatch.delenv("TERMINAL_PERSISTENT_SHELL", raising=False)
from tools.terminal_tool import _get_env_config
assert _get_env_config()["ssh_persistent"] is True
def test_ssh_persistent_explicit_false(self, monkeypatch):
"""Per-backend env var overrides the global default."""
monkeypatch.setenv("TERMINAL_SSH_PERSISTENT", "false")
from tools.terminal_tool import _get_env_config
assert _get_env_config()["ssh_persistent"] is False
def test_ssh_persistent_true(self, monkeypatch):
def test_ssh_persistent_explicit_true(self, monkeypatch):
monkeypatch.setenv("TERMINAL_SSH_PERSISTENT", "true")
from tools.terminal_tool import _get_env_config
assert _get_env_config()["ssh_persistent"] is True
def test_ssh_persistent_respects_config(self, monkeypatch):
"""TERMINAL_PERSISTENT_SHELL=false disables SSH persistent by default."""
monkeypatch.delenv("TERMINAL_SSH_PERSISTENT", raising=False)
monkeypatch.setenv("TERMINAL_PERSISTENT_SHELL", "false")
from tools.terminal_tool import _get_env_config
assert _get_env_config()["ssh_persistent"] is False
class TestSSHPreflight:
def test_ensure_ssh_available_raises_clear_error_when_missing(self, monkeypatch):
monkeypatch.setattr(ssh_env.shutil, "which", lambda _name: None)
with pytest.raises(RuntimeError, match="SSH is not installed or not in PATH"):
ssh_env._ensure_ssh_available()
def test_ssh_environment_checks_availability_before_connect(self, monkeypatch):
monkeypatch.setattr(ssh_env.shutil, "which", lambda _name: None)
monkeypatch.setattr(
ssh_env.SSHEnvironment,
"_establish_connection",
lambda self: pytest.fail("_establish_connection should not run when ssh is missing"),
)
with pytest.raises(RuntimeError, match="openssh-client"):
ssh_env.SSHEnvironment(host="example.com", user="alice")
def test_ssh_environment_connects_when_ssh_exists(self, monkeypatch):
called = {"count": 0}
monkeypatch.setattr(ssh_env.shutil, "which", lambda _name: "/usr/bin/ssh")
def _fake_establish(self):
called["count"] += 1
monkeypatch.setattr(ssh_env.SSHEnvironment, "_establish_connection", _fake_establish)
env = ssh_env.SSHEnvironment(host="example.com", user="alice")
assert called["count"] == 1
assert env.host == "example.com"
assert env.user == "alice"
def _setup_ssh_env(monkeypatch, persistent: bool):
monkeypatch.setenv("TERMINAL_ENV", "ssh")
+33
View File
@@ -315,6 +315,23 @@ class TestEnsureInstalled:
mock_thread.start.assert_called_once()
_tirith_mod._resolved_path = None
@patch("tools.tirith_security._load_security_config")
def test_startup_prefetch_can_suppress_install_failure_logs(self, mock_cfg):
mock_cfg.return_value = {"tirith_enabled": True, "tirith_path": "tirith",
"tirith_timeout": 5, "tirith_fail_open": True}
_tirith_mod._resolved_path = None
with patch("tools.tirith_security.shutil.which", return_value=None), \
patch("tools.tirith_security._hermes_bin_dir", return_value="/nonexistent"), \
patch("tools.tirith_security._is_install_failed_on_disk", return_value=False), \
patch("tools.tirith_security.threading.Thread") as MockThread:
mock_thread = MagicMock()
MockThread.return_value = mock_thread
result = ensure_installed(log_failures=False)
assert result is None
assert MockThread.call_args.kwargs["kwargs"] == {"log_failures": False}
mock_thread.start.assert_called_once()
_tirith_mod._resolved_path = None
# ---------------------------------------------------------------------------
# Failed download caches the miss (Finding #1)
@@ -516,6 +533,22 @@ class TestCosignVerification:
assert path is None
assert reason == "cosign_missing"
@patch("tools.tirith_security.logger.debug")
@patch("tools.tirith_security.logger.warning")
@patch("tools.tirith_security.shutil.which", return_value=None)
@patch("tools.tirith_security._download_file")
@patch("tools.tirith_security._detect_target", return_value="aarch64-apple-darwin")
def test_install_quiet_mode_downgrades_cosign_missing_log(self, mock_target, mock_dl,
mock_which, mock_warning,
mock_debug):
"""Startup prefetch should not surface cosign-missing as a warning."""
from tools.tirith_security import _install_tirith
path, reason = _install_tirith(log_failures=False)
assert path is None
assert reason == "cosign_missing"
mock_warning.assert_not_called()
mock_debug.assert_called()
@patch("tools.tirith_security._verify_cosign", return_value=None)
@patch("tools.tirith_security.shutil.which", return_value="/usr/local/bin/cosign")
@patch("tools.tirith_security._download_file")
+97
View File
@@ -7,6 +7,7 @@ end-to-end dispatch. All external dependencies are mocked.
import os
import struct
import subprocess
import wave
from unittest.mock import MagicMock, patch
@@ -45,7 +46,10 @@ def sample_ogg(tmp_path):
def clean_env(monkeypatch):
"""Ensure no real API keys leak into tests."""
monkeypatch.delenv("VOICE_TOOLS_OPENAI_KEY", raising=False)
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
monkeypatch.delenv("GROQ_API_KEY", raising=False)
monkeypatch.delenv("HERMES_LOCAL_STT_COMMAND", raising=False)
monkeypatch.delenv("HERMES_LOCAL_STT_LANGUAGE", raising=False)
# ============================================================================
@@ -132,6 +136,19 @@ class TestGetProviderFallbackPriority:
from tools.transcription_tools import _get_provider
assert _get_provider({}) == "local"
def test_openai_fallback_to_local_command(self, monkeypatch):
monkeypatch.delenv("VOICE_TOOLS_OPENAI_KEY", raising=False)
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
monkeypatch.delenv("GROQ_API_KEY", raising=False)
monkeypatch.setenv(
"HERMES_LOCAL_STT_COMMAND",
"whisper {input_path} --output_dir {output_dir} --language {language}",
)
with patch("tools.transcription_tools._HAS_FASTER_WHISPER", False), \
patch("tools.transcription_tools._HAS_OPENAI", True):
from tools.transcription_tools import _get_provider
assert _get_provider({"provider": "openai"}) == "local_command"
# ============================================================================
# _transcribe_groq
@@ -279,6 +296,63 @@ class TestTranscribeOpenAIExtended:
assert "Permission denied" in result["error"]
class TestTranscribeLocalCommand:
def test_auto_detects_local_whisper_binary(self, monkeypatch):
monkeypatch.delenv("HERMES_LOCAL_STT_COMMAND", raising=False)
monkeypatch.setattr("tools.transcription_tools._find_whisper_binary", lambda: "/opt/homebrew/bin/whisper")
from tools.transcription_tools import _get_local_command_template
template = _get_local_command_template()
assert template is not None
assert template.startswith("/opt/homebrew/bin/whisper ")
assert "{model}" in template
assert "{output_dir}" in template
def test_command_fallback_with_template(self, monkeypatch, sample_ogg, tmp_path):
out_dir = tmp_path / "local-out"
out_dir.mkdir()
monkeypatch.setenv(
"HERMES_LOCAL_STT_COMMAND",
"whisper {input_path} --model {model} --output_dir {output_dir} --language {language}",
)
monkeypatch.setenv("HERMES_LOCAL_STT_LANGUAGE", "en")
def fake_tempdir(prefix=None):
class _TempDir:
def __enter__(self_inner):
return str(out_dir)
def __exit__(self_inner, exc_type, exc, tb):
return False
return _TempDir()
def fake_run(cmd, *args, **kwargs):
if isinstance(cmd, list):
output_path = cmd[-1]
with open(output_path, "wb") as handle:
handle.write(b"RIFF....WAVEfmt ")
return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
(out_dir / "test.txt").write_text("hello from local command\n", encoding="utf-8")
return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
monkeypatch.setattr("tools.transcription_tools.tempfile.TemporaryDirectory", fake_tempdir)
monkeypatch.setattr("tools.transcription_tools._find_ffmpeg_binary", lambda: "/opt/homebrew/bin/ffmpeg")
monkeypatch.setattr("tools.transcription_tools.subprocess.run", fake_run)
from tools.transcription_tools import _transcribe_local_command
result = _transcribe_local_command(sample_ogg, "base")
assert result["success"] is True
assert result["transcript"] == "hello from local command"
assert result["provider"] == "local_command"
# ============================================================================
# _transcribe_local — additional tests
# ============================================================================
@@ -612,6 +686,29 @@ class TestTranscribeAudioDispatch:
assert "faster-whisper" in result["error"]
assert "GROQ_API_KEY" in result["error"]
def test_openai_provider_falls_back_to_local_command(self, monkeypatch, sample_ogg):
monkeypatch.delenv("VOICE_TOOLS_OPENAI_KEY", raising=False)
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
monkeypatch.setenv(
"HERMES_LOCAL_STT_COMMAND",
"whisper {input_path} --model {model} --output_dir {output_dir} --language {language}",
)
with patch("tools.transcription_tools._load_stt_config", return_value={"provider": "openai"}), \
patch("tools.transcription_tools._HAS_FASTER_WHISPER", False), \
patch("tools.transcription_tools._HAS_OPENAI", True), \
patch("tools.transcription_tools._transcribe_local_command", return_value={
"success": True,
"transcript": "hello from fallback",
"provider": "local_command",
}) as mock_local_command:
from tools.transcription_tools import transcribe_audio
result = transcribe_audio(sample_ogg)
assert result["success"] is True
assert result["transcript"] == "hello from fallback"
mock_local_command.assert_called_once_with(sample_ogg, "base")
def test_invalid_file_short_circuits(self):
from tools.transcription_tools import transcribe_audio
result = transcribe_audio("/nonexistent/audio.wav")
+11
View File
@@ -1833,6 +1833,7 @@ registry.register(
schema=_BROWSER_SCHEMA_MAP["browser_navigate"],
handler=lambda args, **kw: browser_navigate(url=args.get("url", ""), task_id=kw.get("task_id")),
check_fn=check_browser_requirements,
emoji="🌐",
)
registry.register(
name="browser_snapshot",
@@ -1841,6 +1842,7 @@ registry.register(
handler=lambda args, **kw: browser_snapshot(
full=args.get("full", False), task_id=kw.get("task_id"), user_task=kw.get("user_task")),
check_fn=check_browser_requirements,
emoji="📸",
)
registry.register(
name="browser_click",
@@ -1848,6 +1850,7 @@ registry.register(
schema=_BROWSER_SCHEMA_MAP["browser_click"],
handler=lambda args, **kw: browser_click(**args, task_id=kw.get("task_id")),
check_fn=check_browser_requirements,
emoji="👆",
)
registry.register(
name="browser_type",
@@ -1855,6 +1858,7 @@ registry.register(
schema=_BROWSER_SCHEMA_MAP["browser_type"],
handler=lambda args, **kw: browser_type(**args, task_id=kw.get("task_id")),
check_fn=check_browser_requirements,
emoji="⌨️",
)
registry.register(
name="browser_scroll",
@@ -1862,6 +1866,7 @@ registry.register(
schema=_BROWSER_SCHEMA_MAP["browser_scroll"],
handler=lambda args, **kw: browser_scroll(**args, task_id=kw.get("task_id")),
check_fn=check_browser_requirements,
emoji="📜",
)
registry.register(
name="browser_back",
@@ -1869,6 +1874,7 @@ registry.register(
schema=_BROWSER_SCHEMA_MAP["browser_back"],
handler=lambda args, **kw: browser_back(task_id=kw.get("task_id")),
check_fn=check_browser_requirements,
emoji="◀️",
)
registry.register(
name="browser_press",
@@ -1876,6 +1882,7 @@ registry.register(
schema=_BROWSER_SCHEMA_MAP["browser_press"],
handler=lambda args, **kw: browser_press(key=args.get("key", ""), task_id=kw.get("task_id")),
check_fn=check_browser_requirements,
emoji="⌨️",
)
registry.register(
name="browser_close",
@@ -1883,6 +1890,7 @@ registry.register(
schema=_BROWSER_SCHEMA_MAP["browser_close"],
handler=lambda args, **kw: browser_close(task_id=kw.get("task_id")),
check_fn=check_browser_requirements,
emoji="🚪",
)
registry.register(
name="browser_get_images",
@@ -1890,6 +1898,7 @@ registry.register(
schema=_BROWSER_SCHEMA_MAP["browser_get_images"],
handler=lambda args, **kw: browser_get_images(task_id=kw.get("task_id")),
check_fn=check_browser_requirements,
emoji="🖼️",
)
registry.register(
name="browser_vision",
@@ -1897,6 +1906,7 @@ registry.register(
schema=_BROWSER_SCHEMA_MAP["browser_vision"],
handler=lambda args, **kw: browser_vision(question=args.get("question", ""), annotate=args.get("annotate", False), task_id=kw.get("task_id")),
check_fn=check_browser_requirements,
emoji="👁️",
)
registry.register(
name="browser_console",
@@ -1904,4 +1914,5 @@ registry.register(
schema=_BROWSER_SCHEMA_MAP["browser_console"],
handler=lambda args, **kw: browser_console(clear=args.get("clear", False), task_id=kw.get("task_id")),
check_fn=check_browser_requirements,
emoji="🖥️",
)
+13 -3
View File
@@ -92,10 +92,17 @@ def _run_git(
shadow_repo: Path,
working_dir: str,
timeout: int = _GIT_TIMEOUT,
allowed_returncodes: Optional[Set[int]] = None,
) -> tuple:
"""Run a git command against the shadow repo. Returns (ok, stdout, stderr)."""
"""Run a git command against the shadow repo. Returns (ok, stdout, stderr).
``allowed_returncodes`` suppresses error logging for known/expected non-zero
exits while preserving the normal ``ok = (returncode == 0)`` contract.
Example: ``git diff --cached --quiet`` returns 1 when changes exist.
"""
env = _git_env(shadow_repo, working_dir)
cmd = ["git"] + list(args)
allowed_returncodes = allowed_returncodes or set()
try:
result = subprocess.run(
cmd,
@@ -108,7 +115,7 @@ def _run_git(
ok = result.returncode == 0
stdout = result.stdout.strip()
stderr = result.stderr.strip()
if not ok:
if not ok and result.returncode not in allowed_returncodes:
logger.error(
"Git command failed: %s (rc=%d) stderr=%s",
" ".join(cmd), result.returncode, stderr,
@@ -381,7 +388,10 @@ class CheckpointManager:
# Check if there's anything to commit
ok_diff, diff_out, _ = _run_git(
["diff", "--cached", "--quiet"], shadow, working_dir,
["diff", "--cached", "--quiet"],
shadow,
working_dir,
allowed_returncodes={1},
)
if ok_diff:
# No changes to commit
+1
View File
@@ -137,4 +137,5 @@ registry.register(
choices=args.get("choices"),
callback=kw.get("callback")),
check_fn=check_clarify_requirements,
emoji="",
)
+1
View File
@@ -776,4 +776,5 @@ registry.register(
task_id=kw.get("task_id"),
enabled_tools=kw.get("enabled_tools")),
check_fn=check_sandbox_requirements,
emoji="🐍",
)
+1
View File
@@ -458,4 +458,5 @@ registry.register(
task_id=kw.get("task_id"),
),
check_fn=check_cronjob_requirements,
emoji="",
)
+3 -9
View File
@@ -116,15 +116,8 @@ def _build_child_progress_callback(task_index: int, parent_agent, task_count: in
# Regular tool call event
if spinner:
short = (preview[:35] + "...") if preview and len(preview) > 35 else (preview or "")
tool_emojis = {
"terminal": "💻", "web_search": "🔍", "web_extract": "📄",
"read_file": "📖", "write_file": "✍️", "patch": "🔧",
"search_files": "🔎", "list_directory": "📂",
"browser_navigate": "🌐", "browser_click": "👆",
"text_to_speech": "🔊", "image_generate": "🎨",
"vision_analyze": "👁️", "process": "⚙️",
}
emoji = tool_emojis.get(tool_name, "")
from agent.display import get_tool_emoji
emoji = get_tool_emoji(tool_name)
line = f" {prefix}├─ {emoji} {tool_name}"
if short:
line += f" \"{short}\""
@@ -758,4 +751,5 @@ registry.register(
max_iterations=args.get("max_iterations"),
parent_agent=kw.get("parent_agent")),
check_fn=check_delegate_requirements,
emoji="🔀",
)
+10
View File
@@ -1,6 +1,7 @@
"""SSH remote execution environment with ControlMaster connection persistence."""
import logging
import shutil
import subprocess
import tempfile
import threading
@@ -14,6 +15,14 @@ from tools.interrupt import is_interrupted
logger = logging.getLogger(__name__)
def _ensure_ssh_available() -> None:
"""Fail fast with a clear error when the SSH client is unavailable."""
if not shutil.which("ssh"):
raise RuntimeError(
"SSH is not installed or not in PATH. Install OpenSSH client: apt install openssh-client"
)
class SSHEnvironment(PersistentShellMixin, BaseEnvironment):
"""Run commands on a remote machine over SSH.
@@ -44,6 +53,7 @@ class SSHEnvironment(PersistentShellMixin, BaseEnvironment):
self.control_dir = Path(tempfile.gettempdir()) / "hermes-ssh"
self.control_dir.mkdir(parents=True, exist_ok=True)
self.control_socket = self.control_dir / f"{user}@{host}:{port}.sock"
_ensure_ssh_available()
self._establish_connection()
if self.persistent:
+21 -5
View File
@@ -1,6 +1,7 @@
#!/usr/bin/env python3
"""File Tools Module - LLM agent file manipulation tools."""
import errno
import json
import logging
import os
@@ -11,6 +12,18 @@ from agent.redact import redact_sensitive_text
logger = logging.getLogger(__name__)
_EXPECTED_WRITE_ERRNOS = {errno.EACCES, errno.EPERM, errno.EROFS}
def _is_expected_write_exception(exc: Exception) -> bool:
"""Return True for expected write denials that should not hit error logs."""
if isinstance(exc, PermissionError):
return True
if isinstance(exc, OSError) and exc.errno in _EXPECTED_WRITE_ERRNOS:
return True
return False
_file_ops_lock = threading.Lock()
_file_ops_cache: dict = {}
@@ -257,7 +270,10 @@ def write_file_tool(path: str, content: str, task_id: str = "default") -> str:
result = file_ops.write_file(path, content)
return json.dumps(result.to_dict(), ensure_ascii=False)
except Exception as e:
logger.error("write_file error: %s: %s", type(e).__name__, e)
if _is_expected_write_exception(e):
logger.debug("write_file expected denial: %s: %s", type(e).__name__, e)
else:
logger.error("write_file error: %s: %s", type(e).__name__, e, exc_info=True)
return json.dumps({"error": str(e)}, ensure_ascii=False)
@@ -467,7 +483,7 @@ def _handle_search_files(args, **kw):
output_mode=args.get("output_mode", "content"), context=args.get("context", 0), task_id=tid)
registry.register(name="read_file", toolset="file", schema=READ_FILE_SCHEMA, handler=_handle_read_file, check_fn=_check_file_reqs)
registry.register(name="write_file", toolset="file", schema=WRITE_FILE_SCHEMA, handler=_handle_write_file, check_fn=_check_file_reqs)
registry.register(name="patch", toolset="file", schema=PATCH_SCHEMA, handler=_handle_patch, check_fn=_check_file_reqs)
registry.register(name="search_files", toolset="file", schema=SEARCH_FILES_SCHEMA, handler=_handle_search_files, check_fn=_check_file_reqs)
registry.register(name="read_file", toolset="file", schema=READ_FILE_SCHEMA, handler=_handle_read_file, check_fn=_check_file_reqs, emoji="📖")
registry.register(name="write_file", toolset="file", schema=WRITE_FILE_SCHEMA, handler=_handle_write_file, check_fn=_check_file_reqs, emoji="✍️")
registry.register(name="patch", toolset="file", schema=PATCH_SCHEMA, handler=_handle_patch, check_fn=_check_file_reqs, emoji="🔧")
registry.register(name="search_files", toolset="file", schema=SEARCH_FILES_SCHEMA, handler=_handle_search_files, check_fn=_check_file_reqs, emoji="🔎")
+4
View File
@@ -459,6 +459,7 @@ registry.register(
schema=HA_LIST_ENTITIES_SCHEMA,
handler=_handle_list_entities,
check_fn=_check_ha_available,
emoji="🏠",
)
registry.register(
@@ -467,6 +468,7 @@ registry.register(
schema=HA_GET_STATE_SCHEMA,
handler=_handle_get_state,
check_fn=_check_ha_available,
emoji="🏠",
)
registry.register(
@@ -475,6 +477,7 @@ registry.register(
schema=HA_LIST_SERVICES_SCHEMA,
handler=_handle_list_services,
check_fn=_check_ha_available,
emoji="🏠",
)
registry.register(
@@ -483,4 +486,5 @@ registry.register(
schema=HA_CALL_SERVICE_SCHEMA,
handler=_handle_call_service,
check_fn=_check_ha_available,
emoji="🏠",
)
+4
View File
@@ -222,6 +222,7 @@ registry.register(
schema=_PROFILE_SCHEMA,
handler=_handle_honcho_profile,
check_fn=_check_honcho_available,
emoji="🔮",
)
registry.register(
@@ -230,6 +231,7 @@ registry.register(
schema=_SEARCH_SCHEMA,
handler=_handle_honcho_search,
check_fn=_check_honcho_available,
emoji="🔮",
)
registry.register(
@@ -238,6 +240,7 @@ registry.register(
schema=_QUERY_SCHEMA,
handler=_handle_honcho_context,
check_fn=_check_honcho_available,
emoji="🔮",
)
registry.register(
@@ -246,4 +249,5 @@ registry.register(
schema=_CONCLUDE_SCHEMA,
handler=_handle_honcho_conclude,
check_fn=_check_honcho_available,
emoji="🔮",
)
+1
View File
@@ -558,4 +558,5 @@ registry.register(
check_fn=check_image_generation_requirements,
requires_env=["FAL_KEY"],
is_async=False, # Switched to sync fal_client API to fix "Event loop is closed" in gateway
emoji="🎨",
)
+1
View File
@@ -496,6 +496,7 @@ registry.register(
old_text=args.get("old_text"),
store=kw.get("store")),
check_fn=check_memory_requirements,
emoji="🧠",
)
+1
View File
@@ -544,4 +544,5 @@ registry.register(
check_fn=check_moa_requirements,
requires_env=["OPENROUTER_API_KEY"],
is_async=True,
emoji="🧠",
)
+1
View File
@@ -858,4 +858,5 @@ registry.register(
toolset="terminal",
schema=PROCESS_SCHEMA,
handler=_handle_process,
emoji="⚙️",
)
+10 -2
View File
@@ -26,11 +26,11 @@ class ToolEntry:
__slots__ = (
"name", "toolset", "schema", "handler", "check_fn",
"requires_env", "is_async", "description",
"requires_env", "is_async", "description", "emoji",
)
def __init__(self, name, toolset, schema, handler, check_fn,
requires_env, is_async, description):
requires_env, is_async, description, emoji):
self.name = name
self.toolset = toolset
self.schema = schema
@@ -39,6 +39,7 @@ class ToolEntry:
self.requires_env = requires_env
self.is_async = is_async
self.description = description
self.emoji = emoji
class ToolRegistry:
@@ -62,6 +63,7 @@ class ToolRegistry:
requires_env: list = None,
is_async: bool = False,
description: str = "",
emoji: str = "",
):
"""Register a tool. Called at module-import time by each tool file."""
self._tools[name] = ToolEntry(
@@ -73,6 +75,7 @@ class ToolRegistry:
requires_env=requires_env or [],
is_async=is_async,
description=description or schema.get("description", ""),
emoji=emoji,
)
if check_fn and toolset not in self._toolset_checks:
self._toolset_checks[toolset] = check_fn
@@ -141,6 +144,11 @@ class ToolRegistry:
entry = self._tools.get(name)
return entry.toolset if entry else None
def get_emoji(self, name: str, default: str = "") -> str:
"""Return the emoji for a tool, or *default* if unset."""
entry = self._tools.get(name)
return (entry.emoji if entry and entry.emoji else default)
def get_tool_to_toolset_map(self) -> Dict[str, str]:
"""Return ``{tool_name: toolset_name}`` for every registered tool."""
return {name: e.toolset for name, e in self._tools.items()}
+10 -10
View File
@@ -1374,24 +1374,24 @@ RL_TEST_INFERENCE_SCHEMA = {"name": "rl_test_inference", "description": "Quick i
_rl_env = ["TINKER_API_KEY", "WANDB_API_KEY"]
registry.register(name="rl_list_environments", toolset="rl", schema=RL_LIST_ENVIRONMENTS_SCHEMA,
registry.register(name="rl_list_environments", emoji="🧪", toolset="rl", schema=RL_LIST_ENVIRONMENTS_SCHEMA,
handler=lambda args, **kw: rl_list_environments(), check_fn=check_rl_api_keys, requires_env=_rl_env, is_async=True)
registry.register(name="rl_select_environment", toolset="rl", schema=RL_SELECT_ENVIRONMENT_SCHEMA,
registry.register(name="rl_select_environment", emoji="🧪", toolset="rl", schema=RL_SELECT_ENVIRONMENT_SCHEMA,
handler=lambda args, **kw: rl_select_environment(name=args.get("name", "")), check_fn=check_rl_api_keys, requires_env=_rl_env, is_async=True)
registry.register(name="rl_get_current_config", toolset="rl", schema=RL_GET_CURRENT_CONFIG_SCHEMA,
registry.register(name="rl_get_current_config", emoji="🧪", toolset="rl", schema=RL_GET_CURRENT_CONFIG_SCHEMA,
handler=lambda args, **kw: rl_get_current_config(), check_fn=check_rl_api_keys, requires_env=_rl_env, is_async=True)
registry.register(name="rl_edit_config", toolset="rl", schema=RL_EDIT_CONFIG_SCHEMA,
registry.register(name="rl_edit_config", emoji="🧪", toolset="rl", schema=RL_EDIT_CONFIG_SCHEMA,
handler=lambda args, **kw: rl_edit_config(field=args.get("field", ""), value=args.get("value")), check_fn=check_rl_api_keys, requires_env=_rl_env, is_async=True)
registry.register(name="rl_start_training", toolset="rl", schema=RL_START_TRAINING_SCHEMA,
registry.register(name="rl_start_training", emoji="🧪", toolset="rl", schema=RL_START_TRAINING_SCHEMA,
handler=lambda args, **kw: rl_start_training(), check_fn=check_rl_api_keys, requires_env=_rl_env, is_async=True)
registry.register(name="rl_check_status", toolset="rl", schema=RL_CHECK_STATUS_SCHEMA,
registry.register(name="rl_check_status", emoji="🧪", toolset="rl", schema=RL_CHECK_STATUS_SCHEMA,
handler=lambda args, **kw: rl_check_status(run_id=args.get("run_id", "")), check_fn=check_rl_api_keys, requires_env=_rl_env, is_async=True)
registry.register(name="rl_stop_training", toolset="rl", schema=RL_STOP_TRAINING_SCHEMA,
registry.register(name="rl_stop_training", emoji="🧪", toolset="rl", schema=RL_STOP_TRAINING_SCHEMA,
handler=lambda args, **kw: rl_stop_training(run_id=args.get("run_id", "")), check_fn=check_rl_api_keys, requires_env=_rl_env, is_async=True)
registry.register(name="rl_get_results", toolset="rl", schema=RL_GET_RESULTS_SCHEMA,
registry.register(name="rl_get_results", emoji="🧪", toolset="rl", schema=RL_GET_RESULTS_SCHEMA,
handler=lambda args, **kw: rl_get_results(run_id=args.get("run_id", "")), check_fn=check_rl_api_keys, requires_env=_rl_env, is_async=True)
registry.register(name="rl_list_runs", toolset="rl", schema=RL_LIST_RUNS_SCHEMA,
registry.register(name="rl_list_runs", emoji="🧪", toolset="rl", schema=RL_LIST_RUNS_SCHEMA,
handler=lambda args, **kw: rl_list_runs(), check_fn=check_rl_api_keys, requires_env=_rl_env, is_async=True)
registry.register(name="rl_test_inference", toolset="rl", schema=RL_TEST_INFERENCE_SCHEMA,
registry.register(name="rl_test_inference", emoji="🧪", toolset="rl", schema=RL_TEST_INFERENCE_SCHEMA,
handler=lambda args, **kw: rl_test_inference(num_steps=args.get("num_steps", 3), group_size=args.get("group_size", 16), models=args.get("models")),
check_fn=check_rl_api_keys, requires_env=_rl_env, is_async=True)
+1
View File
@@ -512,4 +512,5 @@ registry.register(
schema=SEND_MESSAGE_SCHEMA,
handler=send_message_tool,
check_fn=_check_send_message,
emoji="📨",
)
+1
View File
@@ -385,4 +385,5 @@ registry.register(
db=kw.get("db"),
current_session_id=kw.get("current_session_id")),
check_fn=check_session_search_requirements,
emoji="🔍",
)
+1
View File
@@ -653,4 +653,5 @@ registry.register(
old_string=args.get("old_string"),
new_string=args.get("new_string"),
replace_all=args.get("replace_all", False)),
emoji="📝",
)
+2
View File
@@ -1261,6 +1261,7 @@ registry.register(
category=args.get("category"), task_id=kw.get("task_id")
),
check_fn=check_skills_requirements,
emoji="📚",
)
registry.register(
name="skill_view",
@@ -1270,4 +1271,5 @@ registry.register(
args.get("name", ""), file_path=args.get("file_path"), task_id=kw.get("task_id")
),
check_fn=check_skills_requirements,
emoji="📚",
)
+8 -1
View File
@@ -505,7 +505,13 @@ def _get_env_config() -> Dict[str, Any]:
"ssh_user": os.getenv("TERMINAL_SSH_USER", ""),
"ssh_port": _parse_env_var("TERMINAL_SSH_PORT", "22"),
"ssh_key": os.getenv("TERMINAL_SSH_KEY", ""),
"ssh_persistent": os.getenv("TERMINAL_SSH_PERSISTENT", "false").lower() in ("true", "1", "yes"),
# Persistent shell: SSH defaults to the config-level persistent_shell
# setting (true by default for non-local backends); local is always opt-in.
# Per-backend env vars override if explicitly set.
"ssh_persistent": os.getenv(
"TERMINAL_SSH_PERSISTENT",
os.getenv("TERMINAL_PERSISTENT_SHELL", "true"),
).lower() in ("true", "1", "yes"),
"local_persistent": os.getenv("TERMINAL_LOCAL_PERSISTENT", "false").lower() in ("true", "1", "yes"),
# Container resource config (applies to docker, singularity, modal, daytona -- ignored for local/ssh)
"container_cpu": _parse_env_var("TERMINAL_CONTAINER_CPU", "1", float, "number"),
@@ -1333,4 +1339,5 @@ registry.register(
schema=TERMINAL_SCHEMA,
handler=_handle_terminal,
check_fn=check_terminal_requirements,
emoji="💻",
)
+18 -13
View File
@@ -279,7 +279,7 @@ def _verify_checksum(archive_path: str, checksums_path: str, archive_name: str)
return True
def _install_tirith() -> tuple[str | None, str]:
def _install_tirith(*, log_failures: bool = True) -> tuple[str | None, str]:
"""Download and install tirith to $HERMES_HOME/bin/tirith.
Verifies provenance via cosign and SHA-256 checksum.
@@ -287,6 +287,8 @@ def _install_tirith() -> tuple[str | None, str]:
failure_reason is a short tag used by the disk marker to decide if the
failure is retryable (e.g. "cosign_missing" clears when cosign appears).
"""
log = logger.warning if log_failures else logger.debug
target = _detect_target()
if not target:
logger.info("tirith auto-install: unsupported platform %s/%s",
@@ -309,7 +311,7 @@ def _install_tirith() -> tuple[str | None, str]:
_download_file(f"{base_url}/{archive_name}", archive_path)
_download_file(f"{base_url}/checksums.txt", checksums_path)
except Exception as exc:
logger.warning("tirith download failed: %s", exc)
log("tirith download failed: %s", exc)
return None, "download_failed"
# Cosign provenance verification is mandatory for auto-install.
@@ -320,25 +322,25 @@ def _install_tirith() -> tuple[str | None, str]:
_download_file(f"{base_url}/checksums.txt.sig", sig_path)
_download_file(f"{base_url}/checksums.txt.pem", cert_path)
except Exception as exc:
logger.warning("tirith install skipped: cosign artifacts unavailable (%s). "
"Install tirith manually or install cosign for auto-install.", exc)
log("tirith install skipped: cosign artifacts unavailable (%s). "
"Install tirith manually or install cosign for auto-install.", exc)
return None, "cosign_artifacts_unavailable"
# Check cosign availability before attempting verification so we can
# distinguish "not installed" (retryable) from "installed but broken."
if not shutil.which("cosign"):
logger.warning("tirith install skipped: cosign not found on PATH. "
"Install cosign for auto-install, or install tirith manually.")
log("tirith install skipped: cosign not found on PATH. "
"Install cosign for auto-install, or install tirith manually.")
return None, "cosign_missing"
cosign_result = _verify_cosign(checksums_path, sig_path, cert_path)
if cosign_result is not True:
# False = verification rejected, None = execution failure (timeout/OSError)
if cosign_result is None:
logger.warning("tirith install aborted: cosign execution failed")
log("tirith install aborted: cosign execution failed")
return None, "cosign_exec_failed"
else:
logger.warning("tirith install aborted: cosign provenance verification failed")
log("tirith install aborted: cosign provenance verification failed")
return None, "cosign_verification_failed"
if not _verify_checksum(archive_path, checksums_path, archive_name):
@@ -354,7 +356,7 @@ def _install_tirith() -> tuple[str | None, str]:
tar.extract(member, tmpdir)
break
else:
logger.warning("tirith binary not found in archive")
log("tirith binary not found in archive")
return None, "binary_not_in_archive"
src = os.path.join(tmpdir, "tirith")
@@ -473,7 +475,7 @@ def _resolve_tirith_path(configured_path: str) -> str:
return expanded
def _background_install():
def _background_install(*, log_failures: bool = True):
"""Background thread target: download and install tirith."""
global _resolved_path, _install_failure_reason
with _install_lock:
@@ -494,7 +496,7 @@ def _background_install():
_install_failure_reason = ""
return
installed, reason = _install_tirith()
installed, reason = _install_tirith(log_failures=log_failures)
if installed:
_resolved_path = installed
_install_failure_reason = ""
@@ -505,7 +507,7 @@ def _background_install():
_mark_install_failed(reason)
def ensure_installed():
def ensure_installed(*, log_failures: bool = True):
"""Ensure tirith is available, downloading in background if needed.
Quick PATH/local checks are synchronous; network download runs in a
@@ -578,7 +580,10 @@ def ensure_installed():
# Need to download — launch background thread so startup doesn't block
if _install_thread is None or not _install_thread.is_alive():
_install_thread = threading.Thread(
target=_background_install, daemon=True)
target=_background_install,
kwargs={"log_failures": log_failures},
daemon=True,
)
_install_thread.start()
return None # Not available yet; commands will fail-open until ready
+1
View File
@@ -264,4 +264,5 @@ registry.register(
handler=lambda args, **kw: todo_tool(
todos=args.get("todos"), merge=args.get("merge", False), store=kw.get("store")),
check_fn=check_todo_requirements,
emoji="📋",
)
+183 -9
View File
@@ -25,6 +25,10 @@ Usage::
import logging
import os
import shlex
import shutil
import subprocess
import tempfile
from pathlib import Path
from typing import Optional, Dict, Any
@@ -44,13 +48,18 @@ _HAS_OPENAI = _ilu.find_spec("openai") is not None
DEFAULT_PROVIDER = "local"
DEFAULT_LOCAL_MODEL = "base"
DEFAULT_LOCAL_STT_LANGUAGE = "en"
DEFAULT_STT_MODEL = os.getenv("STT_OPENAI_MODEL", "whisper-1")
DEFAULT_GROQ_STT_MODEL = os.getenv("STT_GROQ_MODEL", "whisper-large-v3-turbo")
LOCAL_STT_COMMAND_ENV = "HERMES_LOCAL_STT_COMMAND"
LOCAL_STT_LANGUAGE_ENV = "HERMES_LOCAL_STT_LANGUAGE"
COMMON_LOCAL_BIN_DIRS = ("/opt/homebrew/bin", "/usr/local/bin")
GROQ_BASE_URL = os.getenv("GROQ_BASE_URL", "https://api.groq.com/openai/v1")
OPENAI_BASE_URL = os.getenv("STT_OPENAI_BASE_URL", "https://api.openai.com/v1")
SUPPORTED_FORMATS = {".mp3", ".mp4", ".mpeg", ".mpga", ".m4a", ".wav", ".webm", ".ogg"}
LOCAL_NATIVE_AUDIO_FORMATS = {".wav", ".aiff", ".aif"}
MAX_FILE_SIZE = 25 * 1024 * 1024 # 25 MB
# Known model sets for auto-correction
@@ -105,6 +114,53 @@ def is_stt_enabled(stt_config: Optional[dict] = None) -> bool:
return bool(enabled)
def _resolve_openai_api_key() -> str:
"""Prefer the voice-tools key, but fall back to the normal OpenAI key."""
return os.getenv("VOICE_TOOLS_OPENAI_KEY", "") or os.getenv("OPENAI_API_KEY", "")
def _find_binary(binary_name: str) -> Optional[str]:
"""Find a local binary, checking common Homebrew/local prefixes as well as PATH."""
for directory in COMMON_LOCAL_BIN_DIRS:
candidate = Path(directory) / binary_name
if candidate.exists() and os.access(candidate, os.X_OK):
return str(candidate)
return shutil.which(binary_name)
def _find_ffmpeg_binary() -> Optional[str]:
return _find_binary("ffmpeg")
def _find_whisper_binary() -> Optional[str]:
return _find_binary("whisper")
def _get_local_command_template() -> Optional[str]:
configured = os.getenv(LOCAL_STT_COMMAND_ENV, "").strip()
if configured:
return configured
whisper_binary = _find_whisper_binary()
if whisper_binary:
quoted_binary = shlex.quote(whisper_binary)
return (
f"{quoted_binary} {{input_path}} --model {{model}} --output_format txt "
"--output_dir {output_dir} --language {language}"
)
return None
def _has_local_command() -> bool:
return _get_local_command_template() is not None
def _normalize_local_command_model(model_name: Optional[str]) -> str:
if not model_name or model_name in OPENAI_MODELS or model_name in GROQ_MODELS:
return DEFAULT_LOCAL_MODEL
return model_name
def _get_provider(stt_config: dict) -> str:
"""Determine which STT provider to use.
@@ -121,15 +177,32 @@ def _get_provider(stt_config: dict) -> str:
if provider == "local":
if _HAS_FASTER_WHISPER:
return "local"
if _has_local_command():
logger.info("faster-whisper not installed, falling back to local STT command")
return "local_command"
# Local requested but not available — fall back to groq, then openai
if _HAS_OPENAI and os.getenv("GROQ_API_KEY"):
logger.info("faster-whisper not installed, falling back to Groq Whisper API")
return "groq"
if _HAS_OPENAI and os.getenv("VOICE_TOOLS_OPENAI_KEY"):
if _HAS_OPENAI and _resolve_openai_api_key():
logger.info("faster-whisper not installed, falling back to OpenAI Whisper API")
return "openai"
return "none"
if provider == "local_command":
if _has_local_command():
return "local_command"
if _HAS_FASTER_WHISPER:
logger.info("Local STT command unavailable, falling back to local faster-whisper")
return "local"
if _HAS_OPENAI and os.getenv("GROQ_API_KEY"):
logger.info("Local STT command unavailable, falling back to Groq Whisper API")
return "groq"
if _HAS_OPENAI and _resolve_openai_api_key():
logger.info("Local STT command unavailable, falling back to OpenAI Whisper API")
return "openai"
return "none"
if provider == "groq":
if _HAS_OPENAI and os.getenv("GROQ_API_KEY"):
return "groq"
@@ -137,20 +210,26 @@ def _get_provider(stt_config: dict) -> str:
if _HAS_FASTER_WHISPER:
logger.info("GROQ_API_KEY not set, falling back to local faster-whisper")
return "local"
if _HAS_OPENAI and os.getenv("VOICE_TOOLS_OPENAI_KEY"):
if _has_local_command():
logger.info("GROQ_API_KEY not set, falling back to local STT command")
return "local_command"
if _HAS_OPENAI and _resolve_openai_api_key():
logger.info("GROQ_API_KEY not set, falling back to OpenAI Whisper API")
return "openai"
return "none"
if provider == "openai":
if _HAS_OPENAI and os.getenv("VOICE_TOOLS_OPENAI_KEY"):
if _HAS_OPENAI and _resolve_openai_api_key():
return "openai"
# OpenAI requested but no key — fall back
if _HAS_FASTER_WHISPER:
logger.info("VOICE_TOOLS_OPENAI_KEY not set, falling back to local faster-whisper")
logger.info("OpenAI STT key not set, falling back to local faster-whisper")
return "local"
if _has_local_command():
logger.info("OpenAI STT key not set, falling back to local STT command")
return "local_command"
if _HAS_OPENAI and os.getenv("GROQ_API_KEY"):
logger.info("VOICE_TOOLS_OPENAI_KEY not set, falling back to Groq Whisper API")
logger.info("OpenAI STT key not set, falling back to Groq Whisper API")
return "groq"
return "none"
@@ -222,6 +301,89 @@ def _transcribe_local(file_path: str, model_name: str) -> Dict[str, Any]:
logger.error("Local transcription failed: %s", e, exc_info=True)
return {"success": False, "transcript": "", "error": f"Local transcription failed: {e}"}
def _prepare_local_audio(file_path: str, work_dir: str) -> tuple[Optional[str], Optional[str]]:
"""Normalize audio for local CLI STT when needed."""
audio_path = Path(file_path)
if audio_path.suffix.lower() in LOCAL_NATIVE_AUDIO_FORMATS:
return file_path, None
ffmpeg = _find_ffmpeg_binary()
if not ffmpeg:
return None, "Local STT fallback requires ffmpeg for non-WAV inputs, but ffmpeg was not found"
converted_path = os.path.join(work_dir, f"{audio_path.stem}.wav")
command = [ffmpeg, "-y", "-i", file_path, converted_path]
try:
subprocess.run(command, check=True, capture_output=True, text=True)
return converted_path, None
except subprocess.CalledProcessError as e:
details = e.stderr.strip() or e.stdout.strip() or str(e)
logger.error("ffmpeg conversion failed for %s: %s", file_path, details)
return None, f"Failed to convert audio for local STT: {details}"
def _transcribe_local_command(file_path: str, model_name: str) -> Dict[str, Any]:
"""Run the configured local STT command template and read back a .txt transcript."""
command_template = _get_local_command_template()
if not command_template:
return {
"success": False,
"transcript": "",
"error": (
f"{LOCAL_STT_COMMAND_ENV} not configured and no local whisper binary was found"
),
}
language = os.getenv(LOCAL_STT_LANGUAGE_ENV, DEFAULT_LOCAL_STT_LANGUAGE)
normalized_model = _normalize_local_command_model(model_name)
try:
with tempfile.TemporaryDirectory(prefix="hermes-local-stt-") as output_dir:
prepared_input, prep_error = _prepare_local_audio(file_path, output_dir)
if prep_error:
return {"success": False, "transcript": "", "error": prep_error}
command = command_template.format(
input_path=shlex.quote(prepared_input),
output_dir=shlex.quote(output_dir),
language=shlex.quote(language),
model=shlex.quote(normalized_model),
)
subprocess.run(command, shell=True, check=True, capture_output=True, text=True)
txt_files = sorted(Path(output_dir).glob("*.txt"))
if not txt_files:
return {
"success": False,
"transcript": "",
"error": "Local STT command completed but did not produce a .txt transcript",
}
transcript_text = txt_files[0].read_text(encoding="utf-8").strip()
logger.info(
"Transcribed %s via local STT command (%s, %d chars)",
Path(file_path).name,
normalized_model,
len(transcript_text),
)
return {"success": True, "transcript": transcript_text, "provider": "local_command"}
except KeyError as e:
return {
"success": False,
"transcript": "",
"error": f"Invalid {LOCAL_STT_COMMAND_ENV} template, missing placeholder: {e}",
}
except subprocess.CalledProcessError as e:
details = e.stderr.strip() or e.stdout.strip() or str(e)
logger.error("Local STT command failed for %s: %s", file_path, details)
return {"success": False, "transcript": "", "error": f"Local STT failed: {details}"}
except Exception as e:
logger.error("Unexpected error during local command transcription: %s", e, exc_info=True)
return {"success": False, "transcript": "", "error": f"Local transcription failed: {e}"}
# ---------------------------------------------------------------------------
# Provider: groq (Whisper API — free tier)
# ---------------------------------------------------------------------------
@@ -277,9 +439,13 @@ def _transcribe_groq(file_path: str, model_name: str) -> Dict[str, Any]:
def _transcribe_openai(file_path: str, model_name: str) -> Dict[str, Any]:
"""Transcribe using OpenAI Whisper API (paid)."""
api_key = os.getenv("VOICE_TOOLS_OPENAI_KEY")
api_key = _resolve_openai_api_key()
if not api_key:
return {"success": False, "transcript": "", "error": "VOICE_TOOLS_OPENAI_KEY not set"}
return {
"success": False,
"transcript": "",
"error": "Neither VOICE_TOOLS_OPENAI_KEY nor OPENAI_API_KEY is set",
}
if not _HAS_OPENAI:
return {"success": False, "transcript": "", "error": "openai package not installed"}
@@ -363,6 +529,13 @@ def transcribe_audio(file_path: str, model: Optional[str] = None) -> Dict[str, A
model_name = model or local_cfg.get("model", DEFAULT_LOCAL_MODEL)
return _transcribe_local(file_path, model_name)
if provider == "local_command":
local_cfg = stt_config.get("local", {})
model_name = _normalize_local_command_model(
model or local_cfg.get("model", DEFAULT_LOCAL_MODEL)
)
return _transcribe_local_command(file_path, model_name)
if provider == "groq":
model_name = model or DEFAULT_GROQ_STT_MODEL
return _transcribe_groq(file_path, model_name)
@@ -378,7 +551,8 @@ def transcribe_audio(file_path: str, model: Optional[str] = None) -> Dict[str, A
"transcript": "",
"error": (
"No STT provider available. Install faster-whisper for free local "
"transcription, set GROQ_API_KEY for free Groq Whisper, "
"or set VOICE_TOOLS_OPENAI_KEY for the OpenAI Whisper API."
f"transcription, configure {LOCAL_STT_COMMAND_ENV} or install a local whisper CLI, "
"set GROQ_API_KEY for free Groq Whisper, or set VOICE_TOOLS_OPENAI_KEY "
"or OPENAI_API_KEY for the OpenAI Whisper API."
),
}
+1
View File
@@ -743,4 +743,5 @@ registry.register(
text=args.get("text", ""),
output_path=args.get("output_path")),
check_fn=check_tts_requirements,
emoji="🔊",
)
+1
View File
@@ -493,4 +493,5 @@ registry.register(
handler=_handle_vision_analyze,
check_fn=check_vision_requirements,
is_async=True,
emoji="👁️",
)
+2
View File
@@ -1258,6 +1258,7 @@ registry.register(
handler=lambda args, **kw: web_search_tool(args.get("query", ""), limit=5),
check_fn=check_firecrawl_api_key,
requires_env=["FIRECRAWL_API_KEY"],
emoji="🔍",
)
registry.register(
name="web_extract",
@@ -1268,4 +1269,5 @@ registry.register(
check_fn=check_firecrawl_api_key,
requires_env=["FIRECRAWL_API_KEY"],
is_async=True,
emoji="📄",
)
+4
View File
@@ -50,6 +50,8 @@ def atomic_json_write(
os.fsync(f.fileno())
os.replace(tmp_path, path)
except BaseException:
# Intentionally catch BaseException so temp-file cleanup still runs for
# KeyboardInterrupt/SystemExit before re-raising the original signal.
try:
os.unlink(tmp_path)
except OSError:
@@ -96,6 +98,8 @@ def atomic_yaml_write(
os.fsync(f.fileno())
os.replace(tmp_path, path)
except BaseException:
# Match atomic_json_write: cleanup must also happen for process-level
# interruptions before we re-raise them.
try:
os.unlink(tmp_path)
except OSError:
@@ -31,7 +31,9 @@ All variables go in `~/.hermes/.env`. You can also set them with `hermes config
| `CLAUDE_CODE_OAUTH_TOKEN` | Explicit Claude Code token override if you export one manually |
| `HERMES_MODEL` | Preferred model name (checked before `LLM_MODEL`, used by gateway) |
| `LLM_MODEL` | Default model name (fallback when not set in config.yaml) |
| `VOICE_TOOLS_OPENAI_KEY` | OpenAI key for OpenAI speech-to-text and text-to-speech providers |
| `VOICE_TOOLS_OPENAI_KEY` | Preferred OpenAI key for OpenAI speech-to-text and text-to-speech providers |
| `HERMES_LOCAL_STT_COMMAND` | Optional local speech-to-text command template. Supports `{input_path}`, `{output_dir}`, `{language}`, and `{model}` placeholders |
| `HERMES_LOCAL_STT_LANGUAGE` | Default language passed to `HERMES_LOCAL_STT_COMMAND` or auto-detected local `whisper` CLI fallback (default: `en`) |
| `HERMES_HOME` | Override Hermes config directory (default: `~/.hermes`) |
## Provider Auth (OAuth)
@@ -93,6 +95,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
| `TERMINAL_SSH_USER` | SSH username |
| `TERMINAL_SSH_PORT` | SSH port (default: 22) |
| `TERMINAL_SSH_KEY` | Path to private key |
| `TERMINAL_SSH_PERSISTENT` | Override persistent shell for SSH (default: follows `TERMINAL_PERSISTENT_SHELL`) |
## Container Resources (Docker, Singularity, Modal, Daytona)
@@ -104,6 +107,14 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
| `TERMINAL_CONTAINER_PERSISTENT` | Persist container filesystem across sessions (default: `true`) |
| `TERMINAL_SANDBOX_DIR` | Host directory for workspaces and overlays (default: `~/.hermes/sandboxes/`) |
## Persistent Shell
| Variable | Description |
|----------|-------------|
| `TERMINAL_PERSISTENT_SHELL` | Enable persistent shell for non-local backends (default: `true`). Also settable via `terminal.persistent_shell` in config.yaml |
| `TERMINAL_LOCAL_PERSISTENT` | Enable persistent shell for local backend (default: `false`) |
| `TERMINAL_SSH_PERSISTENT` | Override persistent shell for SSH backend (default: follows `TERMINAL_PERSISTENT_SHELL`) |
## Messaging
| Variable | Description |
+58
View File
@@ -462,6 +462,9 @@ terminal:
container_memory: 5120 # MB (default 5GB)
container_disk: 51200 # MB (default 50GB)
container_persistent: true # Persist filesystem across sessions
# Persistent shell — keep a long-lived bash process across commands
persistent_shell: true # Enabled by default for SSH backend
```
### Common Terminal Backend Issues
@@ -517,6 +520,46 @@ This is useful for:
Can also be set via environment variable: `TERMINAL_DOCKER_VOLUMES='["/host:/container"]'` (JSON array).
### Persistent Shell
By default, each terminal command runs in its own subprocess — working directory, environment variables, and shell variables reset between commands. When **persistent shell** is enabled, a single long-lived bash process is kept alive across `execute()` calls so that state survives between commands.
This is most useful for the **SSH backend**, where it also eliminates per-command connection overhead. Persistent shell is **enabled by default for SSH** and disabled for the local backend.
```yaml
terminal:
persistent_shell: true # default — enables persistent shell for SSH
```
To disable:
```bash
hermes config set terminal.persistent_shell false
```
**What persists across commands:**
- Working directory (`cd /tmp` sticks for the next command)
- Exported environment variables (`export FOO=bar`)
- Shell variables (`MY_VAR=hello`)
**Precedence:**
| Level | Variable | Default |
|-------|----------|---------|
| Config | `terminal.persistent_shell` | `true` |
| SSH override | `TERMINAL_SSH_PERSISTENT` | follows config |
| Local override | `TERMINAL_LOCAL_PERSISTENT` | `false` |
Per-backend environment variables take highest precedence. If you want persistent shell on the local backend too:
```bash
export TERMINAL_LOCAL_PERSISTENT=true
```
:::note
Commands that require `stdin_data` or sudo automatically fall back to one-shot mode, since the persistent shell's stdin is already occupied by the IPC protocol.
:::
See [Code Execution](features/code-execution.md) and the [Terminal section of the README](features/tools.md) for details on each backend.
## Memory Configuration
@@ -805,6 +848,21 @@ voice:
Use `/voice on` in the CLI to enable microphone mode, `record_key` to start/stop recording, and `/voice tts` to toggle spoken replies. See [Voice Mode](/docs/user-guide/features/voice-mode) for end-to-end setup and platform-specific behavior.
## Group Chat Session Isolation
Control whether shared chats keep one conversation per room or one conversation per participant:
```yaml
group_sessions_per_user: true # true = per-user isolation in groups/channels, false = one shared session per chat
```
- `true` is the default and recommended setting. In Discord channels, Telegram groups, Slack channels, and similar shared contexts, each sender gets their own session when the platform provides a user ID.
- `false` reverts to the old shared-room behavior. That can be useful if you explicitly want Hermes to treat a channel like one collaborative conversation, but it also means users share context, token costs, and interrupt state.
- Direct messages are unaffected. Hermes still keys DMs by chat/DM ID as usual.
- Threads stay isolated from their parent channel either way; with `true`, each participant also gets their own session inside the thread.
For the behavior details and examples, see [Sessions](/docs/user-guide/sessions) and the [Discord guide](/docs/user-guide/messaging/discord).
## Quick Commands
Define custom commands that run shell commands without invoking the LLM — zero token usage, instant execution. Especially useful from messaging platforms (Telegram, Discord, etc.) for quick server checks or utility scripts.
+13 -7
View File
@@ -74,10 +74,11 @@ Voice messages sent on Telegram, Discord, WhatsApp, Slack, or Signal are automat
| Provider | Quality | Cost | API Key |
|----------|---------|------|---------|
| **Local Whisper** (default) | Good | Free | None needed |
| **OpenAI Whisper API** | GoodBest | Paid | `VOICE_TOOLS_OPENAI_KEY` |
| **Groq Whisper API** | GoodBest | Free tier | `GROQ_API_KEY` |
| **OpenAI Whisper API** | GoodBest | Paid | `VOICE_TOOLS_OPENAI_KEY` or `OPENAI_API_KEY` |
:::info Zero Config
Local transcription works out of the box — no API key needed. The `faster-whisper` model (~150 MB for `base`) is auto-downloaded on first voice message.
Local transcription works out of the box when `faster-whisper` is installed. If that's unavailable, Hermes can also use a local `whisper` CLI from common install locations (like `/opt/homebrew/bin`) or a custom command via `HERMES_LOCAL_STT_COMMAND`.
:::
### Configuration
@@ -85,7 +86,7 @@ Local transcription works out of the box — no API key needed. The `faster-whis
```yaml
# In ~/.hermes/config.yaml
stt:
provider: "local" # "local" (free, faster-whisper) | "openai" (API)
provider: "local" # "local" | "groq" | "openai"
local:
model: "base" # tiny, base, small, medium, large-v3
openai:
@@ -104,11 +105,16 @@ stt:
| `medium` | ~1.5 GB | Slower | Great |
| `large-v3` | ~3 GB | Slowest | Best |
**OpenAI API** — Requires `VOICE_TOOLS_OPENAI_KEY`. Supports `whisper-1`, `gpt-4o-mini-transcribe`, and `gpt-4o-transcribe`.
**Groq API** — Requires `GROQ_API_KEY`. Good cloud fallback when you want a free hosted STT option.
**OpenAI API** — Accepts `VOICE_TOOLS_OPENAI_KEY` first and falls back to `OPENAI_API_KEY`. Supports `whisper-1`, `gpt-4o-mini-transcribe`, and `gpt-4o-transcribe`.
**Custom local CLI fallback** — Set `HERMES_LOCAL_STT_COMMAND` if you want Hermes to call a local transcription command directly. The command template supports `{input_path}`, `{output_dir}`, `{language}`, and `{model}` placeholders.
### Fallback Behavior
If your configured provider isn't available, Hermes automatically falls back:
- **Local not installed**Falls back to OpenAI API (if key is set)
- **OpenAI key not set** → Falls back to local Whisper (if installed)
- **Neither available** → Voice messages pass through with a note to the user
- **Local faster-whisper unavailable**Tries a local `whisper` CLI or `HERMES_LOCAL_STT_COMMAND` before cloud providers
- **Groq key not set** → Falls back to local transcription, then OpenAI
- **OpenAI key not set** → Falls back to local transcription, then Groq
- **Nothing available** → Voice messages pass through with an accurate note to the user
+84 -4
View File
@@ -14,15 +14,71 @@ Before setup, here's the part most people want to know: how Hermes behaves once
| Context | Behavior |
|---------|----------|
| **DMs** | Hermes responds to every message. No `@mention` needed. |
| **DMs** | Hermes responds to every message. No `@mention` needed. Each DM has its own session. |
| **Server channels** | By default, Hermes only responds when you `@mention` it. If you post in a channel without mentioning it, Hermes ignores the message. |
| **Free-response channels** | You can make specific channels mention-free with `DISCORD_FREE_RESPONSE_CHANNELS`, or disable mentions globally with `DISCORD_REQUIRE_MENTION=false`. |
| **Threads** | Hermes replies in the same thread. Mention rules still apply unless that thread or its parent channel is configured as free-response. |
| **Threads** | Hermes replies in the same thread. Mention rules still apply unless that thread or its parent channel is configured as free-response. Threads stay isolated from the parent channel for session history. |
| **Shared channels with multiple users** | By default, Hermes isolates session history per user inside the channel for safety and clarity. Two people talking in the same channel do not share one transcript unless you explicitly disable that. |
:::tip
If you want a normal shared bot channel where people can talk to Hermes without tagging it every time, add that channel to `DISCORD_FREE_RESPONSE_CHANNELS`.
If you want a normal bot-help channel where people can talk to Hermes without tagging it every time, add that channel to `DISCORD_FREE_RESPONSE_CHANNELS`.
:::
### Discord Gateway Model
Hermes on Discord is not a webhook that replies statelessly. It runs through the full messaging gateway, which means each incoming message goes through:
1. authorization (`DISCORD_ALLOWED_USERS`)
2. mention / free-response checks
3. session lookup
4. session transcript loading
5. normal Hermes agent execution, including tools, memory, and slash commands
6. response delivery back to Discord
That matters because behavior in a busy server depends on both Discord routing and Hermes session policy.
### Session Model in Discord
By default:
- each DM gets its own session
- each server thread gets its own session namespace
- each user in a shared channel gets their own session inside that channel
So if Alice and Bob both talk to Hermes in `#research`, Hermes treats those as separate conversations by default even though they are using the same visible Discord channel.
This is controlled by `config.yaml`:
```yaml
group_sessions_per_user: true
```
Set it to `false` only if you explicitly want one shared conversation for the entire room:
```yaml
group_sessions_per_user: false
```
Shared sessions can be useful for a collaborative room, but they also mean:
- users share context growth and token costs
- one person's long tool-heavy task can bloat everyone else's context
- one person's in-flight run can interrupt another person's follow-up in the same room
### Interrupts and Concurrency
Hermes tracks running agents by session key.
With the default `group_sessions_per_user: true`:
- Alice interrupting her own in-flight request only affects Alice's session in that channel
- Bob can keep talking in the same channel without inheriting Alice's history or interrupting Alice's run
With `group_sessions_per_user: false`:
- the whole room shares one running-agent slot for that channel/thread
- follow-up messages from different people can interrupt or queue behind each other
This guide walks you through the full setup process — from creating your bot on Discord's Developer Portal to sending your first message.
## Step 1: Create a Discord Application
@@ -175,13 +231,25 @@ Add the following to your `~/.hermes/.env` file:
```bash
# Required
DISCORD_BOT_TOKEN=your-bot-token-from-developer-portal
DISCORD_BOT_TOKEN=your-bot-token
DISCORD_ALLOWED_USERS=284102345871466496
# Multiple allowed users (comma-separated)
# DISCORD_ALLOWED_USERS=284102345871466496,198765432109876543
```
Optional behavior settings in `~/.hermes/config.yaml`:
```yaml
discord:
require_mention: true
group_sessions_per_user: true
```
- `discord.require_mention: true` keeps Hermes quiet in normal server traffic unless mentioned
- `group_sessions_per_user: true` keeps each participant's context isolated inside shared channels and threads
### Start the Gateway
Once configured, start the Discord gateway:
@@ -265,6 +333,18 @@ For the full setup and operational guide, see:
**Fix**: Add your User ID to `DISCORD_ALLOWED_USERS` in `~/.hermes/.env` and restart the gateway.
### People in the same channel are sharing context unexpectedly
**Cause**: `group_sessions_per_user` is disabled, or the platform cannot provide a user ID for the messages in that context.
**Fix**: Set this in `~/.hermes/config.yaml` and restart the gateway:
```yaml
group_sessions_per_user: true
```
If you intentionally want a shared room conversation, leave it off — just expect shared transcript history and shared interrupt behavior.
## Security
:::warning
+25 -10
View File
@@ -299,17 +299,32 @@ The agent is prompted to use session search automatically:
On messaging platforms, sessions are keyed by a deterministic session key built from the message source:
| Chat Type | Key Format | Example |
|-----------|-----------|---------|
| Telegram DM | `agent:main:telegram:dm` | One session per bot |
| Discord DM | `agent:main:discord:dm` | One session per bot |
| WhatsApp DM | `agent:main:whatsapp:dm:<chat_id>` | Per-user (multi-user) |
| Group chat | `agent:main:<platform>:group:<chat_id>` | Per-group |
| Channel | `agent:main:<platform>:channel:<chat_id>` | Per-channel |
| Chat Type | Default Key Format | Behavior |
|-----------|--------------------|----------|
| Telegram DM | `agent:main:telegram:dm:<chat_id>` | One session per DM chat |
| Discord DM | `agent:main:discord:dm:<chat_id>` | One session per DM chat |
| WhatsApp DM | `agent:main:whatsapp:dm:<chat_id>` | One session per DM chat |
| Group chat | `agent:main:<platform>:group:<chat_id>:<user_id>` | Per-user inside the group when the platform exposes a user ID |
| Group thread/topic | `agent:main:<platform>:group:<chat_id>:<thread_id>:<user_id>` | Per-user inside that thread/topic |
| Channel | `agent:main:<platform>:channel:<chat_id>:<user_id>` | Per-user inside the channel when the platform exposes a user ID |
:::info
WhatsApp DMs include the chat ID in the session key because multiple users can DM the bot. Other platforms use a single DM session since the bot is configured per-user via allowlists.
:::
When Hermes cannot get a participant identifier for a shared chat, it falls back to one shared session for that room.
### Shared vs Isolated Group Sessions
By default, Hermes uses `group_sessions_per_user: true` in `config.yaml`. That means:
- Alice and Bob can both talk to Hermes in the same Discord channel without sharing transcript history
- one user's long tool-heavy task does not pollute another user's context window
- interrupt handling also stays per-user because the running-agent key matches the isolated session key
If you want one shared "room brain" instead, set:
```yaml
group_sessions_per_user: false
```
That reverts groups/channels to a single shared session per room, which preserves shared conversational context but also shares token costs, interrupt state, and context growth.
### Session Reset Policies