Compare commits

..

1 Commits

Author SHA1 Message Date
Shannon Sands bad9fe2452 add generic gateway startup readiness checks 2026-04-15 10:03:23 +10:00
162 changed files with 1720 additions and 13669 deletions
-4
View File
@@ -145,10 +145,6 @@
# Only override here if you need to force a backend without touching config.yaml:
# TERMINAL_ENV=local
# Override the container runtime binary (e.g. to use Podman instead of Docker).
# Useful on systems where Docker's storage driver is broken or unavailable.
# HERMES_DOCKER_BINARY=/usr/local/bin/podman
# Container images (for singularity/docker/modal backends)
# TERMINAL_DOCKER_IMAGE=nikolaik/python-nodejs:python3.11-nodejs20
# TERMINAL_SINGULARITY_IMAGE=docker://nikolaik/python-nodejs:python3.11-nodejs20
-1
View File
@@ -105,4 +105,3 @@ tesseracttars-creator <tesseracttars@gmail.com> <tesseracttars@gmail.com>
xinbenlv <zzn+pa@zzn.im> <zzn+pa@zzn.im>
SaulJWu <saul.jj.wu@gmail.com> <saul.jj.wu@gmail.com>
angelos <angelos@oikos.lan.home.malaiwah.com> <angelos@oikos.lan.home.malaiwah.com>
MestreY0d4-Uninter <241404605+MestreY0d4-Uninter@users.noreply.github.com> <MestreY0d4-Uninter@users.noreply.github.com>
+4 -4
View File
@@ -13,7 +13,7 @@ source venv/bin/activate # ALWAYS activate before running Python
```
hermes-agent/
├── run_agent.py # AIAgent class — core conversation loop
├── model_tools.py # Tool orchestration, discover_builtin_tools(), handle_function_call()
├── model_tools.py # Tool orchestration, _discover_tools(), handle_function_call()
├── toolsets.py # Toolset definitions, _HERMES_CORE_TOOLS list
├── cli.py # HermesCLI class — interactive CLI orchestrator
├── hermes_state.py # SessionDB — SQLite session store (FTS5 search)
@@ -181,7 +181,7 @@ if canonical == "mycommand":
## Adding New Tools
Requires changes in **2 files**:
Requires changes in **3 files**:
**1. Create `tools/your_tool.py`:**
```python
@@ -204,9 +204,9 @@ registry.register(
)
```
**2. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset.
**2. Add import** in `model_tools.py` `_discover_tools()` list.
Auto-discovery: any `tools/*.py` file with a top-level `registry.register()` call is imported automatically — no manual import list to maintain.
**3. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset.
The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.
-84
View File
@@ -1,84 +0,0 @@
# Hermes Agent Security Policy
This document outlines the security protocols, trust model, and deployment hardening guidelines for the **Hermes Agent** project.
## 1. Vulnerability Reporting
Hermes Agent does **not** operate a bug bounty program. Security issues should be reported via [GitHub Security Advisories (GHSA)](https://github.com/NousResearch/hermes-agent/security/advisories/new) or by emailing **security@nousresearch.com**. Do not open public issues for security vulnerabilities.
### Required Submission Details
- **Title & Severity:** Concise description and CVSS score/rating.
- **Affected Component:** Exact file path and line range (e.g., `tools/approval.py:120-145`).
- **Environment:** Output of `hermes version`, commit SHA, OS, and Python version.
- **Reproduction:** Step-by-step Proof-of-Concept (PoC) against `main` or the latest release.
- **Impact:** Explanation of what trust boundary was crossed.
---
## 2. Trust Model
The core assumption is that Hermes is a **personal agent** with one trusted operator.
### Operator & Session Trust
- **Single Tenant:** The system protects the operator from LLM actions, not from malicious co-tenants. Multi-user isolation must happen at the OS/host level.
- **Gateway Security:** Authorized callers (Telegram, Discord, Slack, etc.) receive equal trust. Session keys are used for routing, not as authorization boundaries.
- **Execution:** Defaults to `terminal.backend: local` (direct host execution). Container isolation (Docker, Modal, Daytona) is opt-in for sandboxing.
### Dangerous Command Approval
The approval system (`tools/approval.py`) is a core security boundary. Terminal commands, file operations, and other potentially destructive actions are gated behind explicit user confirmation before execution. The approval mode is configurable via `approvals.mode` in `config.yaml`:
- `"on"` (default) — prompts the user to approve dangerous commands.
- `"auto"` — auto-approves after a configurable delay.
- `"off"` — disables the gate entirely (break-glass; see Section 3).
### Output Redaction
`agent/redact.py` strips secret-like patterns (API keys, tokens, credentials) from all display output before it reaches the terminal or gateway platform. This prevents accidental credential leakage in chat logs, tool previews, and response text. Redaction operates on the display layer only — underlying values remain intact for internal agent operations.
### Skills vs. MCP Servers
- **Installed Skills:** High trust. Equivalent to local host code; skills can read environment variables and run arbitrary commands.
- **MCP Servers:** Lower trust. MCP subprocesses receive a filtered environment (`_build_safe_env()` in `tools/mcp_tool.py`) — only safe baseline variables (`PATH`, `HOME`, `XDG_*`) plus variables explicitly declared in the server's `env` config block are passed through. Host credentials are stripped by default. Additionally, packages invoked via `npx`/`uvx` are checked against the OSV malware database before spawning.
### Code Execution Sandbox
The `execute_code` tool (`tools/code_execution_tool.py`) runs LLM-generated Python scripts in a child process with API keys and tokens stripped from the environment to prevent credential exfiltration. Only environment variables explicitly declared by loaded skills (via `env_passthrough`) or by the user in `config.yaml` (`terminal.env_passthrough`) are passed through. The child accesses Hermes tools via RPC, not direct API calls.
### Subagents
- **No recursive delegation:** The `delegate_task` tool is disabled for child agents.
- **Depth limit:** `MAX_DEPTH = 2` — parent (depth 0) can spawn a child (depth 1); grandchildren are rejected.
- **Memory isolation:** Subagents run with `skip_memory=True` and do not have access to the parent's persistent memory provider. The parent receives only the task prompt and final response as an observation.
---
## 3. Out of Scope (Non-Vulnerabilities)
The following scenarios are **not** considered security breaches:
- **Prompt Injection:** Unless it results in a concrete bypass of the approval system, toolset restrictions, or container sandbox.
- **Public Exposure:** Deploying the gateway to the public internet without external authentication or network protection.
- **Trusted State Access:** Reports that require pre-existing write access to `~/.hermes/`, `.env`, or `config.yaml` (these are operator-owned files).
- **Default Behavior:** Host-level command execution when `terminal.backend` is set to `local` — this is the documented default, not a vulnerability.
- **Configuration Trade-offs:** Intentional break-glass settings such as `approvals.mode: "off"` or `terminal.backend: local` in production.
- **Tool-level read/access restrictions:** The agent has unrestricted shell access via the `terminal` tool by design. Reports that a specific tool (e.g., `read_file`) can access a resource are not vulnerabilities if the same access is available through `terminal`. Tool-level deny lists only constitute a meaningful security boundary when paired with equivalent restrictions on the terminal side (as with write operations, where `WRITE_DENIED_PATHS` is paired with the dangerous command approval system).
---
## 4. Deployment Hardening & Best Practices
### Filesystem & Network
- **Production sandboxing:** Use container backends (`docker`, `modal`, `daytona`) instead of `local` for untrusted workloads.
- **File permissions:** Run as non-root (the Docker image uses UID 10000); protect credentials with `chmod 600 ~/.hermes/.env` on local installs.
- **Network exposure:** Do not expose the gateway or API server to the public internet without VPN, Tailscale, or firewall protection. SSRF protection is enabled by default across all gateway platform adapters (Telegram, Discord, Slack, Matrix, Mattermost, etc.) with redirect validation. Note: the local terminal backend does not apply SSRF filtering, as it operates within the trusted operator's environment.
### Skills & Supply Chain
- **Skill installation:** Review Skills Guard reports (`tools/skills_guard.py`) before installing third-party skills. The audit log at `~/.hermes/skills/.hub/audit.log` tracks every install and removal.
- **MCP safety:** OSV malware checking runs automatically for `npx`/`uvx` packages before MCP server processes are spawned.
- **CI/CD:** GitHub Actions are pinned to full commit SHAs. The `supply-chain-audit.yml` workflow blocks PRs containing `.pth` files or suspicious `base64`+`exec` patterns.
### Credential Storage
- API keys and tokens belong exclusively in `~/.hermes/.env` — never in `config.yaml` or checked into version control.
- The credential pool system (`agent/credential_pool.py`) handles key rotation and fallback. Credentials are resolved from environment variables, not stored in plaintext databases.
---
## 5. Disclosure Process
- **Coordinated Disclosure:** 90-day window or until a fix is released, whichever comes first.
- **Communication:** All updates occur via the GHSA thread or email correspondence with security@nousresearch.com.
- **Credits:** Reporters are credited in release notes unless anonymity is requested.
-27
View File
@@ -298,33 +298,6 @@ def build_anthropic_client(api_key: str, base_url: str = None):
return _anthropic_sdk.Anthropic(**kwargs)
def build_anthropic_bedrock_client(region: str):
"""Create an AnthropicBedrock client for Bedrock Claude models.
Uses the Anthropic SDK's native Bedrock adapter, which provides full
Claude feature parity: prompt caching, thinking budgets, adaptive
thinking, fast mode — features not available via the Converse API.
Auth uses the boto3 default credential chain (IAM roles, SSO, env vars).
"""
if _anthropic_sdk is None:
raise ImportError(
"The 'anthropic' package is required for the Bedrock provider. "
"Install it with: pip install 'anthropic>=0.39.0'"
)
if not hasattr(_anthropic_sdk, "AnthropicBedrock"):
raise ImportError(
"anthropic.AnthropicBedrock not available. "
"Upgrade with: pip install 'anthropic>=0.39.0'"
)
from httpx import Timeout
return _anthropic_sdk.AnthropicBedrock(
aws_region=region,
timeout=Timeout(timeout=900.0, connect=10.0),
)
def read_claude_code_credentials() -> Optional[Dict[str, Any]]:
"""Read refreshable Claude Code OAuth credentials from ~/.claude/.credentials.json.
+18 -101
View File
@@ -775,21 +775,6 @@ def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
def _try_nous(vision: bool = False) -> Tuple[Optional[OpenAI], Optional[str]]:
# Check cross-session rate limit guard before attempting Nous —
# if another session already recorded a 429, skip Nous entirely
# to avoid piling more requests onto the tapped RPH bucket.
try:
from agent.nous_rate_guard import nous_rate_limit_remaining
_remaining = nous_rate_limit_remaining()
if _remaining is not None and _remaining > 0:
logger.debug(
"Auxiliary: skipping Nous Portal (rate-limited, resets in %.0fs)",
_remaining,
)
return None, None
except Exception:
pass
nous = _read_nous_auth()
if not nous:
return None, None
@@ -914,51 +899,6 @@ def _current_custom_base_url() -> str:
return custom_base or ""
def _validate_proxy_env_urls() -> None:
"""Fail fast with a clear error when proxy env vars have malformed URLs.
Common cause: shell config (e.g. .zshrc) with a typo like
``export HTTP_PROXY=http://127.0.0.1:6153export NEXT_VAR=...``
which concatenates 'export' into the port number. Without this
check the OpenAI/httpx client raises a cryptic ``Invalid port``
error that doesn't name the offending env var.
"""
from urllib.parse import urlparse
for key in ("HTTPS_PROXY", "HTTP_PROXY", "ALL_PROXY",
"https_proxy", "http_proxy", "all_proxy"):
value = str(os.environ.get(key) or "").strip()
if not value:
continue
try:
parsed = urlparse(value)
if parsed.scheme:
_ = parsed.port # raises ValueError for e.g. '6153export'
except ValueError as exc:
raise RuntimeError(
f"Malformed proxy environment variable {key}={value!r}. "
"Fix or unset your proxy settings and try again."
) from exc
def _validate_base_url(base_url: str) -> None:
"""Reject obviously broken custom endpoint URLs before they reach httpx."""
from urllib.parse import urlparse
candidate = str(base_url or "").strip()
if not candidate or candidate.startswith("acp://"):
return
try:
parsed = urlparse(candidate)
if parsed.scheme in {"http", "https"}:
_ = parsed.port # raises ValueError for malformed ports
except ValueError as exc:
raise RuntimeError(
f"Malformed custom endpoint URL: {candidate!r}. "
"Run `hermes setup` or `hermes model` and enter a valid http(s) base URL."
) from exc
def _try_custom_endpoint() -> Tuple[Optional[OpenAI], Optional[str]]:
runtime = _resolve_custom_runtime()
if len(runtime) == 2:
@@ -1359,7 +1299,6 @@ def resolve_provider_client(
Returns:
(client, resolved_model) or (None, None) if auth is unavailable.
"""
_validate_proxy_env_urls()
# Normalise aliases
provider = _normalize_aux_provider(provider)
@@ -1896,15 +1835,9 @@ def auxiliary_max_tokens_param(value: int) -> dict:
# Every auxiliary LLM consumer should use these instead of manually
# constructing clients and calling .chat.completions.create().
# Client cache: (provider, async_mode, base_url, api_key, api_mode, runtime_key) -> (client, default_model, loop)
# NOTE: loop identity is NOT part of the key. On async cache hits we check
# whether the cached loop is the *current* loop; if not, the stale entry is
# replaced in-place. This bounds cache growth to one entry per unique
# provider config rather than one per (config × event-loop), which previously
# caused unbounded fd accumulation in long-running gateway processes (#10200).
# Client cache: (provider, async_mode, base_url, api_key) -> (client, default_model)
_client_cache: Dict[tuple, tuple] = {}
_client_cache_lock = threading.Lock()
_CLIENT_CACHE_MAX_SIZE = 64 # safety belt — evict oldest when exceeded
def neuter_async_httpx_del() -> None:
@@ -2037,49 +1970,39 @@ def _get_cached_client(
Async clients (AsyncOpenAI) use httpx.AsyncClient internally, which
binds to the event loop that was current when the client was created.
Using such a client on a *different* loop causes deadlocks or
RuntimeError. To prevent cross-loop issues, the cache validates on
every async hit that the cached loop is the *current, open* loop.
If the loop changed (e.g. a new gateway worker-thread loop), the stale
entry is replaced in-place rather than creating an additional entry.
This keeps cache size bounded to one entry per unique provider config,
preventing the fd-exhaustion that previously occurred in long-running
gateways where recycled worker threads created unbounded entries (#10200).
RuntimeError. To prevent cross-loop issues (especially in gateway
mode where _run_async() may spawn fresh loops in worker threads), the
cache key for async clients includes the current event loop's identity
so each loop gets its own client instance.
"""
# Resolve the current event loop for async clients so we can validate
# cached entries. Loop identity is NOT in the cache key — instead we
# check at hit time whether the cached loop is still current and open.
# This prevents unbounded cache growth from recycled worker-thread loops
# while still guaranteeing we never reuse a client on the wrong loop
# (which causes deadlocks, see #2681).
# Include loop identity for async clients to prevent cross-loop reuse.
# httpx.AsyncClient (inside AsyncOpenAI) is bound to the loop where it
# was created — reusing it on a different loop causes deadlocks (#2681).
loop_id = 0
current_loop = None
if async_mode:
try:
import asyncio as _aio
current_loop = _aio.get_event_loop()
loop_id = id(current_loop)
except RuntimeError:
pass
runtime = _normalize_main_runtime(main_runtime)
runtime_key = tuple(runtime.get(field, "") for field in _MAIN_RUNTIME_FIELDS) if provider == "auto" else ()
cache_key = (provider, async_mode, base_url or "", api_key or "", api_mode or "", runtime_key)
cache_key = (provider, async_mode, base_url or "", api_key or "", api_mode or "", loop_id, runtime_key)
with _client_cache_lock:
if cache_key in _client_cache:
cached_client, cached_default, cached_loop = _client_cache[cache_key]
if async_mode:
# Validate: the cached client must be bound to the CURRENT,
# OPEN loop. If the loop changed or was closed, the httpx
# transport inside is dead — force-close and replace.
loop_ok = (
cached_loop is not None
and cached_loop is current_loop
and not cached_loop.is_closed()
)
if loop_ok:
# A cached async client whose loop has been closed will raise
# "Event loop is closed" when httpx tries to clean up its
# transport. Discard the stale client and create a fresh one.
if cached_loop is not None and cached_loop.is_closed():
_force_close_async_httpx(cached_client)
del _client_cache[cache_key]
else:
effective = _compat_model(cached_client, model, cached_default)
return cached_client, effective
# Stale — evict and fall through to create a new client.
_force_close_async_httpx(cached_client)
del _client_cache[cache_key]
else:
effective = _compat_model(cached_client, model, cached_default)
return cached_client, effective
@@ -2099,12 +2022,6 @@ def _get_cached_client(
bound_loop = current_loop
with _client_cache_lock:
if cache_key not in _client_cache:
# Safety belt: if the cache has grown beyond the max, evict
# the oldest entries (FIFO — dict preserves insertion order).
while len(_client_cache) >= _CLIENT_CACHE_MAX_SIZE:
evict_key, evict_entry = next(iter(_client_cache.items()))
_force_close_async_httpx(evict_entry[0])
del _client_cache[evict_key]
_client_cache[cache_key] = (client, default_model, bound_loop)
else:
client, default_model, _ = _client_cache[cache_key]
File diff suppressed because it is too large Load Diff
+36 -307
View File
@@ -17,10 +17,7 @@ Improvements over v2:
- Richer tool call/result detail in summarizer input
"""
import hashlib
import json
import logging
import re
import time
from typing import Any, Dict, List, Optional
@@ -60,128 +57,6 @@ _CHARS_PER_TOKEN = 4
_SUMMARY_FAILURE_COOLDOWN_SECONDS = 600
def _summarize_tool_result(tool_name: str, tool_args: str, tool_content: str) -> str:
"""Create an informative 1-line summary of a tool call + result.
Used during the pre-compression pruning pass to replace large tool
outputs with a short but useful description of what the tool did,
rather than a generic placeholder that carries zero information.
Returns strings like::
[terminal] ran `npm test` -> exit 0, 47 lines output
[read_file] read config.py from line 1 (1,200 chars)
[search_files] content search for 'compress' in agent/ -> 12 matches
"""
try:
args = json.loads(tool_args) if tool_args else {}
except (json.JSONDecodeError, TypeError):
args = {}
content = tool_content or ""
content_len = len(content)
line_count = content.count("\n") + 1 if content.strip() else 0
if tool_name == "terminal":
cmd = args.get("command", "")
if len(cmd) > 80:
cmd = cmd[:77] + "..."
exit_match = re.search(r'"exit_code"\s*:\s*(-?\d+)', content)
exit_code = exit_match.group(1) if exit_match else "?"
return f"[terminal] ran `{cmd}` -> exit {exit_code}, {line_count} lines output"
if tool_name == "read_file":
path = args.get("path", "?")
offset = args.get("offset", 1)
return f"[read_file] read {path} from line {offset} ({content_len:,} chars)"
if tool_name == "write_file":
path = args.get("path", "?")
written_lines = args.get("content", "").count("\n") + 1 if args.get("content") else "?"
return f"[write_file] wrote to {path} ({written_lines} lines)"
if tool_name == "search_files":
pattern = args.get("pattern", "?")
path = args.get("path", ".")
target = args.get("target", "content")
match_count = re.search(r'"total_count"\s*:\s*(\d+)', content)
count = match_count.group(1) if match_count else "?"
return f"[search_files] {target} search for '{pattern}' in {path} -> {count} matches"
if tool_name == "patch":
path = args.get("path", "?")
mode = args.get("mode", "replace")
return f"[patch] {mode} in {path} ({content_len:,} chars result)"
if tool_name in ("browser_navigate", "browser_click", "browser_snapshot",
"browser_type", "browser_scroll", "browser_vision"):
url = args.get("url", "")
ref = args.get("ref", "")
detail = f" {url}" if url else (f" ref={ref}" if ref else "")
return f"[{tool_name}]{detail} ({content_len:,} chars)"
if tool_name == "web_search":
query = args.get("query", "?")
return f"[web_search] query='{query}' ({content_len:,} chars result)"
if tool_name == "web_extract":
urls = args.get("urls", [])
url_desc = urls[0] if isinstance(urls, list) and urls else "?"
if isinstance(urls, list) and len(urls) > 1:
url_desc += f" (+{len(urls) - 1} more)"
return f"[web_extract] {url_desc} ({content_len:,} chars)"
if tool_name == "delegate_task":
goal = args.get("goal", "")
if len(goal) > 60:
goal = goal[:57] + "..."
return f"[delegate_task] '{goal}' ({content_len:,} chars result)"
if tool_name == "execute_code":
code_preview = (args.get("code") or "")[:60].replace("\n", " ")
if len(args.get("code", "")) > 60:
code_preview += "..."
return f"[execute_code] `{code_preview}` ({line_count} lines output)"
if tool_name in ("skill_view", "skills_list", "skill_manage"):
name = args.get("name", "?")
return f"[{tool_name}] name={name} ({content_len:,} chars)"
if tool_name == "vision_analyze":
question = args.get("question", "")[:50]
return f"[vision_analyze] '{question}' ({content_len:,} chars)"
if tool_name == "memory":
action = args.get("action", "?")
target = args.get("target", "?")
return f"[memory] {action} on {target}"
if tool_name == "todo":
return "[todo] updated task list"
if tool_name == "clarify":
return "[clarify] asked user a question"
if tool_name == "text_to_speech":
return f"[text_to_speech] generated audio ({content_len:,} chars)"
if tool_name == "cronjob":
action = args.get("action", "?")
return f"[cronjob] {action}"
if tool_name == "process":
action = args.get("action", "?")
sid = args.get("session_id", "?")
return f"[process] {action} session={sid}"
# Generic fallback
first_arg = ""
for k, v in list(args.items())[:2]:
sv = str(v)[:40]
first_arg += f" {k}={sv}"
return f"[{tool_name}]{first_arg} ({content_len:,} chars result)"
class ContextCompressor(ContextEngine):
"""Default context engine — compresses conversation context via lossy summarization.
@@ -203,8 +78,6 @@ class ContextCompressor(ContextEngine):
self._context_probed = False
self._context_probe_persistable = False
self._previous_summary = None
self._last_compression_savings_pct = 100.0
self._ineffective_compression_count = 0
def update_model(
self,
@@ -294,9 +167,6 @@ class ContextCompressor(ContextEngine):
# Stores the previous compaction summary for iterative updates
self._previous_summary: Optional[str] = None
# Anti-thrashing: track whether last compression was effective
self._last_compression_savings_pct: float = 100.0
self._ineffective_compression_count: int = 0
self._summary_failure_cooldown_until: float = 0.0
def update_from_response(self, usage: Dict[str, Any]):
@@ -305,26 +175,9 @@ class ContextCompressor(ContextEngine):
self.last_completion_tokens = usage.get("completion_tokens", 0)
def should_compress(self, prompt_tokens: int = None) -> bool:
"""Check if context exceeds the compression threshold.
Includes anti-thrashing protection: if the last two compressions
each saved less than 10%, skip compression to avoid infinite loops
where each pass removes only 1-2 messages.
"""
"""Check if context exceeds the compression threshold."""
tokens = prompt_tokens if prompt_tokens is not None else self.last_prompt_tokens
if tokens < self.threshold_tokens:
return False
# Anti-thrashing: back off if recent compressions were ineffective
if self._ineffective_compression_count >= 2:
if not self.quiet_mode:
logger.warning(
"Compression skipped — last %d compressions saved <10%% each. "
"Consider /new to start a fresh session, or /compress <topic> "
"for focused compression.",
self._ineffective_compression_count,
)
return False
return True
return tokens >= self.threshold_tokens
# ------------------------------------------------------------------
# Tool output pruning (cheap pre-pass, no LLM call)
@@ -334,16 +187,7 @@ class ContextCompressor(ContextEngine):
self, messages: List[Dict[str, Any]], protect_tail_count: int,
protect_tail_tokens: int | None = None,
) -> tuple[List[Dict[str, Any]], int]:
"""Replace old tool result contents with informative 1-line summaries.
Instead of a generic placeholder, generates a summary like::
[terminal] ran `npm test` -> exit 0, 47 lines output
[read_file] read config.py from line 1 (3,400 chars)
Also deduplicates identical tool results (e.g. reading the same file
5x keeps only the newest full copy) and truncates large tool_call
arguments in assistant messages outside the protected tail.
"""Replace old tool result contents with a short placeholder.
Walks backward from the end, protecting the most recent messages that
fall within ``protect_tail_tokens`` (when provided) OR the last
@@ -359,22 +203,6 @@ class ContextCompressor(ContextEngine):
result = [m.copy() for m in messages]
pruned = 0
# Build index: tool_call_id -> (tool_name, arguments_json)
call_id_to_tool: Dict[str, tuple] = {}
for msg in result:
if msg.get("role") == "assistant":
for tc in msg.get("tool_calls") or []:
if isinstance(tc, dict):
cid = tc.get("id", "")
fn = tc.get("function", {})
call_id_to_tool[cid] = (fn.get("name", "unknown"), fn.get("arguments", ""))
else:
cid = getattr(tc, "id", "") or ""
fn = getattr(tc, "function", None)
name = getattr(fn, "name", "unknown") if fn else "unknown"
args_str = getattr(fn, "arguments", "") if fn else ""
call_id_to_tool[cid] = (name, args_str)
# Determine the prune boundary
if protect_tail_tokens is not None and protect_tail_tokens > 0:
# Token-budget approach: walk backward accumulating tokens
@@ -383,8 +211,7 @@ class ContextCompressor(ContextEngine):
min_protect = min(protect_tail_count, len(result) - 1)
for i in range(len(result) - 1, -1, -1):
msg = result[i]
raw_content = msg.get("content") or ""
content_len = sum(len(p.get("text", "")) for p in raw_content) if isinstance(raw_content, list) else len(raw_content)
content_len = len(msg.get("content") or "")
msg_tokens = content_len // _CHARS_PER_TOKEN + 10
for tc in msg.get("tool_calls") or []:
if isinstance(tc, dict):
@@ -399,69 +226,18 @@ class ContextCompressor(ContextEngine):
else:
prune_boundary = len(result) - protect_tail_count
# Pass 1: Deduplicate identical tool results.
# When the same file is read multiple times, keep only the most recent
# full copy and replace older duplicates with a back-reference.
content_hashes: dict = {} # hash -> (index, tool_call_id)
for i in range(len(result) - 1, -1, -1):
msg = result[i]
if msg.get("role") != "tool":
continue
content = msg.get("content") or ""
# Skip multimodal content (list of content blocks)
if isinstance(content, list):
continue
if len(content) < 200:
continue
h = hashlib.md5(content.encode("utf-8", errors="replace")).hexdigest()[:12]
if h in content_hashes:
# This is an older duplicate — replace with back-reference
result[i] = {**msg, "content": "[Duplicate tool output — same content as a more recent call]"}
pruned += 1
else:
content_hashes[h] = (i, msg.get("tool_call_id", "?"))
# Pass 2: Replace old tool results with informative summaries
for i in range(prune_boundary):
msg = result[i]
if msg.get("role") != "tool":
continue
content = msg.get("content", "")
# Skip multimodal content (list of content blocks)
if isinstance(content, list):
continue
if not content or content == _PRUNED_TOOL_PLACEHOLDER:
continue
# Skip already-deduplicated or previously-summarized results
if content.startswith("[Duplicate tool output"):
continue
# Only prune if the content is substantial (>200 chars)
if len(content) > 200:
call_id = msg.get("tool_call_id", "")
tool_name, tool_args = call_id_to_tool.get(call_id, ("unknown", ""))
summary = _summarize_tool_result(tool_name, tool_args, content)
result[i] = {**msg, "content": summary}
result[i] = {**msg, "content": _PRUNED_TOOL_PLACEHOLDER}
pruned += 1
# Pass 3: Truncate large tool_call arguments in assistant messages
# outside the protected tail. write_file with 50KB content, for
# example, survives pruning entirely without this.
for i in range(prune_boundary):
msg = result[i]
if msg.get("role") != "assistant" or not msg.get("tool_calls"):
continue
new_tcs = []
modified = False
for tc in msg["tool_calls"]:
if isinstance(tc, dict):
args = tc.get("function", {}).get("arguments", "")
if len(args) > 500:
tc = {**tc, "function": {**tc["function"], "arguments": args[:200] + "...[truncated]"}}
modified = True
new_tcs.append(tc)
if modified:
result[i] = {**msg, "tool_calls": new_tcs}
return result, pruned
# ------------------------------------------------------------------
@@ -581,37 +357,29 @@ class ContextCompressor(ContextEngine):
)
# Shared structured template (used by both paths).
# Key changes vs v1:
# - "Pending User Asks" section (from Claude Code) explicitly tracks
# unanswered questions so the model knows what's resolved vs open
# - "Remaining Work" replaces "Next Steps" to avoid reading as active
# instructions
# - "Resolved Questions" makes it clear which questions were already
# answered (prevents model from re-answering them)
_template_sections = f"""## Goal
[What the user is trying to accomplish]
## Constraints & Preferences
[User preferences, coding style, constraints, important decisions]
## Completed Actions
[Numbered list of concrete actions taken — include tool used, target, and outcome.
Format each as: N. ACTION target — outcome [tool: name]
Example:
1. READ config.py:45 — found `==` should be `!=` [tool: read_file]
2. PATCH config.py:45 — changed `==` to `!=` [tool: patch]
3. TEST `pytest tests/` — 3/50 failed: test_parse, test_validate, test_edge [tool: terminal]
Be specific with file paths, commands, line numbers, and results.]
## Active State
[Current working state — include:
- Working directory and branch (if applicable)
- Modified/created files with brief note on each
- Test status (X/Y passing)
- Any running processes or servers
- Environment details that matter]
## In Progress
[Work currently underway — what was being done when compaction fired]
## Blocked
[Any blockers, errors, or issues not yet resolved. Include exact error messages.]
## Progress
### Done
[Completed work — include specific file paths, commands run, results obtained]
### In Progress
[Work currently underway]
### Blocked
[Any blockers or issues encountered]
## Key Decisions
[Important technical decisions and WHY they were made]
[Important technical decisions and why they were made]
## Resolved Questions
[Questions the user asked that were ALREADY answered — include the answer so the next assistant does not re-answer them]
@@ -628,7 +396,10 @@ Be specific with file paths, commands, line numbers, and results.]
## Critical Context
[Any specific values, error messages, configuration details, or data that would be lost without explicit preservation]
Target ~{summary_budget} tokens. Be CONCRETE — include file paths, command outputs, error messages, line numbers, and specific values. Avoid vague descriptions like "made some changes" — say exactly what changed.
## Tools & Patterns
[Which tools were used, how they were used effectively, and any tool-specific discoveries]
Target ~{summary_budget} tokens. Be specific — include file paths, command outputs, error messages, and concrete values rather than vague descriptions.
Write only the summary body. Do not include any preamble or prefix."""
@@ -644,7 +415,7 @@ PREVIOUS SUMMARY:
NEW TURNS TO INCORPORATE:
{content_to_summarize}
Update the summary using this exact structure. PRESERVE all existing information that is still relevant. ADD new completed actions to the numbered list (continue numbering). Move items from "In Progress" to "Completed Actions" when done. Move answered questions to "Resolved Questions". Update "Active State" to reflect current state. Remove information only if it is clearly obsolete.
Update the summary using this exact structure. PRESERVE all existing information that is still relevant. ADD new progress. Move items from "In Progress" to "Done" when completed. Move answered questions to "Resolved Questions". Remove information only if it is clearly obsolete.
{_template_sections}"""
else:
@@ -679,7 +450,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
"api_mode": self.api_mode,
},
"messages": [{"role": "user", "content": prompt}],
"max_tokens": int(summary_budget * 1.3),
"max_tokens": summary_budget * 2,
# timeout resolved from auxiliary.compression.timeout config by call_llm
}
if self.summary_model:
@@ -693,10 +464,8 @@ The user has requested that this compaction PRIORITISE preserving all informatio
# Store for iterative updates on next compaction
self._previous_summary = summary
self._summary_failure_cooldown_until = 0.0
self._summary_model_fallen_back = False
return self._with_summary_prefix(summary)
except RuntimeError:
# No provider configured — long cooldown, unlikely to self-resolve
self._summary_failure_cooldown_until = time.monotonic() + _SUMMARY_FAILURE_COOLDOWN_SECONDS
logging.warning("Context compression: no provider available for "
"summary. Middle turns will be dropped without summary "
@@ -704,42 +473,12 @@ The user has requested that this compaction PRIORITISE preserving all informatio
_SUMMARY_FAILURE_COOLDOWN_SECONDS)
return None
except Exception as e:
# If the summary model is different from the main model and the
# error looks permanent (model not found, 503, 404), fall back to
# using the main model instead of entering cooldown that leaves
# context growing unbounded. (#8620 sub-issue 4)
_status = getattr(e, "status_code", None) or getattr(getattr(e, "response", None), "status_code", None)
_err_str = str(e).lower()
_is_model_not_found = (
_status in (404, 503)
or "model_not_found" in _err_str
or "does not exist" in _err_str
or "no available channel" in _err_str
)
if (
_is_model_not_found
and self.summary_model
and self.summary_model != self.model
and not getattr(self, "_summary_model_fallen_back", False)
):
self._summary_model_fallen_back = True
logging.warning(
"Summary model '%s' not available (%s). "
"Falling back to main model '%s' for compression.",
self.summary_model, e, self.model,
)
self.summary_model = "" # empty = use main model
self._summary_failure_cooldown_until = 0.0 # no cooldown
return self._generate_summary(messages, summary_budget) # retry immediately
# Transient errors (timeout, rate limit, network) — shorter cooldown
_transient_cooldown = 60
self._summary_failure_cooldown_until = time.monotonic() + _transient_cooldown
self._summary_failure_cooldown_until = time.monotonic() + _SUMMARY_FAILURE_COOLDOWN_SECONDS
logging.warning(
"Failed to generate context summary: %s. "
"Further summary attempts paused for %d seconds.",
e,
_transient_cooldown,
_SUMMARY_FAILURE_COOLDOWN_SECONDS,
)
return None
@@ -1005,11 +744,11 @@ The user has requested that this compaction PRIORITISE preserving all informatio
compressed = []
for i in range(compress_start):
msg = messages[i].copy()
if i == 0 and msg.get("role") == "system":
existing = msg.get("content") or ""
_compression_note = "[Note: Some earlier conversation turns have been compacted into a handoff summary to preserve context space. The current session state may still reflect earlier work, so build on that summary and state rather than re-doing work.]"
if _compression_note not in existing:
msg["content"] = existing + "\n\n" + _compression_note
if i == 0 and msg.get("role") == "system" and self.compression_count == 0:
msg["content"] = (
(msg.get("content") or "")
+ "\n\n[Note: Some earlier conversation turns have been compacted into a handoff summary to preserve context space. The current session state may still reflect earlier work, so build on that summary and state rather than re-doing work.]"
)
compressed.append(msg)
# If LLM summary failed, insert a static fallback so the model
@@ -1067,24 +806,14 @@ The user has requested that this compaction PRIORITISE preserving all informatio
compressed = self._sanitize_tool_pairs(compressed)
new_estimate = estimate_messages_tokens_rough(compressed)
saved_estimate = display_tokens - new_estimate
# Anti-thrashing: track compression effectiveness
savings_pct = (saved_estimate / display_tokens * 100) if display_tokens > 0 else 0
self._last_compression_savings_pct = savings_pct
if savings_pct < 10:
self._ineffective_compression_count += 1
else:
self._ineffective_compression_count = 0
if not self.quiet_mode:
new_estimate = estimate_messages_tokens_rough(compressed)
saved_estimate = display_tokens - new_estimate
logger.info(
"Compressed: %d -> %d messages (~%d tokens saved, %.0f%%)",
"Compressed: %d -> %d messages (~%d tokens saved)",
n_messages,
len(compressed),
saved_estimate,
savings_pct,
)
logger.info("Compression #%d complete", self.compression_count)
-2
View File
@@ -1162,7 +1162,6 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
if token:
source_name = "gh_cli" if "gh" in source.lower() else f"env:{source}"
active_sources.add(source_name)
pconfig = PROVIDER_REGISTRY.get(provider)
changed |= _upsert_entry(
entries,
provider,
@@ -1171,7 +1170,6 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
"source": source_name,
"auth_type": AUTH_TYPE_API_KEY,
"access_token": token,
"base_url": pconfig.inference_base_url if pconfig else "",
"label": source,
},
)
-9
View File
@@ -112,10 +112,6 @@ _RATE_LIMIT_PATTERNS = [
"please retry after",
"resource_exhausted",
"rate increased too quickly", # Alibaba/DashScope throttling
# AWS Bedrock throttling
"throttlingexception",
"too many concurrent requests",
"servicequotaexceededexception",
]
# Usage-limit patterns that need disambiguation (could be billing OR rate_limit)
@@ -175,11 +171,6 @@ _CONTEXT_OVERFLOW_PATTERNS = [
# Chinese error messages (some providers return these)
"超过最大长度",
"上下文长度",
# AWS Bedrock Converse API error patterns
"input is too long",
"max input token",
"input token",
"exceeds the maximum number of input tokens",
]
# Model not found patterns
-11
View File
@@ -36,7 +36,6 @@ _PROVIDER_PREFIXES: frozenset[str] = frozenset({
"opencode", "zen", "go", "vercel", "kilo", "dashscope", "aliyun", "qwen",
"mimo", "xiaomi-mimo",
"arcee-ai", "arceeai",
"xai", "x-ai", "x.ai", "grok",
"qwen-portal",
})
@@ -1012,16 +1011,6 @@ def get_model_context_length(
if ctx:
return ctx
# 4b. AWS Bedrock — use static context length table.
# Bedrock's ListFoundationModels doesn't expose context window sizes,
# so we maintain a curated table in bedrock_adapter.py.
if provider == "bedrock" or (base_url and "bedrock-runtime" in base_url):
try:
from agent.bedrock_adapter import get_bedrock_context_length
return get_bedrock_context_length(model)
except ImportError:
pass # boto3 not installed — fall through to generic resolution
# 5. Provider-aware lookups (before generic OpenRouter cache)
# These are provider-specific and take priority over the generic OR cache,
# since the same model can have different context limits per provider
-182
View File
@@ -1,182 +0,0 @@
"""Cross-session rate limit guard for Nous Portal.
Writes rate limit state to a shared file so all sessions (CLI, gateway,
cron, auxiliary) can check whether Nous Portal is currently rate-limited
before making requests. Prevents retry amplification when RPH is tapped.
Each 429 from Nous triggers up to 9 API calls per conversation turn
(3 SDK retries x 3 Hermes retries), and every one of those calls counts
against RPH. By recording the rate limit state on first 429 and checking
it before subsequent attempts, we eliminate the amplification effect.
"""
from __future__ import annotations
import json
import logging
import os
import tempfile
import time
from typing import Any, Mapping, Optional
logger = logging.getLogger(__name__)
_STATE_SUBDIR = "rate_limits"
_STATE_FILENAME = "nous.json"
def _state_path() -> str:
"""Return the path to the Nous rate limit state file."""
try:
from hermes_constants import get_hermes_home
base = get_hermes_home()
except ImportError:
base = os.path.join(os.path.expanduser("~"), ".hermes")
return os.path.join(base, _STATE_SUBDIR, _STATE_FILENAME)
def _parse_reset_seconds(headers: Optional[Mapping[str, str]]) -> Optional[float]:
"""Extract the best available reset-time estimate from response headers.
Priority:
1. x-ratelimit-reset-requests-1h (hourly RPH window — most useful)
2. x-ratelimit-reset-requests (per-minute RPM window)
3. retry-after (generic HTTP header)
Returns seconds-from-now, or None if no usable header found.
"""
if not headers:
return None
lowered = {k.lower(): v for k, v in headers.items()}
for key in (
"x-ratelimit-reset-requests-1h",
"x-ratelimit-reset-requests",
"retry-after",
):
raw = lowered.get(key)
if raw is not None:
try:
val = float(raw)
if val > 0:
return val
except (TypeError, ValueError):
pass
return None
def record_nous_rate_limit(
*,
headers: Optional[Mapping[str, str]] = None,
error_context: Optional[dict[str, Any]] = None,
default_cooldown: float = 300.0,
) -> None:
"""Record that Nous Portal is rate-limited.
Parses the reset time from response headers or error context.
Falls back to ``default_cooldown`` (5 minutes) if no reset info
is available. Writes to a shared file that all sessions can read.
Args:
headers: HTTP response headers from the 429 error.
error_context: Structured error context from _extract_api_error_context().
default_cooldown: Fallback cooldown in seconds when no header data.
"""
now = time.time()
reset_at = None
# Try headers first (most accurate)
header_seconds = _parse_reset_seconds(headers)
if header_seconds is not None:
reset_at = now + header_seconds
# Try error_context reset_at (from body parsing)
if reset_at is None and isinstance(error_context, dict):
ctx_reset = error_context.get("reset_at")
if isinstance(ctx_reset, (int, float)) and ctx_reset > now:
reset_at = float(ctx_reset)
# Default cooldown
if reset_at is None:
reset_at = now + default_cooldown
path = _state_path()
try:
state_dir = os.path.dirname(path)
os.makedirs(state_dir, exist_ok=True)
state = {
"reset_at": reset_at,
"recorded_at": now,
"reset_seconds": reset_at - now,
}
# Atomic write: write to temp file + rename
fd, tmp_path = tempfile.mkstemp(dir=state_dir, suffix=".tmp")
try:
with os.fdopen(fd, "w") as f:
json.dump(state, f)
os.replace(tmp_path, path)
except Exception:
# Clean up temp file on failure
try:
os.unlink(tmp_path)
except OSError:
pass
raise
logger.info(
"Nous rate limit recorded: resets in %.0fs (at %.0f)",
reset_at - now, reset_at,
)
except Exception as exc:
logger.debug("Failed to write Nous rate limit state: %s", exc)
def nous_rate_limit_remaining() -> Optional[float]:
"""Check if Nous Portal is currently rate-limited.
Returns:
Seconds remaining until reset, or None if not rate-limited.
"""
path = _state_path()
try:
with open(path) as f:
state = json.load(f)
reset_at = state.get("reset_at", 0)
remaining = reset_at - time.time()
if remaining > 0:
return remaining
# Expired — clean up
try:
os.unlink(path)
except OSError:
pass
return None
except (FileNotFoundError, json.JSONDecodeError, KeyError, TypeError):
return None
def clear_nous_rate_limit() -> None:
"""Clear the rate limit state (e.g., after a successful Nous request)."""
try:
os.unlink(_state_path())
except FileNotFoundError:
pass
except OSError as exc:
logger.debug("Failed to clear Nous rate limit state: %s", exc)
def format_remaining(seconds: float) -> str:
"""Format seconds remaining into human-readable duration."""
s = max(0, int(seconds))
if s < 60:
return f"{s}s"
if s < 3600:
m, sec = divmod(s, 60)
return f"{m}m {sec}s" if sec else f"{m}m"
h, remainder = divmod(s, 3600)
m = remainder // 60
return f"{h}h {m}m" if m else f"{h}h"
-17
View File
@@ -93,17 +93,6 @@ _DB_CONNSTR_RE = re.compile(
re.IGNORECASE,
)
# JWT tokens: header.payload[.signature] — always start with "eyJ" (base64 for "{")
# Matches 1-part (header only), 2-part (header.payload), and full 3-part JWTs.
_JWT_RE = re.compile(
r"eyJ[A-Za-z0-9_-]{10,}" # Header (always starts with eyJ)
r"(?:\.[A-Za-z0-9_=-]{4,}){0,2}" # Optional payload and/or signature
)
# Discord user/role mentions: <@123456789012345678> or <@!123456789012345678>
# Snowflake IDs are 17-20 digit integers that resolve to specific Discord accounts.
_DISCORD_MENTION_RE = re.compile(r"<@!?(\d{17,20})>")
# E.164 phone numbers: +<country><number>, 7-15 digits
# Negative lookahead prevents matching hex strings or identifiers
_SIGNAL_PHONE_RE = re.compile(r"(\+[1-9]\d{6,14})(?![A-Za-z0-9])")
@@ -170,12 +159,6 @@ def redact_sensitive_text(text: str) -> str:
# Database connection string passwords
text = _DB_CONNSTR_RE.sub(lambda m: f"{m.group(1)}***{m.group(3)}", text)
# JWT tokens (eyJ... — base64-encoded JSON headers)
text = _JWT_RE.sub(lambda m: _mask_token(m.group(0)), text)
# Discord user/role mentions (<@snowflake_id>)
text = _DISCORD_MENTION_RE.sub(lambda m: f"<@{'!' if '!' in m.group(0) else ''}***>", text)
# E.164 phone numbers (Signal, WhatsApp)
def _redact_phone(m):
phone = m.group(1)
+1 -3
View File
@@ -12,8 +12,6 @@ from datetime import datetime
from pathlib import Path
from typing import Any, Dict, Optional
from hermes_constants import display_hermes_home
logger = logging.getLogger(__name__)
_skill_commands: Dict[str, Dict[str, Any]] = {}
@@ -110,7 +108,7 @@ def _inject_skill_config(loaded_skill: dict[str, Any], parts: list[str]) -> None
if not resolved:
return
lines = ["", f"[Skill config (from {display_hermes_home()}/config.yaml):"]
lines = ["", "[Skill config (from ~/.hermes/config.yaml):"]
for key, value in resolved.items():
display_val = str(value) if value else "(not set)"
lines.append(f" {key} = {display_val}")
-74
View File
@@ -284,80 +284,6 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://ai.google.dev/pricing",
pricing_version="google-pricing-2026-03-16",
),
# AWS Bedrock — pricing per the Bedrock pricing page.
# Bedrock charges the same per-token rates as the model provider but
# through AWS billing. These are the on-demand prices (no commitment).
# Source: https://aws.amazon.com/bedrock/pricing/
(
"bedrock",
"anthropic.claude-opus-4-6",
): PricingEntry(
input_cost_per_million=Decimal("15.00"),
output_cost_per_million=Decimal("75.00"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"anthropic.claude-sonnet-4-6",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"anthropic.claude-sonnet-4-5",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"anthropic.claude-haiku-4-5",
): PricingEntry(
input_cost_per_million=Decimal("0.80"),
output_cost_per_million=Decimal("4.00"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"amazon.nova-pro",
): PricingEntry(
input_cost_per_million=Decimal("0.80"),
output_cost_per_million=Decimal("3.20"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"amazon.nova-lite",
): PricingEntry(
input_cost_per_million=Decimal("0.06"),
output_cost_per_million=Decimal("0.24"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"amazon.nova-micro",
): PricingEntry(
input_cost_per_million=Decimal("0.035"),
output_cost_per_million=Decimal("0.14"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
}
+19 -25
View File
@@ -989,7 +989,6 @@ def _prune_orphaned_branches(repo_root: str) -> None:
_ACCENT_ANSI_DEFAULT = "\033[1;38;2;255;215;0m" # True-color #FFD700 bold — fallback
_BOLD = "\033[1m"
_RST = "\033[0m"
_STREAM_PAD = " " # 4-space indent for streamed response text (matches Panel padding)
def _hex_to_ansi(hex_color: str, *, bold: bool = False) -> str:
@@ -1713,9 +1712,9 @@ class HermesCLI:
# Parse and validate toolsets
self.enabled_toolsets = toolsets
if toolsets and "all" not in toolsets and "*" not in toolsets:
# Validate each toolset — MCP server names are resolved via
# live registry aliases (registered during discover_mcp_tools),
# but discovery hasn't run yet at this point, so exclude them.
# Validate each toolset — MCP server names are added by
# _get_platform_tools() but aren't registered in TOOLSETS yet
# (that happens later in _sync_mcp_toolsets), so exclude them.
mcp_names = set((CLI_CONFIG.get("mcp_servers") or {}).keys())
invalid = [t for t in toolsets if not validate_toolset(t) and t not in mcp_names]
if invalid:
@@ -2581,7 +2580,7 @@ class HermesCLI:
_tc = getattr(self, "_stream_text_ansi", "")
while "\n" in self._stream_buf:
line, self._stream_buf = self._stream_buf.split("\n", 1)
_cprint(f"{_STREAM_PAD}{_tc}{line}{_RST}" if _tc else f"{_STREAM_PAD}{line}")
_cprint(f"{_tc}{line}{_RST}" if _tc else line)
def _flush_stream(self) -> None:
"""Emit any remaining partial line from the stream buffer and close the box."""
@@ -2598,7 +2597,7 @@ class HermesCLI:
if self._stream_buf:
_tc = getattr(self, "_stream_text_ansi", "")
_cprint(f"{_STREAM_PAD}{_tc}{self._stream_buf}{_RST}" if _tc else f"{_STREAM_PAD}{self._stream_buf}")
_cprint(f"{_tc}{self._stream_buf}{_RST}" if _tc else self._stream_buf)
self._stream_buf = ""
# Close the response box
@@ -4100,8 +4099,6 @@ class HermesCLI:
self.agent.flush_memories(self.conversation_history)
except (Exception, KeyboardInterrupt):
pass
# Trigger memory extraction on the old session before session_id rotates.
self.agent.commit_memory_session(self.conversation_history)
self._notify_session_boundary("on_session_finalize")
elif self.agent:
# First session or empty history — still finalize the old session
@@ -4590,19 +4587,16 @@ class HermesCLI:
self._close_model_picker()
return
provider_data = providers[selected]
# Use the curated model list from list_authenticated_providers()
# (same lists as `hermes model` and gateway pickers).
# Only fall back to the live provider catalog when the curated
# list is empty (e.g. user-defined endpoints with no curated list).
model_list = provider_data.get("models", [])
model_list = []
try:
from hermes_cli.models import provider_model_ids
live = provider_model_ids(provider_data["slug"])
if live:
model_list = live
except Exception:
pass
if not model_list:
try:
from hermes_cli.models import provider_model_ids
live = provider_model_ids(provider_data["slug"])
if live:
model_list = live
except Exception:
pass
model_list = provider_data.get("models", [])
state["stage"] = "model"
state["provider_data"] = provider_data
state["model_list"] = model_list
@@ -5767,7 +5761,7 @@ class HermesCLI:
border_style=_resp_color,
style=_resp_text,
box=rich_box.HORIZONTALS,
padding=(1, 4),
padding=(1, 2),
))
else:
_cprint(" (No response generated)")
@@ -5891,7 +5885,7 @@ class HermesCLI:
title_align="left",
border_style=_resp_color,
box=rich_box.HORIZONTALS,
padding=(1, 4),
padding=(1, 2),
))
else:
_cprint(" 💬 /btw: (no response)")
@@ -5958,7 +5952,7 @@ class HermesCLI:
parts = cmd.strip().split(None, 1)
sub = parts[1].lower().strip() if len(parts) > 1 else "status"
_DEFAULT_CDP = "http://127.0.0.1:9222"
_DEFAULT_CDP = "http://localhost:9222"
current = os.environ.get("BROWSER_CDP_URL", "").strip()
if sub.startswith("connect"):
@@ -7654,7 +7648,7 @@ class HermesCLI:
label = " ⚕ Hermes "
fill = w - 2 - len(label)
_cprint(f"\n{_ACCENT}╭─{label}{'' * max(fill - 1, 0)}{_RST}")
_cprint(f"{_STREAM_PAD}{sentence.rstrip()}")
_cprint(sentence.rstrip())
tts_thread = threading.Thread(
target=stream_tts_to_speaker,
@@ -7885,7 +7879,7 @@ class HermesCLI:
border_style=_resp_color,
style=_resp_text,
box=rich_box.HORIZONTALS,
padding=(1, 4),
padding=(1, 2),
))
-6
View File
@@ -501,12 +501,6 @@ def update_job(job_id: str, updates: Dict[str, Any]) -> Optional[Dict[str, Any]]
if schedule_changed:
updated_schedule = updated["schedule"]
# The API may pass schedule as a raw string (e.g. "every 10m")
# instead of a pre-parsed dict. Normalize it the same way
# create_job() does so downstream code can call .get() safely.
if isinstance(updated_schedule, str):
updated_schedule = parse_schedule(updated_schedule)
updated["schedule"] = updated_schedule
updated["schedule_display"] = updates.get(
"schedule_display",
updated_schedule.get("display", updated.get("schedule_display")),
+2 -9
View File
@@ -10,7 +10,6 @@ runs at a time if multiple processes overlap.
import asyncio
import concurrent.futures
import contextvars
import json
import logging
import os
@@ -289,13 +288,11 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
if wrap_response:
task_name = job.get("name", job["id"])
job_id = job.get("id", "")
delivery_content = (
f"Cronjob Response: {task_name}\n"
f"(job_id: {job_id})\n"
f"-------------\n\n"
f"{content}\n\n"
f"To stop or manage this job, send me a new message (e.g. \"stop reminder {task_name}\")."
f"Note: The agent cannot see this message, and therefore cannot respond to it."
)
else:
delivery_content = content
@@ -771,11 +768,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
_cron_inactivity_limit = _cron_timeout if _cron_timeout > 0 else None
_POLL_INTERVAL = 5.0
_cron_pool = concurrent.futures.ThreadPoolExecutor(max_workers=1)
# Preserve scheduler-scoped ContextVar state (for example skill-declared
# env passthrough registrations) when the cron run hops into the worker
# thread used for inactivity timeout monitoring.
_cron_context = contextvars.copy_context()
_cron_future = _cron_pool.submit(_cron_context.run, agent.run_conversation, prompt)
_cron_future = _cron_pool.submit(agent.run_conversation, prompt)
_inactivity_timeout = False
try:
if _cron_inactivity_limit is None:
Executable → Regular
+6 -13
View File
@@ -1,14 +1,13 @@
#!/bin/bash
# Docker/Podman entrypoint: bootstrap config files into the mounted volume, then run hermes.
# Docker entrypoint: bootstrap config files into the mounted volume, then run hermes.
set -e
HERMES_HOME="${HERMES_HOME:-/opt/data}"
HERMES_HOME="/opt/data"
INSTALL_DIR="/opt/hermes"
# --- Privilege dropping via gosu ---
# When started as root (the default for Docker, or fakeroot in rootless Podman),
# optionally remap the hermes user/group to match host-side ownership, fix volume
# permissions, then re-exec as hermes.
# When started as root (the default), optionally remap the hermes user/group
# to match host-side ownership, fix volume permissions, then re-exec as hermes.
if [ "$(id -u)" = "0" ]; then
if [ -n "$HERMES_UID" ] && [ "$HERMES_UID" != "$(id -u hermes)" ]; then
echo "Changing hermes UID to $HERMES_UID"
@@ -17,19 +16,13 @@ if [ "$(id -u)" = "0" ]; then
if [ -n "$HERMES_GID" ] && [ "$HERMES_GID" != "$(id -g hermes)" ]; then
echo "Changing hermes GID to $HERMES_GID"
# -o allows non-unique GID (e.g. macOS GID 20 "staff" may already exist
# as "dialout" in the Debian-based container image)
groupmod -o -g "$HERMES_GID" hermes 2>/dev/null || true
groupmod -g "$HERMES_GID" hermes
fi
actual_hermes_uid=$(id -u hermes)
if [ "$(stat -c %u "$HERMES_HOME" 2>/dev/null)" != "$actual_hermes_uid" ]; then
echo "$HERMES_HOME is not owned by $actual_hermes_uid, fixing"
# In rootless Podman the container's "root" is mapped to an unprivileged
# host UID — chown will fail. That's fine: the volume is already owned
# by the mapped user on the host side.
chown -R hermes:hermes "$HERMES_HOME" 2>/dev/null || \
echo "Warning: chown failed (rootless container?) — continuing anyway"
chown -R hermes:hermes "$HERMES_HOME"
fi
echo "Dropping root privileges"
-6
View File
@@ -554,12 +554,6 @@ def load_gateway_config() -> GatewayConfig:
bridged["mention_patterns"] = platform_cfg["mention_patterns"]
if plat == Platform.DISCORD and "channel_skill_bindings" in platform_cfg:
bridged["channel_skill_bindings"] = platform_cfg["channel_skill_bindings"]
if "channel_prompts" in platform_cfg:
channel_prompts = platform_cfg["channel_prompts"]
if isinstance(channel_prompts, dict):
bridged["channel_prompts"] = {str(k): v for k, v in channel_prompts.items()}
else:
bridged["channel_prompts"] = channel_prompts
if not bridged:
continue
plat_data = platforms_data.setdefault(plat.value, {})
+25 -1
View File
@@ -3,11 +3,12 @@ Event Hook System
A lightweight event-driven system that fires handlers at key lifecycle points.
Hooks are discovered from ~/.hermes/hooks/ directories, each containing:
- HOOK.yaml (metadata: name, description, events list)
- HOOK.yaml (metadata: name, description, events list, optional startup_readiness)
- handler.py (Python handler with async def handle(event_type, context))
Events:
- gateway:startup -- Gateway process starts
- gateway:shutdown -- Gateway process is shutting down
- session:start -- New session created (first message of a new session)
- session:end -- Session ends (user ran /new or /reset)
- session:reset -- Session reset completed (new session entry created)
@@ -31,6 +32,26 @@ from hermes_cli.config import get_hermes_home
HOOKS_DIR = get_hermes_home() / "hooks"
def _normalize_startup_readiness(hook_name: str, manifest: dict[str, Any]) -> Optional[dict[str, Any]]:
"""Validate and normalize optional startup readiness metadata."""
readiness = manifest.get("startup_readiness")
if readiness is None:
return None
if not isinstance(readiness, dict):
print(f"[hooks] Ignoring startup_readiness for {hook_name}: expected mapping", flush=True)
return None
check_id = str(readiness.get("id", "")).strip()
if not check_id:
print(f"[hooks] Ignoring startup_readiness for {hook_name}: missing id", flush=True)
return None
return {
"id": check_id,
"required": bool(readiness.get("required", True)),
}
class HookRegistry:
"""
Discovers, loads, and fires event hooks.
@@ -62,6 +83,7 @@ class HookRegistry:
"description": "Run ~/.hermes/BOOT.md on gateway startup",
"events": ["gateway:startup"],
"path": "(builtin)",
"startup_readiness": None,
})
except Exception as e:
print(f"[hooks] Could not load built-in boot-md hook: {e}", flush=True)
@@ -102,6 +124,7 @@ class HookRegistry:
if not events:
print(f"[hooks] Skipping {hook_name}: no events declared", flush=True)
continue
startup_readiness = _normalize_startup_readiness(hook_name, manifest)
# Dynamically load the handler module
spec = importlib.util.spec_from_file_location(
@@ -128,6 +151,7 @@ class HookRegistry:
"description": manifest.get("description", ""),
"events": events,
"path": str(hook_dir),
"startup_readiness": startup_readiness,
})
print(f"[hooks] Loaded hook '{hook_name}' for events: {events}", flush=True)
+3 -512
View File
@@ -515,8 +515,6 @@ class APIServerAdapter(BasePlatformAdapter):
session_id: Optional[str] = None,
stream_delta_callback=None,
tool_progress_callback=None,
tool_start_callback=None,
tool_complete_callback=None,
) -> Any:
"""
Create an AIAgent instance using the gateway's runtime config.
@@ -555,8 +553,6 @@ class APIServerAdapter(BasePlatformAdapter):
platform="api_server",
stream_delta_callback=stream_delta_callback,
tool_progress_callback=tool_progress_callback,
tool_start_callback=tool_start_callback,
tool_complete_callback=tool_complete_callback,
session_db=self._ensure_session_db(),
fallback_model=fallback_model,
)
@@ -969,427 +965,6 @@ class APIServerAdapter(BasePlatformAdapter):
return response
async def _write_sse_responses(
self,
request: "web.Request",
response_id: str,
model: str,
created_at: int,
stream_q,
agent_task,
agent_ref,
conversation_history: List[Dict[str, str]],
user_message: str,
instructions: Optional[str],
conversation: Optional[str],
store: bool,
session_id: str,
) -> "web.StreamResponse":
"""Write an SSE stream for POST /v1/responses (OpenAI Responses API).
Emits spec-compliant event types as the agent runs:
- ``response.created`` initial envelope (status=in_progress)
- ``response.output_text.delta`` / ``response.output_text.done``
streamed assistant text
- ``response.output_item.added`` / ``response.output_item.done``
with ``item.type == "function_call"`` when the agent invokes a
tool (both events fire; the ``done`` event carries the finalized
``arguments`` string)
- ``response.output_item.added`` with
``item.type == "function_call_output"`` tool result with
``{call_id, output, status}``
- ``response.completed`` terminal event carrying the full
response object with all output items + usage (same payload
shape as the non-streaming path for parity)
- ``response.failed`` terminal event on agent error
If the client disconnects mid-stream, ``agent.interrupt()`` is
called so the agent stops issuing upstream LLM calls, then the
asyncio task is cancelled. When ``store=True`` the full response
is persisted to the ResponseStore in a ``finally`` block so GET
/v1/responses/{id} and ``previous_response_id`` chaining work the
same as the batch path.
"""
import queue as _q
sse_headers = {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
"X-Accel-Buffering": "no",
}
origin = request.headers.get("Origin", "")
cors = self._cors_headers_for_origin(origin) if origin else None
if cors:
sse_headers.update(cors)
if session_id:
sse_headers["X-Hermes-Session-Id"] = session_id
response = web.StreamResponse(status=200, headers=sse_headers)
await response.prepare(request)
# State accumulated during the stream
final_text_parts: List[str] = []
# Track open function_call items by name so we can emit a matching
# ``done`` event when the tool completes. Order preserved.
pending_tool_calls: List[Dict[str, Any]] = []
# Output items we've emitted so far (used to build the terminal
# response.completed payload). Kept in the order they appeared.
emitted_items: List[Dict[str, Any]] = []
# Monotonic counter for output_index (spec requires it).
output_index = 0
# Monotonic counter for call_id generation if the agent doesn't
# provide one (it doesn't, from tool_progress_callback).
call_counter = 0
# Canonical Responses SSE events include a monotonically increasing
# sequence_number. Add it server-side for every emitted event so
# clients that validate the OpenAI event schema can parse our stream.
sequence_number = 0
# Track the assistant message item id + content index for text
# delta events — the spec ties deltas to a specific item.
message_item_id = f"msg_{uuid.uuid4().hex[:24]}"
message_output_index: Optional[int] = None
message_opened = False
async def _write_event(event_type: str, data: Dict[str, Any]) -> None:
nonlocal sequence_number
if "sequence_number" not in data:
data["sequence_number"] = sequence_number
sequence_number += 1
payload = f"event: {event_type}\ndata: {json.dumps(data)}\n\n"
await response.write(payload.encode())
def _envelope(status: str) -> Dict[str, Any]:
env: Dict[str, Any] = {
"id": response_id,
"object": "response",
"status": status,
"created_at": created_at,
"model": model,
}
return env
final_response_text = ""
agent_error: Optional[str] = None
usage: Dict[str, int] = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
try:
# response.created — initial envelope, status=in_progress
created_env = _envelope("in_progress")
created_env["output"] = []
await _write_event("response.created", {
"type": "response.created",
"response": created_env,
})
last_activity = time.monotonic()
async def _open_message_item() -> None:
"""Emit response.output_item.added for the assistant message
the first time any text delta arrives."""
nonlocal message_opened, message_output_index, output_index
if message_opened:
return
message_opened = True
message_output_index = output_index
output_index += 1
item = {
"id": message_item_id,
"type": "message",
"status": "in_progress",
"role": "assistant",
"content": [],
}
await _write_event("response.output_item.added", {
"type": "response.output_item.added",
"output_index": message_output_index,
"item": item,
})
async def _emit_text_delta(delta_text: str) -> None:
await _open_message_item()
final_text_parts.append(delta_text)
await _write_event("response.output_text.delta", {
"type": "response.output_text.delta",
"item_id": message_item_id,
"output_index": message_output_index,
"content_index": 0,
"delta": delta_text,
"logprobs": [],
})
async def _emit_tool_started(payload: Dict[str, Any]) -> str:
"""Emit response.output_item.added for a function_call.
Returns the call_id so the matching completion event can
reference it. Prefer the real ``tool_call_id`` from the
agent when available; fall back to a generated call id for
safety in tests or older code paths.
"""
nonlocal output_index, call_counter
call_counter += 1
call_id = payload.get("tool_call_id") or f"call_{response_id[5:]}_{call_counter}"
args = payload.get("arguments", {})
if isinstance(args, dict):
arguments_str = json.dumps(args)
else:
arguments_str = str(args)
item = {
"id": f"fc_{uuid.uuid4().hex[:24]}",
"type": "function_call",
"status": "in_progress",
"name": payload.get("name", ""),
"call_id": call_id,
"arguments": arguments_str,
}
idx = output_index
output_index += 1
pending_tool_calls.append({
"call_id": call_id,
"name": payload.get("name", ""),
"arguments": arguments_str,
"item_id": item["id"],
"output_index": idx,
})
emitted_items.append({
"type": "function_call",
"name": payload.get("name", ""),
"arguments": arguments_str,
"call_id": call_id,
})
await _write_event("response.output_item.added", {
"type": "response.output_item.added",
"output_index": idx,
"item": item,
})
return call_id
async def _emit_tool_completed(payload: Dict[str, Any]) -> None:
"""Emit response.output_item.done (function_call) followed
by response.output_item.added (function_call_output)."""
nonlocal output_index
call_id = payload.get("tool_call_id")
result = payload.get("result", "")
pending = None
if call_id:
for i, p in enumerate(pending_tool_calls):
if p["call_id"] == call_id:
pending = pending_tool_calls.pop(i)
break
if pending is None:
# Completion without a matching start — skip to avoid
# emitting orphaned done events.
return
# function_call done
done_item = {
"id": pending["item_id"],
"type": "function_call",
"status": "completed",
"name": pending["name"],
"call_id": pending["call_id"],
"arguments": pending["arguments"],
}
await _write_event("response.output_item.done", {
"type": "response.output_item.done",
"output_index": pending["output_index"],
"item": done_item,
})
# function_call_output added (result)
result_str = result if isinstance(result, str) else json.dumps(result)
output_parts = [{"type": "input_text", "text": result_str}]
output_item = {
"id": f"fco_{uuid.uuid4().hex[:24]}",
"type": "function_call_output",
"call_id": pending["call_id"],
"output": output_parts,
"status": "completed",
}
idx = output_index
output_index += 1
emitted_items.append({
"type": "function_call_output",
"call_id": pending["call_id"],
"output": output_parts,
})
await _write_event("response.output_item.added", {
"type": "response.output_item.added",
"output_index": idx,
"item": output_item,
})
await _write_event("response.output_item.done", {
"type": "response.output_item.done",
"output_index": idx,
"item": output_item,
})
# Main drain loop — thread-safe queue fed by agent callbacks.
async def _dispatch(it) -> None:
"""Route a queue item to the correct SSE emitter.
Plain strings are text deltas. Tagged tuples with
``__tool_started__`` / ``__tool_completed__`` prefixes
are tool lifecycle events.
"""
if isinstance(it, tuple) and len(it) == 2 and isinstance(it[0], str):
tag, payload = it
if tag == "__tool_started__":
await _emit_tool_started(payload)
elif tag == "__tool_completed__":
await _emit_tool_completed(payload)
# Unknown tags are silently ignored (forward-compat).
elif isinstance(it, str):
await _emit_text_delta(it)
# Other types (non-string, non-tuple) are silently dropped.
loop = asyncio.get_event_loop()
while True:
try:
item = await loop.run_in_executor(None, lambda: stream_q.get(timeout=0.5))
except _q.Empty:
if agent_task.done():
# Drain remaining
while True:
try:
item = stream_q.get_nowait()
if item is None:
break
await _dispatch(item)
last_activity = time.monotonic()
except _q.Empty:
break
break
if time.monotonic() - last_activity >= CHAT_COMPLETIONS_SSE_KEEPALIVE_SECONDS:
await response.write(b": keepalive\n\n")
last_activity = time.monotonic()
continue
if item is None: # EOS sentinel
break
await _dispatch(item)
last_activity = time.monotonic()
# Pick up agent result + usage from the completed task
try:
result, agent_usage = await agent_task
usage = agent_usage or usage
# If the agent produced a final_response but no text
# deltas were streamed (e.g. some providers only emit
# the full response at the end), emit a single fallback
# delta so Responses clients still receive a live text part.
agent_final = result.get("final_response", "") if isinstance(result, dict) else ""
if agent_final and not final_text_parts:
await _emit_text_delta(agent_final)
if agent_final and not final_response_text:
final_response_text = agent_final
if isinstance(result, dict) and result.get("error") and not final_response_text:
agent_error = result["error"]
except Exception as e: # noqa: BLE001
logger.error("Error running agent for streaming responses: %s", e, exc_info=True)
agent_error = str(e)
# Close the message item if it was opened
final_response_text = "".join(final_text_parts) or final_response_text
if message_opened:
await _write_event("response.output_text.done", {
"type": "response.output_text.done",
"item_id": message_item_id,
"output_index": message_output_index,
"content_index": 0,
"text": final_response_text,
"logprobs": [],
})
msg_done_item = {
"id": message_item_id,
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{"type": "output_text", "text": final_response_text}
],
}
await _write_event("response.output_item.done", {
"type": "response.output_item.done",
"output_index": message_output_index,
"item": msg_done_item,
})
# Always append a final message item in the completed
# response envelope so clients that only parse the terminal
# payload still see the assistant text. This mirrors the
# shape produced by _extract_output_items in the batch path.
final_items: List[Dict[str, Any]] = list(emitted_items)
final_items.append({
"type": "message",
"role": "assistant",
"content": [
{"type": "output_text", "text": final_response_text or (agent_error or "")}
],
})
if agent_error:
failed_env = _envelope("failed")
failed_env["output"] = final_items
failed_env["error"] = {"message": agent_error, "type": "server_error"}
failed_env["usage"] = {
"input_tokens": usage.get("input_tokens", 0),
"output_tokens": usage.get("output_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
}
await _write_event("response.failed", {
"type": "response.failed",
"response": failed_env,
})
else:
completed_env = _envelope("completed")
completed_env["output"] = final_items
completed_env["usage"] = {
"input_tokens": usage.get("input_tokens", 0),
"output_tokens": usage.get("output_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
}
await _write_event("response.completed", {
"type": "response.completed",
"response": completed_env,
})
# Persist for future chaining / GET retrieval, mirroring
# the batch path behavior.
if store:
full_history = list(conversation_history)
full_history.append({"role": "user", "content": user_message})
if isinstance(result, dict) and result.get("messages"):
full_history.extend(result["messages"])
else:
full_history.append({"role": "assistant", "content": final_response_text})
self._response_store.put(response_id, {
"response": completed_env,
"conversation_history": full_history,
"instructions": instructions,
"session_id": session_id,
})
if conversation:
self._response_store.set_conversation(conversation, response_id)
except (ConnectionResetError, ConnectionAbortedError, BrokenPipeError, OSError):
# Client disconnected — interrupt the agent so it stops
# making upstream LLM calls, then cancel the task.
agent = agent_ref[0] if agent_ref else None
if agent is not None:
try:
agent.interrupt("SSE client disconnected")
except Exception:
pass
if not agent_task.done():
agent_task.cancel()
try:
await agent_task
except (asyncio.CancelledError, Exception):
pass
logger.info("SSE client disconnected; interrupted agent task %s", response_id)
return response
async def _handle_responses(self, request: "web.Request") -> "web.Response":
"""POST /v1/responses — OpenAI Responses API format."""
auth_err = self._check_auth(request)
@@ -1460,13 +1035,11 @@ class APIServerAdapter(BasePlatformAdapter):
if previous_response_id:
logger.debug("Both conversation_history and previous_response_id provided; using conversation_history")
stored_session_id = None
if not conversation_history and previous_response_id:
stored = self._response_store.get(previous_response_id)
if stored is None:
return web.json_response(_openai_error(f"Previous response not found: {previous_response_id}"), status=404)
conversation_history = list(stored.get("conversation_history", []))
stored_session_id = stored.get("session_id")
# If no instructions provided, carry forward from previous
if instructions is None:
instructions = stored.get("instructions")
@@ -1484,83 +1057,8 @@ class APIServerAdapter(BasePlatformAdapter):
if body.get("truncation") == "auto" and len(conversation_history) > 100:
conversation_history = conversation_history[-100:]
# Reuse session from previous_response_id chain so the dashboard
# groups the entire conversation under one session entry.
session_id = stored_session_id or str(uuid.uuid4())
stream = bool(body.get("stream", False))
if stream:
# Streaming branch — emit OpenAI Responses SSE events as the
# agent runs so frontends can render text deltas and tool
# calls in real time. See _write_sse_responses for details.
import queue as _q
_stream_q: _q.Queue = _q.Queue()
def _on_delta(delta):
# None from the agent is a CLI box-close signal, not EOS.
# Forwarding would kill the SSE stream prematurely; the
# SSE writer detects completion via agent_task.done().
if delta is not None:
_stream_q.put(delta)
def _on_tool_progress(event_type, name, preview, args, **kwargs):
"""Queue non-start tool progress events if needed in future.
The structured Responses stream uses ``tool_start_callback``
and ``tool_complete_callback`` for exact call-id correlation,
so progress events are currently ignored here.
"""
return
def _on_tool_start(tool_call_id, function_name, function_args):
"""Queue a started tool for live function_call streaming."""
_stream_q.put(("__tool_started__", {
"tool_call_id": tool_call_id,
"name": function_name,
"arguments": function_args or {},
}))
def _on_tool_complete(tool_call_id, function_name, function_args, function_result):
"""Queue a completed tool result for live function_call_output streaming."""
_stream_q.put(("__tool_completed__", {
"tool_call_id": tool_call_id,
"name": function_name,
"arguments": function_args or {},
"result": function_result,
}))
agent_ref = [None]
agent_task = asyncio.ensure_future(self._run_agent(
user_message=user_message,
conversation_history=conversation_history,
ephemeral_system_prompt=instructions,
session_id=session_id,
stream_delta_callback=_on_delta,
tool_progress_callback=_on_tool_progress,
tool_start_callback=_on_tool_start,
tool_complete_callback=_on_tool_complete,
agent_ref=agent_ref,
))
response_id = f"resp_{uuid.uuid4().hex[:28]}"
model_name = body.get("model", self._model_name)
created_at = int(time.time())
return await self._write_sse_responses(
request=request,
response_id=response_id,
model=model_name,
created_at=created_at,
stream_q=_stream_q,
agent_task=agent_task,
agent_ref=agent_ref,
conversation_history=conversation_history,
user_message=user_message,
instructions=instructions,
conversation=conversation,
store=store,
session_id=session_id,
)
# Run the agent (with Idempotency-Key support)
session_id = str(uuid.uuid4())
async def _compute_response():
return await self._run_agent(
@@ -1635,7 +1133,6 @@ class APIServerAdapter(BasePlatformAdapter):
"response": response_data,
"conversation_history": full_history,
"instructions": instructions,
"session_id": session_id,
})
# Update conversation mapping so the next request with the same
# conversation name automatically chains to this response
@@ -1989,8 +1486,6 @@ class APIServerAdapter(BasePlatformAdapter):
session_id: Optional[str] = None,
stream_delta_callback=None,
tool_progress_callback=None,
tool_start_callback=None,
tool_complete_callback=None,
agent_ref: Optional[list] = None,
) -> tuple:
"""
@@ -2012,8 +1507,6 @@ class APIServerAdapter(BasePlatformAdapter):
session_id=session_id,
stream_delta_callback=stream_delta_callback,
tool_progress_callback=tool_progress_callback,
tool_start_callback=tool_start_callback,
tool_complete_callback=tool_complete_callback,
)
if agent_ref is not None:
agent_ref[0] = agent
@@ -2150,12 +1643,10 @@ class APIServerAdapter(BasePlatformAdapter):
if previous_response_id:
logger.debug("Both conversation_history and previous_response_id provided; using conversation_history")
stored_session_id = None
if not conversation_history and previous_response_id:
stored = self._response_store.get(previous_response_id)
if stored:
conversation_history = list(stored.get("conversation_history", []))
stored_session_id = stored.get("session_id")
if instructions is None:
instructions = stored.get("instructions")
@@ -2174,7 +1665,7 @@ class APIServerAdapter(BasePlatformAdapter):
)
conversation_history.append({"role": msg["role"], "content": str(content)})
session_id = body.get("session_id") or stored_session_id or run_id
session_id = body.get("session_id") or run_id
ephemeral_system_prompt = instructions
async def _run_and_close():
-49
View File
@@ -682,10 +682,6 @@ class MessageEvent:
# Auto-loaded skill(s) for topic/channel bindings (e.g., Telegram DM Topics,
# Discord channel_skill_bindings). A single name or ordered list.
auto_skill: Optional[str | list[str]] = None
# Per-channel ephemeral system prompt (e.g. Discord channel_prompts).
# Applied at API call time and never persisted to transcript history.
channel_prompt: Optional[str] = None
# Internal flag — set for synthetic events (e.g. background process
# completion notifications) that must bypass user authorization checks.
@@ -780,36 +776,6 @@ _RETRYABLE_ERROR_PATTERNS = (
MessageHandler = Callable[[MessageEvent], Awaitable[Optional[str]]]
def resolve_channel_prompt(
config_extra: dict,
channel_id: str,
parent_id: str | None = None,
) -> str | None:
"""Resolve a per-channel ephemeral prompt from platform config.
Looks up ``channel_prompts`` in the adapter's ``config.extra`` dict.
Prefers an exact match on *channel_id*; falls back to *parent_id*
(useful for forum threads / child channels inheriting a parent prompt).
Returns the prompt string, or None if no match is found. Blank/whitespace-
only prompts are treated as absent.
"""
prompts = config_extra.get("channel_prompts") or {}
if not isinstance(prompts, dict):
return None
for key in (channel_id, parent_id):
if not key:
continue
prompt = prompts.get(key)
if prompt is None:
continue
prompt = str(prompt).strip()
if prompt:
return prompt
return None
class BasePlatformAdapter(ABC):
"""
Base class for platform adapters.
@@ -1658,21 +1624,6 @@ class BasePlatformAdapter(ABC):
# streaming already delivered the text (already_sent=True) or
# when the message was queued behind an active agent. Log at
# DEBUG to avoid noisy warnings for expected behavior.
#
# Suppress stale response when the session was interrupted by a
# new message that hasn't been consumed yet. The pending message
# is processed by the pending-message handler below (#8221/#2483).
if (
response
and interrupt_event.is_set()
and session_key in self._pending_messages
):
logger.info(
"[%s] Suppressing stale response for interrupted session %s",
self.name,
session_key,
)
response = None
if not response:
logger.debug("[%s] Handler returned empty/None response for %s", self.name, event.source.chat_id)
if response:
+1 -150
View File
@@ -1379,68 +1379,6 @@ class DiscordAdapter(BasePlatformAdapter):
)
return await super().send_image(chat_id, image_url, caption, reply_to)
async def send_animation(
self,
chat_id: str,
animation_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an animated GIF natively as a Discord file attachment."""
if not self._client:
return SendResult(success=False, error="Not connected")
if not is_safe_url(animation_url):
logger.warning("[%s] Blocked unsafe animation URL during Discord send_animation", self.name)
return await super().send_animation(chat_id, animation_url, caption, reply_to, metadata=metadata)
try:
import aiohttp
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
# Download the GIF and send as a Discord file attachment
# (Discord renders .gif attachments as auto-playing animations inline)
from gateway.platforms.base import resolve_proxy_url, proxy_kwargs_for_aiohttp
_proxy = resolve_proxy_url(platform_env_var="DISCORD_PROXY")
_sess_kw, _req_kw = proxy_kwargs_for_aiohttp(_proxy)
async with aiohttp.ClientSession(**_sess_kw) as session:
async with session.get(animation_url, timeout=aiohttp.ClientTimeout(total=30), **_req_kw) as resp:
if resp.status != 200:
raise Exception(f"Failed to download animation: HTTP {resp.status}")
animation_data = await resp.read()
import io
file = discord.File(io.BytesIO(animation_data), filename="animation.gif")
msg = await channel.send(
content=caption if caption else None,
file=file,
)
return SendResult(success=True, message_id=str(msg.id))
except ImportError:
logger.warning(
"[%s] aiohttp not installed, falling back to URL. Run: pip install aiohttp",
self.name,
exc_info=True,
)
return await super().send_animation(chat_id, animation_url, caption, reply_to, metadata=metadata)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[%s] Failed to send animation attachment, falling back to URL: %s",
self.name,
e,
exc_info=True,
)
return await super().send_animation(chat_id, animation_url, caption, reply_to, metadata=metadata)
async def send_video(
self,
chat_id: str,
@@ -1758,10 +1696,6 @@ class DiscordAdapter(BasePlatformAdapter):
async def slash_update(interaction: discord.Interaction):
await self._run_simple_slash(interaction, "/update", "Update initiated~")
@tree.command(name="restart", description="Gracefully restart the Hermes gateway")
async def slash_restart(interaction: discord.Interaction):
await self._run_simple_slash(interaction, "/restart", "Restart requested~")
@tree.command(name="approve", description="Approve a pending dangerous command")
@discord.app_commands.describe(scope="Optional: 'all', 'session', 'always', 'all session', 'all always'")
async def slash_approve(interaction: discord.Interaction, scope: str = ""):
@@ -1802,76 +1736,6 @@ class DiscordAdapter(BasePlatformAdapter):
async def slash_btw(interaction: discord.Interaction, question: str):
await self._run_simple_slash(interaction, f"/btw {question}")
# ── Auto-register any gateway-available commands not yet on the tree ──
# This ensures new commands added to COMMAND_REGISTRY in
# hermes_cli/commands.py automatically appear as Discord slash
# commands without needing a manual entry here.
try:
from hermes_cli.commands import COMMAND_REGISTRY, _is_gateway_available, _resolve_config_gates
already_registered = set()
try:
already_registered = {cmd.name for cmd in tree.get_commands()}
except Exception:
pass
config_overrides = _resolve_config_gates()
for cmd_def in COMMAND_REGISTRY:
if not _is_gateway_available(cmd_def, config_overrides):
continue
# Discord command names: lowercase, hyphens OK, max 32 chars.
discord_name = cmd_def.name.lower()[:32]
if discord_name in already_registered:
continue
# Skip aliases that overlap with already-registered names
# (aliases for explicitly registered commands are handled above).
desc = (cmd_def.description or f"Run /{cmd_def.name}")[:100]
has_args = bool(cmd_def.args_hint)
if has_args:
# Command takes optional arguments — create handler with
# an optional ``args`` string parameter.
def _make_args_handler(_name: str, _hint: str):
@discord.app_commands.describe(args=f"Arguments: {_hint}"[:100])
async def _handler(interaction: discord.Interaction, args: str = ""):
await self._run_simple_slash(
interaction, f"/{_name} {args}".strip()
)
_handler.__name__ = f"auto_slash_{_name.replace('-', '_')}"
return _handler
handler = _make_args_handler(cmd_def.name, cmd_def.args_hint)
else:
# Parameterless command.
def _make_simple_handler(_name: str):
async def _handler(interaction: discord.Interaction):
await self._run_simple_slash(interaction, f"/{_name}")
_handler.__name__ = f"auto_slash_{_name.replace('-', '_')}"
return _handler
handler = _make_simple_handler(cmd_def.name)
auto_cmd = discord.app_commands.Command(
name=discord_name,
description=desc,
callback=handler,
)
try:
tree.add_command(auto_cmd)
already_registered.add(discord_name)
except Exception:
# Silently skip commands that fail registration (e.g.
# name conflict with a subcommand group).
pass
logger.debug(
"Discord auto-registered %d commands from COMMAND_REGISTRY",
len(already_registered),
)
except Exception as e:
logger.warning("Discord auto-register from COMMAND_REGISTRY failed: %s", e)
# Register skills under a single /skill command group with category
# subcommand groups. This uses 1 top-level slot instead of N,
# supporting up to 25 categories × 25 skills = 625 skills.
@@ -1992,14 +1856,11 @@ class DiscordAdapter(BasePlatformAdapter):
)
msg_type = MessageType.COMMAND if text.startswith("/") else MessageType.TEXT
channel_id = str(interaction.channel_id)
parent_id = str(getattr(getattr(interaction, "channel", None), "parent_id", "") or "")
return MessageEvent(
text=text,
message_type=msg_type,
source=source,
raw_message=interaction,
channel_prompt=self._resolve_channel_prompt(channel_id, parent_id or None),
)
# ------------------------------------------------------------------
@@ -2070,17 +1931,14 @@ class DiscordAdapter(BasePlatformAdapter):
chat_topic=chat_topic,
)
_parent_channel = self._thread_parent_channel(getattr(interaction, "channel", None))
_parent_id = str(getattr(_parent_channel, "id", "") or "")
_parent_id = str(getattr(getattr(interaction, "channel", None), "parent_id", "") or "")
_skills = self._resolve_channel_skills(thread_id, _parent_id or None)
_channel_prompt = self._resolve_channel_prompt(thread_id, _parent_id or None)
event = MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=source,
raw_message=interaction,
auto_skill=_skills,
channel_prompt=_channel_prompt,
)
await self.handle_message(event)
@@ -2109,11 +1967,6 @@ class DiscordAdapter(BasePlatformAdapter):
return list(dict.fromkeys(skills)) # dedup, preserve order
return None
def _resolve_channel_prompt(self, channel_id: str, parent_id: str | None = None) -> str | None:
"""Resolve a Discord per-channel prompt, preferring the exact channel over its parent."""
from gateway.platforms.base import resolve_channel_prompt
return resolve_channel_prompt(self.config.extra, channel_id, parent_id)
def _thread_parent_channel(self, channel: Any) -> Any:
"""Return the parent text channel when invoked from a thread."""
return getattr(channel, "parent", None) or channel
@@ -2665,7 +2518,6 @@ class DiscordAdapter(BasePlatformAdapter):
_parent_id = str(getattr(_chan, "parent_id", "") or "")
_chan_id = str(getattr(_chan, "id", ""))
_skills = self._resolve_channel_skills(_chan_id, _parent_id or None)
_channel_prompt = self._resolve_channel_prompt(_chan_id, _parent_id or None)
reply_to_id = None
reply_to_text = None
@@ -2686,7 +2538,6 @@ class DiscordAdapter(BasePlatformAdapter):
reply_to_text=reply_to_text,
timestamp=message.created_at,
auto_skill=_skills,
channel_prompt=_channel_prompt,
)
# Track thread participation so the bot won't require @mention for
+1 -4
View File
@@ -49,10 +49,7 @@ class MessageDeduplicator:
return False
now = time.time()
if msg_id in self._seen:
if now - self._seen[msg_id] < self._ttl:
return True
# Entry has expired — remove it and treat as new
del self._seen[msg_id]
return True
self._seen[msg_id] = now
if len(self._seen) > self._max_size:
cutoff = now - self._ttl
-8
View File
@@ -729,14 +729,6 @@ class MatrixAdapter(BasePlatformAdapter):
except Exception:
pass
async def stop_typing(self, chat_id: str) -> None:
"""Stop the Matrix typing indicator."""
if self._client:
try:
await self._client.set_typing(RoomID(chat_id), timeout=0)
except Exception:
pass
async def edit_message(
self, chat_id: str, message_id: str, content: str
) -> SendResult:
-7
View File
@@ -718,12 +718,6 @@ class MattermostAdapter(BasePlatformAdapter):
thread_id=thread_id,
)
# Per-channel ephemeral prompt
from gateway.platforms.base import resolve_channel_prompt
_channel_prompt = resolve_channel_prompt(
self.config.extra, channel_id, None,
)
msg_event = MessageEvent(
text=message_text,
message_type=msg_type,
@@ -732,7 +726,6 @@ class MattermostAdapter(BasePlatformAdapter):
message_id=post_id,
media_urls=media_urls if media_urls else None,
media_types=media_types if media_types else None,
channel_prompt=_channel_prompt,
)
await self.handle_message(msg_event)
-7
View File
@@ -1167,12 +1167,6 @@ class SlackAdapter(BasePlatformAdapter):
thread_id=thread_ts,
)
# Per-channel ephemeral prompt
from gateway.platforms.base import resolve_channel_prompt
_channel_prompt = resolve_channel_prompt(
self.config.extra, channel_id, None,
)
msg_event = MessageEvent(
text=text,
message_type=msg_type,
@@ -1182,7 +1176,6 @@ class SlackAdapter(BasePlatformAdapter):
media_urls=media_urls,
media_types=media_types,
reply_to_message_id=thread_ts if thread_ts != ts else None,
channel_prompt=_channel_prompt,
)
# Only react when bot is directly addressed (DM or @mention).
+6 -26
View File
@@ -163,15 +163,6 @@ class TelegramAdapter(BasePlatformAdapter):
# Approval button state: message_id → session_key
self._approval_state: Dict[int, str] = {}
@staticmethod
def _is_callback_user_authorized(user_id: str) -> bool:
"""Return whether a Telegram inline-button caller may perform gated actions."""
allowed_csv = os.getenv("TELEGRAM_ALLOWED_USERS", "").strip()
if not allowed_csv:
return True
allowed_ids = {uid.strip() for uid in allowed_csv.split(",") if uid.strip()}
return "*" in allowed_ids or user_id in allowed_ids
def _fallback_ips(self) -> list[str]:
"""Return validated fallback IPs from config (populated by _apply_env_overrides)."""
configured = self.config.extra.get("fallback_ips", []) if getattr(self.config, "extra", None) else []
@@ -1449,9 +1440,12 @@ class TelegramAdapter(BasePlatformAdapter):
# Only authorized users may click approval buttons.
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(caller_id):
await query.answer(text="⛔ You are not authorized to approve commands.")
return
allowed_csv = os.getenv("TELEGRAM_ALLOWED_USERS", "").strip()
if allowed_csv:
allowed_ids = {uid.strip() for uid in allowed_csv.split(",") if uid.strip()}
if "*" not in allowed_ids and caller_id not in allowed_ids:
await query.answer(text="⛔ You are not authorized to approve commands.")
return
session_key = self._approval_state.pop(approval_id, None)
if not session_key:
@@ -1496,10 +1490,6 @@ class TelegramAdapter(BasePlatformAdapter):
if not data.startswith("update_prompt:"):
return
answer = data.split(":", 1)[1] # "y" or "n"
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(caller_id):
await query.answer(text="⛔ You are not authorized to answer update prompts.")
return
await query.answer(text=f"Sent '{answer}' to the update process.")
# Edit the message to show the choice and remove buttons
label = "Yes" if answer == "y" else "No"
@@ -2775,15 +2765,6 @@ class TelegramAdapter(BasePlatformAdapter):
reply_to_id = str(message.reply_to_message.message_id)
reply_to_text = message.reply_to_message.text or message.reply_to_message.caption or None
# Per-channel/topic ephemeral prompt
from gateway.platforms.base import resolve_channel_prompt
_chat_id_str = str(chat.id)
_channel_prompt = resolve_channel_prompt(
self.config.extra,
thread_id_str or _chat_id_str,
_chat_id_str if thread_id_str else None,
)
return MessageEvent(
text=message.text or "",
message_type=msg_type,
@@ -2793,7 +2774,6 @@ class TelegramAdapter(BasePlatformAdapter):
reply_to_message_id=reply_to_id,
reply_to_text=reply_to_text,
auto_skill=topic_skill,
channel_prompt=_channel_prompt,
timestamp=message.date,
)
+112 -446
View File
@@ -482,32 +482,6 @@ def _resolve_hermes_bin() -> Optional[list[str]]:
return None
def _parse_session_key(session_key: str) -> "dict | None":
"""Parse a session key into its component parts.
Session keys follow the format
``agent:main:{platform}:{chat_type}:{chat_id}[:{extra}...]``.
Returns a dict with ``platform``, ``chat_type``, ``chat_id``, and
optionally ``thread_id`` keys, or None if the key doesn't match.
The 6th element is only returned as ``thread_id`` for chat types where
it is unambiguous (``dm`` and ``thread``). For group/channel sessions
the suffix may be a user_id (per-user isolation) rather than a
thread_id, so we leave ``thread_id`` out to avoid mis-routing.
"""
parts = session_key.split(":")
if len(parts) >= 5 and parts[0] == "agent" and parts[1] == "main":
result = {
"platform": parts[2],
"chat_type": parts[3],
"chat_id": parts[4],
}
if len(parts) > 5 and parts[3] in ("dm", "thread"):
result["thread_id"] = parts[5]
return result
return None
def _format_gateway_process_notification(evt: dict) -> "str | None":
"""Format a watch pattern event from completion_queue into a [SYSTEM:] message."""
evt_type = evt.get("type", "completion")
@@ -599,7 +573,6 @@ class GatewayRunner:
self._running_agents: Dict[str, Any] = {}
self._running_agents_ts: Dict[str, float] = {} # start timestamp per session
self._pending_messages: Dict[str, str] = {} # Queued messages during interrupt
self._busy_ack_ts: Dict[str, float] = {} # last busy-ack timestamp per session (debounce)
# Cache AIAgent instances per session to preserve prompt caching.
# Without this, a new AIAgent is created per message, rebuilding the
@@ -1356,100 +1329,26 @@ class GatewayRunner:
merge_pending_message_event(adapter._pending_messages, session_key, event)
async def _handle_active_session_busy_message(self, event: MessageEvent, session_key: str) -> bool:
# --- Draining case (gateway restarting/stopping) ---
if self._draining:
adapter = self.adapters.get(event.source.platform)
if not adapter:
return True
thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
if self._queue_during_drain_enabled():
self._queue_or_replace_pending_event(session_key, event)
message = f"⏳ Gateway {self._status_action_gerund()} — queued for the next turn after it comes back."
else:
message = f"⏳ Gateway is {self._status_action_gerund()} and is not accepting another turn right now."
await adapter._send_with_retry(
chat_id=event.source.chat_id,
content=message,
reply_to=event.message_id,
metadata=thread_meta,
)
return True
# --- Normal busy case (agent actively running a task) ---
# The user sent a message while the agent is working. Interrupt the
# agent immediately so it stops the current tool-calling loop and
# processes the new message. The pending message is stored in the
# adapter so the base adapter picks it up once the interrupted run
# returns. A brief ack tells the user what's happening (debounced
# to avoid spam when they fire multiple messages quickly).
if not self._draining:
return False
adapter = self.adapters.get(event.source.platform)
if not adapter:
return False # let default path handle it
# Store the message so it's processed as the next turn after the
# interrupt causes the current run to exit.
from gateway.platforms.base import merge_pending_message_event
merge_pending_message_event(adapter._pending_messages, session_key, event)
# Interrupt the running agent — this aborts in-flight tool calls and
# causes the agent loop to exit at the next check point.
running_agent = self._running_agents.get(session_key)
if running_agent and running_agent is not _AGENT_PENDING_SENTINEL:
try:
running_agent.interrupt(event.text)
except Exception:
pass # don't let interrupt failure block the ack
# Debounce: only send an acknowledgment once every 30 seconds per session
# to avoid spamming the user when they send multiple messages quickly
_BUSY_ACK_COOLDOWN = 30
now = time.time()
last_ack = self._busy_ack_ts.get(session_key, 0)
if now - last_ack < _BUSY_ACK_COOLDOWN:
return True # interrupt sent, ack already delivered recently
self._busy_ack_ts[session_key] = now
# Build a status-rich acknowledgment
status_parts = []
if running_agent and running_agent is not _AGENT_PENDING_SENTINEL:
try:
summary = running_agent.get_activity_summary()
iteration = summary.get("api_call_count", 0)
max_iter = summary.get("max_iterations", 0)
current_tool = summary.get("current_tool")
start_ts = self._running_agents_ts.get(session_key, 0)
if start_ts:
elapsed_min = int((now - start_ts) / 60)
if elapsed_min > 0:
status_parts.append(f"{elapsed_min} min elapsed")
if max_iter:
status_parts.append(f"iteration {iteration}/{max_iter}")
if current_tool:
status_parts.append(f"running: {current_tool}")
except Exception:
pass
status_detail = f" ({', '.join(status_parts)})" if status_parts else ""
message = (
f"⚡ Interrupting current task{status_detail}. "
f"I'll respond to your message shortly."
)
return True
thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
try:
await adapter._send_with_retry(
chat_id=event.source.chat_id,
content=message,
reply_to=event.message_id,
metadata=thread_meta,
)
except Exception as e:
logger.debug("Failed to send busy-ack: %s", e)
if self._queue_during_drain_enabled():
self._queue_or_replace_pending_event(session_key, event)
message = f"⏳ Gateway {self._status_action_gerund()} — queued for the next turn after it comes back."
else:
message = f"⏳ Gateway is {self._status_action_gerund()} and is not accepting another turn right now."
await adapter._send_with_retry(
chat_id=event.source.chat_id,
content=message,
reply_to=event.message_id,
metadata=thread_meta,
)
return True
async def _drain_active_agents(self, timeout: float) -> tuple[Dict[str, Any], bool]:
@@ -1506,7 +1405,7 @@ class GatewayRunner:
action = "restarting" if self._restart_requested else "shutting down"
hint = (
"Your current task will be interrupted. "
"Send any message after restart to resume where it left off."
"Use /retry after restart to continue."
if self._restart_requested
else "Your current task will be interrupted."
)
@@ -1515,11 +1414,12 @@ class GatewayRunner:
notified: set = set()
for session_key in active:
# Parse platform + chat_id from the session key.
_parsed = _parse_session_key(session_key)
if not _parsed:
# Format: agent:main:{platform}:{chat_type}:{chat_id}[:{extra}...]
parts = session_key.split(":")
if len(parts) < 5:
continue
platform_str = _parsed["platform"]
chat_id = _parsed["chat_id"]
platform_str = parts[2]
chat_id = parts[4]
# Deduplicate: one notification per chat, even if multiple
# sessions (different users/threads) share the same chat.
@@ -1535,7 +1435,7 @@ class GatewayRunner:
# Include thread_id if present so the message lands in the
# correct forum topic / thread.
thread_id = _parsed.get("thread_id")
thread_id = parts[5] if len(parts) > 5 else None
metadata = {"thread_id": thread_id} if thread_id else None
await adapter.send(chat_id, msg, metadata=metadata)
@@ -1575,106 +1475,6 @@ class GatewayRunner:
except Exception:
pass
_STUCK_LOOP_THRESHOLD = 3 # restarts while active before auto-suspend
_STUCK_LOOP_FILE = ".restart_failure_counts"
def _increment_restart_failure_counts(self, active_session_keys: set) -> None:
"""Increment restart-failure counters for sessions active at shutdown.
Persists to a JSON file so counters survive across restarts.
Sessions NOT in active_session_keys are removed (they completed
successfully, so the loop is broken).
"""
import json
path = _hermes_home / self._STUCK_LOOP_FILE
try:
counts = json.loads(path.read_text()) if path.exists() else {}
except Exception:
counts = {}
# Increment active sessions, remove inactive ones (loop broken)
new_counts = {}
for key in active_session_keys:
new_counts[key] = counts.get(key, 0) + 1
# Keep any entries that are still above 0 even if not active now
# (they might become active again next restart)
try:
path.write_text(json.dumps(new_counts))
except Exception:
pass
def _suspend_stuck_loop_sessions(self) -> int:
"""Suspend sessions that have been active across too many restarts.
Returns the number of sessions suspended. Called on gateway startup
AFTER suspend_recently_active() to catch the stuck-loop pattern:
session loads agent gets stuck gateway restarts repeat.
"""
import json
path = _hermes_home / self._STUCK_LOOP_FILE
if not path.exists():
return 0
try:
counts = json.loads(path.read_text())
except Exception:
return 0
suspended = 0
stuck_keys = [k for k, v in counts.items() if v >= self._STUCK_LOOP_THRESHOLD]
for session_key in stuck_keys:
try:
entry = self.session_store._entries.get(session_key)
if entry and not entry.suspended:
entry.suspended = True
suspended += 1
logger.warning(
"Auto-suspended stuck session %s (active across %d "
"consecutive restarts — likely a stuck loop)",
session_key[:30], counts[session_key],
)
except Exception:
pass
if suspended:
try:
self.session_store._save()
except Exception:
pass
# Clear the file — counters start fresh after suspension
try:
path.unlink(missing_ok=True)
except Exception:
pass
return suspended
def _clear_restart_failure_count(self, session_key: str) -> None:
"""Clear the restart-failure counter for a session that completed OK.
Called after a successful agent turn to signal the loop is broken.
"""
import json
path = _hermes_home / self._STUCK_LOOP_FILE
if not path.exists():
return
try:
counts = json.loads(path.read_text())
if session_key in counts:
del counts[session_key]
if counts:
path.write_text(json.dumps(counts))
else:
path.unlink(missing_ok=True)
except Exception:
pass
async def _launch_detached_restart_command(self) -> None:
import shutil
import subprocess
@@ -1740,7 +1540,7 @@ class GatewayRunner:
pass
try:
from gateway.status import write_runtime_status
write_runtime_status(gateway_state="starting", exit_reason=None)
write_runtime_status(gateway_state="starting", exit_reason=None, startup_checks={})
except Exception:
pass
@@ -1782,8 +1582,23 @@ class GatewayRunner:
"or configure platform allowlists (e.g., TELEGRAM_ALLOWED_USERS=your_id)."
)
# Discover plugins before hooks so plugin-owned hook bundles can
# participate in this same startup cycle.
try:
from hermes_cli.plugins import discover_plugins
discover_plugins()
except Exception as e:
logger.warning("Plugin discovery during gateway startup failed: %s", e)
# Discover and load event hooks
self.hooks.discover_and_load()
try:
from gateway.status import reset_startup_checks
reset_startup_checks(self.hooks.loaded_hooks)
except Exception as e:
logger.warning("Startup readiness initialization failed: %s", e)
# Recover background processes from checkpoint (crash recovery)
try:
@@ -1818,17 +1633,6 @@ class GatewayRunner:
except Exception as e:
logger.warning("Session suspension on startup failed: %s", e)
# Stuck-loop detection (#7536): if a session has been active across
# 3+ consecutive restarts, it's probably stuck in a loop (the same
# history keeps causing the agent to hang). Auto-suspend it so the
# user gets a clean slate on the next message.
try:
stuck = self._suspend_stuck_loop_sessions()
if stuck:
logger.warning("Auto-suspended %d stuck-loop session(s)", stuck)
except Exception as e:
logger.debug("Stuck-loop detection failed: %s", e)
connected_count = 0
enabled_platform_count = 0
startup_nonretryable_errors: list[str] = []
@@ -2315,6 +2119,11 @@ class GatewayRunner:
logger.error("Failed to launch detached gateway restart: %s", e)
self._finalize_shutdown_agents(active_agents)
await self.hooks.emit("gateway:shutdown", {
"restart": self._restart_requested,
"service_restart": self._restart_via_service,
"detached_restart": self._restart_detached,
})
for platform, adapter in list(self.adapters.items()):
try:
@@ -2337,8 +2146,6 @@ class GatewayRunner:
self._running_agents.clear()
self._pending_messages.clear()
self._pending_approvals.clear()
if hasattr(self, '_busy_ack_ts'):
self._busy_ack_ts.clear()
self._shutdown_event.set()
# Global cleanup: kill any remaining tool subprocesses not tied
@@ -2382,14 +2189,6 @@ class GatewayRunner:
"active sessions."
)
# Track sessions that were active at shutdown for stuck-loop
# detection (#7536). On each restart, the counter increments
# for sessions that were running. If a session hits the
# threshold (3 consecutive restarts while active), the next
# startup auto-suspends it — breaking the loop.
if active_agents:
self._increment_restart_failure_counts(set(active_agents.keys()))
if self._restart_requested and self._restart_via_service:
self._exit_code = GATEWAY_SERVICE_RESTART_EXIT_CODE
self._exit_reason = self._exit_reason or "Gateway restart requested"
@@ -2823,7 +2622,6 @@ class GatewayRunner:
)
del self._running_agents[_quick_key]
self._running_agents_ts.pop(_quick_key, None)
self._busy_ack_ts.pop(_quick_key, None)
if _quick_key in self._running_agents:
if event.get_command() == "status":
@@ -2891,7 +2689,6 @@ class GatewayRunner:
message_type=_MT.TEXT,
source=event.source,
message_id=event.message_id,
channel_prompt=event.channel_prompt,
)
adapter._pending_messages[_quick_key] = queued_event
return "Queued for the next turn."
@@ -3869,7 +3666,6 @@ class GatewayRunner:
session_id=session_entry.session_id,
session_key=session_key,
event_message_id=event.message_id,
channel_prompt=event.channel_prompt,
)
# Stop persistent typing indicator now that the agent is done
@@ -3881,18 +3677,6 @@ class GatewayRunner:
pass
response = agent_result.get("final_response") or ""
# Convert the agent's internal "(empty)" sentinel into a
# user-friendly message. "(empty)" means the model failed to
# produce visible content after exhausting all retries (nudge,
# prefill, empty-retry, fallback). Sending the raw sentinel
# looks like a bug; a short explanation is more helpful.
if response == "(empty)":
response = (
"⚠️ The model returned no response after processing tool "
"results. This can happen with some models — try again or "
"rephrase your question."
)
agent_messages = agent_result.get("messages", [])
_response_time = time.time() - _msg_start_time
_api_calls = agent_result.get("api_calls", 0)
@@ -3903,12 +3687,6 @@ class GatewayRunner:
_response_time, _api_calls, _resp_len,
)
# Successful turn — clear any stuck-loop counter for this session.
# This ensures the counter only accumulates across CONSECUTIVE
# restarts where the session was active (never completed).
if session_key:
self._clear_restart_failure_count(session_key)
# Surface error details when the agent failed silently (final_response=None)
if not response and agent_result.get("failed"):
error_detail = agent_result.get("error", "unknown error")
@@ -3997,7 +3775,7 @@ class GatewayRunner:
synth_text = _format_gateway_process_notification(evt)
if synth_text:
try:
await self._inject_watch_notification(synth_text, evt)
await self._inject_watch_notification(synth_text, event)
except Exception as e2:
logger.error("Watch notification injection error: %s", e2)
except Exception as e:
@@ -4015,11 +3793,14 @@ class GatewayRunner:
# intermediate reasoning) so sessions can be resumed with full context
# and transcripts are useful for debugging and training data.
#
# IMPORTANT: When the agent failed (e.g. context-overflow 400,
# compression exhausted), do NOT persist the user's message.
# IMPORTANT: When the agent failed before producing any response
# (e.g. context-overflow 400), do NOT persist the user's message.
# Persisting it would make the session even larger, causing the
# same failure on the next attempt — an infinite loop. (#1630, #9893)
agent_failed_early = bool(agent_result.get("failed"))
# same failure on the next attempt — an infinite loop. (#1630)
agent_failed_early = (
agent_result.get("failed")
and not agent_result.get("final_response")
)
if agent_failed_early:
logger.info(
"Skipping transcript persistence for failed request in "
@@ -4027,24 +3808,6 @@ class GatewayRunner:
session_entry.session_id,
)
# When compression is exhausted, the session is permanently too
# large to process. Auto-reset it so the next message starts
# fresh instead of replaying the same oversized context in an
# infinite fail loop. (#9893)
if agent_result.get("compression_exhausted") and session_entry and session_key:
logger.info(
"Auto-resetting session %s after compression exhaustion.",
session_entry.session_id,
)
self.session_store.reset_session(session_key)
self._evict_cached_agent(session_key)
self._session_model_overrides.pop(session_key, None)
response = (response or "") + (
"\n\n🔄 Session auto-reset — the conversation exceeded the "
"maximum context size and could not be compressed further. "
"Your next message will start a fresh session."
)
ts = datetime.now().isoformat()
# If this is a fresh session (no history), write the full tool
@@ -4152,8 +3915,6 @@ class GatewayRunner:
_hist_len = len(history) if 'history' in locals() else 0
if status_code == 401:
status_hint = " Check your API key or run `claude /login` to refresh OAuth credentials."
elif status_code == 402:
status_hint = " Your API balance or quota is exhausted. Check your provider dashboard."
elif status_code == 429:
# Check if this is a plan usage limit (resets on a schedule) vs a transient rate limit
_err_body = getattr(e, "response", None)
@@ -5091,7 +4852,6 @@ class GatewayRunner:
message_type=MessageType.TEXT,
source=source,
raw_message=event.raw_message,
channel_prompt=event.channel_prompt,
)
# Let the normal message handler process it
@@ -6823,17 +6583,11 @@ class GatewayRunner:
})
async def _handle_debug_command(self, event: MessageEvent) -> str:
"""Handle /debug — upload debug report (summary only) and return paste URLs.
Gateway uploads ONLY the summary report (system info + log tails),
NOT full log files, to protect conversation privacy. Users who need
full log uploads should use ``hermes debug share`` from the CLI.
"""
"""Handle /debug — upload debug report + logs and return paste URLs."""
import asyncio
from hermes_cli.debug import (
_capture_dump, collect_debug_report,
upload_to_pastebin, _schedule_auto_delete,
_GATEWAY_PRIVACY_NOTICE,
_capture_dump, collect_debug_report, _read_full_log,
upload_to_pastebin,
)
loop = asyncio.get_running_loop()
@@ -6842,25 +6596,43 @@ class GatewayRunner:
def _collect_and_upload():
dump_text = _capture_dump()
report = collect_debug_report(log_lines=200, dump_text=dump_text)
agent_log = _read_full_log("agent")
gateway_log = _read_full_log("gateway")
if agent_log:
agent_log = dump_text + "\n\n--- full agent.log ---\n" + agent_log
if gateway_log:
gateway_log = dump_text + "\n\n--- full gateway.log ---\n" + gateway_log
urls = {}
failures = []
try:
urls["Report"] = upload_to_pastebin(report)
except Exception as exc:
return f"✗ Failed to upload debug report: {exc}"
# Schedule auto-deletion after 1 hour
_schedule_auto_delete(list(urls.values()))
if agent_log:
try:
urls["agent.log"] = upload_to_pastebin(agent_log)
except Exception:
failures.append("agent.log")
lines = [_GATEWAY_PRIVACY_NOTICE, "", "**Debug report uploaded:**", ""]
if gateway_log:
try:
urls["gateway.log"] = upload_to_pastebin(gateway_log)
except Exception:
failures.append("gateway.log")
lines = ["**Debug report uploaded:**", ""]
label_width = max(len(k) for k in urls)
for label, url in urls.items():
lines.append(f"`{label:<{label_width}}` {url}")
lines.append("")
lines.append("⏱ Pastes will auto-delete in 1 hour.")
lines.append("For full log uploads, use `hermes debug share` from the CLI.")
lines.append("Share these links with the Hermes team for support.")
if failures:
lines.append(f"\n_(failed to upload: {', '.join(failures)})_")
lines.append("\nShare these links with the Hermes team for support.")
return "\n".join(lines)
return await loop.run_in_executor(None, _collect_and_upload)
@@ -7480,75 +7252,14 @@ class GatewayRunner:
return prefix
return user_text
def _build_process_event_source(self, evt: dict):
"""Resolve the canonical source for a synthetic background-process event.
Prefer the persisted session-store origin for the event's session key.
Falling back to the currently active foreground event is what causes
cross-topic bleed, so don't do that.
"""
from gateway.session import SessionSource
session_key = str(evt.get("session_key") or "").strip()
derived_platform = ""
derived_chat_type = ""
derived_chat_id = ""
if session_key:
try:
self.session_store._ensure_loaded()
entry = self.session_store._entries.get(session_key)
if entry and getattr(entry, "origin", None):
return entry.origin
except Exception as exc:
logger.debug(
"Synthetic process-event session-store lookup failed for %s: %s",
session_key,
exc,
)
_parsed = _parse_session_key(session_key)
if _parsed:
derived_platform = _parsed["platform"]
derived_chat_type = _parsed["chat_type"]
derived_chat_id = _parsed["chat_id"]
platform_name = str(evt.get("platform") or derived_platform or "").strip().lower()
chat_type = str(evt.get("chat_type") or derived_chat_type or "").strip().lower()
chat_id = str(evt.get("chat_id") or derived_chat_id or "").strip()
if not platform_name or not chat_type or not chat_id:
return None
try:
platform = Platform(platform_name)
except Exception:
logger.warning(
"Synthetic process event has invalid platform metadata: %r",
platform_name,
)
return None
return SessionSource(
platform=platform,
chat_id=chat_id,
chat_type=chat_type,
thread_id=str(evt.get("thread_id") or "").strip() or None,
user_id=str(evt.get("user_id") or "").strip() or None,
user_name=str(evt.get("user_name") or "").strip() or None,
)
async def _inject_watch_notification(self, synth_text: str, evt: dict) -> None:
async def _inject_watch_notification(self, synth_text: str, original_event) -> None:
"""Inject a watch-pattern notification as a synthetic message event.
Routing must come from the queued watch event itself, not from whatever
foreground message happened to be active when the queue was drained.
Uses the source from the original user event to route the notification
back to the correct chat/adapter.
"""
source = self._build_process_event_source(evt)
source = getattr(original_event, "source", None)
if not source:
logger.warning(
"Dropping watch notification with no routing metadata for process %s",
evt.get("session_id", "unknown"),
)
return
platform_name = source.platform.value if hasattr(source.platform, "value") else str(source.platform)
adapter = None
@@ -7566,12 +7277,7 @@ class GatewayRunner:
source=source,
internal=True,
)
logger.info(
"Watch pattern notification — injecting for %s chat=%s thread=%s",
platform_name,
source.chat_id,
source.thread_id,
)
logger.info("Watch pattern notification — injecting for %s", platform_name)
await adapter.handle_message(synth_event)
except Exception as e:
logger.error("Watch notification injection error: %s", e)
@@ -7641,42 +7347,33 @@ class GatewayRunner:
f"Command: {session.command}\n"
f"Output:\n{_out}]"
)
source = self._build_process_event_source({
"session_id": session_id,
"session_key": session_key,
"platform": platform_name,
"chat_id": chat_id,
"thread_id": thread_id,
"user_id": user_id,
"user_name": user_name,
})
if not source:
logger.warning(
"Dropping completion notification with no routing metadata for process %s",
session_id,
)
break
adapter = None
for p, a in self.adapters.items():
if p == source.platform:
if p.value == platform_name:
adapter = a
break
if adapter and source.chat_id:
if adapter and chat_id:
try:
from gateway.platforms.base import MessageEvent, MessageType
from gateway.session import SessionSource
from gateway.config import Platform
_platform_enum = Platform(platform_name)
_source = SessionSource(
platform=_platform_enum,
chat_id=chat_id,
thread_id=thread_id or None,
user_id=user_id or None,
user_name=user_name or None,
)
synth_event = MessageEvent(
text=synth_text,
message_type=MessageType.TEXT,
source=source,
source=_source,
internal=True,
)
logger.info(
"Process %s finished — injecting agent notification for session %s chat=%s thread=%s",
session_id,
session_key,
source.chat_id,
source.thread_id,
"Process %s finished — injecting agent notification for session %s",
session_id, session_key,
)
await adapter.handle_message(synth_event)
except Exception as e:
@@ -8072,7 +7769,6 @@ class GatewayRunner:
session_key: str = None,
_interrupt_depth: int = 0,
event_message_id: Optional[str] = None,
channel_prompt: Optional[str] = None,
) -> Dict[str, Any]:
"""
Run the agent with the given message and context.
@@ -8427,12 +8123,8 @@ class GatewayRunner:
# Platform.LOCAL ("local") maps to "cli"; others pass through as-is.
platform_key = "cli" if source.platform == Platform.LOCAL else source.platform.value
# Combine platform context, per-channel context, and the user-configured
# ephemeral system prompt.
# Combine platform context with user-configured ephemeral system prompt
combined_ephemeral = context_prompt or ""
event_channel_prompt = (channel_prompt or "").strip()
if event_channel_prompt:
combined_ephemeral = (combined_ephemeral + "\n\n" + event_channel_prompt).strip()
if self._ephemeral_system_prompt:
combined_ephemeral = (combined_ephemeral + "\n\n" + self._ephemeral_system_prompt).strip()
@@ -8566,12 +8258,6 @@ class GatewayRunner:
cached = _cache.get(session_key)
if cached and cached[1] == _sig:
agent = cached[0]
# Reset activity timestamp so the inactivity timeout
# handler doesn't see stale idle time from the previous
# turn and immediately kill this agent. (#9051)
agent._last_activity_ts = time.time()
agent._last_activity_desc = "starting new turn (cached)"
agent._api_call_count = 0
logger.debug("Reusing cached agent for session %s", session_key)
if agent is None:
@@ -8784,21 +8470,6 @@ class GatewayRunner:
if _msn:
message = _msn + "\n\n" + message
# Auto-continue: if the loaded history ends with a tool result,
# the previous agent turn was interrupted mid-work (gateway
# restart, crash, SIGTERM). Prepend a system note so the model
# finishes processing the pending tool results before addressing
# the user's new message. (#4493)
if agent_history and agent_history[-1].get("role") == "tool":
message = (
"[System note: Your previous turn was interrupted before you could "
"process the last tool result(s). The conversation history contains "
"tool outputs you haven't responded to yet. Please finish processing "
"those results and summarize what was accomplished, then address the "
"user's new message below.]\n\n"
+ message
)
_approval_session_key = session_key or ""
_approval_session_token = set_current_session_key(_approval_session_key)
register_gateway_notify(_approval_session_key, _approval_notify_sync)
@@ -8833,8 +8504,6 @@ class GatewayRunner:
"final_response": error_msg,
"messages": result.get("messages", []),
"api_calls": result.get("api_calls", 0),
"failed": result.get("failed", False),
"compression_exhausted": result.get("compression_exhausted", False),
"tools": tools_holder[0] or [],
"history_offset": len(agent_history),
"last_prompt_tokens": _last_prompt_toks,
@@ -9339,11 +9008,15 @@ class GatewayRunner:
pass
except Exception as e:
logger.debug("Stream consumer wait before queued message failed: %s", e)
_response_previewed = bool(result.get("response_previewed"))
_already_streamed = bool(
_sc
and (
getattr(_sc, "final_response_sent", False)
or getattr(_sc, "already_sent", False)
or (
_response_previewed
and getattr(_sc, "already_sent", False)
)
)
)
first_response = result.get("final_response", "")
@@ -9384,7 +9057,6 @@ class GatewayRunner:
session_key=session_key,
_interrupt_depth=_interrupt_depth + 1,
event_message_id=next_message_id,
channel_prompt=pending_event.channel_prompt,
)
finally:
# Stop progress sender, interrupt monitor, and notification task
@@ -9426,21 +9098,15 @@ class GatewayRunner:
# BUT: never suppress delivery when the agent failed — the error
# message is new content the user hasn't seen, and it must reach
# them even if streaming had sent earlier partial output.
#
# Also never suppress when the final response is "(empty)" — this
# means the model failed to produce content after tool calls (common
# with mimo-v2-pro, GLM-5, etc.). The stream consumer may have
# sent intermediate text ("Let me search for that…") alongside the
# tool call, setting already_sent=True, but that text is NOT the
# final answer. Suppressing delivery here leaves the user staring
# at silence. (#10xxx — "agent stops after web search")
_sc = stream_consumer_holder[0]
if _sc and isinstance(response, dict) and not response.get("failed"):
_final = response.get("final_response") or ""
_is_empty_sentinel = not _final or _final == "(empty)"
if not _is_empty_sentinel and (
_response_previewed = bool(response.get("response_previewed"))
if (
getattr(_sc, "final_response_sent", False)
or getattr(_sc, "already_sent", False)
or (
_response_previewed
and getattr(_sc, "already_sent", False)
)
):
response["already_sent"] = True
@@ -9749,9 +9415,9 @@ def main():
config = None
if args.config:
import yaml
import json
with open(args.config, encoding="utf-8") as f:
data = yaml.safe_load(f)
data = json.load(f)
config = GatewayConfig.from_dict(data)
# Run the gateway - exit with code 1 if no platforms connected,
+17 -34
View File
@@ -37,24 +37,18 @@ needs to replace the import + call site:
"""
from contextvars import ContextVar
from typing import Any
# Sentinel to distinguish "never set in this context" from "explicitly set to empty".
# When a contextvar holds _UNSET, we fall back to os.environ (CLI/cron compat).
# When it holds "" (after clear_session_vars resets it), we return "" — no fallback.
_UNSET: Any = object()
# ---------------------------------------------------------------------------
# Per-task session variables
# ---------------------------------------------------------------------------
_SESSION_PLATFORM: ContextVar = ContextVar("HERMES_SESSION_PLATFORM", default=_UNSET)
_SESSION_CHAT_ID: ContextVar = ContextVar("HERMES_SESSION_CHAT_ID", default=_UNSET)
_SESSION_CHAT_NAME: ContextVar = ContextVar("HERMES_SESSION_CHAT_NAME", default=_UNSET)
_SESSION_THREAD_ID: ContextVar = ContextVar("HERMES_SESSION_THREAD_ID", default=_UNSET)
_SESSION_USER_ID: ContextVar = ContextVar("HERMES_SESSION_USER_ID", default=_UNSET)
_SESSION_USER_NAME: ContextVar = ContextVar("HERMES_SESSION_USER_NAME", default=_UNSET)
_SESSION_KEY: ContextVar = ContextVar("HERMES_SESSION_KEY", default=_UNSET)
_SESSION_PLATFORM: ContextVar[str] = ContextVar("HERMES_SESSION_PLATFORM", default="")
_SESSION_CHAT_ID: ContextVar[str] = ContextVar("HERMES_SESSION_CHAT_ID", default="")
_SESSION_CHAT_NAME: ContextVar[str] = ContextVar("HERMES_SESSION_CHAT_NAME", default="")
_SESSION_THREAD_ID: ContextVar[str] = ContextVar("HERMES_SESSION_THREAD_ID", default="")
_SESSION_USER_ID: ContextVar[str] = ContextVar("HERMES_SESSION_USER_ID", default="")
_SESSION_USER_NAME: ContextVar[str] = ContextVar("HERMES_SESSION_USER_NAME", default="")
_SESSION_KEY: ContextVar[str] = ContextVar("HERMES_SESSION_KEY", default="")
_VAR_MAP = {
"HERMES_SESSION_PLATFORM": _SESSION_PLATFORM,
@@ -97,17 +91,10 @@ def set_session_vars(
def clear_session_vars(tokens: list) -> None:
"""Mark session context variables as explicitly cleared.
Sets all variables to ``""`` so that ``get_session_env`` returns an empty
string instead of falling back to (potentially stale) ``os.environ``
values. The *tokens* argument is accepted for API compatibility with
callers that saved the return value of ``set_session_vars``, but the
actual clearing uses ``var.set("")`` rather than ``var.reset(token)``
to ensure the "explicitly cleared" state is distinguishable from
"never set" (which holds the ``_UNSET`` sentinel).
"""
for var in (
"""Restore session context variables to their pre-handler values."""
if not tokens:
return
vars_in_order = [
_SESSION_PLATFORM,
_SESSION_CHAT_ID,
_SESSION_CHAT_NAME,
@@ -115,8 +102,9 @@ def clear_session_vars(tokens: list) -> None:
_SESSION_USER_ID,
_SESSION_USER_NAME,
_SESSION_KEY,
):
var.set("")
]
for var, token in zip(vars_in_order, tokens):
var.reset(token)
def get_session_env(name: str, default: str = "") -> str:
@@ -125,13 +113,8 @@ def get_session_env(name: str, default: str = "") -> str:
Drop-in replacement for ``os.getenv("HERMES_SESSION_*", default)``.
Resolution order:
1. Context variable (set by the gateway for concurrency-safe access).
If the variable was explicitly set (even to ``""``) via
``set_session_vars`` or ``clear_session_vars``, that value is
returned **no fallback to os.environ**.
2. ``os.environ`` (only when the context variable was never set in
this context i.e. CLI, cron scheduler, and test processes that
don't use ``set_session_vars`` at all).
1. Context variable (set by the gateway for concurrency-safe access)
2. ``os.environ`` (used by CLI, cron scheduler, and tests)
3. *default*
"""
import os
@@ -139,7 +122,7 @@ def get_session_env(name: str, default: str = "") -> str:
var = _VAR_MAP.get(name)
if var is not None:
value = var.get()
if value is not _UNSET:
if value:
return value
# Fall back to os.environ for CLI, cron, and test compatibility
return os.getenv(name, default)
+135 -1
View File
@@ -27,6 +27,7 @@ _RUNTIME_STATUS_FILE = "gateway_state.json"
_LOCKS_DIRNAME = "gateway-locks"
_IS_WINDOWS = sys.platform == "win32"
_UNSET = object()
_VALID_STARTUP_CHECK_STATES = {"pending", "ready", "failed"}
def _get_pid_path() -> Path:
@@ -162,11 +163,39 @@ def _build_runtime_status_record() -> dict[str, Any]:
"restart_requested": False,
"active_agents": 0,
"platforms": {},
"startup_checks": {},
"updated_at": _utc_now_iso(),
})
return payload
def _normalize_startup_check_entries(
startup_checks: Optional[dict[str, Any]],
) -> dict[str, dict[str, Any]]:
"""Normalize persisted startup readiness entries."""
if not isinstance(startup_checks, dict):
return {}
now = _utc_now_iso()
normalized: dict[str, dict[str, Any]] = {}
for raw_id, raw_payload in startup_checks.items():
check_id = str(raw_id).strip()
if not check_id:
continue
payload = raw_payload if isinstance(raw_payload, dict) else {}
state = str(payload.get("state", "pending")).strip().lower()
if state not in _VALID_STARTUP_CHECK_STATES:
state = "pending"
normalized[check_id] = {
"state": state,
"required": bool(payload.get("required", True)),
"source": payload.get("source"),
"detail": payload.get("detail"),
"updated_at": payload.get("updated_at") or now,
}
return normalized
def _read_json_file(path: Path) -> Optional[dict[str, Any]]:
if not path.exists():
return None
@@ -223,6 +252,7 @@ def write_runtime_status(
exit_reason: Any = _UNSET,
restart_requested: Any = _UNSET,
active_agents: Any = _UNSET,
startup_checks: Any = _UNSET,
platform: Any = _UNSET,
platform_state: Any = _UNSET,
error_code: Any = _UNSET,
@@ -245,6 +275,8 @@ def write_runtime_status(
payload["restart_requested"] = bool(restart_requested)
if active_agents is not _UNSET:
payload["active_agents"] = max(0, int(active_agents))
if startup_checks is not _UNSET:
payload["startup_checks"] = _normalize_startup_check_entries(startup_checks)
if platform is not _UNSET:
platform_payload = payload["platforms"].get(platform, {})
@@ -262,7 +294,109 @@ def write_runtime_status(
def read_runtime_status() -> Optional[dict[str, Any]]:
"""Read the persisted gateway runtime health/status information."""
return _read_json_file(_get_runtime_status_path())
payload = _read_json_file(_get_runtime_status_path())
if payload is None:
return None
payload.setdefault("platforms", {})
payload["startup_checks"] = _normalize_startup_check_entries(payload.get("startup_checks"))
return payload
def reset_startup_checks(checks: Optional[list[dict[str, Any]]] = None) -> dict[str, dict[str, Any]]:
"""Replace persisted startup readiness checks for the current run."""
normalized: dict[str, dict[str, Any]] = {}
now = _utc_now_iso()
for hook in checks or []:
if not isinstance(hook, dict):
continue
readiness = hook.get("startup_readiness")
if not isinstance(readiness, dict):
continue
check_id = str(readiness.get("id", "")).strip()
if not check_id:
continue
normalized[check_id] = {
"state": "pending",
"required": bool(readiness.get("required", True)),
"source": hook.get("name"),
"detail": None,
"updated_at": now,
}
write_runtime_status(startup_checks=normalized)
return normalized
def update_startup_check(
check_id: str,
state: str,
*,
detail: Any = _UNSET,
required: Any = _UNSET,
source: Any = _UNSET,
) -> dict[str, Any]:
"""Update a single startup readiness check in the runtime status file."""
normalized_id = str(check_id).strip()
if not normalized_id:
raise ValueError("startup readiness check id is required")
normalized_state = str(state).strip().lower()
if normalized_state not in _VALID_STARTUP_CHECK_STATES:
raise ValueError(f"invalid startup readiness state: {state}")
path = _get_runtime_status_path()
payload = _read_json_file(path) or _build_runtime_status_record()
checks = _normalize_startup_check_entries(payload.get("startup_checks"))
existing = checks.get(normalized_id, {})
now = _utc_now_iso()
checks[normalized_id] = {
"state": normalized_state,
"required": bool(existing.get("required", True) if required is _UNSET else required),
"source": existing.get("source") if source is _UNSET else source,
"detail": existing.get("detail") if detail is _UNSET else detail,
"updated_at": now,
}
payload["startup_checks"] = checks
payload.setdefault("platforms", {})
payload.setdefault("kind", _GATEWAY_KIND)
payload["pid"] = os.getpid()
payload["start_time"] = _get_process_start_time(os.getpid())
payload["updated_at"] = now
_write_json_file(path, payload)
return checks[normalized_id]
def mark_startup_check_pending(
check_id: str,
*,
detail: Any = _UNSET,
required: Any = _UNSET,
source: Any = _UNSET,
) -> dict[str, Any]:
return update_startup_check(check_id, "pending", detail=detail, required=required, source=source)
def mark_startup_check_ready(
check_id: str,
*,
detail: Any = _UNSET,
required: Any = _UNSET,
source: Any = _UNSET,
) -> dict[str, Any]:
return update_startup_check(check_id, "ready", detail=detail, required=required, source=source)
def mark_startup_check_failed(
check_id: str,
*,
detail: Any = _UNSET,
required: Any = _UNSET,
source: Any = _UNSET,
) -> dict[str, Any]:
return update_startup_check(check_id, "failed", detail=detail, required=required, source=source)
def remove_pid_file() -> None:
+4 -7
View File
@@ -609,15 +609,12 @@ class GatewayStreamConsumer:
content=text,
metadata=self.metadata,
)
# Note: do NOT set _already_sent = True here.
# Commentary messages are interim status updates (e.g. "Using browser
# tool..."), not the final response. Setting already_sent would cause
# the final response to be incorrectly suppressed when there are
# multiple tool calls. See: https://github.com/NousResearch/hermes-agent/issues/10454
return result.success
if result.success:
self._already_sent = True
return True
except Exception as e:
logger.error("Commentary send error: %s", e)
return False
return False
async def _send_or_edit(self, text: str) -> bool:
"""Send or edit the streaming message.
+2 -27
View File
@@ -274,14 +274,6 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
api_key_env_vars=("XIAOMI_API_KEY",),
base_url_env_var="XIAOMI_BASE_URL",
),
"bedrock": ProviderConfig(
id="bedrock",
name="AWS Bedrock",
auth_type="aws_sdk",
inference_base_url="https://bedrock-runtime.us-east-1.amazonaws.com",
api_key_env_vars=(),
base_url_env_var="BEDROCK_BASE_URL",
),
}
@@ -932,7 +924,6 @@ def resolve_provider(
"qwen-portal": "qwen-oauth", "qwen-cli": "qwen-oauth", "qwen-oauth": "qwen-oauth",
"hf": "huggingface", "hugging-face": "huggingface", "huggingface-hub": "huggingface",
"mimo": "xiaomi", "xiaomi-mimo": "xiaomi",
"aws": "bedrock", "aws-bedrock": "bedrock", "amazon-bedrock": "bedrock", "amazon": "bedrock",
"go": "opencode-go", "opencode-go-sub": "opencode-go",
"kilo": "kilocode", "kilo-code": "kilocode", "kilo-gateway": "kilocode",
# Local server aliases — route through the generic custom provider
@@ -989,15 +980,6 @@ def resolve_provider(
if has_usable_secret(os.getenv(env_var, "")):
return pid
# AWS Bedrock — detect via boto3 credential chain (IAM roles, SSO, env vars).
# This runs after API-key providers so explicit keys always win.
try:
from agent.bedrock_adapter import has_aws_credentials
if has_aws_credentials():
return "bedrock"
except ImportError:
pass # boto3 not installed — skip Bedrock auto-detection
raise AuthError(
"No inference provider configured. Run 'hermes model' to choose a "
"provider and model, or set an API key (OPENROUTER_API_KEY, "
@@ -2402,7 +2384,7 @@ def get_api_key_provider_status(provider_id: str) -> Dict[str, Any]:
if pconfig.base_url_env_var:
env_url = os.getenv(pconfig.base_url_env_var, "").strip()
if provider_id in ("kimi-coding", "kimi-coding-cn"):
if provider_id == "kimi-coding":
base_url = _resolve_kimi_base_url(api_key, pconfig.inference_base_url, env_url)
elif env_url:
base_url = env_url
@@ -2464,13 +2446,6 @@ def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
pconfig = PROVIDER_REGISTRY.get(target)
if pconfig and pconfig.auth_type == "api_key":
return get_api_key_provider_status(target)
# AWS SDK providers (Bedrock) — check via boto3 credential chain
if pconfig and pconfig.auth_type == "aws_sdk":
try:
from agent.bedrock_adapter import has_aws_credentials
return {"logged_in": has_aws_credentials(), "provider": target}
except ImportError:
return {"logged_in": False, "provider": target, "error": "boto3 not installed"}
return {"logged_in": False}
@@ -2495,7 +2470,7 @@ def resolve_api_key_provider_credentials(provider_id: str) -> Dict[str, Any]:
if pconfig.base_url_env_var:
env_url = os.getenv(pconfig.base_url_env_var, "").strip()
if provider_id in ("kimi-coding", "kimi-coding-cn"):
if provider_id == "kimi-coding":
base_url = _resolve_kimi_base_url(api_key, pconfig.inference_base_url, env_url)
elif provider_id == "zai":
base_url = _resolve_zai_base_url(api_key, pconfig.inference_base_url, env_url)
-21
View File
@@ -368,27 +368,6 @@ def _interactive_auth() -> None:
print("=" * 50)
auth_list_command(SimpleNamespace(provider=None))
# Show AWS Bedrock credential status (not in the pool — uses boto3 chain)
try:
from agent.bedrock_adapter import has_aws_credentials, resolve_aws_auth_env_var, resolve_bedrock_region
if has_aws_credentials():
auth_source = resolve_aws_auth_env_var() or "unknown"
region = resolve_bedrock_region()
print(f"bedrock (AWS SDK credential chain):")
print(f" Auth: {auth_source}")
print(f" Region: {region}")
try:
import boto3
sts = boto3.client("sts", region_name=region)
identity = sts.get_caller_identity()
arn = identity.get("Arn", "unknown")
print(f" Identity: {arn}")
except Exception:
print(f" Identity: (could not resolve — boto3 STS call failed)")
print()
except ImportError:
pass # boto3 or bedrock_adapter not available
print()
# Main menu
+3 -2
View File
@@ -164,7 +164,7 @@ COMMAND_REGISTRY: list[CommandDef] = [
# Exit
CommandDef("quit", "Exit the CLI", "Exit",
cli_only=True, aliases=("exit",)),
cli_only=True, aliases=("exit", "q")),
]
@@ -844,7 +844,8 @@ class SlashCommandCompleter(Completer):
return None
return word
def _context_completions(self, word: str, limit: int = 30):
@staticmethod
def _context_completions(word: str, limit: int = 30):
"""Yield Claude Code-style @ context completions.
Bare ``@`` or ``@partial`` shows static references and matching
+1 -97
View File
@@ -419,27 +419,6 @@ DEFAULT_CONFIG = {
"protect_last_n": 20, # minimum recent messages to keep uncompressed
},
# AWS Bedrock provider configuration.
# Only used when model.provider is "bedrock".
"bedrock": {
"region": "", # AWS region for Bedrock API calls (empty = AWS_REGION env var → us-east-1)
"discovery": {
"enabled": True, # Auto-discover models via ListFoundationModels
"provider_filter": [], # Only show models from these providers (e.g. ["anthropic", "amazon"])
"refresh_interval": 3600, # Cache discovery results for this many seconds
},
"guardrail": {
# Amazon Bedrock Guardrails — content filtering and safety policies.
# Create a guardrail in the Bedrock console, then set the ID and version here.
# See: https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html
"guardrail_identifier": "", # e.g. "abc123def456"
"guardrail_version": "", # e.g. "1" or "DRAFT"
"stream_processing_mode": "async", # "sync" or "async"
"trace": "disabled", # "enabled", "disabled", or "enabled_full"
},
},
"smart_model_routing": {
"enabled": False,
"max_simple_chars": 160,
@@ -659,7 +638,6 @@ DEFAULT_CONFIG = {
"allowed_channels": "", # If set, bot ONLY responds in these channel IDs (whitelist)
"auto_thread": True, # Auto-create threads on @mention in channels (like Slack)
"reactions": True, # Add 👀/✅/❌ reactions to messages during processing
"channel_prompts": {}, # Per-channel ephemeral system prompts (forum parents apply to child threads)
},
# WhatsApp platform settings (gateway mode)
@@ -670,21 +648,6 @@ DEFAULT_CONFIG = {
# Supports \n for newlines, e.g. "🤖 *My Bot*\n──────\n"
},
# Telegram platform settings (gateway mode)
"telegram": {
"channel_prompts": {}, # Per-chat/topic ephemeral system prompts (topics inherit from parent group)
},
# Slack platform settings (gateway mode)
"slack": {
"channel_prompts": {}, # Per-channel ephemeral system prompts
},
# Mattermost platform settings (gateway mode)
"mattermost": {
"channel_prompts": {}, # Per-channel ephemeral system prompts
},
# Approval mode for dangerous commands:
# manual — always prompt the user (default)
# smart — use auxiliary LLM to auto-approve low-risk commands, prompt for high-risk
@@ -740,7 +703,7 @@ DEFAULT_CONFIG = {
},
# Config schema version - bump this when adding new required fields
"_config_version": 18,
"_config_version": 17,
}
# =============================================================================
@@ -1011,22 +974,6 @@ OPTIONAL_ENV_VARS = {
"category": "provider",
"advanced": True,
},
"AWS_REGION": {
"description": "AWS region for Bedrock API calls (e.g. us-east-1, eu-central-1)",
"prompt": "AWS Region",
"url": "https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-regions.html",
"password": False,
"category": "provider",
"advanced": True,
},
"AWS_PROFILE": {
"description": "AWS named profile for Bedrock authentication (from ~/.aws/credentials)",
"prompt": "AWS Profile",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
# ── Tool API keys ──
"EXA_API_KEY": {
@@ -2819,47 +2766,6 @@ def sanitize_env_file() -> int:
return fixes
def _check_non_ascii_credential(key: str, value: str) -> str:
"""Warn and strip non-ASCII characters from credential values.
API keys and tokens must be pure ASCII they are sent as HTTP header
values which httpx/httpcore encode as ASCII. Non-ASCII characters
(commonly introduced by copy-pasting from rich-text editors or PDFs
that substitute lookalike Unicode glyphs for ASCII letters) cause
``UnicodeEncodeError: 'ascii' codec can't encode character`` at
request time.
Returns the sanitized (ASCII-only) value. Prints a warning if any
non-ASCII characters were found and removed.
"""
try:
value.encode("ascii")
return value # all ASCII — nothing to do
except UnicodeEncodeError:
pass
# Build a readable list of the offending characters
bad_chars: list[str] = []
for i, ch in enumerate(value):
if ord(ch) > 127:
bad_chars.append(f" position {i}: {ch!r} (U+{ord(ch):04X})")
sanitized = value.encode("ascii", errors="ignore").decode("ascii")
import sys
print(
f"\n Warning: {key} contains non-ASCII characters that will break API requests.\n"
f" This usually happens when copy-pasting from a PDF, rich-text editor,\n"
f" or web page that substitutes lookalike Unicode glyphs for ASCII letters.\n"
f"\n"
+ "\n".join(f" {line}" for line in bad_chars[:5])
+ ("\n ... and more" if len(bad_chars) > 5 else "")
+ f"\n\n The non-ASCII characters have been stripped automatically.\n"
f" If authentication fails, re-copy the key from the provider's dashboard.\n",
file=sys.stderr,
)
return sanitized
def save_env_value(key: str, value: str):
"""Save or update a value in ~/.hermes/.env."""
if is_managed():
@@ -2868,8 +2774,6 @@ def save_env_value(key: str, value: str):
if not _ENV_VAR_NAME_RE.match(key):
raise ValueError(f"Invalid environment variable name: {key!r}")
value = value.replace("\n", "").replace("\r", "")
# API keys / tokens must be ASCII — strip non-ASCII with a warning.
value = _check_non_ascii_credential(key, value)
ensure_hermes_home()
env_path = get_env_path()
+2 -143
View File
@@ -27,110 +27,6 @@ _DPASTE_COM_URL = "https://dpaste.com/api/"
# paste.rs caps at ~1 MB; we stay under that with headroom.
_MAX_LOG_BYTES = 512_000
# Auto-delete pastes after this many seconds (1 hour).
_AUTO_DELETE_SECONDS = 3600
# ---------------------------------------------------------------------------
# Privacy / delete helpers
# ---------------------------------------------------------------------------
_PRIVACY_NOTICE = """\
This will upload the following to a public paste service:
System info (OS, Python version, Hermes version, provider, which API keys
are configured NOT the actual keys)
Recent log lines (agent.log, errors.log, gateway.log may contain
conversation fragments and file paths)
Full agent.log and gateway.log (up to 512 KB each likely contains
conversation content, tool outputs, and file paths)
Pastes auto-delete after 1 hour.
"""
_GATEWAY_PRIVACY_NOTICE = (
"⚠️ **Privacy notice:** This uploads system info + recent log tails "
"(may contain conversation fragments) to a public paste service. "
"Full logs are NOT included from the gateway — use `hermes debug share` "
"from the CLI for full log uploads.\n"
"Pastes auto-delete after 1 hour."
)
def _extract_paste_id(url: str) -> Optional[str]:
"""Extract the paste ID from a paste.rs or dpaste.com URL.
Returns the ID string, or None if the URL doesn't match a known service.
"""
url = url.strip().rstrip("/")
for prefix in ("https://paste.rs/", "http://paste.rs/"):
if url.startswith(prefix):
return url[len(prefix):]
return None
def delete_paste(url: str) -> bool:
"""Delete a paste from paste.rs. Returns True on success.
Only paste.rs supports unauthenticated DELETE. dpaste.com pastes
expire automatically but cannot be deleted via API.
"""
paste_id = _extract_paste_id(url)
if not paste_id:
raise ValueError(
f"Cannot delete: only paste.rs URLs are supported. Got: {url}"
)
target = f"{_PASTE_RS_URL}{paste_id}"
req = urllib.request.Request(
target, method="DELETE",
headers={"User-Agent": "hermes-agent/debug-share"},
)
with urllib.request.urlopen(req, timeout=30) as resp:
return 200 <= resp.status < 300
def _schedule_auto_delete(urls: list[str], delay_seconds: int = _AUTO_DELETE_SECONDS):
"""Spawn a detached process to delete paste.rs pastes after *delay_seconds*.
The child process is fully detached (``start_new_session=True``) so it
survives the parent exiting (important for CLI mode). Only paste.rs
URLs are attempted dpaste.com pastes auto-expire on their own.
"""
import subprocess
paste_rs_urls = [u for u in urls if _extract_paste_id(u)]
if not paste_rs_urls:
return
# Build a tiny inline Python script. No imports beyond stdlib.
url_list = ", ".join(f'"{u}"' for u in paste_rs_urls)
script = (
"import time, urllib.request; "
f"time.sleep({delay_seconds}); "
f"[urllib.request.urlopen(urllib.request.Request(u, method='DELETE', "
f"headers={{'User-Agent': 'hermes-agent/auto-delete'}}), timeout=15) "
f"for u in [{url_list}]]"
)
try:
subprocess.Popen(
[sys.executable, "-c", script],
start_new_session=True,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
except Exception:
pass # Best-effort; manual delete still available.
def _delete_hint(url: str) -> str:
"""Return a one-liner delete command for the given paste URL."""
paste_id = _extract_paste_id(url)
if paste_id:
return f"hermes debug delete {url}"
# dpaste.com — no API delete, expires on its own.
return "(auto-expires per dpaste.com policy)"
def _upload_paste_rs(content: str) -> str:
"""Upload to paste.rs. Returns the paste URL.
@@ -354,9 +250,6 @@ def run_debug_share(args):
expiry = getattr(args, "expire", 7)
local_only = getattr(args, "local", False)
if not local_only:
print(_PRIVACY_NOTICE)
print("Collecting debug report...")
# Capture dump once — prepended to every paste for context.
@@ -422,56 +315,22 @@ def run_debug_share(args):
if failures:
print(f"\n (failed to upload: {', '.join(failures)})")
# Schedule auto-deletion after 1 hour
_schedule_auto_delete(list(urls.values()))
print(f"\n⏱ Pastes will auto-delete in 1 hour.")
# Manual delete fallback
print(f"To delete now: hermes debug delete <url>")
print(f"\nShare these links with the Hermes team for support.")
def run_debug_delete(args):
"""Delete one or more paste URLs uploaded by /debug."""
urls = getattr(args, "urls", [])
if not urls:
print("Usage: hermes debug delete <url> [<url> ...]")
print(" Deletes paste.rs pastes uploaded by 'hermes debug share'.")
return
for url in urls:
try:
ok = delete_paste(url)
if ok:
print(f" ✓ Deleted: {url}")
else:
print(f" ✗ Failed to delete: {url} (unexpected response)")
except ValueError as exc:
print(f"{exc}")
except Exception as exc:
print(f" ✗ Could not delete {url}: {exc}")
def run_debug(args):
"""Route debug subcommands."""
subcmd = getattr(args, "debug_command", None)
if subcmd == "share":
run_debug_share(args)
elif subcmd == "delete":
run_debug_delete(args)
else:
# Default: show help
print("Usage: hermes debug <command>")
print("Usage: hermes debug share [--lines N] [--expire N] [--local]")
print()
print("Commands:")
print(" share Upload debug report to a paste service and print URL")
print(" delete Delete a previously uploaded paste")
print()
print("Options (share):")
print("Options:")
print(" --lines N Number of log lines to include (default: 200)")
print(" --expire N Paste expiry in days (default: 7)")
print(" --local Print report locally instead of uploading")
print()
print("Options (delete):")
print(" <url> ... One or more paste URLs to delete")
+2 -109
View File
@@ -8,7 +8,6 @@ import os
import sys
import subprocess
import shutil
from pathlib import Path
from hermes_cli.config import get_project_root, get_hermes_home, get_env_path
from hermes_constants import display_hermes_home
@@ -514,87 +513,7 @@ def run_doctor(args):
pass
_check_gateway_service_linger(issues)
# =========================================================================
# Check: Command installation (hermes bin symlink)
# =========================================================================
if sys.platform != "win32":
print()
print(color("◆ Command Installation", Colors.CYAN, Colors.BOLD))
# Determine the venv entry point location
_venv_bin = None
for _venv_name in ("venv", ".venv"):
_candidate = PROJECT_ROOT / _venv_name / "bin" / "hermes"
if _candidate.exists():
_venv_bin = _candidate
break
# Determine the expected command link directory (mirrors install.sh logic)
_prefix = os.environ.get("PREFIX", "")
_is_termux_env = bool(os.environ.get("TERMUX_VERSION")) or "com.termux/files/usr" in _prefix
if _is_termux_env and _prefix:
_cmd_link_dir = Path(_prefix) / "bin"
_cmd_link_display = "$PREFIX/bin"
else:
_cmd_link_dir = Path.home() / ".local" / "bin"
_cmd_link_display = "~/.local/bin"
_cmd_link = _cmd_link_dir / "hermes"
if _venv_bin is None:
check_warn(
"Venv entry point not found",
"(hermes not in venv/bin/ or .venv/bin/ — reinstall with pip install -e '.[all]')"
)
manual_issues.append(
f"Reinstall entry point: cd {PROJECT_ROOT} && source venv/bin/activate && pip install -e '.[all]'"
)
else:
check_ok(f"Venv entry point exists ({_venv_bin.relative_to(PROJECT_ROOT)})")
# Check the symlink at the command link location
if _cmd_link.is_symlink():
_target = _cmd_link.resolve()
_expected = _venv_bin.resolve()
if _target == _expected:
check_ok(f"{_cmd_link_display}/hermes → correct target")
else:
check_warn(
f"{_cmd_link_display}/hermes points to wrong target",
f"(→ {_target}, expected → {_expected})"
)
if should_fix:
_cmd_link.unlink()
_cmd_link.symlink_to(_venv_bin)
check_ok(f"Fixed symlink: {_cmd_link_display}/hermes → {_venv_bin}")
fixed_count += 1
else:
issues.append(f"Broken symlink at {_cmd_link_display}/hermes — run 'hermes doctor --fix'")
elif _cmd_link.exists():
# It's a regular file, not a symlink — possibly a wrapper script
check_ok(f"{_cmd_link_display}/hermes exists (non-symlink)")
else:
check_fail(
f"{_cmd_link_display}/hermes not found",
"(hermes command may not work outside the venv)"
)
if should_fix:
_cmd_link_dir.mkdir(parents=True, exist_ok=True)
_cmd_link.symlink_to(_venv_bin)
check_ok(f"Created symlink: {_cmd_link_display}/hermes → {_venv_bin}")
fixed_count += 1
# Check if the link dir is on PATH
_path_dirs = os.environ.get("PATH", "").split(os.pathsep)
if str(_cmd_link_dir) not in _path_dirs:
check_warn(
f"{_cmd_link_display} is not on your PATH",
"(add it to your shell config: export PATH=\"$HOME/.local/bin:$PATH\")"
)
manual_issues.append(f"Add {_cmd_link_display} to your PATH")
else:
issues.append(f"Missing {_cmd_link_display}/hermes symlink — run 'hermes doctor --fix'")
# =========================================================================
# Check: External tools
# =========================================================================
@@ -814,8 +733,7 @@ def run_doctor(args):
("Vercel AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
("Kilo Code", ("KILOCODE_API_KEY",), "https://api.kilo.ai/api/gateway/models", "KILOCODE_BASE_URL", True),
("OpenCode Zen", ("OPENCODE_ZEN_API_KEY",), "https://opencode.ai/zen/v1/models", "OPENCODE_ZEN_BASE_URL", True),
# OpenCode Go has no shared /models endpoint; skip the health check.
("OpenCode Go", ("OPENCODE_GO_API_KEY",), None, "OPENCODE_GO_BASE_URL", False),
("OpenCode Go", ("OPENCODE_GO_API_KEY",), "https://opencode.ai/zen/go/v1/models", "OPENCODE_GO_BASE_URL", True),
]
for _pname, _env_vars, _default_url, _base_env, _supports_health_check in _apikey_providers:
_key = ""
@@ -860,31 +778,6 @@ def run_doctor(args):
except Exception as _e:
print(f"\r {color('', Colors.YELLOW)} {_label} {color(f'({_e})', Colors.DIM)} ")
# -- AWS Bedrock --
# Bedrock uses the AWS SDK credential chain, not API keys.
try:
from agent.bedrock_adapter import has_aws_credentials, resolve_aws_auth_env_var, resolve_bedrock_region
if has_aws_credentials():
_auth_var = resolve_aws_auth_env_var()
_region = resolve_bedrock_region()
_label = "AWS Bedrock".ljust(20)
print(f" Checking AWS Bedrock...", end="", flush=True)
try:
import boto3
_br_client = boto3.client("bedrock", region_name=_region)
_br_resp = _br_client.list_foundation_models()
_model_count = len(_br_resp.get("modelSummaries", []))
print(f"\r {color('', Colors.GREEN)} {_label} {color(f'({_auth_var}, {_region}, {_model_count} models)', Colors.DIM)} ")
except ImportError:
print(f"\r {color('', Colors.YELLOW)} {_label} {color('(boto3 not installed — pip install hermes-agent[bedrock])', Colors.DIM)} ")
issues.append("Install boto3 for Bedrock: pip install hermes-agent[bedrock]")
except Exception as _e:
_err_name = type(_e).__name__
print(f"\r {color('', Colors.YELLOW)} {_label} {color(f'({_err_name}: {_e})', Colors.DIM)} ")
issues.append(f"AWS Bedrock: {_err_name} — check IAM permissions for bedrock:ListFoundationModels")
except ImportError:
pass # bedrock_adapter not available — skip silently
# =========================================================================
# Check: Submodules
# =========================================================================
-29
View File
@@ -8,40 +8,11 @@ from pathlib import Path
from dotenv import load_dotenv
# Env var name suffixes that indicate credential values. These are the
# only env vars whose values we sanitize on load — we must not silently
# alter arbitrary user env vars, but credentials are known to require
# pure ASCII (they become HTTP header values).
_CREDENTIAL_SUFFIXES = ("_API_KEY", "_TOKEN", "_SECRET", "_KEY")
def _sanitize_loaded_credentials() -> None:
"""Strip non-ASCII characters from credential env vars in os.environ.
Called after dotenv loads so the rest of the codebase never sees
non-ASCII API keys. Only touches env vars whose names end with
known credential suffixes (``_API_KEY``, ``_TOKEN``, etc.).
"""
for key, value in list(os.environ.items()):
if not any(key.endswith(suffix) for suffix in _CREDENTIAL_SUFFIXES):
continue
try:
value.encode("ascii")
except UnicodeEncodeError:
os.environ[key] = value.encode("ascii", errors="ignore").decode("ascii")
def _load_dotenv_with_fallback(path: Path, *, override: bool) -> None:
try:
load_dotenv(dotenv_path=path, override=override, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(dotenv_path=path, override=override, encoding="latin-1")
# Strip non-ASCII characters from credential env vars that were just
# loaded. API keys must be pure ASCII since they're sent as HTTP
# header values (httpx encodes headers as ASCII). Non-ASCII chars
# typically come from copy-pasting keys from PDFs or rich-text editors
# that substitute Unicode lookalike glyphs (e.g. ʋ U+028B for v).
_sanitize_loaded_credentials()
def _sanitize_env_file_if_needed(path: Path) -> None:
+127 -111
View File
@@ -10,6 +10,7 @@ import shutil
import signal
import subprocess
import sys
import time
from pathlib import Path
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
@@ -37,6 +38,10 @@ from hermes_cli.setup import (
from hermes_cli.colors import Colors, color
_SERVICE_READINESS_TIMEOUT = 30.0
_SERVICE_READINESS_POLL_INTERVAL = 0.2
# =============================================================================
# Process Management (for manual gateway runs)
# =============================================================================
@@ -222,7 +227,7 @@ def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = Fals
current_cmd = ""
else:
result = subprocess.run(
["ps", "-A", "eww", "-o", "pid=,command="],
["ps", "eww", "-ax", "-o", "pid=,command="],
capture_output=True,
text=True,
timeout=10,
@@ -715,9 +720,7 @@ def _detect_venv_dir() -> Path | None:
"""Detect the active virtualenv directory.
Checks ``sys.prefix`` first (works regardless of the directory name),
then ``VIRTUAL_ENV`` env var (covers uv-managed environments where
sys.prefix == sys.base_prefix), then falls back to probing common
directory names under PROJECT_ROOT.
then falls back to probing common directory names under PROJECT_ROOT.
Returns ``None`` when no virtualenv can be found.
"""
# If we're running inside a virtualenv, sys.prefix points to it.
@@ -726,15 +729,6 @@ def _detect_venv_dir() -> Path | None:
if venv.is_dir():
return venv
# uv and some other tools set VIRTUAL_ENV without changing sys.prefix.
# This catches `uv run` where sys.prefix == sys.base_prefix but the
# environment IS a venv. (#8620)
_virtual_env = os.environ.get("VIRTUAL_ENV")
if _virtual_env:
venv = Path(_virtual_env)
if venv.is_dir():
return venv
# Fallback: check common virtualenv directory names under the project root.
for candidate in (".venv", "venv"):
venv = PROJECT_ROOT / candidate
@@ -1111,12 +1105,123 @@ def systemd_uninstall(system: bool = False):
print(f"{_service_scope_label(system).capitalize()} service uninstalled")
def _describe_startup_check(check_id: str, check: dict) -> str:
source = check.get("source")
detail = check.get("detail")
label = f"{check_id} ({source})" if source and source != check_id else check_id
return f"{label}: {detail}" if detail else label
def _classify_startup_checks(state: dict | None) -> tuple[list[str], list[str], list[str]]:
checks = (state or {}).get("startup_checks") or {}
pending_required: list[str] = []
failed_required: list[str] = []
optional_warnings: list[str] = []
if not isinstance(checks, dict):
return pending_required, failed_required, optional_warnings
for check_id, raw_check in checks.items():
check = raw_check if isinstance(raw_check, dict) else {}
label = _describe_startup_check(str(check_id), check)
check_state = str(check.get("state", "pending")).strip().lower()
required = bool(check.get("required", True))
if check_state == "ready":
continue
if required:
if check_state == "failed":
failed_required.append(label)
else:
pending_required.append(label)
else:
prefix = "failed" if check_state == "failed" else "pending"
optional_warnings.append(f"{prefix}: {label}")
return pending_required, failed_required, optional_warnings
def _wait_for_service_readiness(
*,
action: str,
previous_pid: int | None = None,
timeout: float = _SERVICE_READINESS_TIMEOUT,
poll_interval: float = _SERVICE_READINESS_POLL_INTERVAL,
) -> list[str]:
from gateway.status import get_running_pid, read_runtime_status
deadline = time.monotonic() + timeout
last_pending: list[str] = []
while time.monotonic() < deadline:
live_pid = get_running_pid()
if live_pid is None or (previous_pid is not None and live_pid == previous_pid):
time.sleep(poll_interval)
continue
runtime = read_runtime_status() or {}
try:
runtime_pid = int(runtime.get("pid"))
except (TypeError, ValueError):
runtime_pid = None
if runtime_pid != live_pid:
time.sleep(poll_interval)
continue
gateway_state = runtime.get("gateway_state")
pending_required, failed_required, optional_warnings = _classify_startup_checks(runtime)
last_pending = pending_required
if gateway_state == "startup_failed":
reason = runtime.get("exit_reason") or f"gateway {action} failed during startup"
raise RuntimeError(reason)
if failed_required:
raise RuntimeError(
"required startup checks failed: " + "; ".join(failed_required)
)
if gateway_state == "running" and not pending_required:
return optional_warnings
time.sleep(poll_interval)
if last_pending:
raise RuntimeError(
"timed out waiting for required startup checks: " + "; ".join(last_pending)
)
if previous_pid is not None:
raise RuntimeError(
f"timed out waiting for gateway {action}; previous process is still active or no new runtime became ready"
)
raise RuntimeError(f"timed out waiting for gateway {action} readiness")
def _await_service_ready_or_exit(
*,
action: str,
previous_pid: int | None = None,
timeout: float = _SERVICE_READINESS_TIMEOUT,
) -> None:
try:
optional_warnings = _wait_for_service_readiness(
action=action,
previous_pid=previous_pid,
timeout=timeout,
)
except RuntimeError as exc:
print_error(f" Gateway {action} did not become ready: {exc}")
raise SystemExit(1) from exc
for warning in optional_warnings:
print_warning(f" Optional startup check {warning}")
def systemd_start(system: bool = False):
system = _select_systemd_scope(system)
if system:
_require_root_for_system_service("start")
refresh_systemd_unit_if_needed(system=system)
_run_systemctl(["start", get_service_name()], system=system, check=True, timeout=30)
_await_service_ready_or_exit(action="start")
print(f"{_service_scope_label(system).capitalize()} service started")
@@ -1139,64 +1244,11 @@ def systemd_restart(system: bool = False):
pid = get_running_pid()
if pid is not None and _request_gateway_self_restart(pid):
# SIGUSR1 sent — the gateway will drain active agents, exit with
# code 75, and systemd will restart it after RestartSec (30s).
# Wait for the old process to die and the new one to become active
# so the CLI doesn't return while the service is still restarting.
import time
scope_label = _service_scope_label(system).capitalize()
svc = get_service_name()
scope_cmd = _systemctl_cmd(system)
# Phase 1: wait for old process to exit (drain + shutdown)
print(f"{scope_label} service draining active work...")
deadline = time.time() + 90
while time.time() < deadline:
try:
os.kill(pid, 0)
time.sleep(1)
except (ProcessLookupError, PermissionError):
break # old process is gone
else:
print(f"⚠ Old process (PID {pid}) still alive after 90s")
# Phase 2: wait for systemd to start the new process
print(f"⏳ Waiting for {svc} to restart...")
deadline = time.time() + 60
while time.time() < deadline:
try:
result = subprocess.run(
scope_cmd + ["is-active", svc],
capture_output=True, text=True, timeout=5,
)
if result.stdout.strip() == "active":
# Verify it's a NEW process, not the old one somehow
new_pid = get_running_pid()
if new_pid and new_pid != pid:
print(f"{scope_label} service restarted (PID {new_pid})")
return
except (subprocess.TimeoutExpired, FileNotFoundError):
pass
time.sleep(2)
# Timed out — check final state
try:
result = subprocess.run(
scope_cmd + ["is-active", svc],
capture_output=True, text=True, timeout=5,
)
if result.stdout.strip() == "active":
print(f"{scope_label} service restarted")
return
except Exception:
pass
print(
f"{scope_label} service did not become active within 60s.\n"
f" Check status: {'sudo ' if system else ''}hermes gateway status\n"
f" Check logs: journalctl {'--user ' if not system else ''}-u {svc} --since '2 min ago'"
)
_await_service_ready_or_exit(action="restart", previous_pid=pid)
print(f"{_service_scope_label(system).capitalize()} service restarted")
return
_run_systemctl(["reload-or-restart", get_service_name()], system=system, check=True, timeout=90)
_await_service_ready_or_exit(action="restart", previous_pid=pid)
print(f"{_service_scope_label(system).capitalize()} service restarted")
@@ -1455,6 +1507,7 @@ def launchd_start():
plist_path.write_text(generate_launchd_plist(), encoding="utf-8")
subprocess.run(["launchctl", "bootstrap", _launchd_domain(), str(plist_path)], check=True, timeout=30)
subprocess.run(["launchctl", "kickstart", f"{_launchd_domain()}/{label}"], check=True, timeout=30)
_await_service_ready_or_exit(action="start")
print("✓ Service started")
return
@@ -1467,6 +1520,7 @@ def launchd_start():
print("↻ launchd job was unloaded; reloading service definition")
subprocess.run(["launchctl", "bootstrap", _launchd_domain(), str(plist_path)], check=True, timeout=30)
subprocess.run(["launchctl", "kickstart", f"{_launchd_domain()}/{label}"], check=True, timeout=30)
_await_service_ready_or_exit(action="start")
print("✓ Service started")
def launchd_stop():
@@ -1537,7 +1591,8 @@ def launchd_restart():
try:
pid = get_running_pid()
if pid is not None and _request_gateway_self_restart(pid):
print("✓ Service restart requested")
_await_service_ready_or_exit(action="restart", previous_pid=pid)
print("✓ Service restarted")
return
if pid is not None:
try:
@@ -1549,6 +1604,7 @@ def launchd_restart():
if not exited:
print(f"⚠ Gateway drain timed out after {drain_timeout:.0f}s — forcing launchd restart")
subprocess.run(["launchctl", "kickstart", "-k", target], check=True, timeout=90)
_await_service_ready_or_exit(action="restart", previous_pid=pid)
print("✓ Service restarted")
except subprocess.CalledProcessError as e:
if e.returncode not in (3, 113):
@@ -1558,6 +1614,7 @@ def launchd_restart():
plist_path = get_launchd_plist_path()
subprocess.run(["launchctl", "bootstrap", _launchd_domain(), str(plist_path)], check=True, timeout=30)
subprocess.run(["launchctl", "kickstart", target], check=True, timeout=30)
_await_service_ready_or_exit(action="restart", previous_pid=pid)
print("✓ Service restarted")
def launchd_status(deep: bool = False):
@@ -2930,15 +2987,6 @@ def gateway_command(args):
elif subcmd == "start":
system = getattr(args, 'system', False)
start_all = getattr(args, 'all', False)
if start_all:
# Kill all stale gateway processes across all profiles before starting
killed = kill_gateway_processes(all_profiles=True)
if killed:
print(f"✓ Killed {killed} stale gateway process(es) across all profiles")
_wait_for_gateway_exit(timeout=10.0, force_after=5.0)
if is_termux():
print("Gateway service start is not supported on Termux because there is no system service manager.")
print("Run manually: hermes gateway")
@@ -3024,39 +3072,7 @@ def gateway_command(args):
# Try service first, fall back to killing and restarting
service_available = False
system = getattr(args, 'system', False)
restart_all = getattr(args, 'all', False)
service_configured = False
if restart_all:
# --all: stop every gateway process across all profiles, then start fresh
service_stopped = False
if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
try:
systemd_stop(system=system)
service_stopped = True
except subprocess.CalledProcessError:
pass
elif is_macos() and get_launchd_plist_path().exists():
try:
launchd_stop()
service_stopped = True
except subprocess.CalledProcessError:
pass
killed = kill_gateway_processes(all_profiles=True)
total = killed + (1 if service_stopped else 0)
if total:
print(f"✓ Stopped {total} gateway process(es) across all profiles")
_wait_for_gateway_exit(timeout=10.0, force_after=5.0)
# Start the current profile's service fresh
print("Starting gateway...")
if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
systemd_start(system=system)
elif is_macos() and get_launchd_plist_path().exists():
launchd_start()
else:
run_gateway(verbose=0)
return
if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
service_configured = True
+1 -290
View File
@@ -1139,8 +1139,6 @@ def select_provider_and_model(args=None):
_model_flow_anthropic(config, current_model)
elif selected_provider == "kimi-coding":
_model_flow_kimi(config, current_model)
elif selected_provider == "bedrock":
_model_flow_bedrock(config, current_model)
elif selected_provider in ("gemini", "deepseek", "xai", "zai", "kimi-coding-cn", "minimax", "minimax-cn", "kilocode", "opencode-zen", "opencode-go", "ai-gateway", "alibaba", "huggingface", "xiaomi", "arcee"):
_model_flow_api_key_provider(config, selected_provider, current_model)
@@ -2427,252 +2425,6 @@ def _model_flow_kimi(config, current_model=""):
print("No change.")
def _model_flow_bedrock_api_key(config, region, current_model=""):
"""Bedrock API Key mode — uses the OpenAI-compatible bedrock-mantle endpoint.
For developers who don't have an AWS account but received a Bedrock API Key
from their AWS admin. Works like any OpenAI-compatible endpoint.
"""
from hermes_cli.auth import _prompt_model_selection, _save_model_choice, deactivate_provider
from hermes_cli.config import load_config, save_config, get_env_value, save_env_value
from hermes_cli.models import _PROVIDER_MODELS
mantle_base_url = f"https://bedrock-mantle.{region}.api.aws/v1"
# Prompt for API key
existing_key = get_env_value("AWS_BEARER_TOKEN_BEDROCK") or ""
if existing_key:
print(f" Bedrock API Key: {existing_key[:12]}... ✓")
else:
print(f" Endpoint: {mantle_base_url}")
print()
try:
import getpass
api_key = getpass.getpass(" Bedrock API Key: ").strip()
except (KeyboardInterrupt, EOFError):
print()
return
if not api_key:
print(" Cancelled.")
return
save_env_value("AWS_BEARER_TOKEN_BEDROCK", api_key)
existing_key = api_key
print(" ✓ API key saved.")
print()
# Model selection — use static list (mantle doesn't need boto3 for discovery)
model_list = _PROVIDER_MODELS.get("bedrock", [])
print(f" Showing {len(model_list)} curated models")
if model_list:
selected = _prompt_model_selection(model_list, current_model=current_model)
else:
try:
selected = input(" Model ID: ").strip()
except (KeyboardInterrupt, EOFError):
selected = None
if selected:
_save_model_choice(selected)
# Save as custom provider pointing to bedrock-mantle
cfg = load_config()
model = cfg.get("model")
if not isinstance(model, dict):
model = {"default": model} if model else {}
cfg["model"] = model
model["provider"] = "custom"
model["base_url"] = mantle_base_url
model.pop("api_mode", None) # chat_completions is the default
# Also save region in bedrock config for reference
bedrock_cfg = cfg.get("bedrock", {})
if not isinstance(bedrock_cfg, dict):
bedrock_cfg = {}
bedrock_cfg["region"] = region
cfg["bedrock"] = bedrock_cfg
# Save the API key env var name so hermes knows where to find it
save_env_value("OPENAI_API_KEY", existing_key)
save_env_value("OPENAI_BASE_URL", mantle_base_url)
save_config(cfg)
deactivate_provider()
print(f" Default model set to: {selected} (via Bedrock API Key, {region})")
print(f" Endpoint: {mantle_base_url}")
else:
print(" No change.")
def _model_flow_bedrock(config, current_model=""):
"""AWS Bedrock provider: verify credentials, pick region, discover models.
Uses the native Converse API via boto3 not the OpenAI-compatible endpoint.
Auth is handled by the AWS SDK default credential chain (env vars, profile,
instance role), so no API key prompt is needed.
"""
from hermes_cli.auth import _prompt_model_selection, _save_model_choice, deactivate_provider
from hermes_cli.config import load_config, save_config
from hermes_cli.models import _PROVIDER_MODELS
# 1. Check for AWS credentials
try:
from agent.bedrock_adapter import (
has_aws_credentials,
resolve_aws_auth_env_var,
resolve_bedrock_region,
discover_bedrock_models,
)
except ImportError:
print(" ✗ boto3 is not installed. Install it with:")
print(" pip install boto3")
print()
return
if not has_aws_credentials():
print(" ⚠ No AWS credentials detected via environment variables.")
print(" Bedrock will use boto3's default credential chain (IMDS, SSO, etc.)")
print()
auth_var = resolve_aws_auth_env_var()
if auth_var:
print(f" AWS credentials: {auth_var}")
else:
print(" AWS credentials: boto3 default chain (instance role / SSO)")
print()
# 2. Region selection
current_region = resolve_bedrock_region()
try:
region_input = input(f" AWS Region [{current_region}]: ").strip()
except (KeyboardInterrupt, EOFError):
print()
return
region = region_input or current_region
# 2b. Authentication mode
print(" Choose authentication method:")
print()
print(" 1. IAM credential chain (recommended)")
print(" Works with EC2 instance roles, SSO, env vars, aws configure")
print(" 2. Bedrock API Key")
print(" Enter your Bedrock API Key directly — also supports")
print(" team scenarios where an admin distributes keys")
print()
try:
auth_choice = input(" Choice [1]: ").strip()
except (KeyboardInterrupt, EOFError):
print()
return
if auth_choice == "2":
_model_flow_bedrock_api_key(config, region, current_model)
return
# 3. Model discovery — try live API first, fall back to static list
print(f" Discovering models in {region}...")
live_models = discover_bedrock_models(region)
if live_models:
_EXCLUDE_PREFIXES = (
"stability.", "cohere.embed", "twelvelabs.", "us.stability.",
"us.cohere.embed", "us.twelvelabs.", "global.cohere.embed",
"global.twelvelabs.",
)
_EXCLUDE_SUBSTRINGS = ("safeguard", "voxtral", "palmyra-vision")
filtered = []
for m in live_models:
mid = m["id"]
if any(mid.startswith(p) for p in _EXCLUDE_PREFIXES):
continue
if any(s in mid.lower() for s in _EXCLUDE_SUBSTRINGS):
continue
filtered.append(m)
# Deduplicate: prefer inference profiles (us.*, global.*) over bare
# foundation model IDs.
profile_base_ids = set()
for m in filtered:
mid = m["id"]
if mid.startswith(("us.", "global.")):
base = mid.split(".", 1)[1] if "." in mid[3:] else mid
profile_base_ids.add(base)
deduped = []
for m in filtered:
mid = m["id"]
if not mid.startswith(("us.", "global.")) and mid in profile_base_ids:
continue
deduped.append(m)
_RECOMMENDED = [
"us.anthropic.claude-sonnet-4-6",
"us.anthropic.claude-opus-4-6",
"us.anthropic.claude-haiku-4-5",
"us.amazon.nova-pro",
"us.amazon.nova-lite",
"us.amazon.nova-micro",
"deepseek.v3",
"us.meta.llama4-maverick",
"us.meta.llama4-scout",
]
def _sort_key(m):
mid = m["id"]
for i, rec in enumerate(_RECOMMENDED):
if mid.startswith(rec):
return (0, i, mid)
if mid.startswith("global."):
return (1, 0, mid)
return (2, 0, mid)
deduped.sort(key=_sort_key)
model_list = [m["id"] for m in deduped]
print(f" Found {len(model_list)} text model(s) (filtered from {len(live_models)} total)")
else:
model_list = _PROVIDER_MODELS.get("bedrock", [])
if model_list:
print(f" Using {len(model_list)} curated models (live discovery unavailable)")
else:
print(" No models found. Check IAM permissions for bedrock:ListFoundationModels.")
return
# 4. Model selection
if model_list:
selected = _prompt_model_selection(model_list, current_model=current_model)
else:
try:
selected = input(" Model ID: ").strip()
except (KeyboardInterrupt, EOFError):
selected = None
if selected:
_save_model_choice(selected)
cfg = load_config()
model = cfg.get("model")
if not isinstance(model, dict):
model = {"default": model} if model else {}
cfg["model"] = model
model["provider"] = "bedrock"
model["base_url"] = f"https://bedrock-runtime.{region}.amazonaws.com"
model.pop("api_mode", None) # bedrock_converse is auto-detected
bedrock_cfg = cfg.get("bedrock", {})
if not isinstance(bedrock_cfg, dict):
bedrock_cfg = {}
bedrock_cfg["region"] = region
cfg["bedrock"] = bedrock_cfg
save_config(cfg)
deactivate_provider()
print(f" Default model set to: {selected} (via AWS Bedrock, {region})")
else:
print(" No change.")
def _model_flow_api_key_provider(config, provider_id, current_model=""):
"""Generic flow for API-key providers (z.ai, MiniMax, OpenCode, etc.)."""
from hermes_cli.auth import (
@@ -4997,7 +4749,6 @@ For more help on a command:
# gateway start
gateway_start = gateway_subparsers.add_parser("start", help="Start the installed systemd/launchd background service")
gateway_start.add_argument("--system", action="store_true", help="Target the Linux system-level gateway service")
gateway_start.add_argument("--all", action="store_true", help="Kill ALL stale gateway processes across all profiles before starting")
# gateway stop
gateway_stop = gateway_subparsers.add_parser("stop", help="Stop gateway service")
@@ -5007,7 +4758,6 @@ For more help on a command:
# gateway restart
gateway_restart = gateway_subparsers.add_parser("restart", help="Restart gateway service")
gateway_restart.add_argument("--system", action="store_true", help="Target the Linux system-level gateway service")
gateway_restart.add_argument("--all", action="store_true", help="Kill ALL gateway processes across all profiles before restarting")
# gateway status
gateway_status = gateway_subparsers.add_parser("status", help="Show gateway status")
@@ -5321,7 +5071,6 @@ Examples:
hermes debug share --lines 500 Include more log lines
hermes debug share --expire 30 Keep paste for 30 days
hermes debug share --local Print report locally (no upload)
hermes debug delete <url> Delete a previously uploaded paste
""",
)
debug_sub = debug_parser.add_subparsers(dest="debug_command")
@@ -5341,14 +5090,6 @@ Examples:
"--local", action="store_true",
help="Print the report locally instead of uploading",
)
delete_parser = debug_sub.add_parser(
"delete",
help="Delete a paste uploaded by 'hermes debug share'",
)
delete_parser.add_argument(
"urls", nargs="*", default=[],
help="One or more paste URLs to delete (e.g. https://paste.rs/abc123)",
)
debug_parser.set_defaults(func=cmd_debug)
# =========================================================================
@@ -6303,37 +6044,7 @@ Examples:
sys.exit(1)
_processed_argv = _coalesce_session_name_args(sys.argv[1:])
# ── Defensive subparser routing (bpo-9338 workaround) ───────────
# On some Python versions (notably <3.11), argparse fails to route
# subcommand tokens when the parent parser has nargs='?' optional
# arguments (--continue). The symptom: "unrecognized arguments: model"
# even though 'model' is a registered subcommand.
#
# Fix: when argv contains a token matching a known subcommand, set
# subparsers.required=True to force deterministic routing. If that
# fails (e.g. 'hermes -c model' where 'model' is consumed as the
# session name for --continue), fall back to the default behaviour.
import io as _io
_known_cmds = set(subparsers.choices.keys()) if hasattr(subparsers, "choices") else set()
_has_cmd_token = any(t in _known_cmds for t in _processed_argv if not t.startswith("-"))
if _has_cmd_token:
subparsers.required = True
_saved_stderr = sys.stderr
try:
sys.stderr = _io.StringIO()
args = parser.parse_args(_processed_argv)
sys.stderr = _saved_stderr
except SystemExit:
sys.stderr = _saved_stderr
# Subcommand name was consumed as a flag value (e.g. -c model).
# Fall back to optional subparsers so argparse handles it normally.
subparsers.required = False
args = parser.parse_args(_processed_argv)
else:
subparsers.required = False
args = parser.parse_args(_processed_argv)
args = parser.parse_args(_processed_argv)
# Handle --version flag
if args.version:
+2 -4
View File
@@ -58,11 +58,9 @@ def _prompt(label: str, default: str | None = None, secret: bool = False) -> str
def _install_dependencies(provider_name: str) -> None:
"""Install pip dependencies declared in plugin.yaml."""
import subprocess
from plugins.memory import find_provider_dir
from pathlib import Path as _Path
plugin_dir = find_provider_dir(provider_name)
if not plugin_dir:
return
plugin_dir = _Path(__file__).parent.parent / "plugins" / "memory" / provider_name
yaml_path = plugin_dir / "plugin.yaml"
if not yaml_path.exists():
return
+1 -58
View File
@@ -303,22 +303,6 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"XiaomiMiMo/MiMo-V2-Flash",
"moonshotai/Kimi-K2-Thinking",
],
# AWS Bedrock — static fallback list used when dynamic discovery is
# unavailable (no boto3, no credentials, or API error). The agent
# prefers live discovery via ListFoundationModels + ListInferenceProfiles.
# Use inference profile IDs (us.*) since most models require them.
"bedrock": [
"us.anthropic.claude-sonnet-4-6",
"us.anthropic.claude-opus-4-6-v1",
"us.anthropic.claude-haiku-4-5-20251001-v1:0",
"us.anthropic.claude-sonnet-4-5-20250929-v1:0",
"us.amazon.nova-pro-v1:0",
"us.amazon.nova-lite-v1:0",
"us.amazon.nova-micro-v1:0",
"deepseek.v3.2",
"us.meta.llama4-maverick-17b-instruct-v1:0",
"us.meta.llama4-scout-17b-instruct-v1:0",
],
}
# ---------------------------------------------------------------------------
@@ -542,7 +526,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("deepseek", "DeepSeek", "DeepSeek (DeepSeek-V3, R1, coder — direct API)"),
ProviderEntry("xai", "xAI", "xAI (Grok models — direct API)"),
ProviderEntry("zai", "Z.AI / GLM", "Z.AI / GLM (Zhipu AI direct API)"),
ProviderEntry("kimi-coding", "Kimi / Kimi Coding Plan", "Kimi Coding Plan (api.kimi.com) & Moonshot API"),
ProviderEntry("kimi-coding", "Kimi / Moonshot", "Kimi / Moonshot (Moonshot AI direct API)"),
ProviderEntry("kimi-coding-cn", "Kimi / Moonshot (China)", "Kimi / Moonshot China (Moonshot CN direct API)"),
ProviderEntry("minimax", "MiniMax", "MiniMax (global direct API)"),
ProviderEntry("minimax-cn", "MiniMax (China)", "MiniMax China (domestic direct API)"),
@@ -552,7 +536,6 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("opencode-zen", "OpenCode Zen", "OpenCode Zen (35+ curated models, pay-as-you-go)"),
ProviderEntry("opencode-go", "OpenCode Go", "OpenCode Go (open models, $10/month subscription)"),
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway (200+ models, pay-per-use)"),
ProviderEntry("bedrock", "AWS Bedrock", "AWS Bedrock (Claude, Nova, Llama, DeepSeek — IAM or API key)"),
]
# Derived dicts — used throughout the codebase
@@ -604,10 +587,6 @@ _PROVIDER_ALIASES = {
"huggingface-hub": "huggingface",
"mimo": "xiaomi",
"xiaomi-mimo": "xiaomi",
"aws": "bedrock",
"aws-bedrock": "bedrock",
"amazon-bedrock": "bedrock",
"amazon": "bedrock",
"grok": "xai",
"x-ai": "xai",
"x.ai": "xai",
@@ -1978,42 +1957,6 @@ def validate_requested_model(
# api_models is None — couldn't reach API. Accept and persist,
# but warn so typos don't silently break things.
# Bedrock: use our own discovery instead of HTTP /models endpoint.
# Bedrock's bedrock-runtime URL doesn't support /models — it uses the
# AWS SDK control plane (ListFoundationModels + ListInferenceProfiles).
if normalized == "bedrock":
try:
from agent.bedrock_adapter import discover_bedrock_models, resolve_bedrock_region
region = resolve_bedrock_region()
discovered = discover_bedrock_models(region)
discovered_ids = {m["id"] for m in discovered}
if requested in discovered_ids:
return {
"accepted": True,
"persist": True,
"recognized": True,
"message": None,
}
# Not in discovered list — still accept (user may have custom
# inference profiles or cross-account access), but warn.
suggestions = get_close_matches(requested, list(discovered_ids), n=3, cutoff=0.4)
suggestion_text = ""
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": True,
"persist": True,
"recognized": False,
"message": (
f"Note: `{requested}` was not found in Bedrock model discovery for {region}. "
f"It may still work with custom inference profiles or cross-account access."
f"{suggestion_text}"
),
}
except Exception:
pass # Fall through to generic warning
provider_label = _PROVIDER_LABELS.get(normalized, normalized)
return {
"accepted": True,
-14
View File
@@ -236,12 +236,6 @@ ALIASES: Dict[str, str] = {
"mimo": "xiaomi",
"xiaomi-mimo": "xiaomi",
# bedrock
"aws": "bedrock",
"aws-bedrock": "bedrock",
"amazon-bedrock": "bedrock",
"amazon": "bedrock",
# arcee
"arcee-ai": "arcee",
"arceeai": "arcee",
@@ -268,7 +262,6 @@ _LABEL_OVERRIDES: Dict[str, str] = {
"copilot-acp": "GitHub Copilot ACP",
"xiaomi": "Xiaomi MiMo",
"local": "Local endpoint",
"bedrock": "AWS Bedrock",
}
@@ -278,7 +271,6 @@ TRANSPORT_TO_API_MODE: Dict[str, str] = {
"openai_chat": "chat_completions",
"anthropic_messages": "anthropic_messages",
"codex_responses": "codex_responses",
"bedrock_converse": "bedrock_converse",
}
@@ -396,10 +388,6 @@ def determine_api_mode(provider: str, base_url: str = "") -> str:
if pdef is not None:
return TRANSPORT_TO_API_MODE.get(pdef.transport, "chat_completions")
# Direct provider checks for providers not in HERMES_OVERLAYS
if provider == "bedrock":
return "bedrock_converse"
# URL-based heuristics for custom / unknown providers
if base_url:
url_lower = base_url.rstrip("/").lower()
@@ -407,8 +395,6 @@ def determine_api_mode(provider: str, base_url: str = "") -> str:
return "anthropic_messages"
if "api.openai.com" in url_lower:
return "codex_responses"
if "bedrock-runtime" in url_lower and "amazonaws.com" in url_lower:
return "bedrock_converse"
return "chat_completions"
+1 -72
View File
@@ -124,7 +124,7 @@ def _copilot_runtime_api_mode(model_cfg: Dict[str, Any], api_key: str) -> str:
return "chat_completions"
_VALID_API_MODES = {"chat_completions", "codex_responses", "anthropic_messages", "bedrock_converse"}
_VALID_API_MODES = {"chat_completions", "codex_responses", "anthropic_messages"}
def _parse_api_mode(raw: Any) -> Optional[str]:
@@ -836,77 +836,6 @@ def resolve_runtime_provider(
"requested_provider": requested_provider,
}
# AWS Bedrock (native Converse API via boto3)
if provider == "bedrock":
from agent.bedrock_adapter import (
has_aws_credentials,
resolve_aws_auth_env_var,
resolve_bedrock_region,
is_anthropic_bedrock_model,
)
# When the user explicitly selected bedrock (not auto-detected),
# trust boto3's credential chain — it handles IMDS, ECS task roles,
# Lambda execution roles, SSO, and other implicit sources that our
# env-var check can't detect.
is_explicit = requested_provider in ("bedrock", "aws", "aws-bedrock", "amazon-bedrock", "amazon")
if not is_explicit and not has_aws_credentials():
raise AuthError(
"No AWS credentials found for Bedrock. Configure one of:\n"
" - AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY\n"
" - AWS_PROFILE (for SSO / named profiles)\n"
" - IAM instance role (EC2, ECS, Lambda)\n"
"Or run 'aws configure' to set up credentials.",
code="no_aws_credentials",
)
# Read bedrock-specific config from config.yaml
from hermes_cli.config import load_config as _load_bedrock_config
_bedrock_cfg = _load_bedrock_config().get("bedrock", {})
# Region priority: config.yaml bedrock.region → env var → us-east-1
region = (_bedrock_cfg.get("region") or "").strip() or resolve_bedrock_region()
auth_source = resolve_aws_auth_env_var() or "aws-sdk-default-chain"
# Build guardrail config if configured
_gr = _bedrock_cfg.get("guardrail", {})
guardrail_config = None
if _gr.get("guardrail_identifier") and _gr.get("guardrail_version"):
guardrail_config = {
"guardrailIdentifier": _gr["guardrail_identifier"],
"guardrailVersion": _gr["guardrail_version"],
}
if _gr.get("stream_processing_mode"):
guardrail_config["streamProcessingMode"] = _gr["stream_processing_mode"]
if _gr.get("trace"):
guardrail_config["trace"] = _gr["trace"]
# Dual-path routing: Claude models use AnthropicBedrock SDK for full
# feature parity (prompt caching, thinking budgets, adaptive thinking).
# Non-Claude models use the Converse API for multi-model support.
_current_model = str(model_cfg.get("default") or "").strip()
if is_anthropic_bedrock_model(_current_model):
# Claude on Bedrock → AnthropicBedrock SDK → anthropic_messages path
runtime = {
"provider": "bedrock",
"api_mode": "anthropic_messages",
"base_url": f"https://bedrock-runtime.{region}.amazonaws.com",
"api_key": "aws-sdk",
"source": auth_source,
"region": region,
"bedrock_anthropic": True, # Signal to use AnthropicBedrock client
"requested_provider": requested_provider,
}
else:
# Non-Claude (Nova, DeepSeek, Llama, etc.) → Converse API
runtime = {
"provider": "bedrock",
"api_mode": "bedrock_converse",
"base_url": f"https://bedrock-runtime.{region}.amazonaws.com",
"api_key": "aws-sdk",
"source": auth_source,
"region": region,
"requested_provider": requested_provider,
}
if guardrail_config:
runtime["guardrail_config"] = guardrail_config
return runtime
# API-key providers (z.ai/GLM, Kimi, MiniMax, MiniMax-CN)
pconfig = PROVIDER_REGISTRY.get(provider)
if pconfig and pconfig.auth_type == "api_key":
+2 -48
View File
@@ -1592,29 +1592,6 @@ def setup_agent_settings(config: dict):
# =============================================================================
def _setup_telegram_auto() -> str | None:
"""Attempt automatic Telegram bot creation via Managed Bots (Bot API 9.6).
Returns the bot token on success, None on failure/skip.
"""
try:
from hermes_cli.telegram_managed_bot import auto_setup_telegram_bot
except ImportError:
return None
# Determine profile name for username generation
profile_name = None
try:
hermes_home = get_hermes_home()
profiles_dir = hermes_home.rstrip("/").rsplit("/", 1)[0] if "/profiles/" in hermes_home else ""
if profiles_dir:
profile_name = hermes_home.rstrip("/").rsplit("/", 1)[-1]
except Exception:
pass
return auto_setup_telegram_bot(profile_name=profile_name)
def _setup_telegram():
"""Configure Telegram bot credentials and allowlist."""
print_header("Telegram")
@@ -1633,31 +1610,8 @@ def _setup_telegram():
print_success("Telegram allowlist configured")
return
# Offer automatic setup via Telegram Managed Bots (Bot API 9.6)
print_info("How would you like to create your Telegram bot?")
print()
print_info(" [1] Automatic (recommended)")
print_info(" Scan a QR code → confirm in Telegram → done.")
print_info(" No token copy-paste needed.")
print()
print_info(" [2] Manual")
print_info(" Create a bot via @BotFather yourself and paste the token.")
print()
choice = prompt("Choice [1/2]", default="1")
token = None
if choice.strip() == "1":
token = _setup_telegram_auto()
if not token:
print()
print_info("Falling back to manual setup...")
print()
if not token:
print_info("Create a bot via @BotFather on Telegram")
token = prompt("Telegram bot token", password=True)
print_info("Create a bot via @BotFather on Telegram")
token = prompt("Telegram bot token", password=True)
if not token:
return
save_env_value("TELEGRAM_BOT_TOKEN", token)
-269
View File
@@ -1,269 +0,0 @@
"""Telegram Managed Bot — automatic bot creation via Bot API 9.6.
Uses Telegram's Managed Bots feature to create bots for users without
manual BotFather interaction. The flow:
1. CLI generates a pairing nonce and a t.me/newbot deep link.
2. User opens the link in Telegram and confirms bot creation.
3. A Nous-hosted manager bot receives the ``managed_bot`` update,
calls ``getManagedBotToken``, and stores the token keyed by nonce.
4. CLI polls the pairing API and retrieves the token.
5. Token is saved to ``.env`` zero manual copy-paste.
Requires:
- A Nous-hosted manager bot with Bot Management Mode enabled.
- A pairing API (Cloudflare Worker + KV or equivalent) at
``MANAGED_BOT_API_URL``.
"""
from __future__ import annotations
import secrets
import string
import sys
import time
import urllib.parse
from typing import Optional
import httpx
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
# Default pairing API base URL (Nous-hosted Cloudflare Worker).
# Override via config.yaml ``telegram.managed_bot_api_url`` for self-hosted.
DEFAULT_API_URL = "https://setup.hermes-agent.nousresearch.com"
# The Nous-hosted manager bot username (without @).
DEFAULT_MANAGER_BOT = "HermesSetupBot"
# How long to poll before giving up (seconds).
DEFAULT_POLL_TIMEOUT = 180
# Poll interval (seconds).
POLL_INTERVAL = 2
# ---------------------------------------------------------------------------
# QR code rendering
# ---------------------------------------------------------------------------
def render_qr_terminal(url: str) -> str:
"""Render a URL as a QR code string suitable for terminal output.
Uses the ``qrcode`` library if available, otherwise returns an empty
string (caller should fall back to printing the URL directly).
"""
try:
import io
import qrcode # type: ignore[import-untyped]
qr = qrcode.QRCode(
version=None,
error_correction=qrcode.constants.ERROR_CORRECT_L,
box_size=1,
border=1,
)
qr.add_data(url)
qr.make(fit=True)
buf = io.StringIO()
qr.print_ascii(out=buf, invert=True)
return buf.getvalue()
except ImportError:
return ""
def print_qr_code(url: str) -> None:
"""Print a QR code to stdout, with URL fallback if qrcode is missing."""
qr_text = render_qr_terminal(url)
if qr_text:
print(qr_text)
else:
print(f" (Install 'qrcode' for a scannable QR code: pip install qrcode)")
print(f" Link: {url}")
# ---------------------------------------------------------------------------
# Deep link generation
# ---------------------------------------------------------------------------
def _random_suffix(length: int = 4) -> str:
"""Generate a short random alphanumeric suffix for bot usernames."""
alphabet = string.ascii_lowercase + string.digits
return "".join(secrets.choice(alphabet) for _ in range(length))
def generate_bot_username(profile_name: Optional[str] = None) -> str:
"""Generate a suggested bot username like ``hermes_work_a7f3_bot``.
Telegram requires bot usernames to end with ``bot`` and be 5-32 chars.
"""
base = "hermes"
suffix = _random_suffix()
if profile_name and profile_name != "default":
# Sanitize profile name for Telegram username rules (a-z, 0-9, _)
clean = "".join(c if c.isalnum() else "_" for c in profile_name.lower())
clean = clean[:12] # Keep it short
return f"{base}_{clean}_{suffix}_bot"
return f"{base}_{suffix}_bot"
def generate_deep_link(
manager_bot: str = DEFAULT_MANAGER_BOT,
suggested_username: Optional[str] = None,
suggested_name: Optional[str] = None,
) -> str:
"""Build the ``t.me/newbot`` deep link for managed bot creation.
Format: ``https://t.me/newbot/{manager_bot}/{suggested_username}[?name={name}]``
"""
username = suggested_username or generate_bot_username()
base_url = f"https://t.me/newbot/{manager_bot}/{username}"
if suggested_name:
params = urllib.parse.urlencode({"name": suggested_name})
return f"{base_url}?{params}"
return base_url
# ---------------------------------------------------------------------------
# Pairing protocol (client side)
# ---------------------------------------------------------------------------
def generate_pairing_nonce() -> str:
"""Generate a cryptographically random pairing nonce (32 hex chars)."""
return secrets.token_hex(16)
def register_pairing(api_url: str, nonce: str, timeout: float = 10.0) -> bool:
"""Register a pairing nonce with the pairing API.
``POST /pair`` body: ``{"nonce": "..."}``
Returns True on success, False on failure.
"""
try:
resp = httpx.post(
f"{api_url}/pair",
json={"nonce": nonce},
timeout=timeout,
)
return resp.status_code in (200, 201)
except httpx.HTTPError:
return False
def poll_for_token(
api_url: str,
nonce: str,
timeout: float = DEFAULT_POLL_TIMEOUT,
interval: float = POLL_INTERVAL,
) -> Optional[str]:
"""Poll the pairing API until the bot token is available or timeout.
``GET /pair/{nonce}`` 200 with ``{"token": "..."}`` when ready,
404 while waiting.
Returns the bot token string on success, None on timeout/failure.
"""
deadline = time.monotonic() + timeout
while time.monotonic() < deadline:
try:
resp = httpx.get(
f"{api_url}/pair/{nonce}",
timeout=10.0,
)
if resp.status_code == 200:
data = resp.json()
token = data.get("token")
if token:
return token
# 404 = not yet ready, keep polling
except httpx.HTTPError:
pass # Network hiccup, retry
time.sleep(interval)
return None
# ---------------------------------------------------------------------------
# Orchestrator — called from setup wizard
# ---------------------------------------------------------------------------
def auto_setup_telegram_bot(
api_url: str = DEFAULT_API_URL,
manager_bot: str = DEFAULT_MANAGER_BOT,
profile_name: Optional[str] = None,
poll_timeout: float = DEFAULT_POLL_TIMEOUT,
) -> Optional[str]:
"""Run the full automatic Telegram bot creation flow.
1. Generate nonce + suggested username.
2. Register the nonce with the pairing API.
3. Print the QR code / deep link for the user.
4. Poll until the token arrives (or timeout).
Returns the bot token on success, None on failure/timeout.
"""
nonce = generate_pairing_nonce()
username = generate_bot_username(profile_name)
deep_link = generate_deep_link(
manager_bot=manager_bot,
suggested_username=username,
suggested_name="Hermes Agent",
)
# Embed the nonce in the pairing API so the manager bot can match it.
# The manager bot receives the suggested_username from Telegram's
# managed_bot update and uses it to look up the nonce.
if not register_pairing(api_url, nonce):
print(" ✗ Could not reach the Hermes setup service.")
print(" Try the manual setup instead, or check your network.")
return None
print()
print(" Scan this QR code with your phone, or open the link below:")
print()
print_qr_code(deep_link)
print()
print(" When Telegram opens, tap 'Create Bot' to confirm.")
print(" (You can edit the bot name and username before confirming)")
print()
# Animated waiting
spinner_chars = "⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏"
start = time.monotonic()
deadline = start + poll_timeout
idx = 0
while time.monotonic() < deadline:
char = spinner_chars[idx % len(spinner_chars)]
elapsed = int(time.monotonic() - start)
remaining = int(poll_timeout - elapsed)
sys.stdout.write(f"\r {char} Waiting for bot creation... ({remaining}s remaining) ")
sys.stdout.flush()
idx += 1
try:
resp = httpx.get(f"{api_url}/pair/{nonce}", timeout=10.0)
if resp.status_code == 200:
data = resp.json()
token = data.get("token")
if token:
sys.stdout.write("\r ✓ Bot created successfully! \n")
sys.stdout.flush()
return token
except httpx.HTTPError:
pass
time.sleep(POLL_INTERVAL)
sys.stdout.write("\r ✗ Timed out waiting for bot creation. \n")
sys.stdout.flush()
print(" The bot may still be created — check Telegram.")
print(" You can paste the token manually below, or re-run setup.")
return None
+16 -38
View File
@@ -63,7 +63,6 @@ CONFIGURABLE_TOOLSETS = [
("clarify", "❓ Clarifying Questions", "clarify"),
("delegation", "👥 Task Delegation", "delegate_task"),
("cronjob", "⏰ Cron Jobs", "create/list/update/pause/resume/run, with optional attached skills"),
("messaging", "📨 Cross-Platform Messaging", "send_message"),
("rl", "🧪 RL Training", "Tinker-Atropos training tools"),
("homeassistant", "🏠 Home Assistant", "smart home device control"),
]
@@ -122,7 +121,6 @@ TOOL_CATEGORIES = {
"providers": [
{
"name": "Nous Subscription",
"badge": "subscription",
"tag": "Managed OpenAI TTS billed to your subscription",
"env_vars": [],
"tts_provider": "openai",
@@ -132,15 +130,13 @@ TOOL_CATEGORIES = {
},
{
"name": "Microsoft Edge TTS",
"badge": "★ recommended · free",
"tag": "Good quality, no API key needed",
"tag": "Free - no API key needed",
"env_vars": [],
"tts_provider": "edge",
},
{
"name": "OpenAI TTS",
"badge": "paid",
"tag": "High quality voices",
"tag": "Premium - high quality voices",
"env_vars": [
{"key": "VOICE_TOOLS_OPENAI_KEY", "prompt": "OpenAI API key", "url": "https://platform.openai.com/api-keys"},
],
@@ -148,8 +144,7 @@ TOOL_CATEGORIES = {
},
{
"name": "ElevenLabs",
"badge": "paid",
"tag": "Most natural voices",
"tag": "Premium - most natural voices",
"env_vars": [
{"key": "ELEVENLABS_API_KEY", "prompt": "ElevenLabs API key", "url": "https://elevenlabs.io/app/settings/api-keys"},
],
@@ -157,8 +152,7 @@ TOOL_CATEGORIES = {
},
{
"name": "Mistral (Voxtral TTS)",
"badge": "paid",
"tag": "Multilingual, native Opus",
"tag": "Multilingual, native Opus, needs MISTRAL_API_KEY",
"env_vars": [
{"key": "MISTRAL_API_KEY", "prompt": "Mistral API key", "url": "https://console.mistral.ai/"},
],
@@ -174,7 +168,6 @@ TOOL_CATEGORIES = {
"providers": [
{
"name": "Nous Subscription",
"badge": "subscription",
"tag": "Managed Firecrawl billed to your subscription",
"web_backend": "firecrawl",
"env_vars": [],
@@ -184,8 +177,7 @@ TOOL_CATEGORIES = {
},
{
"name": "Firecrawl Cloud",
"badge": "★ recommended",
"tag": "Full-featured search, extract, and crawl",
"tag": "Hosted service - search, extract, and crawl",
"web_backend": "firecrawl",
"env_vars": [
{"key": "FIRECRAWL_API_KEY", "prompt": "Firecrawl API key", "url": "https://firecrawl.dev"},
@@ -193,8 +185,7 @@ TOOL_CATEGORIES = {
},
{
"name": "Exa",
"badge": "paid",
"tag": "Neural search with semantic understanding",
"tag": "AI-native search and contents",
"web_backend": "exa",
"env_vars": [
{"key": "EXA_API_KEY", "prompt": "Exa API key", "url": "https://exa.ai"},
@@ -202,8 +193,7 @@ TOOL_CATEGORIES = {
},
{
"name": "Parallel",
"badge": "paid",
"tag": "AI-powered search and extract",
"tag": "AI-native search and extract",
"web_backend": "parallel",
"env_vars": [
{"key": "PARALLEL_API_KEY", "prompt": "Parallel API key", "url": "https://parallel.ai"},
@@ -211,8 +201,7 @@ TOOL_CATEGORIES = {
},
{
"name": "Tavily",
"badge": "free tier",
"tag": "Search, extract, and crawl — 1000 free searches/mo",
"tag": "AI-native search, extract, and crawl",
"web_backend": "tavily",
"env_vars": [
{"key": "TAVILY_API_KEY", "prompt": "Tavily API key", "url": "https://app.tavily.com/home"},
@@ -220,8 +209,7 @@ TOOL_CATEGORIES = {
},
{
"name": "Firecrawl Self-Hosted",
"badge": "free · self-hosted",
"tag": "Run your own Firecrawl instance (Docker)",
"tag": "Free - run your own instance",
"web_backend": "firecrawl",
"env_vars": [
{"key": "FIRECRAWL_API_URL", "prompt": "Your Firecrawl instance URL (e.g., http://localhost:3002)"},
@@ -235,7 +223,6 @@ TOOL_CATEGORIES = {
"providers": [
{
"name": "Nous Subscription",
"badge": "subscription",
"tag": "Managed FAL image generation billed to your subscription",
"env_vars": [],
"requires_nous_auth": True,
@@ -244,7 +231,6 @@ TOOL_CATEGORIES = {
},
{
"name": "FAL.ai",
"badge": "paid",
"tag": "FLUX 2 Pro with auto-upscaling",
"env_vars": [
{"key": "FAL_KEY", "prompt": "FAL API key", "url": "https://fal.ai/dashboard/keys"},
@@ -258,7 +244,6 @@ TOOL_CATEGORIES = {
"providers": [
{
"name": "Nous Subscription (Browser Use cloud)",
"badge": "subscription",
"tag": "Managed Browser Use billed to your subscription",
"env_vars": [],
"browser_provider": "browser-use",
@@ -269,16 +254,14 @@ TOOL_CATEGORIES = {
},
{
"name": "Local Browser",
"badge": "★ recommended · free",
"tag": "Headless Chromium, no API key needed",
"tag": "Free headless Chromium (no API key needed)",
"env_vars": [],
"browser_provider": "local",
"post_setup": "agent_browser",
},
{
"name": "Browserbase",
"badge": "paid",
"tag": "Cloud browser with stealth and proxies",
"tag": "Cloud browser with stealth & proxies",
"env_vars": [
{"key": "BROWSERBASE_API_KEY", "prompt": "Browserbase API key", "url": "https://browserbase.com"},
{"key": "BROWSERBASE_PROJECT_ID", "prompt": "Browserbase project ID"},
@@ -288,7 +271,6 @@ TOOL_CATEGORIES = {
},
{
"name": "Browser Use",
"badge": "paid",
"tag": "Cloud browser with remote execution",
"env_vars": [
{"key": "BROWSER_USE_API_KEY", "prompt": "Browser Use API key", "url": "https://browser-use.com"},
@@ -298,7 +280,6 @@ TOOL_CATEGORIES = {
},
{
"name": "Firecrawl",
"badge": "paid",
"tag": "Cloud browser with remote execution",
"env_vars": [
{"key": "FIRECRAWL_API_KEY", "prompt": "Firecrawl API key", "url": "https://firecrawl.dev"},
@@ -308,8 +289,7 @@ TOOL_CATEGORIES = {
},
{
"name": "Camofox",
"badge": "free · local",
"tag": "Anti-detection browser (Firefox/Camoufox)",
"tag": "Local anti-detection browser (Firefox/Camoufox)",
"env_vars": [
{"key": "CAMOFOX_URL", "prompt": "Camofox server URL", "default": "http://localhost:9377",
"url": "https://github.com/jo-inc/camofox-browser"},
@@ -858,8 +838,7 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
# Plain text labels only (no ANSI codes in menu items)
provider_choices = []
for p in providers:
badge = f" [{p['badge']}]" if p.get("badge") else ""
tag = f"{p['tag']}" if p.get("tag") else ""
tag = f" ({p['tag']})" if p.get("tag") else ""
configured = ""
env_vars = p.get("env_vars", [])
if not env_vars or all(get_env_value(v["key"]) for v in env_vars):
@@ -869,7 +848,7 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
configured = ""
else:
configured = " [configured]"
provider_choices.append(f"{p['name']}{badge}{tag}{configured}")
provider_choices.append(f"{p['name']}{tag}{configured}")
# Add skip option
provider_choices.append("Skip — keep defaults / configure later")
@@ -1125,8 +1104,7 @@ def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
provider_choices = []
for p in providers:
badge = f" [{p['badge']}]" if p.get("badge") else ""
tag = f"{p['tag']}" if p.get("tag") else ""
tag = f" ({p['tag']})" if p.get("tag") else ""
configured = ""
env_vars = p.get("env_vars", [])
if not env_vars or all(get_env_value(v["key"]) for v in env_vars):
@@ -1136,7 +1114,7 @@ def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
configured = ""
else:
configured = " [configured]"
provider_choices.append(f"{p['name']}{badge}{tag}{configured}")
provider_choices.append(f"{p['name']}{tag}{configured}")
default_idx = _detect_active_provider_index(providers, config)
-1
View File
@@ -358,7 +358,6 @@ def _add_rotating_handler(
path.parent.mkdir(parents=True, exist_ok=True)
handler = _ManagedRotatingFileHandler(
str(path), maxBytes=max_bytes, backupCount=backup_count,
encoding="utf-8",
)
handler.setLevel(level)
handler.setFormatter(formatter)
+40 -2
View File
@@ -26,7 +26,7 @@ import logging
import threading
from typing import Dict, Any, List, Optional, Tuple
from tools.registry import discover_builtin_tools, registry
from tools.registry import registry
from toolsets import resolve_toolset, validate_toolset
logger = logging.getLogger(__name__)
@@ -129,7 +129,45 @@ def _run_async(coro):
# Tool Discovery (importing each module triggers its registry.register calls)
# =============================================================================
discover_builtin_tools()
def _discover_tools():
"""Import all tool modules to trigger their registry.register() calls.
Wrapped in a function so import errors in optional tools (e.g., fal_client
not installed) don't prevent the rest from loading.
"""
_modules = [
"tools.web_tools",
"tools.terminal_tool",
"tools.file_tools",
"tools.vision_tools",
"tools.mixture_of_agents_tool",
"tools.image_generation_tool",
"tools.skills_tool",
"tools.skill_manager_tool",
"tools.browser_tool",
"tools.cronjob_tools",
"tools.rl_training_tool",
"tools.tts_tool",
"tools.todo_tool",
"tools.memory_tool",
"tools.session_search_tool",
"tools.clarify_tool",
"tools.code_execution_tool",
"tools.delegate_tool",
"tools.process_registry",
"tools.send_message_tool",
# "tools.honcho_tools", # Removed — Honcho is now a memory provider plugin
"tools.homeassistant_tool",
]
import importlib
for mod_name in _modules:
try:
importlib.import_module(mod_name)
except Exception as e:
logger.warning("Could not import tool module %s: %s", mod_name, e)
_discover_tools()
# MCP tool discovery (external MCP servers from config)
try:
+26 -115
View File
@@ -1,22 +1,18 @@
"""Memory provider plugin discovery.
Scans two directories for memory provider plugins:
1. Bundled providers: ``plugins/memory/<name>/`` (shipped with hermes-agent)
2. User-installed providers: ``$HERMES_HOME/plugins/<name>/``
Scans ``plugins/memory/<name>/`` directories for memory provider plugins.
Each subdirectory must contain ``__init__.py`` with a class implementing
the MemoryProvider ABC. On name collisions, bundled providers take
precedence.
the MemoryProvider ABC.
Only ONE provider can be active at a time, selected via
``memory.provider`` in config.yaml.
Memory providers are separate from the general plugin system they live
in the repo and are always available without user installation. Only ONE
can be active at a time, selected via ``memory.provider`` in config.yaml.
Usage:
from plugins.memory import discover_memory_providers, load_memory_provider
available = discover_memory_providers() # [(name, desc, available), ...]
provider = load_memory_provider("mnemosyne") # MemoryProvider instance
provider = load_memory_provider("openviking") # MemoryProvider instance
"""
from __future__ import annotations
@@ -33,101 +29,24 @@ logger = logging.getLogger(__name__)
_MEMORY_PLUGINS_DIR = Path(__file__).parent
# ---------------------------------------------------------------------------
# Directory helpers
# ---------------------------------------------------------------------------
def _get_user_plugins_dir() -> Optional[Path]:
"""Return ``$HERMES_HOME/plugins/`` or None if unavailable."""
try:
from hermes_constants import get_hermes_home
d = get_hermes_home() / "plugins"
return d if d.is_dir() else None
except Exception:
return None
def _is_memory_provider_dir(path: Path) -> bool:
"""Heuristic: does *path* look like a memory provider plugin?
Checks for ``register_memory_provider`` or ``MemoryProvider`` in the
``__init__.py`` source. Cheap text scan no import needed.
"""
init_file = path / "__init__.py"
if not init_file.exists():
return False
try:
source = init_file.read_text(errors="replace")[:8192]
return "register_memory_provider" in source or "MemoryProvider" in source
except Exception:
return False
def _iter_provider_dirs() -> List[Tuple[str, Path]]:
"""Yield ``(name, path)`` for all discovered provider directories.
Scans bundled first, then user-installed. Bundled takes precedence
on name collisions (first-seen wins via ``seen`` set).
"""
seen: set = set()
dirs: List[Tuple[str, Path]] = []
# 1. Bundled providers (plugins/memory/<name>/)
if _MEMORY_PLUGINS_DIR.is_dir():
for child in sorted(_MEMORY_PLUGINS_DIR.iterdir()):
if not child.is_dir() or child.name.startswith(("_", ".")):
continue
if not (child / "__init__.py").exists():
continue
seen.add(child.name)
dirs.append((child.name, child))
# 2. User-installed providers ($HERMES_HOME/plugins/<name>/)
user_dir = _get_user_plugins_dir()
if user_dir:
for child in sorted(user_dir.iterdir()):
if not child.is_dir() or child.name.startswith(("_", ".")):
continue
if child.name in seen:
continue # bundled takes precedence
if not _is_memory_provider_dir(child):
continue # skip non-memory plugins
dirs.append((child.name, child))
return dirs
def find_provider_dir(name: str) -> Optional[Path]:
"""Resolve a provider name to its directory.
Checks bundled first, then user-installed.
"""
# Bundled
bundled = _MEMORY_PLUGINS_DIR / name
if bundled.is_dir() and (bundled / "__init__.py").exists():
return bundled
# User-installed
user_dir = _get_user_plugins_dir()
if user_dir:
user = user_dir / name
if user.is_dir() and _is_memory_provider_dir(user):
return user
return None
# ---------------------------------------------------------------------------
# Public API
# ---------------------------------------------------------------------------
def discover_memory_providers() -> List[Tuple[str, str, bool]]:
"""Scan bundled and user-installed directories for available providers.
"""Scan plugins/memory/ for available providers.
Returns list of (name, description, is_available) tuples.
Bundled providers take precedence on name collisions.
Does NOT import the providers just reads plugin.yaml for metadata
and does a lightweight availability check.
"""
results = []
if not _MEMORY_PLUGINS_DIR.is_dir():
return results
for child in sorted(_MEMORY_PLUGINS_DIR.iterdir()):
if not child.is_dir() or child.name.startswith(("_", ".")):
continue
init_file = child / "__init__.py"
if not init_file.exists():
continue
for name, child in _iter_provider_dirs():
# Read description from plugin.yaml if available
desc = ""
yaml_file = child / "plugin.yaml"
@@ -151,7 +70,7 @@ def discover_memory_providers() -> List[Tuple[str, str, bool]]:
except Exception:
available = False
results.append((name, desc, available))
results.append((child.name, desc, available))
return results
@@ -159,15 +78,11 @@ def discover_memory_providers() -> List[Tuple[str, str, bool]]:
def load_memory_provider(name: str) -> Optional["MemoryProvider"]:
"""Load and return a MemoryProvider instance by name.
Checks both bundled (``plugins/memory/<name>/``) and user-installed
(``$HERMES_HOME/plugins/<name>/``) directories. Bundled takes
precedence on name collisions.
Returns None if the provider is not found or fails to load.
"""
provider_dir = find_provider_dir(name)
if not provider_dir:
logger.debug("Memory provider '%s' not found in bundled or user plugins", name)
provider_dir = _MEMORY_PLUGINS_DIR / name
if not provider_dir.is_dir():
logger.debug("Memory provider '%s' not found in %s", name, _MEMORY_PLUGINS_DIR)
return None
try:
@@ -189,10 +104,7 @@ def _load_provider_from_dir(provider_dir: Path) -> Optional["MemoryProvider"]:
- A top-level class that extends MemoryProvider we instantiate it
"""
name = provider_dir.name
# Use a separate namespace for user-installed plugins so they don't
# collide with bundled providers in sys.modules.
_is_bundled = _MEMORY_PLUGINS_DIR in provider_dir.parents or provider_dir.parent == _MEMORY_PLUGINS_DIR
module_name = f"plugins.memory.{name}" if _is_bundled else f"_hermes_user_memory.{name}"
module_name = f"plugins.memory.{name}"
init_file = provider_dir / "__init__.py"
if not init_file.exists():
@@ -345,16 +257,15 @@ def discover_plugin_cli_commands() -> List[dict]:
return results
# Only look at the active provider's directory
plugin_dir = find_provider_dir(active_provider)
if not plugin_dir:
plugin_dir = _MEMORY_PLUGINS_DIR / active_provider
if not plugin_dir.is_dir():
return results
cli_file = plugin_dir / "cli.py"
if not cli_file.exists():
return results
_is_bundled = _MEMORY_PLUGINS_DIR in plugin_dir.parents or plugin_dir.parent == _MEMORY_PLUGINS_DIR
module_name = f"plugins.memory.{active_provider}.cli" if _is_bundled else f"_hermes_user_memory.{active_provider}.cli"
module_name = f"plugins.memory.{active_provider}.cli"
try:
# Import the CLI module (lightweight — no SDK needed)
if module_name in sys.modules:
+7 -1
View File
@@ -58,7 +58,8 @@ def resolve_config_path() -> Path:
Resolution order:
1. $HERMES_HOME/honcho.json (profile-local, if it exists)
2. ~/.honcho/config.json (global, cross-app interop)
2. ~/.hermes/honcho.json (default profile shared host blocks live here)
3. ~/.honcho/config.json (global, cross-app interop)
Returns the global path if none exist (for first-time setup writes).
"""
@@ -66,6 +67,11 @@ def resolve_config_path() -> Path:
if local_path.exists():
return local_path
# Default profile's config — host blocks accumulate here via setup/clone
default_path = Path.home() / ".hermes" / "honcho.json"
if default_path != local_path and default_path.exists():
return default_path
return GLOBAL_CONFIG_PATH
+9 -46
View File
@@ -10,9 +10,8 @@ lifecycle instead of read-only search endpoints.
Config via environment variables (profile-scoped via each profile's .env):
OPENVIKING_ENDPOINT Server URL (default: http://127.0.0.1:1933)
OPENVIKING_API_KEY API key (required for authenticated servers)
OPENVIKING_ACCOUNT Tenant account (default: default)
OPENVIKING_ACCOUNT Tenant account (default: root)
OPENVIKING_USER Tenant user (default: default)
OPENVIKING_AGENT Tenant agent (default: hermes)
Capabilities:
- Automatic memory extraction on session commit (6 categories)
@@ -81,12 +80,11 @@ class _VikingClient:
"""Thin HTTP client for the OpenViking REST API."""
def __init__(self, endpoint: str, api_key: str = "",
account: str = "", user: str = "", agent: str = ""):
account: str = "", user: str = ""):
self._endpoint = endpoint.rstrip("/")
self._api_key = api_key
self._account = account or os.environ.get("OPENVIKING_ACCOUNT", "default")
self._account = account or os.environ.get("OPENVIKING_ACCOUNT", "root")
self._user = user or os.environ.get("OPENVIKING_USER", "default")
self._agent = agent or os.environ.get("OPENVIKING_AGENT", "hermes")
self._httpx = _get_httpx()
if self._httpx is None:
raise ImportError("httpx is required for OpenViking: pip install httpx")
@@ -96,7 +94,6 @@ class _VikingClient:
"Content-Type": "application/json",
"X-OpenViking-Account": self._account,
"X-OpenViking-User": self._user,
"X-OpenViking-Agent": self._agent,
}
if self._api_key:
h["X-API-Key"] = self._api_key
@@ -285,44 +282,20 @@ class OpenVikingMemoryProvider(MemoryProvider):
},
{
"key": "api_key",
"description": "OpenViking API key (leave blank for local dev mode)",
"description": "OpenViking API key",
"secret": True,
"env_var": "OPENVIKING_API_KEY",
},
{
"key": "account",
"description": "OpenViking tenant account ID ([default], used when local mode, OPENVIKING_API_KEY is empty)",
"default": "default",
"env_var": "OPENVIKING_ACCOUNT",
},
{
"key": "user",
"description": "OpenViking user ID within the account ([default], used when local mode, OPENVIKING_API_KEY is empty)",
"default": "default",
"env_var": "OPENVIKING_USER",
},
{
"key": "agent",
"description": "OpenViking agent ID within the account ([hermes], useful in multi-agent mode)",
"default": "hermes",
"env_var": "OPENVIKING_AGENT",
},
]
def initialize(self, session_id: str, **kwargs) -> None:
self._endpoint = os.environ.get("OPENVIKING_ENDPOINT", _DEFAULT_ENDPOINT)
self._api_key = os.environ.get("OPENVIKING_API_KEY", "")
self._account = os.environ.get("OPENVIKING_ACCOUNT", "default")
self._user = os.environ.get("OPENVIKING_USER", "default")
self._agent = os.environ.get("OPENVIKING_AGENT", "hermes")
self._session_id = session_id
self._turn_count = 0
try:
self._client = _VikingClient(
self._endpoint, self._api_key,
account=self._account, user=self._user, agent=self._agent,
)
self._client = _VikingClient(self._endpoint, self._api_key)
if not self._client.health():
logger.warning("OpenViking server at %s is not reachable", self._endpoint)
self._client = None
@@ -352,8 +325,7 @@ class OpenVikingMemoryProvider(MemoryProvider):
"(abstract/overview/full), viking_browse to explore.\n"
"Use viking_remember to store facts, viking_add_resource to index URLs/docs."
)
except Exception as e:
logger.warning("OpenViking system_prompt_block failed: %s", e)
except Exception:
return (
"# OpenViking Knowledge Base\n"
f"Active. Endpoint: {self._endpoint}\n"
@@ -379,10 +351,7 @@ class OpenVikingMemoryProvider(MemoryProvider):
def _run():
try:
client = _VikingClient(
self._endpoint, self._api_key,
account=self._account, user=self._user, agent=self._agent,
)
client = _VikingClient(self._endpoint, self._api_key)
resp = client.post("/api/v1/search/find", {
"query": query,
"top_k": 5,
@@ -417,10 +386,7 @@ class OpenVikingMemoryProvider(MemoryProvider):
def _sync():
try:
client = _VikingClient(
self._endpoint, self._api_key,
account=self._account, user=self._user, agent=self._agent,
)
client = _VikingClient(self._endpoint, self._api_key)
sid = self._session_id
# Add user message
@@ -476,10 +442,7 @@ class OpenVikingMemoryProvider(MemoryProvider):
def _write():
try:
client = _VikingClient(
self._endpoint, self._api_key,
account=self._account, user=self._user, agent=self._agent,
)
client = _VikingClient(self._endpoint, self._api_key)
# Add as a user message with memory context so the commit
# picks it up as an explicit memory during extraction
client.post(f"/api/v1/sessions/{self._session_id}/messages", {
-4
View File
@@ -32,8 +32,6 @@ dependencies = [
"fal-client>=0.13.1,<1",
# Text-to-speech (Edge TTS is free, no API key needed)
"edge-tts>=7.2.7,<8",
# QR code rendering for Telegram auto-setup
"qrcode>=8.0,<9",
# Skills Hub (GitHub App JWT auth — optional, only needed for bot identity)
"PyJWT[crypto]>=2.12.0,<3", # CVE-2026-32597
]
@@ -65,7 +63,6 @@ homeassistant = ["aiohttp>=3.9.0,<4"]
sms = ["aiohttp>=3.9.0,<4"]
acp = ["agent-client-protocol>=0.9.0,<1.0"]
mistral = ["mistralai>=2.3.0,<3"]
bedrock = ["boto3>=1.35.0,<2"]
termux = [
# Tested Android / Termux path: keeps the core CLI feature-rich while
# avoiding extras that currently depend on non-Android wheels (notably
@@ -111,7 +108,6 @@ all = [
"hermes-agent[dingtalk]",
"hermes-agent[feishu]",
"hermes-agent[mistral]",
"hermes-agent[bedrock]",
"hermes-agent[web]",
]
+73 -664
View File
File diff suppressed because it is too large Load Diff
+1 -6
View File
@@ -28,7 +28,7 @@ BOLD='\033[1m'
# Configuration
REPO_URL_SSH="git@github.com:NousResearch/hermes-agent.git"
REPO_URL_HTTPS="https://github.com/NousResearch/hermes-agent.git"
HERMES_HOME="${HERMES_HOME:-$HOME/.hermes}"
HERMES_HOME="$HOME/.hermes"
INSTALL_DIR="${HERMES_INSTALL_DIR:-$HERMES_HOME/hermes-agent}"
PYTHON_VERSION="3.11"
NODE_VERSION="22"
@@ -66,10 +66,6 @@ while [[ $# -gt 0 ]]; do
INSTALL_DIR="$2"
shift 2
;;
--hermes-home)
HERMES_HOME="$2"
shift 2
;;
-h|--help)
echo "Hermes Agent Installer"
echo ""
@@ -80,7 +76,6 @@ while [[ $# -gt 0 ]]; do
echo " --skip-setup Skip interactive setup wizard"
echo " --branch NAME Git branch to install (default: main)"
echo " --dir PATH Installation directory (default: ~/.hermes/hermes-agent)"
echo " --hermes-home PATH Data directory (default: ~/.hermes, or \$HERMES_HOME)"
echo " -h, --help Show this help"
exit 0
;;
+1 -13
View File
@@ -62,9 +62,7 @@ AUTHOR_MAP = {
"258577966+voidborne-d@users.noreply.github.com": "voidborne-d",
"70424851+insecurejezza@users.noreply.github.com": "insecurejezza",
"259807879+Bartok9@users.noreply.github.com": "Bartok9",
"241404605+MestreY0d4-Uninter@users.noreply.github.com": "MestreY0d4-Uninter",
"268667990+Roy-oss1@users.noreply.github.com": "Roy-oss1",
"241404605+MestreY0d4-Uninter@users.noreply.github.com": "MestreY0d4-Uninter",
# contributors (manual mapping from git names)
"dmayhem93@gmail.com": "dmahan93",
"samherring99@gmail.com": "samherring99",
@@ -77,12 +75,8 @@ AUTHOR_MAP = {
"abdullahfarukozden@gmail.com": "Farukest",
"lovre.pesut@gmail.com": "rovle",
"hakanerten02@hotmail.com": "teyrebaz33",
"ruzzgarcn@gmail.com": "Ruzzgar",
"alireza78.crypto@gmail.com": "alireza78a",
"brooklyn.bb.nicholson@gmail.com": "brooklynnicholson",
"4317663+helix4u@users.noreply.github.com": "helix4u",
"331214+counterposition@users.noreply.github.com": "counterposition",
"blspear@gmail.com": "BrennerSpear",
"gpickett00@gmail.com": "gpickett00",
"mcosma@gmail.com": "wakamex",
"clawdia.nash@proton.me": "clawdia-nash",
@@ -101,9 +95,7 @@ AUTHOR_MAP = {
"vincentcharlebois@gmail.com": "vincentcharlebois",
"aryan@synvoid.com": "aryansingh",
"johnsonblake1@gmail.com": "blakejohnson",
"greer.guthrie@gmail.com": "g-guthrie",
"kennyx102@gmail.com": "bobashopcashier",
"shokatalishaikh95@gmail.com": "areu01or00",
"bryan@intertwinesys.com": "bryanyoung",
"christo.mitov@gmail.com": "christomitov",
"hermes@nousresearch.com": "NousResearch",
@@ -123,8 +115,6 @@ AUTHOR_MAP = {
"m@statecraft.systems": "mbierling",
"balyan.sid@gmail.com": "balyansid",
"oluwadareab12@gmail.com": "bennytimz",
"simon@simonmarcus.org": "simon-marcus",
"1243352777@qq.com": "zons-zhaozhy",
# ── bulk addition: 75 emails resolved via API, PR salvage bodies, noreply
# crossref, and GH contributor list matching (April 2026 audit) ──
"1115117931@qq.com": "aaronagent",
@@ -197,14 +187,12 @@ AUTHOR_MAP = {
"yangzhi.see@gmail.com": "SeeYangZhi",
"yongtenglei@gmail.com": "yongtenglei",
"young@YoungdeMacBook-Pro.local": "YoungYang963",
"ysfalweshcan@gmail.com": "Junass1",
"ysfalweshcan@gmail.com": "Awsh1",
"ysfwaxlycan@gmail.com": "WAXLYY",
"yusufalweshdemir@gmail.com": "Dusk1e",
"zhouboli@gmail.com": "zhouboli",
"zqiao@microsoft.com": "tomqiaozc",
"zzn+pa@zzn.im": "xinbenlv",
"zaynjarvis@gmail.com": "ZaynJarvis",
"zhiheng.liu@bytedance.com": "ZaynJarvis",
}
@@ -313,7 +313,7 @@ Type these during an interactive chat session.
```
~/.hermes/config.yaml Main configuration
~/.hermes/.env API keys and secrets
$HERMES_HOME/skills/ Installed skills
~/.hermes/skills/ Installed skills
~/.hermes/sessions/ Session transcripts
~/.hermes/logs/ Gateway and error logs
~/.hermes/auth.json OAuth tokens and credential pools
@@ -650,9 +650,9 @@ registry.register(
)
```
**2. Add to `toolsets.py`**`_HERMES_CORE_TOOLS` list.
**2. Add import** in `model_tools.py``_discover_tools()` list.
Auto-discovery: any `tools/*.py` file with a top-level `registry.register()` call is imported automatically — no manual list needed.
**3. Add to `toolsets.py`** → `_HERMES_CORE_TOOLS` list.
All handlers must return JSON strings. Use `get_hermes_home()` for paths, never hardcode `~/.hermes`.
+1 -1
View File
@@ -334,7 +334,7 @@ When the user asks you to "review PR #N", "look at this PR", or gives you a PR U
### Step 1: Set up environment
```bash
source "${HERMES_HOME:-$HOME/.hermes}/skills/github/github-auth/scripts/gh-env.sh"
source ~/.hermes/skills/github/github-auth/scripts/gh-env.sh
# Or run the inline setup block from the top of this skill
```
@@ -6,7 +6,7 @@ All requests need: `-H "Authorization: token $GITHUB_TOKEN"`
Use the `gh-env.sh` helper to set `$GITHUB_TOKEN`, `$GH_OWNER`, `$GH_REPO` automatically:
```bash
source "${HERMES_HOME:-$HOME/.hermes}/skills/github/github-auth/scripts/gh-env.sh"
source ~/.hermes/skills/github/github-auth/scripts/gh-env.sh
```
## Repositories
@@ -98,7 +98,7 @@ def find_nearby(lat: float, lon: float, types: list[str], radius: int = 1500, li
# Get coordinates (nodes have lat/lon directly, ways/relations use center)
plat = el.get("lat") or (el.get("center", {}) or {}).get("lat")
plon = el.get("lon") or (el.get("center", {}) or {}).get("lon")
if plat is None or plon is None:
if not plat or not plon:
continue
dist = haversine(lat, lon, plat, plon)
+105 -136
View File
@@ -1,19 +1,35 @@
---
name: google-workspace
description: Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration for Hermes. Uses Hermes-managed OAuth2 setup, prefers the Google Workspace CLI (`gws`) when available for broader API coverage, and falls back to the Python client libraries otherwise.
version: 1.0.0
description: Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration via gws CLI (googleworkspace/cli). Uses OAuth2 with automatic token refresh via bridge script. Requires gws binary.
version: 2.0.0
author: Nous Research
license: MIT
required_credential_files:
- path: google_token.json
description: Google OAuth2 token (created by setup script)
- path: google_client_secret.json
description: Google OAuth2 client credentials (downloaded from Google Cloud Console)
metadata:
hermes:
tags: [Google, Gmail, Calendar, Drive, Sheets, Docs, Contacts, Email, OAuth]
tags: [Google, Gmail, Calendar, Drive, Sheets, Docs, Contacts, Email, OAuth, gws]
homepage: https://github.com/NousResearch/hermes-agent
related_skills: [himalaya]
---
# Google Workspace
Gmail, Calendar, Drive, Contacts, Sheets, and Docs — through Hermes-managed OAuth and a thin CLI wrapper. When `gws` is installed, the skill uses it as the execution backend for broader Google Workspace coverage; otherwise it falls back to the bundled Python client implementation.
Gmail, Calendar, Drive, Contacts, Sheets, and Docs — powered by `gws` (Google's official Rust CLI). The skill provides a backward-compatible Python wrapper that handles OAuth token refresh and delegates to `gws`.
## Architecture
```
google_api.py → gws_bridge.py → gws CLI
(argparse compat) (token refresh) (Google APIs)
```
- `setup.py` handles OAuth2 (headless-compatible, works on CLI/Telegram/Discord)
- `gws_bridge.py` refreshes the Hermes token and injects it into `gws` via `GOOGLE_WORKSPACE_CLI_TOKEN`
- `google_api.py` provides the same CLI interface as v1 but delegates to `gws`
## References
@@ -22,7 +38,22 @@ Gmail, Calendar, Drive, Contacts, Sheets, and Docs — through Hermes-managed OA
## Scripts
- `scripts/setup.py` — OAuth2 setup (run once to authorize)
- `scripts/google_api.py` — compatibility wrapper CLI. It prefers `gws` for operations when available, while preserving Hermes' existing JSON output contract.
- `scripts/gws_bridge.py` — Token refresh bridge to gws CLI
- `scripts/google_api.py` — Backward-compatible API wrapper (delegates to gws)
## Prerequisites
Install `gws`:
```bash
cargo install google-workspace-cli
# or via npm (recommended, downloads prebuilt binary):
npm install -g @googleworkspace/cli
# or via Homebrew:
brew install googleworkspace-cli
```
Verify: `gws --version`
## First-Time Setup
@@ -32,7 +63,13 @@ on CLI, Telegram, Discord, or any platform.
Define a shorthand first:
```bash
GSETUP="python ${HERMES_HOME:-$HOME/.hermes}/skills/productivity/google-workspace/scripts/setup.py"
HERMES_HOME="${HERMES_HOME:-$HOME/.hermes}"
GWORKSPACE_SKILL_DIR="$HERMES_HOME/skills/productivity/google-workspace"
PYTHON_BIN="${HERMES_PYTHON:-python3}"
if [ -x "$HERMES_HOME/hermes-agent/venv/bin/python" ]; then
PYTHON_BIN="$HERMES_HOME/hermes-agent/venv/bin/python"
fi
GSETUP="$PYTHON_BIN $GWORKSPACE_SKILL_DIR/scripts/setup.py"
```
### Step 0: Check if already set up
@@ -45,166 +82,88 @@ If it prints `AUTHENTICATED`, skip to Usage — setup is already done.
### Step 1: Triage — ask the user what they need
Before starting OAuth setup, ask the user TWO questions:
**Question 1: "What Google services do you need? Just email, or also
Calendar/Drive/Sheets/Docs?"**
- **Email only** They don't need this skill at all. Use the `himalaya` skill
instead — it works with a Gmail App Password (Settings → Security → App
Passwords) and takes 2 minutes to set up. No Google Cloud project needed.
Load the himalaya skill and follow its setup instructions.
- **Email only** → Use the `himalaya` skill instead — simpler setup.
- **Calendar, Drive, Sheets, Docs (or email + these)** → Continue below.
- **Email + Calendar** → Continue with this skill, but use
`--services email,calendar` during auth so the consent screen only asks for
the scopes they actually need.
**Partial scopes**: Users can authorize only a subset of services. The setup
script accepts partial scopes and warns about missing ones.
- **Calendar/Drive/Sheets/Docs only** → Continue with this skill and use a
narrower `--services` set like `calendar,drive,sheets,docs`.
**Question 2: "Does your Google account use Advanced Protection?"**
- **Full Workspace access** → Continue with this skill and use the default
`all` service set.
**Question 2: "Does your Google account use Advanced Protection (hardware
security keys required to sign in)? If you're not sure, you probably don't
— it's something you would have explicitly enrolled in."**
- **No / Not sure** → Normal setup. Continue below.
- **Yes** → Their Workspace admin must add the OAuth client ID to the org's
allowed apps list before Step 4 will work. Let them know upfront.
- **No / Not sure** → Normal setup.
- **Yes** → Workspace admin must add the OAuth client ID to allowed apps first.
### Step 2: Create OAuth credentials (one-time, ~5 minutes)
Tell the user:
> You need a Google Cloud OAuth client. This is a one-time setup:
>
> 1. Create or select a project:
> https://console.cloud.google.com/projectselector2/home/dashboard
> 2. Enable the required APIs from the API Library:
> https://console.cloud.google.com/apis/library
> Enable: Gmail API, Google Calendar API, Google Drive API,
> Google Sheets API, Google Docs API, People API
> 3. Create the OAuth client here:
> https://console.cloud.google.com/apis/credentials
> Credentials → Create Credentials → OAuth 2.0 Client ID
> 4. Application type: "Desktop app" → Create
> 5. If the app is still in Testing, add the user's Google account as a test user here:
> https://console.cloud.google.com/auth/audience
> Audience → Test users → Add users
> 6. Download the JSON file and tell me the file path
>
> Important Hermes CLI note: if the file path starts with `/`, do NOT send only the bare path as its own message in the CLI, because it can be mistaken for a slash command. Send it in a sentence instead, like:
> `The JSON file path is: /home/user/Downloads/client_secret_....json`
Once they provide the path:
> 1. Go to https://console.cloud.google.com/apis/credentials
> 2. Create a project (or use an existing one)
> 3. Enable the APIs you need (Gmail, Calendar, Drive, Sheets, Docs, People)
> 4. Credentials → Create Credentials → OAuth 2.0 Client ID → Desktop app
> 5. Download JSON and tell me the file path
```bash
$GSETUP --client-secret /path/to/client_secret.json
```
If they paste the raw client ID / client secret values instead of a file path,
write a valid Desktop OAuth JSON file for them yourself, save it somewhere
explicit (for example `~/Downloads/hermes-google-client-secret.json`), then run
`--client-secret` against that file.
### Step 3: Get authorization URL
Use the service set chosen in Step 1. Examples:
```bash
$GSETUP --auth-url --services email,calendar --format json
$GSETUP --auth-url --services calendar,drive,sheets,docs --format json
$GSETUP --auth-url --services all --format json
$GSETUP --auth-url
```
This returns JSON with an `auth_url` field and also saves the exact URL to
`~/.hermes/google_oauth_last_url.txt`.
Agent rules for this step:
- Extract the `auth_url` field and send that exact URL to the user as a single line.
- Tell the user that the browser will likely fail on `http://localhost:1` after approval, and that this is expected.
- Tell them to copy the ENTIRE redirected URL from the browser address bar.
- If the user gets `Error 403: access_denied`, send them directly to `https://console.cloud.google.com/auth/audience` to add themselves as a test user.
Send the URL to the user. After authorizing, they paste back the redirect URL or code.
### Step 4: Exchange the code
The user will paste back either a URL like `http://localhost:1/?code=4/0A...&scope=...`
or just the code string. Either works. The `--auth-url` step stores a temporary
pending OAuth session locally so `--auth-code` can complete the PKCE exchange
later, even on headless systems:
```bash
$GSETUP --auth-code "THE_URL_OR_CODE_THE_USER_PASTED" --format json
$GSETUP --auth-code "THE_URL_OR_CODE_THE_USER_PASTED"
```
If `--auth-code` fails because the code expired, was already used, or came from
an older browser tab, it now returns a fresh `fresh_auth_url`. In that case,
immediately send the new URL to the user and have them retry with the newest
browser redirect only.
### Step 5: Verify
```bash
$GSETUP --check
```
Should print `AUTHENTICATED`. Setup is complete — token refreshes automatically from now on.
### Notes
- Token is stored at `~/.hermes/google_token.json` and auto-refreshes.
- Pending OAuth session state/verifier are stored temporarily at `~/.hermes/google_oauth_pending.json` until exchange completes.
- If `gws` is installed, `google_api.py` points it at the same `~/.hermes/google_token.json` credentials file. Users do not need to run a separate `gws auth login` flow.
- To revoke: `$GSETUP --revoke`
Should print `AUTHENTICATED`. Token refreshes automatically from now on.
## Usage
All commands go through the API script. Set `GAPI` as a shorthand:
All commands go through the API script:
```bash
GAPI="python ${HERMES_HOME:-$HOME/.hermes}/skills/productivity/google-workspace/scripts/google_api.py"
HERMES_HOME="${HERMES_HOME:-$HOME/.hermes}"
GWORKSPACE_SKILL_DIR="$HERMES_HOME/skills/productivity/google-workspace"
PYTHON_BIN="${HERMES_PYTHON:-python3}"
if [ -x "$HERMES_HOME/hermes-agent/venv/bin/python" ]; then
PYTHON_BIN="$HERMES_HOME/hermes-agent/venv/bin/python"
fi
GAPI="$PYTHON_BIN $GWORKSPACE_SKILL_DIR/scripts/google_api.py"
```
### Gmail
```bash
# Search (returns JSON array with id, from, subject, date, snippet)
$GAPI gmail search "is:unread" --max 10
$GAPI gmail search "from:boss@company.com newer_than:1d"
$GAPI gmail search "has:attachment filename:pdf newer_than:7d"
# Read full message (returns JSON with body text)
$GAPI gmail get MESSAGE_ID
# Send
$GAPI gmail send --to user@example.com --subject "Hello" --body "Message text"
$GAPI gmail send --to user@example.com --subject "Report" --body "<h1>Q4</h1><p>Details...</p>" --html
$GAPI gmail send --to user@example.com --subject "Hello" --from '"Research Agent" <user@example.com>' --body "Message text"
# Reply (automatically threads and sets In-Reply-To)
$GAPI gmail send --to user@example.com --subject "Report" --body "<h1>Q4</h1>" --html
$GAPI gmail reply MESSAGE_ID --body "Thanks, that works for me."
$GAPI gmail reply MESSAGE_ID --from '"Support Bot" <user@example.com>' --body "Thanks"
# Labels
$GAPI gmail labels
$GAPI gmail modify MESSAGE_ID --add-labels LABEL_ID
$GAPI gmail modify MESSAGE_ID --remove-labels UNREAD
```
### Calendar
```bash
# List events (defaults to next 7 days)
$GAPI calendar list
$GAPI calendar list --start 2026-03-01T00:00:00Z --end 2026-03-07T23:59:59Z
# Create event (ISO 8601 with timezone required)
$GAPI calendar create --summary "Team Standup" --start 2026-03-01T10:00:00-06:00 --end 2026-03-01T10:30:00-06:00
$GAPI calendar create --summary "Lunch" --start 2026-03-01T12:00:00Z --end 2026-03-01T13:00:00Z --location "Cafe"
$GAPI calendar create --summary "Review" --start 2026-03-01T14:00:00Z --end 2026-03-01T15:00:00Z --attendees "alice@co.com,bob@co.com"
# Delete event
$GAPI calendar create --summary "Standup" --start 2026-03-01T10:00:00+01:00 --end 2026-03-01T10:30:00+01:00
$GAPI calendar create --summary "Review" --start ... --end ... --attendees "alice@co.com,bob@co.com"
$GAPI calendar delete EVENT_ID
```
@@ -224,13 +183,8 @@ $GAPI contacts list --max 20
### Sheets
```bash
# Read
$GAPI sheets get SHEET_ID "Sheet1!A1:D10"
# Write
$GAPI sheets update SHEET_ID "Sheet1!A1:B2" --values '[["Name","Score"],["Alice","95"]]'
# Append rows
$GAPI sheets append SHEET_ID "Sheet1!A:C" --values '[["new","row","data"]]'
```
@@ -240,37 +194,52 @@ $GAPI sheets append SHEET_ID "Sheet1!A:C" --values '[["new","row","data"]]'
$GAPI docs get DOC_ID
```
### Direct gws access (advanced)
For operations not covered by the wrapper, use `gws_bridge.py` directly:
```bash
GBRIDGE="$PYTHON_BIN $GWORKSPACE_SKILL_DIR/scripts/gws_bridge.py"
$GBRIDGE calendar +agenda --today --format table
$GBRIDGE gmail +triage --labels --format json
$GBRIDGE drive +upload ./report.pdf
$GBRIDGE sheets +read --spreadsheet SHEET_ID --range "Sheet1!A1:D10"
```
## Output Format
All commands return JSON. Parse with `jq` or read directly. Key fields:
All commands return JSON via `gws --format json`. Key output shapes:
- **Gmail search**: `[{id, threadId, from, to, subject, date, snippet, labels}]`
- **Gmail get**: `{id, threadId, from, to, subject, date, labels, body}`
- **Gmail send/reply**: `{status: "sent", id, threadId}`
- **Calendar list**: `[{id, summary, start, end, location, description, htmlLink}]`
- **Calendar create**: `{status: "created", id, summary, htmlLink}`
- **Drive search**: `[{id, name, mimeType, modifiedTime, webViewLink}]`
- **Contacts list**: `[{name, emails: [...], phones: [...]}]`
- **Sheets get**: `[[cell, cell, ...], ...]`
- **Gmail search/triage**: Array of message summaries (sender, subject, date, snippet)
- **Gmail get/read**: Message object with headers and body text
- **Gmail send/reply**: Confirmation with message ID
- **Calendar list/agenda**: Array of event objects (summary, start, end, location)
- **Calendar create**: Confirmation with event ID and htmlLink
- **Drive search**: Array of file objects (id, name, mimeType, webViewLink)
- **Sheets get/read**: 2D array of cell values
- **Docs get**: Full document JSON (use `body.content` for text extraction)
- **Contacts list**: Array of person objects with names, emails, phones
Parse output with `jq` or read JSON directly.
## Rules
1. **Never send email or create/delete events without confirming with the user first.** Show the draft content and ask for approval.
2. **Check auth before first use** — run `setup.py --check`. If it fails, guide the user through setup.
3. **Use the Gmail search syntax reference** for complex queries — load it with `skill_view("google-workspace", file_path="references/gmail-search-syntax.md")`.
4. **Calendar times must include timezone** always use ISO 8601 with offset (e.g., `2026-03-01T10:00:00-06:00`) or UTC (`Z`).
5. **Respect rate limits** — avoid rapid-fire sequential API calls. Batch reads when possible.
1. **Never send email or create/delete events without confirming with the user first.**
2. **Check auth before first use** — run `setup.py --check`.
3. **Use the Gmail search syntax reference** for complex queries.
4. **Calendar times must include timezone** — ISO 8601 with offset or UTC.
5. **Respect rate limits** — avoid rapid-fire sequential API calls.
## Troubleshooting
| Problem | Fix |
|---------|-----|
| `NOT_AUTHENTICATED` | Run setup Steps 2-5 above |
| `REFRESH_FAILED` | Token revoked or expired — redo Steps 3-5 |
| `HttpError 403: Insufficient Permission` | Missing API scope — `$GSETUP --revoke` then redo Steps 3-5 |
| `HttpError 403: Access Not Configured` | API not enabled — user needs to enable it in Google Cloud Console |
| `ModuleNotFoundError` | Run `$GSETUP --install-deps` |
| Advanced Protection blocks auth | Workspace admin must allowlist the OAuth client ID |
| `NOT_AUTHENTICATED` | Run setup Steps 2-5 |
| `REFRESH_FAILED` | Token revoked — redo Steps 3-5 |
| `gws: command not found` | Install: `npm install -g @googleworkspace/cli` |
| `HttpError 403` | Missing scope — `$GSETUP --revoke` then redo Steps 3-5 |
| `HttpError 403: Access Not Configured` | Enable API in Google Cloud Console |
| Advanced Protection blocks auth | Admin must allowlist the OAuth client ID |
## Revoking Access
@@ -1,17 +1,17 @@
#!/usr/bin/env python3
"""Google Workspace API CLI for Hermes Agent.
Uses the Google Workspace CLI (`gws`) when available, but preserves the
existing Hermes-facing JSON contract and falls back to the Python client
libraries if `gws` is not installed.
Thin wrapper that delegates to gws (googleworkspace/cli) via gws_bridge.py.
Maintains the same CLI interface for backward compatibility with Hermes skills.
Usage:
python google_api.py gmail search "is:unread" [--max 10]
python google_api.py gmail get MESSAGE_ID
python google_api.py gmail send --to user@example.com --subject "Hi" --body "Hello"
python google_api.py gmail reply MESSAGE_ID --body "Thanks"
python google_api.py calendar list [--from DATE] [--to DATE] [--calendar primary]
python google_api.py calendar list [--start DATE] [--end DATE] [--calendar primary]
python google_api.py calendar create --summary "Meeting" --start DATETIME --end DATETIME
python google_api.py calendar delete EVENT_ID
python google_api.py drive search "budget report" [--max 10]
python google_api.py contacts list [--max 20]
python google_api.py sheets get SHEET_ID RANGE
@@ -21,396 +21,47 @@ Usage:
"""
import argparse
import base64
import json
import os
import shutil
import subprocess
import sys
from datetime import datetime, timedelta, timezone
from email.mime.text import MIMEText
from pathlib import Path
HERMES_HOME = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
TOKEN_PATH = HERMES_HOME / "google_token.json"
CLIENT_SECRET_PATH = HERMES_HOME / "google_client_secret.json"
SCOPES = [
"https://www.googleapis.com/auth/gmail.readonly",
"https://www.googleapis.com/auth/gmail.send",
"https://www.googleapis.com/auth/gmail.modify",
"https://www.googleapis.com/auth/calendar",
"https://www.googleapis.com/auth/drive.readonly",
"https://www.googleapis.com/auth/contacts.readonly",
"https://www.googleapis.com/auth/spreadsheets",
"https://www.googleapis.com/auth/documents.readonly",
]
BRIDGE = Path(__file__).parent / "gws_bridge.py"
PYTHON = sys.executable
def _ensure_authenticated():
if not TOKEN_PATH.exists():
print("Not authenticated. Run the setup script first:", file=sys.stderr)
print(f" python {Path(__file__).parent / 'setup.py'}", file=sys.stderr)
sys.exit(1)
def _stored_token_scopes() -> list[str]:
try:
data = json.loads(TOKEN_PATH.read_text())
except Exception:
return list(SCOPES)
scopes = data.get("scopes")
if isinstance(scopes, list) and scopes:
return scopes
return list(SCOPES)
def _gws_binary() -> str | None:
override = os.getenv("HERMES_GWS_BIN")
if override:
return override
return shutil.which("gws")
def _gws_env() -> dict[str, str]:
env = os.environ.copy()
env["GOOGLE_WORKSPACE_CLI_CREDENTIALS_FILE"] = str(TOKEN_PATH)
return env
def _run_gws(parts: list[str], *, params: dict | None = None, body: dict | None = None):
binary = _gws_binary()
if not binary:
raise RuntimeError("gws not installed")
_ensure_authenticated()
cmd = [binary, *parts]
if params is not None:
cmd.extend(["--params", json.dumps(params)])
if body is not None:
cmd.extend(["--json", json.dumps(body)])
def gws(*args: str) -> None:
"""Call gws via the bridge and exit with its return code."""
result = subprocess.run(
cmd,
capture_output=True,
text=True,
env=_gws_env(),
[PYTHON, str(BRIDGE)] + list(args),
env={**os.environ, "HERMES_HOME": os.environ.get("HERMES_HOME", str(Path.home() / ".hermes"))},
)
if result.returncode != 0:
err = result.stderr.strip() or result.stdout.strip() or "Unknown gws error"
print(err, file=sys.stderr)
sys.exit(result.returncode or 1)
stdout = result.stdout.strip()
if not stdout:
return {}
try:
return json.loads(stdout)
except json.JSONDecodeError:
print("ERROR: Unexpected non-JSON output from gws:", file=sys.stderr)
print(stdout, file=sys.stderr)
sys.exit(1)
sys.exit(result.returncode)
def _headers_dict(msg: dict) -> dict[str, str]:
return {h["name"]: h["value"] for h in msg.get("payload", {}).get("headers", [])}
def _extract_message_body(msg: dict) -> str:
body = ""
payload = msg.get("payload", {})
if payload.get("body", {}).get("data"):
body = base64.urlsafe_b64decode(payload["body"]["data"]).decode("utf-8", errors="replace")
elif payload.get("parts"):
for part in payload["parts"]:
if part.get("mimeType") == "text/plain" and part.get("body", {}).get("data"):
body = base64.urlsafe_b64decode(part["body"]["data"]).decode("utf-8", errors="replace")
break
if not body:
for part in payload["parts"]:
if part.get("mimeType") == "text/html" and part.get("body", {}).get("data"):
body = base64.urlsafe_b64decode(part["body"]["data"]).decode("utf-8", errors="replace")
break
return body
def _extract_doc_text(doc: dict) -> str:
text_parts = []
for element in doc.get("body", {}).get("content", []):
paragraph = element.get("paragraph", {})
for pe in paragraph.get("elements", []):
text_run = pe.get("textRun", {})
if text_run.get("content"):
text_parts.append(text_run["content"])
return "".join(text_parts)
def _datetime_with_timezone(value: str) -> str:
if not value:
return value
if "T" not in value:
return value
if value.endswith("Z"):
return value
tail = value[10:]
if "+" in tail or "-" in tail:
return value
return value + "Z"
def get_credentials():
"""Load and refresh credentials from token file."""
_ensure_authenticated()
from google.oauth2.credentials import Credentials
from google.auth.transport.requests import Request
creds = Credentials.from_authorized_user_file(str(TOKEN_PATH), _stored_token_scopes())
if creds.expired and creds.refresh_token:
creds.refresh(Request())
TOKEN_PATH.write_text(creds.to_json())
if not creds.valid:
print("Token is invalid. Re-run setup.", file=sys.stderr)
sys.exit(1)
return creds
def build_service(api, version):
from googleapiclient.discovery import build
return build(api, version, credentials=get_credentials())
# =========================================================================
# Gmail
# =========================================================================
# -- Gmail --
def gmail_search(args):
if _gws_binary():
results = _run_gws(
["gmail", "users", "messages", "list"],
params={"userId": "me", "q": args.query, "maxResults": args.max},
)
messages = results.get("messages", [])
output = []
for msg_meta in messages:
msg = _run_gws(
["gmail", "users", "messages", "get"],
params={
"userId": "me",
"id": msg_meta["id"],
"format": "metadata",
"metadataHeaders": ["From", "To", "Subject", "Date"],
},
)
headers = _headers_dict(msg)
output.append(
{
"id": msg["id"],
"threadId": msg["threadId"],
"from": headers.get("From", ""),
"to": headers.get("To", ""),
"subject": headers.get("Subject", ""),
"date": headers.get("Date", ""),
"snippet": msg.get("snippet", ""),
"labels": msg.get("labelIds", []),
}
)
print(json.dumps(output, indent=2, ensure_ascii=False))
return
service = build_service("gmail", "v1")
results = service.users().messages().list(
userId="me", q=args.query, maxResults=args.max
).execute()
messages = results.get("messages", [])
if not messages:
print("No messages found.")
return
output = []
for msg_meta in messages:
msg = service.users().messages().get(
userId="me", id=msg_meta["id"], format="metadata",
metadataHeaders=["From", "To", "Subject", "Date"],
).execute()
headers = _headers_dict(msg)
output.append({
"id": msg["id"],
"threadId": msg["threadId"],
"from": headers.get("From", ""),
"to": headers.get("To", ""),
"subject": headers.get("Subject", ""),
"date": headers.get("Date", ""),
"snippet": msg.get("snippet", ""),
"labels": msg.get("labelIds", []),
})
print(json.dumps(output, indent=2, ensure_ascii=False))
cmd = ["gmail", "+triage", "--query", args.query, "--max", str(args.max), "--format", "json"]
gws(*cmd)
def gmail_get(args):
if _gws_binary():
msg = _run_gws(
["gmail", "users", "messages", "get"],
params={"userId": "me", "id": args.message_id, "format": "full"},
)
headers = _headers_dict(msg)
result = {
"id": msg["id"],
"threadId": msg["threadId"],
"from": headers.get("From", ""),
"to": headers.get("To", ""),
"subject": headers.get("Subject", ""),
"date": headers.get("Date", ""),
"labels": msg.get("labelIds", []),
"body": _extract_message_body(msg),
}
print(json.dumps(result, indent=2, ensure_ascii=False))
return
service = build_service("gmail", "v1")
msg = service.users().messages().get(
userId="me", id=args.message_id, format="full"
).execute()
headers = _headers_dict(msg)
result = {
"id": msg["id"],
"threadId": msg["threadId"],
"from": headers.get("From", ""),
"to": headers.get("To", ""),
"subject": headers.get("Subject", ""),
"date": headers.get("Date", ""),
"labels": msg.get("labelIds", []),
"body": _extract_message_body(msg),
}
print(json.dumps(result, indent=2, ensure_ascii=False))
gws("gmail", "+read", "--id", args.message_id, "--headers", "--format", "json")
def gmail_send(args):
if _gws_binary():
message = MIMEText(args.body, "html" if args.html else "plain")
message["to"] = args.to
message["subject"] = args.subject
if args.cc:
message["cc"] = args.cc
if args.from_header:
message["from"] = args.from_header
raw = base64.urlsafe_b64encode(message.as_bytes()).decode()
body = {"raw": raw}
if args.thread_id:
body["threadId"] = args.thread_id
result = _run_gws(
["gmail", "users", "messages", "send"],
params={"userId": "me"},
body=body,
)
print(json.dumps({"status": "sent", "id": result["id"], "threadId": result.get("threadId", "")}, indent=2))
return
service = build_service("gmail", "v1")
message = MIMEText(args.body, "html" if args.html else "plain")
message["to"] = args.to
message["subject"] = args.subject
cmd = ["gmail", "+send", "--to", args.to, "--subject", args.subject, "--body", args.body, "--format", "json"]
if args.cc:
message["cc"] = args.cc
if args.from_header:
message["from"] = args.from_header
raw = base64.urlsafe_b64encode(message.as_bytes()).decode()
body = {"raw": raw}
if args.thread_id:
body["threadId"] = args.thread_id
result = service.users().messages().send(userId="me", body=body).execute()
print(json.dumps({"status": "sent", "id": result["id"], "threadId": result.get("threadId", "")}, indent=2))
cmd += ["--cc", args.cc]
if args.html:
cmd.append("--html")
gws(*cmd)
def gmail_reply(args):
if _gws_binary():
original = _run_gws(
["gmail", "users", "messages", "get"],
params={
"userId": "me",
"id": args.message_id,
"format": "metadata",
"metadataHeaders": ["From", "Subject", "Message-ID"],
},
)
headers = _headers_dict(original)
subject = headers.get("Subject", "")
if not subject.startswith("Re:"):
subject = f"Re: {subject}"
message = MIMEText(args.body)
message["to"] = headers.get("From", "")
message["subject"] = subject
if args.from_header:
message["from"] = args.from_header
if headers.get("Message-ID"):
message["In-Reply-To"] = headers["Message-ID"]
message["References"] = headers["Message-ID"]
raw = base64.urlsafe_b64encode(message.as_bytes()).decode()
result = _run_gws(
["gmail", "users", "messages", "send"],
params={"userId": "me"},
body={"raw": raw, "threadId": original["threadId"]},
)
print(json.dumps({"status": "sent", "id": result["id"], "threadId": result.get("threadId", "")}, indent=2))
return
service = build_service("gmail", "v1")
original = service.users().messages().get(
userId="me", id=args.message_id, format="metadata",
metadataHeaders=["From", "Subject", "Message-ID"],
).execute()
headers = _headers_dict(original)
subject = headers.get("Subject", "")
if not subject.startswith("Re:"):
subject = f"Re: {subject}"
message = MIMEText(args.body)
message["to"] = headers.get("From", "")
message["subject"] = subject
if args.from_header:
message["from"] = args.from_header
if headers.get("Message-ID"):
message["In-Reply-To"] = headers["Message-ID"]
message["References"] = headers["Message-ID"]
raw = base64.urlsafe_b64encode(message.as_bytes()).decode()
body = {"raw": raw, "threadId": original["threadId"]}
result = service.users().messages().send(userId="me", body=body).execute()
print(json.dumps({"status": "sent", "id": result["id"], "threadId": result.get("threadId", "")}, indent=2))
gws("gmail", "+reply", "--message-id", args.message_id, "--body", args.body, "--format", "json")
def gmail_labels(args):
if _gws_binary():
results = _run_gws(["gmail", "users", "labels", "list"], params={"userId": "me"})
labels = [{"id": l["id"], "name": l["name"], "type": l.get("type", "")} for l in results.get("labels", [])]
print(json.dumps(labels, indent=2))
return
service = build_service("gmail", "v1")
results = service.users().labels().list(userId="me").execute()
labels = [{"id": l["id"], "name": l["name"], "type": l.get("type", "")} for l in results.get("labels", [])]
print(json.dumps(labels, indent=2))
gws("gmail", "users", "labels", "list", "--params", json.dumps({"userId": "me"}), "--format", "json")
def gmail_modify(args):
body = {}
@@ -418,310 +69,145 @@ def gmail_modify(args):
body["addLabelIds"] = args.add_labels.split(",")
if args.remove_labels:
body["removeLabelIds"] = args.remove_labels.split(",")
if _gws_binary():
result = _run_gws(
["gmail", "users", "messages", "modify"],
params={"userId": "me", "id": args.message_id},
body=body,
)
print(json.dumps({"id": result["id"], "labels": result.get("labelIds", [])}, indent=2))
return
service = build_service("gmail", "v1")
result = service.users().messages().modify(userId="me", id=args.message_id, body=body).execute()
print(json.dumps({"id": result["id"], "labels": result.get("labelIds", [])}, indent=2))
gws(
"gmail", "users", "messages", "modify",
"--params", json.dumps({"userId": "me", "id": args.message_id}),
"--json", json.dumps(body),
"--format", "json",
)
# =========================================================================
# Calendar
# =========================================================================
# -- Calendar --
def calendar_list(args):
now = datetime.now(timezone.utc)
time_min = _datetime_with_timezone(args.start or now.isoformat())
time_max = _datetime_with_timezone(args.end or (now + timedelta(days=7)).isoformat())
if _gws_binary():
results = _run_gws(
["calendar", "events", "list"],
params={
if args.start or args.end:
# Specific date range — use raw Calendar API for precise timeMin/timeMax
from datetime import datetime, timedelta, timezone as tz
now = datetime.now(tz.utc)
time_min = args.start or now.isoformat()
time_max = args.end or (now + timedelta(days=7)).isoformat()
gws(
"calendar", "events", "list",
"--params", json.dumps({
"calendarId": args.calendar,
"timeMin": time_min,
"timeMax": time_max,
"maxResults": args.max,
"singleEvents": True,
"orderBy": "startTime",
},
}),
"--format", "json",
)
events = []
for e in results.get("items", []):
events.append({
"id": e["id"],
"summary": e.get("summary", "(no title)"),
"start": e.get("start", {}).get("dateTime", e.get("start", {}).get("date", "")),
"end": e.get("end", {}).get("dateTime", e.get("end", {}).get("date", "")),
"location": e.get("location", ""),
"description": e.get("description", ""),
"status": e.get("status", ""),
"htmlLink": e.get("htmlLink", ""),
})
print(json.dumps(events, indent=2, ensure_ascii=False))
return
service = build_service("calendar", "v3")
results = service.events().list(
calendarId=args.calendar, timeMin=time_min, timeMax=time_max,
maxResults=args.max, singleEvents=True, orderBy="startTime",
).execute()
events = []
for e in results.get("items", []):
events.append({
"id": e["id"],
"summary": e.get("summary", "(no title)"),
"start": e.get("start", {}).get("dateTime", e.get("start", {}).get("date", "")),
"end": e.get("end", {}).get("dateTime", e.get("end", {}).get("date", "")),
"location": e.get("location", ""),
"description": e.get("description", ""),
"status": e.get("status", ""),
"htmlLink": e.get("htmlLink", ""),
})
print(json.dumps(events, indent=2, ensure_ascii=False))
else:
# No date range — use +agenda helper (defaults to 7 days)
cmd = ["calendar", "+agenda", "--days", "7", "--format", "json"]
if args.calendar != "primary":
cmd += ["--calendar", args.calendar]
gws(*cmd)
def calendar_create(args):
event = {
"summary": args.summary,
"start": {"dateTime": args.start},
"end": {"dateTime": args.end},
}
cmd = [
"calendar", "+insert",
"--summary", args.summary,
"--start", args.start,
"--end", args.end,
"--format", "json",
]
if args.location:
event["location"] = args.location
cmd += ["--location", args.location]
if args.description:
event["description"] = args.description
cmd += ["--description", args.description]
if args.attendees:
event["attendees"] = [{"email": e.strip()} for e in args.attendees.split(",") if e.strip()]
if _gws_binary():
result = _run_gws(
["calendar", "events", "insert"],
params={"calendarId": args.calendar},
body=event,
)
print(json.dumps({
"status": "created",
"id": result["id"],
"summary": result.get("summary", ""),
"htmlLink": result.get("htmlLink", ""),
}, indent=2))
return
service = build_service("calendar", "v3")
result = service.events().insert(calendarId=args.calendar, body=event).execute()
print(json.dumps({
"status": "created",
"id": result["id"],
"summary": result.get("summary", ""),
"htmlLink": result.get("htmlLink", ""),
}, indent=2))
for email in args.attendees.split(","):
cmd += ["--attendee", email.strip()]
if args.calendar != "primary":
cmd += ["--calendar", args.calendar]
gws(*cmd)
def calendar_delete(args):
if _gws_binary():
_run_gws(["calendar", "events", "delete"], params={"calendarId": args.calendar, "eventId": args.event_id})
print(json.dumps({"status": "deleted", "eventId": args.event_id}))
return
service = build_service("calendar", "v3")
service.events().delete(calendarId=args.calendar, eventId=args.event_id).execute()
print(json.dumps({"status": "deleted", "eventId": args.event_id}))
gws(
"calendar", "events", "delete",
"--params", json.dumps({"calendarId": args.calendar, "eventId": args.event_id}),
"--format", "json",
)
# =========================================================================
# Drive
# =========================================================================
# -- Drive --
def drive_search(args):
query = args.query if args.raw_query else f"fullText contains '{args.query}'"
if _gws_binary():
results = _run_gws(
["drive", "files", "list"],
params={
"q": query,
"pageSize": args.max,
"fields": "files(id, name, mimeType, modifiedTime, webViewLink)",
},
)
print(json.dumps(results.get("files", []), indent=2, ensure_ascii=False))
return
service = build_service("drive", "v3")
results = service.files().list(
q=query, pageSize=args.max, fields="files(id, name, mimeType, modifiedTime, webViewLink)",
).execute()
files = results.get("files", [])
print(json.dumps(files, indent=2, ensure_ascii=False))
gws(
"drive", "files", "list",
"--params", json.dumps({
"q": query,
"pageSize": args.max,
"fields": "files(id,name,mimeType,modifiedTime,webViewLink)",
}),
"--format", "json",
)
# =========================================================================
# Contacts
# =========================================================================
# -- Contacts --
def contacts_list(args):
if _gws_binary():
results = _run_gws(
["people", "people", "connections", "list"],
params={
"resourceName": "people/me",
"pageSize": args.max,
"personFields": "names,emailAddresses,phoneNumbers",
},
)
contacts = []
for person in results.get("connections", []):
names = person.get("names", [{}])
emails = person.get("emailAddresses", [])
phones = person.get("phoneNumbers", [])
contacts.append({
"name": names[0].get("displayName", "") if names else "",
"emails": [e.get("value", "") for e in emails],
"phones": [p.get("value", "") for p in phones],
})
print(json.dumps(contacts, indent=2, ensure_ascii=False))
return
service = build_service("people", "v1")
results = service.people().connections().list(
resourceName="people/me",
pageSize=args.max,
personFields="names,emailAddresses,phoneNumbers",
).execute()
contacts = []
for person in results.get("connections", []):
names = person.get("names", [{}])
emails = person.get("emailAddresses", [])
phones = person.get("phoneNumbers", [])
contacts.append({
"name": names[0].get("displayName", "") if names else "",
"emails": [e.get("value", "") for e in emails],
"phones": [p.get("value", "") for p in phones],
})
print(json.dumps(contacts, indent=2, ensure_ascii=False))
gws(
"people", "people", "connections", "list",
"--params", json.dumps({
"resourceName": "people/me",
"pageSize": args.max,
"personFields": "names,emailAddresses,phoneNumbers",
}),
"--format", "json",
)
# =========================================================================
# Sheets
# =========================================================================
# -- Sheets --
def sheets_get(args):
if _gws_binary():
result = _run_gws(
["sheets", "spreadsheets", "values", "get"],
params={"spreadsheetId": args.sheet_id, "range": args.range},
)
print(json.dumps(result.get("values", []), indent=2, ensure_ascii=False))
return
service = build_service("sheets", "v4")
result = service.spreadsheets().values().get(
spreadsheetId=args.sheet_id, range=args.range,
).execute()
print(json.dumps(result.get("values", []), indent=2, ensure_ascii=False))
gws(
"sheets", "+read",
"--spreadsheet", args.sheet_id,
"--range", args.range,
"--format", "json",
)
def sheets_update(args):
values = json.loads(args.values)
body = {"values": values}
if _gws_binary():
result = _run_gws(
["sheets", "spreadsheets", "values", "update"],
params={
"spreadsheetId": args.sheet_id,
"range": args.range,
"valueInputOption": "USER_ENTERED",
},
body=body,
)
print(json.dumps({"updatedCells": result.get("updatedCells", 0), "updatedRange": result.get("updatedRange", "")}, indent=2))
return
service = build_service("sheets", "v4")
result = service.spreadsheets().values().update(
spreadsheetId=args.sheet_id, range=args.range,
valueInputOption="USER_ENTERED", body=body,
).execute()
print(json.dumps({"updatedCells": result.get("updatedCells", 0), "updatedRange": result.get("updatedRange", "")}, indent=2))
gws(
"sheets", "spreadsheets", "values", "update",
"--params", json.dumps({
"spreadsheetId": args.sheet_id,
"range": args.range,
"valueInputOption": "USER_ENTERED",
}),
"--json", json.dumps({"values": values}),
"--format", "json",
)
def sheets_append(args):
values = json.loads(args.values)
body = {"values": values}
if _gws_binary():
result = _run_gws(
["sheets", "spreadsheets", "values", "append"],
params={
"spreadsheetId": args.sheet_id,
"range": args.range,
"valueInputOption": "USER_ENTERED",
"insertDataOption": "INSERT_ROWS",
},
body=body,
)
print(json.dumps({"updatedCells": result.get("updates", {}).get("updatedCells", 0)}, indent=2))
return
service = build_service("sheets", "v4")
result = service.spreadsheets().values().append(
spreadsheetId=args.sheet_id, range=args.range,
valueInputOption="USER_ENTERED", insertDataOption="INSERT_ROWS", body=body,
).execute()
print(json.dumps({"updatedCells": result.get("updates", {}).get("updatedCells", 0)}, indent=2))
gws(
"sheets", "+append",
"--spreadsheet", args.sheet_id,
"--json-values", json.dumps(values),
"--format", "json",
)
# =========================================================================
# Docs
# =========================================================================
# -- Docs --
def docs_get(args):
if _gws_binary():
doc = _run_gws(["docs", "documents", "get"], params={"documentId": args.doc_id})
result = {
"title": doc.get("title", ""),
"documentId": doc.get("documentId", ""),
"body": _extract_doc_text(doc),
}
print(json.dumps(result, indent=2, ensure_ascii=False))
return
service = build_service("docs", "v1")
doc = service.documents().get(documentId=args.doc_id).execute()
result = {
"title": doc.get("title", ""),
"documentId": doc.get("documentId", ""),
"body": _extract_doc_text(doc),
}
print(json.dumps(result, indent=2, ensure_ascii=False))
gws(
"docs", "documents", "get",
"--params", json.dumps({"documentId": args.doc_id}),
"--format", "json",
)
# =========================================================================
# CLI parser
# =========================================================================
# -- CLI parser (backward-compatible interface) --
def main():
parser = argparse.ArgumentParser(description="Google Workspace API for Hermes Agent")
parser = argparse.ArgumentParser(description="Google Workspace API for Hermes Agent (gws backend)")
sub = parser.add_subparsers(dest="service", required=True)
# --- Gmail ---
@@ -742,15 +228,13 @@ def main():
p.add_argument("--subject", required=True)
p.add_argument("--body", required=True)
p.add_argument("--cc", default="")
p.add_argument("--from", dest="from_header", default="", help="Custom From header (e.g. '\"Agent Name\" <user@example.com>')")
p.add_argument("--html", action="store_true", help="Send body as HTML")
p.add_argument("--thread-id", default="", help="Thread ID for threading")
p.add_argument("--thread-id", default="", help="Thread ID (unused with gws, kept for compat)")
p.set_defaults(func=gmail_send)
p = gmail_sub.add_parser("reply")
p.add_argument("message_id", help="Message ID to reply to")
p.add_argument("--body", required=True)
p.add_argument("--from", dest="from_header", default="", help="Custom From header (e.g. '\"Agent Name\" <user@example.com>')")
p.set_defaults(func=gmail_reply)
p = gmail_sub.add_parser("labels")
@@ -25,13 +25,6 @@ def refresh_token(token_data: dict) -> dict:
import urllib.parse
import urllib.request
required_keys = ["client_id", "client_secret", "refresh_token", "token_uri"]
missing = [k for k in required_keys if k not in token_data]
if missing:
print(f"ERROR: google_token.json is missing required fields: {', '.join(missing)}", file=sys.stderr)
print("Please re-authenticate by running the Google Workspace setup script.", file=sys.stderr)
sys.exit(1)
params = urllib.parse.urlencode({
"client_id": token_data["client_id"],
"client_secret": token_data["client_secret"],
+3 -3
View File
@@ -60,7 +60,7 @@ The fastest path — auto-detect the model, test strategies, and lock in the win
# In execute_code — use the loader to avoid exec-scoping issues:
import os
exec(open(os.path.expanduser(
os.path.join(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes")), "skills/red-teaming/godmode/scripts/load_godmode.py")
"~/.hermes/skills/red-teaming/godmode/scripts/load_godmode.py"
)).read())
# Auto-detect model from config and jailbreak it
@@ -192,7 +192,7 @@ python3 scripts/parseltongue.py "How do I hack into a WiFi network?" --tier stan
Or use `execute_code` inline:
```python
# Load the parseltongue module
exec(open(os.path.join(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes")), "skills/red-teaming/godmode/scripts/parseltongue.py")).read())
exec(open(os.path.expanduser("~/.hermes/skills/red-teaming/godmode/scripts/parseltongue.py")).read())
query = "How do I hack into a WiFi network?"
variants = generate_variants(query, tier="standard")
@@ -229,7 +229,7 @@ Race multiple models against the same query, score responses, pick the winner:
```python
# Via execute_code
exec(open(os.path.join(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes")), "skills/red-teaming/godmode/scripts/godmode_race.py")).read())
exec(open(os.path.expanduser("~/.hermes/skills/red-teaming/godmode/scripts/godmode_race.py")).read())
result = race_models(
query="Explain how SQL injection works with a practical example",
@@ -114,7 +114,7 @@ hermes
### Via the GODMODE CLASSIC racer script
```python
exec(open(os.path.join(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes")), "skills/red-teaming/godmode/scripts/godmode_race.py")).read())
exec(open(os.path.expanduser("~/.hermes/skills/red-teaming/godmode/scripts/godmode_race.py")).read())
result = race_godmode_classic("Your query here")
print(f"Winner: {result['codename']} — Score: {result['score']}")
print(result['content'])
@@ -129,7 +129,7 @@ These don't auto-reject but reduce the response score:
## Using in Python
```python
exec(open(os.path.join(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes")), "skills/red-teaming/godmode/scripts/godmode_race.py")).read())
exec(open(os.path.expanduser("~/.hermes/skills/red-teaming/godmode/scripts/godmode_race.py")).read())
# Check if a response is a refusal
text = "I'm sorry, but I can't assist with that request."
@@ -7,7 +7,7 @@ finds what works, and locks it in by writing config.yaml + prefill.json.
Usage in execute_code:
exec(open(os.path.expanduser(
os.path.join(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes")), "skills/red-teaming/godmode/scripts/auto_jailbreak.py")
"~/.hermes/skills/red-teaming/godmode/scripts/auto_jailbreak.py"
)).read())
result = auto_jailbreak() # Uses current model from config
@@ -7,7 +7,7 @@ Queries multiple models in parallel via OpenRouter, scores responses
on quality/filteredness/speed, returns the best unfiltered answer.
Usage in execute_code:
exec(open(os.path.join(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes")), "skills/red-teaming/godmode/scripts/godmode_race.py")).read())
exec(open(os.path.expanduser("~/.hermes/skills/red-teaming/godmode/scripts/godmode_race.py")).read())
result = race_models(
query="Your query here",
@@ -3,7 +3,7 @@ Loader for G0DM0D3 scripts. Handles the exec-scoping issues.
Usage in execute_code:
exec(open(os.path.expanduser(
os.path.join(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes")), "skills/red-teaming/godmode/scripts/load_godmode.py")
"~/.hermes/skills/red-teaming/godmode/scripts/load_godmode.py"
)).read())
# Now all functions are available:
@@ -11,7 +11,7 @@ Usage:
python parseltongue.py "How do I hack a WiFi network?" --tier standard
# As a module in execute_code
exec(open(os.path.join(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes")), "skills/red-teaming/godmode/scripts/parseltongue.py")).read())
exec(open("~/.hermes/skills/red-teaming/godmode/scripts/parseltongue.py").read())
variants = generate_variants("How do I hack a WiFi network?", tier="standard")
"""
File diff suppressed because it is too large Load Diff
-269
View File
@@ -1,269 +0,0 @@
"""Integration tests for the AWS Bedrock provider wiring.
Verifies that the Bedrock provider is correctly registered in the
provider registry, model catalog, and runtime resolution pipeline.
These tests do NOT require AWS credentials or boto3 all AWS calls
are mocked.
Note: Tests that import ``hermes_cli.auth`` or ``hermes_cli.runtime_provider``
require Python 3.10+ due to ``str | None`` type syntax in the import chain.
"""
import os
from unittest.mock import MagicMock, patch
import pytest
class TestProviderRegistry:
"""Verify Bedrock is registered in PROVIDER_REGISTRY."""
def test_bedrock_in_registry(self):
from hermes_cli.auth import PROVIDER_REGISTRY
assert "bedrock" in PROVIDER_REGISTRY
def test_bedrock_auth_type_is_aws_sdk(self):
from hermes_cli.auth import PROVIDER_REGISTRY
pconfig = PROVIDER_REGISTRY["bedrock"]
assert pconfig.auth_type == "aws_sdk"
def test_bedrock_has_no_api_key_env_vars(self):
"""Bedrock uses the AWS SDK credential chain, not API keys."""
from hermes_cli.auth import PROVIDER_REGISTRY
pconfig = PROVIDER_REGISTRY["bedrock"]
assert pconfig.api_key_env_vars == ()
def test_bedrock_base_url_env_var(self):
from hermes_cli.auth import PROVIDER_REGISTRY
pconfig = PROVIDER_REGISTRY["bedrock"]
assert pconfig.base_url_env_var == "BEDROCK_BASE_URL"
class TestProviderAliases:
"""Verify Bedrock aliases resolve correctly."""
def test_aws_alias(self):
from hermes_cli.models import _PROVIDER_ALIASES
assert _PROVIDER_ALIASES.get("aws") == "bedrock"
def test_aws_bedrock_alias(self):
from hermes_cli.models import _PROVIDER_ALIASES
assert _PROVIDER_ALIASES.get("aws-bedrock") == "bedrock"
def test_amazon_bedrock_alias(self):
from hermes_cli.models import _PROVIDER_ALIASES
assert _PROVIDER_ALIASES.get("amazon-bedrock") == "bedrock"
def test_amazon_alias(self):
from hermes_cli.models import _PROVIDER_ALIASES
assert _PROVIDER_ALIASES.get("amazon") == "bedrock"
class TestProviderLabels:
"""Verify Bedrock appears in provider labels."""
def test_bedrock_label(self):
from hermes_cli.models import _PROVIDER_LABELS
assert _PROVIDER_LABELS.get("bedrock") == "AWS Bedrock"
class TestModelCatalog:
"""Verify Bedrock has a static model fallback list."""
def test_bedrock_has_curated_models(self):
from hermes_cli.models import _PROVIDER_MODELS
models = _PROVIDER_MODELS.get("bedrock", [])
assert len(models) > 0
def test_bedrock_models_include_claude(self):
from hermes_cli.models import _PROVIDER_MODELS
models = _PROVIDER_MODELS.get("bedrock", [])
claude_models = [m for m in models if "anthropic.claude" in m]
assert len(claude_models) > 0
def test_bedrock_models_include_nova(self):
from hermes_cli.models import _PROVIDER_MODELS
models = _PROVIDER_MODELS.get("bedrock", [])
nova_models = [m for m in models if "amazon.nova" in m]
assert len(nova_models) > 0
class TestResolveProvider:
"""Verify resolve_provider() handles bedrock correctly."""
def test_explicit_bedrock_resolves(self, monkeypatch):
"""When user explicitly requests 'bedrock', it should resolve."""
from hermes_cli.auth import PROVIDER_REGISTRY
# bedrock is in the registry, so resolve_provider should return it
from hermes_cli.auth import resolve_provider
result = resolve_provider("bedrock")
assert result == "bedrock"
def test_aws_alias_resolves_to_bedrock(self):
from hermes_cli.auth import resolve_provider
result = resolve_provider("aws")
assert result == "bedrock"
def test_amazon_bedrock_alias_resolves(self):
from hermes_cli.auth import resolve_provider
result = resolve_provider("amazon-bedrock")
assert result == "bedrock"
def test_auto_detect_with_aws_credentials(self, monkeypatch):
"""When AWS credentials are present and no other provider is configured,
auto-detect should find bedrock."""
from hermes_cli.auth import resolve_provider
# Clear all other provider env vars
for var in ["OPENAI_API_KEY", "OPENROUTER_API_KEY", "ANTHROPIC_API_KEY",
"ANTHROPIC_TOKEN", "GOOGLE_API_KEY", "DEEPSEEK_API_KEY"]:
monkeypatch.delenv(var, raising=False)
# Set AWS credentials
monkeypatch.setenv("AWS_ACCESS_KEY_ID", "AKIAIOSFODNN7EXAMPLE")
monkeypatch.setenv("AWS_SECRET_ACCESS_KEY", "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY")
# Mock the auth store to have no active provider
with patch("hermes_cli.auth._load_auth_store", return_value={}):
result = resolve_provider("auto")
assert result == "bedrock"
class TestRuntimeProvider:
"""Verify resolve_runtime_provider() handles bedrock correctly."""
def test_bedrock_runtime_resolution(self, monkeypatch):
from hermes_cli.runtime_provider import resolve_runtime_provider
monkeypatch.setenv("AWS_ACCESS_KEY_ID", "AKIAIOSFODNN7EXAMPLE")
monkeypatch.setenv("AWS_SECRET_ACCESS_KEY", "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY")
monkeypatch.setenv("AWS_REGION", "eu-west-1")
# Mock resolve_provider to return bedrock
with patch("hermes_cli.runtime_provider.resolve_provider", return_value="bedrock"), \
patch("hermes_cli.runtime_provider._get_model_config", return_value={"provider": "bedrock"}):
result = resolve_runtime_provider(requested="bedrock")
assert result["provider"] == "bedrock"
assert result["api_mode"] == "bedrock_converse"
assert result["region"] == "eu-west-1"
assert "bedrock-runtime.eu-west-1.amazonaws.com" in result["base_url"]
assert result["api_key"] == "aws-sdk"
def test_bedrock_runtime_default_region(self, monkeypatch):
from hermes_cli.runtime_provider import resolve_runtime_provider
monkeypatch.setenv("AWS_PROFILE", "default")
monkeypatch.delenv("AWS_REGION", raising=False)
monkeypatch.delenv("AWS_DEFAULT_REGION", raising=False)
with patch("hermes_cli.runtime_provider.resolve_provider", return_value="bedrock"), \
patch("hermes_cli.runtime_provider._get_model_config", return_value={"provider": "bedrock"}):
result = resolve_runtime_provider(requested="bedrock")
assert result["region"] == "us-east-1"
def test_bedrock_runtime_no_credentials_raises_on_auto_detect(self, monkeypatch):
"""When bedrock is auto-detected (not explicitly requested) and no
credentials are found, runtime resolution should raise AuthError."""
from hermes_cli.runtime_provider import resolve_runtime_provider
from hermes_cli.auth import AuthError
# Clear all AWS env vars
for var in ["AWS_ACCESS_KEY_ID", "AWS_SECRET_ACCESS_KEY", "AWS_PROFILE",
"AWS_BEARER_TOKEN_BEDROCK", "AWS_CONTAINER_CREDENTIALS_RELATIVE_URI",
"AWS_WEB_IDENTITY_TOKEN_FILE"]:
monkeypatch.delenv(var, raising=False)
# Mock both the provider resolution and boto3's credential chain
mock_session = MagicMock()
mock_session.get_credentials.return_value = None
with patch("hermes_cli.runtime_provider.resolve_provider", return_value="bedrock"), \
patch("hermes_cli.runtime_provider._get_model_config", return_value={"provider": "bedrock"}), \
patch("hermes_cli.runtime_provider.resolve_requested_provider", return_value="auto"), \
patch.dict("sys.modules", {"botocore": MagicMock(), "botocore.session": MagicMock()}):
import botocore.session as _bs
_bs.get_session = MagicMock(return_value=mock_session)
with pytest.raises(AuthError, match="No AWS credentials"):
resolve_runtime_provider(requested="auto")
def test_bedrock_runtime_explicit_skips_credential_check(self, monkeypatch):
"""When user explicitly requests bedrock, trust boto3's credential chain
even if env-var detection finds nothing (covers IMDS, SSO, etc.)."""
from hermes_cli.runtime_provider import resolve_runtime_provider
# No AWS env vars set — but explicit bedrock request should not raise
for var in ["AWS_ACCESS_KEY_ID", "AWS_SECRET_ACCESS_KEY", "AWS_PROFILE",
"AWS_BEARER_TOKEN_BEDROCK"]:
monkeypatch.delenv(var, raising=False)
with patch("hermes_cli.runtime_provider.resolve_provider", return_value="bedrock"), \
patch("hermes_cli.runtime_provider._get_model_config", return_value={"provider": "bedrock"}):
result = resolve_runtime_provider(requested="bedrock")
assert result["provider"] == "bedrock"
assert result["api_mode"] == "bedrock_converse"
# ---------------------------------------------------------------------------
# providers.py integration
# ---------------------------------------------------------------------------
class TestProvidersModule:
"""Verify bedrock is wired into hermes_cli/providers.py."""
def test_bedrock_alias_in_providers(self):
from hermes_cli.providers import ALIASES
assert ALIASES.get("bedrock") is None # "bedrock" IS the canonical name, not an alias
assert ALIASES.get("aws") == "bedrock"
assert ALIASES.get("aws-bedrock") == "bedrock"
def test_bedrock_transport_mapping(self):
from hermes_cli.providers import TRANSPORT_TO_API_MODE
assert TRANSPORT_TO_API_MODE.get("bedrock_converse") == "bedrock_converse"
def test_determine_api_mode_from_bedrock_url(self):
from hermes_cli.providers import determine_api_mode
assert determine_api_mode(
"unknown", "https://bedrock-runtime.us-east-1.amazonaws.com"
) == "bedrock_converse"
def test_label_override(self):
from hermes_cli.providers import _LABEL_OVERRIDES
assert _LABEL_OVERRIDES.get("bedrock") == "AWS Bedrock"
# ---------------------------------------------------------------------------
# Error classifier integration
# ---------------------------------------------------------------------------
class TestErrorClassifierBedrock:
"""Verify Bedrock error patterns are in the global error classifier."""
def test_throttling_in_rate_limit_patterns(self):
from agent.error_classifier import _RATE_LIMIT_PATTERNS
assert "throttlingexception" in _RATE_LIMIT_PATTERNS
def test_context_overflow_patterns(self):
from agent.error_classifier import _CONTEXT_OVERFLOW_PATTERNS
assert "input is too long" in _CONTEXT_OVERFLOW_PATTERNS
# ---------------------------------------------------------------------------
# pyproject.toml bedrock extra
# ---------------------------------------------------------------------------
class TestPackaging:
"""Verify bedrock optional dependency is declared."""
def test_bedrock_extra_exists(self):
import configparser
from pathlib import Path
# Read pyproject.toml to verify [bedrock] extra
toml_path = Path(__file__).parent.parent.parent / "pyproject.toml"
content = toml_path.read_text()
assert 'bedrock = ["boto3' in content
def test_bedrock_in_all_extra(self):
from pathlib import Path
content = (Path(__file__).parent.parent.parent / "pyproject.toml").read_text()
assert '"hermes-agent[bedrock]"' in content
-1
View File
@@ -1091,7 +1091,6 @@ def test_load_pool_seeds_copilot_via_gh_auth_token(tmp_path, monkeypatch):
assert len(entries) == 1
assert entries[0].source == "gh_cli"
assert entries[0].access_token == "gho_fake_token_abc123"
assert entries[0].base_url == "https://api.githubcopilot.com"
def test_load_pool_does_not_seed_copilot_when_no_token(tmp_path, monkeypatch):
-244
View File
@@ -396,108 +396,6 @@ class TestPluginMemoryDiscovery:
assert load_memory_provider("nonexistent_provider") is None
class TestUserInstalledProviderDiscovery:
"""Memory providers installed to $HERMES_HOME/plugins/ should be found.
Regression test for issues #4956 and #9099: load_memory_provider() and
discover_memory_providers() only scanned the bundled plugins/memory/
directory, ignoring user-installed plugins.
"""
def _make_user_memory_plugin(self, tmp_path, name="myprovider"):
"""Create a minimal user memory provider plugin."""
plugin_dir = tmp_path / "plugins" / name
plugin_dir.mkdir(parents=True)
(plugin_dir / "__init__.py").write_text(
"from agent.memory_provider import MemoryProvider\n"
"class MyProvider(MemoryProvider):\n"
f" @property\n"
f" def name(self): return {name!r}\n"
" def is_available(self): return True\n"
" def initialize(self, **kw): pass\n"
" def sync_turn(self, *a, **kw): pass\n"
" def get_tool_schemas(self): return []\n"
" def handle_tool_call(self, *a, **kw): return '{}'\n"
)
(plugin_dir / "plugin.yaml").write_text(
f"name: {name}\ndescription: Test user provider\n"
)
return plugin_dir
def test_discover_finds_user_plugins(self, tmp_path, monkeypatch):
"""discover_memory_providers() includes user-installed plugins."""
from plugins.memory import discover_memory_providers, _get_user_plugins_dir
self._make_user_memory_plugin(tmp_path, "myexternal")
monkeypatch.setattr(
"plugins.memory._get_user_plugins_dir",
lambda: tmp_path / "plugins",
)
providers = discover_memory_providers()
names = [n for n, _, _ in providers]
assert "myexternal" in names
assert "holographic" in names # bundled still found
def test_load_user_plugin(self, tmp_path, monkeypatch):
"""load_memory_provider() can load from $HERMES_HOME/plugins/."""
from plugins.memory import load_memory_provider
self._make_user_memory_plugin(tmp_path, "myexternal")
monkeypatch.setattr(
"plugins.memory._get_user_plugins_dir",
lambda: tmp_path / "plugins",
)
p = load_memory_provider("myexternal")
assert p is not None
assert p.name == "myexternal"
assert p.is_available()
def test_bundled_takes_precedence(self, tmp_path, monkeypatch):
"""Bundled provider wins when user plugin has the same name."""
from plugins.memory import load_memory_provider, discover_memory_providers
# Create user plugin named "holographic" (same as bundled)
plugin_dir = tmp_path / "plugins" / "holographic"
plugin_dir.mkdir(parents=True)
(plugin_dir / "__init__.py").write_text(
"from agent.memory_provider import MemoryProvider\n"
"class Fake(MemoryProvider):\n"
" @property\n"
" def name(self): return 'holographic-FAKE'\n"
" def is_available(self): return True\n"
" def initialize(self, **kw): pass\n"
" def sync_turn(self, *a, **kw): pass\n"
" def get_tool_schemas(self): return []\n"
" def handle_tool_call(self, *a, **kw): return '{}'\n"
)
monkeypatch.setattr(
"plugins.memory._get_user_plugins_dir",
lambda: tmp_path / "plugins",
)
# Load should return bundled (name "holographic"), not user (name "holographic-FAKE")
p = load_memory_provider("holographic")
assert p is not None
assert p.name == "holographic" # bundled wins
# discover should not duplicate
providers = discover_memory_providers()
holo_count = sum(1 for n, _, _ in providers if n == "holographic")
assert holo_count == 1
def test_non_memory_user_plugins_excluded(self, tmp_path, monkeypatch):
"""User plugins that don't reference MemoryProvider are skipped."""
from plugins.memory import discover_memory_providers
plugin_dir = tmp_path / "plugins" / "notmemory"
plugin_dir.mkdir(parents=True)
(plugin_dir / "__init__.py").write_text(
"def register(ctx):\n ctx.register_tool('foo', 'bar', {}, lambda: None)\n"
)
monkeypatch.setattr(
"plugins.memory._get_user_plugins_dir",
lambda: tmp_path / "plugins",
)
providers = discover_memory_providers()
names = [n for n, _, _ in providers]
assert "notmemory" not in names
# ---------------------------------------------------------------------------
# Sequential dispatch routing tests
# ---------------------------------------------------------------------------
@@ -797,145 +695,3 @@ class TestMemoryContextFencing:
fence_end = combined.index("</memory-context>")
assert "Alice" in combined[fence_start:fence_end]
assert combined.index("weather") < fence_start
# ---------------------------------------------------------------------------
# AIAgent.commit_memory_session — routes to MemoryManager.on_session_end
# ---------------------------------------------------------------------------
class _CommitRecorder(FakeMemoryProvider):
"""Provider that records on_session_end calls for assertions."""
def __init__(self, name="recorder"):
super().__init__(name)
self.end_calls = []
def on_session_end(self, messages):
self.end_calls.append(list(messages or []))
class TestCommitMemorySessionRouting:
def test_on_session_end_fans_out(self):
mgr = MemoryManager()
builtin = _CommitRecorder("builtin")
external = _CommitRecorder("openviking")
mgr.add_provider(builtin)
mgr.add_provider(external)
msgs = [{"role": "user", "content": "hi"}]
mgr.on_session_end(msgs)
assert builtin.end_calls == [msgs]
assert external.end_calls == [msgs]
def test_on_session_end_tolerates_failure(self):
mgr = MemoryManager()
builtin = FakeMemoryProvider("builtin")
bad = _CommitRecorder("bad-provider")
bad.on_session_end = lambda m: (_ for _ in ()).throw(RuntimeError("boom"))
mgr.add_provider(builtin)
mgr.add_provider(bad)
mgr.on_session_end([]) # must not raise
# ---------------------------------------------------------------------------
# on_memory_write bridge — must fire from both concurrent AND sequential paths
# ---------------------------------------------------------------------------
class TestOnMemoryWriteBridge:
"""Verify that MemoryManager.on_memory_write is called when built-in
memory writes happen. This is a regression test for #10174 where the
sequential tool execution path (_execute_tool_calls_sequential) was
missing the bridge call, so single memory tool calls never notified
external memory providers.
"""
def test_on_memory_write_add(self):
"""on_memory_write fires for 'add' actions."""
mgr = MemoryManager()
p = FakeMemoryProvider("ext")
mgr.add_provider(p)
mgr.on_memory_write("add", "memory", "new fact")
assert p.memory_writes == [("add", "memory", "new fact")]
def test_on_memory_write_replace(self):
"""on_memory_write fires for 'replace' actions."""
mgr = MemoryManager()
p = FakeMemoryProvider("ext")
mgr.add_provider(p)
mgr.on_memory_write("replace", "user", "updated pref")
assert p.memory_writes == [("replace", "user", "updated pref")]
def test_on_memory_write_remove_not_bridged(self):
"""The bridge intentionally skips 'remove' — only add/replace notify."""
# This tests the contract that run_agent.py checks:
# function_args.get("action") in ("add", "replace")
mgr = MemoryManager()
p = FakeMemoryProvider("ext")
mgr.add_provider(p)
# Manager itself doesn't filter — run_agent.py does.
# But providers should handle remove gracefully.
mgr.on_memory_write("remove", "memory", "old fact")
assert p.memory_writes == [("remove", "memory", "old fact")]
def test_memory_manager_tool_injection_deduplicates(self):
"""Memory manager tools already in self.tools (from plugin registry)
must not be appended again. Duplicate function names cause 400 errors
on providers that enforce unique names (e.g. Xiaomi MiMo via Nous Portal).
Regression test for: duplicate mnemosyne_recall / mnemosyne_remember /
mnemosyne_stats in tools array 400 from Nous Portal.
"""
mgr = MemoryManager()
p = FakeMemoryProvider("ext", tools=[
{"name": "ext_recall", "description": "Recall", "parameters": {}},
{"name": "ext_remember", "description": "Remember", "parameters": {}},
])
mgr.add_provider(p)
# Simulate self.tools already containing one of the plugin tools
# (as if it was registered via ctx.register_tool → get_tool_definitions)
existing_tools = [
{"type": "function", "function": {"name": "ext_recall", "description": "Recall (from registry)", "parameters": {}}},
{"type": "function", "function": {"name": "web_search", "description": "Search", "parameters": {}}},
]
# Apply the same dedup logic from run_agent.py __init__
_existing_names = {
t.get("function", {}).get("name")
for t in existing_tools
if isinstance(t, dict)
}
for _schema in mgr.get_all_tool_schemas():
_tname = _schema.get("name", "")
if _tname and _tname in _existing_names:
continue
existing_tools.append({"type": "function", "function": _schema})
if _tname:
_existing_names.add(_tname)
# ext_recall should NOT be duplicated; ext_remember should be added
tool_names = [t["function"]["name"] for t in existing_tools]
assert tool_names.count("ext_recall") == 1, f"ext_recall duplicated: {tool_names}"
assert tool_names.count("ext_remember") == 1
assert tool_names.count("web_search") == 1
assert len(existing_tools) == 3 # web_search + ext_recall + ext_remember
def test_on_memory_write_tolerates_provider_failure(self):
"""If a provider's on_memory_write raises, others still get notified."""
mgr = MemoryManager()
bad = FakeMemoryProvider("builtin")
bad.on_memory_write = MagicMock(side_effect=RuntimeError("boom"))
good = FakeMemoryProvider("good")
mgr.add_provider(bad)
mgr.add_provider(good)
mgr.on_memory_write("add", "user", "test")
# Good provider still received the call despite bad provider crashing
assert good.memory_writes == [("add", "user", "test")]
-253
View File
@@ -1,253 +0,0 @@
"""Tests for agent/nous_rate_guard.py — cross-session Nous Portal rate limit guard."""
import json
import os
import time
import pytest
@pytest.fixture
def rate_guard_env(tmp_path, monkeypatch):
"""Isolate rate guard state to a temp directory."""
hermes_home = str(tmp_path / ".hermes")
os.makedirs(hermes_home, exist_ok=True)
monkeypatch.setenv("HERMES_HOME", hermes_home)
# Clear any cached module-level imports
return hermes_home
class TestRecordNousRateLimit:
"""Test recording rate limit state."""
def test_records_with_header_reset(self, rate_guard_env):
from agent.nous_rate_guard import record_nous_rate_limit, _state_path
headers = {"x-ratelimit-reset-requests-1h": "1800"}
record_nous_rate_limit(headers=headers)
path = _state_path()
assert os.path.exists(path)
with open(path) as f:
state = json.load(f)
assert state["reset_seconds"] == pytest.approx(1800, abs=2)
assert state["reset_at"] > time.time()
def test_records_with_per_minute_header(self, rate_guard_env):
from agent.nous_rate_guard import record_nous_rate_limit, _state_path
headers = {"x-ratelimit-reset-requests": "45"}
record_nous_rate_limit(headers=headers)
with open(_state_path()) as f:
state = json.load(f)
assert state["reset_seconds"] == pytest.approx(45, abs=2)
def test_records_with_retry_after_header(self, rate_guard_env):
from agent.nous_rate_guard import record_nous_rate_limit, _state_path
headers = {"retry-after": "60"}
record_nous_rate_limit(headers=headers)
with open(_state_path()) as f:
state = json.load(f)
assert state["reset_seconds"] == pytest.approx(60, abs=2)
def test_prefers_hourly_over_per_minute(self, rate_guard_env):
from agent.nous_rate_guard import record_nous_rate_limit, _state_path
headers = {
"x-ratelimit-reset-requests-1h": "1800",
"x-ratelimit-reset-requests": "45",
}
record_nous_rate_limit(headers=headers)
with open(_state_path()) as f:
state = json.load(f)
# Should use the hourly value, not the per-minute one
assert state["reset_seconds"] == pytest.approx(1800, abs=2)
def test_falls_back_to_error_context_reset_at(self, rate_guard_env):
from agent.nous_rate_guard import record_nous_rate_limit, _state_path
future_reset = time.time() + 900
record_nous_rate_limit(
headers=None,
error_context={"reset_at": future_reset},
)
with open(_state_path()) as f:
state = json.load(f)
assert state["reset_at"] == pytest.approx(future_reset, abs=1)
def test_falls_back_to_default_cooldown(self, rate_guard_env):
from agent.nous_rate_guard import record_nous_rate_limit, _state_path
record_nous_rate_limit(headers=None)
with open(_state_path()) as f:
state = json.load(f)
# Default is 300 seconds (5 minutes)
assert state["reset_seconds"] == pytest.approx(300, abs=2)
def test_custom_default_cooldown(self, rate_guard_env):
from agent.nous_rate_guard import record_nous_rate_limit, _state_path
record_nous_rate_limit(headers=None, default_cooldown=120.0)
with open(_state_path()) as f:
state = json.load(f)
assert state["reset_seconds"] == pytest.approx(120, abs=2)
def test_creates_directory_if_missing(self, rate_guard_env):
from agent.nous_rate_guard import record_nous_rate_limit, _state_path
record_nous_rate_limit(headers={"retry-after": "10"})
assert os.path.exists(_state_path())
class TestNousRateLimitRemaining:
"""Test checking remaining rate limit time."""
def test_returns_none_when_no_file(self, rate_guard_env):
from agent.nous_rate_guard import nous_rate_limit_remaining
assert nous_rate_limit_remaining() is None
def test_returns_remaining_seconds_when_active(self, rate_guard_env):
from agent.nous_rate_guard import record_nous_rate_limit, nous_rate_limit_remaining
record_nous_rate_limit(headers={"x-ratelimit-reset-requests-1h": "600"})
remaining = nous_rate_limit_remaining()
assert remaining is not None
assert 595 < remaining <= 605 # ~600 seconds, allowing for test execution time
def test_returns_none_when_expired(self, rate_guard_env):
from agent.nous_rate_guard import nous_rate_limit_remaining, _state_path
# Write an already-expired state
state_dir = os.path.dirname(_state_path())
os.makedirs(state_dir, exist_ok=True)
with open(_state_path(), "w") as f:
json.dump({"reset_at": time.time() - 10, "recorded_at": time.time() - 100}, f)
assert nous_rate_limit_remaining() is None
# File should be cleaned up
assert not os.path.exists(_state_path())
def test_handles_corrupt_file(self, rate_guard_env):
from agent.nous_rate_guard import nous_rate_limit_remaining, _state_path
state_dir = os.path.dirname(_state_path())
os.makedirs(state_dir, exist_ok=True)
with open(_state_path(), "w") as f:
f.write("not valid json{{{")
assert nous_rate_limit_remaining() is None
class TestClearNousRateLimit:
"""Test clearing rate limit state."""
def test_clears_existing_file(self, rate_guard_env):
from agent.nous_rate_guard import (
record_nous_rate_limit,
clear_nous_rate_limit,
nous_rate_limit_remaining,
_state_path,
)
record_nous_rate_limit(headers={"retry-after": "600"})
assert nous_rate_limit_remaining() is not None
clear_nous_rate_limit()
assert nous_rate_limit_remaining() is None
assert not os.path.exists(_state_path())
def test_clear_when_no_file(self, rate_guard_env):
from agent.nous_rate_guard import clear_nous_rate_limit
# Should not raise
clear_nous_rate_limit()
class TestFormatRemaining:
"""Test human-readable duration formatting."""
def test_seconds(self):
from agent.nous_rate_guard import format_remaining
assert format_remaining(30) == "30s"
def test_minutes(self):
from agent.nous_rate_guard import format_remaining
assert format_remaining(125) == "2m 5s"
def test_exact_minutes(self):
from agent.nous_rate_guard import format_remaining
assert format_remaining(120) == "2m"
def test_hours(self):
from agent.nous_rate_guard import format_remaining
assert format_remaining(3720) == "1h 2m"
class TestParseResetSeconds:
"""Test header parsing for reset times."""
def test_case_insensitive_headers(self, rate_guard_env):
from agent.nous_rate_guard import _parse_reset_seconds
headers = {"X-Ratelimit-Reset-Requests-1h": "1200"}
assert _parse_reset_seconds(headers) == 1200.0
def test_returns_none_for_empty_headers(self):
from agent.nous_rate_guard import _parse_reset_seconds
assert _parse_reset_seconds(None) is None
assert _parse_reset_seconds({}) is None
def test_ignores_zero_values(self):
from agent.nous_rate_guard import _parse_reset_seconds
headers = {"x-ratelimit-reset-requests-1h": "0"}
assert _parse_reset_seconds(headers) is None
def test_ignores_invalid_values(self):
from agent.nous_rate_guard import _parse_reset_seconds
headers = {"x-ratelimit-reset-requests-1h": "not-a-number"}
assert _parse_reset_seconds(headers) is None
class TestAuxiliaryClientIntegration:
"""Test that the auxiliary client respects the rate guard."""
def test_try_nous_skips_when_rate_limited(self, rate_guard_env, monkeypatch):
from agent.nous_rate_guard import record_nous_rate_limit
# Record a rate limit
record_nous_rate_limit(headers={"retry-after": "600"})
# Mock _read_nous_auth to return valid creds (would normally succeed)
import agent.auxiliary_client as aux
monkeypatch.setattr(aux, "_read_nous_auth", lambda: {
"access_token": "test-token",
"inference_base_url": "https://api.nous.test/v1",
})
result = aux._try_nous()
assert result == (None, None)
def test_try_nous_works_when_not_rate_limited(self, rate_guard_env, monkeypatch):
import agent.auxiliary_client as aux
# No rate limit recorded — _try_nous should proceed normally
# (will return None because no real creds, but won't be blocked
# by the rate guard)
monkeypatch.setattr(aux, "_read_nous_auth", lambda: None)
result = aux._try_nous()
assert result == (None, None)
@@ -1,60 +0,0 @@
"""Tests for malformed proxy env var and base URL validation.
Salvaged from PR #6403 by MestreY0d4-Uninter — validates that the agent
surfaces clear errors instead of cryptic httpx ``Invalid port`` exceptions
when proxy env vars or custom endpoint URLs are malformed.
"""
from __future__ import annotations
import pytest
from agent.auxiliary_client import _validate_base_url, _validate_proxy_env_urls
# -- proxy env validation ------------------------------------------------
def test_proxy_env_accepts_normal_values(monkeypatch):
monkeypatch.setenv("HTTP_PROXY", "http://127.0.0.1:6153")
monkeypatch.setenv("HTTPS_PROXY", "https://proxy.example.com:8443")
monkeypatch.setenv("ALL_PROXY", "socks5://127.0.0.1:1080")
_validate_proxy_env_urls() # should not raise
def test_proxy_env_accepts_empty(monkeypatch):
monkeypatch.delenv("HTTP_PROXY", raising=False)
monkeypatch.delenv("HTTPS_PROXY", raising=False)
monkeypatch.delenv("ALL_PROXY", raising=False)
monkeypatch.delenv("http_proxy", raising=False)
monkeypatch.delenv("https_proxy", raising=False)
monkeypatch.delenv("all_proxy", raising=False)
_validate_proxy_env_urls() # should not raise
@pytest.mark.parametrize("key", [
"HTTP_PROXY", "HTTPS_PROXY", "ALL_PROXY",
"http_proxy", "https_proxy", "all_proxy",
])
def test_proxy_env_rejects_malformed_port(monkeypatch, key):
monkeypatch.setenv(key, "http://127.0.0.1:6153export")
with pytest.raises(RuntimeError, match=rf"Malformed proxy environment variable {key}=.*6153export"):
_validate_proxy_env_urls()
# -- base URL validation -------------------------------------------------
@pytest.mark.parametrize("url", [
"https://api.example.com/v1",
"http://127.0.0.1:6153/v1",
"acp://copilot",
"",
None,
])
def test_base_url_accepts_valid(url):
_validate_base_url(url) # should not raise
def test_base_url_rejects_malformed_port():
with pytest.raises(RuntimeError, match="Malformed custom endpoint URL"):
_validate_base_url("http://127.0.0.1:6153export")
-92
View File
@@ -284,95 +284,3 @@ class TestElevenLabsTavilyExaKeys:
assert "XYZ789abcdef" not in result
assert "HOME=/home/user" in result
assert "SHELL=/bin/bash" in result
class TestJWTTokens:
"""JWT tokens start with eyJ (base64 for '{') and have dot-separated parts."""
def test_full_3part_jwt(self):
text = (
"Token: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9"
".eyJpc3MiOiI0MjNiZDJkYjg4MjI0MDAwIn0"
".Gxgv0rru-_kS-I_60EJ7CENTnBh9UeuL3QhkMoQ-VnM"
)
result = redact_sensitive_text(text)
assert "Token:" in result
# Payload and signature must not survive
assert "eyJpc3Mi" not in result
assert "Gxgv0rru" not in result
def test_2part_jwt(self):
text = "eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0"
result = redact_sensitive_text(text)
assert "eyJzdWIi" not in result
def test_standalone_jwt_header(self):
text = "leaked header: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9 here"
result = redact_sensitive_text(text)
assert "IkpXVCJ9" not in result
assert "leaked header:" in result
def test_jwt_with_base64_padding(self):
text = "eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0=.abc123def456ghij"
result = redact_sensitive_text(text)
assert "abc123def456" not in result
def test_short_eyj_not_matched(self):
"""eyJ followed by fewer than 10 base64 chars should not match."""
text = "eyJust a normal word"
assert redact_sensitive_text(text) == text
def test_jwt_preserves_surrounding_text(self):
text = "before eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0 after"
result = redact_sensitive_text(text)
assert result.startswith("before ")
assert result.endswith(" after")
def test_home_assistant_jwt_in_memory(self):
"""Real-world pattern: HA token stored in agent memory block."""
text = (
"Home Assistant API Token: "
"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9"
".eyJpc3MiOiJhYmNkZWYiLCJleHAiOjE3NzQ5NTcxMDN9"
".Gxgv0rru-_kS-I_60EJ7CENTnBh9UeuL3QhkMoQ-VnM"
)
result = redact_sensitive_text(text)
assert "Home Assistant API Token:" in result
assert "Gxgv0rru" not in result
assert "..." in result
class TestDiscordMentions:
"""Discord snowflake IDs in <@ID> or <@!ID> format."""
def test_normal_mention(self):
result = redact_sensitive_text("Hello <@222589316709220353>")
assert "222589316709220353" not in result
assert "<@***>" in result
def test_nickname_mention(self):
result = redact_sensitive_text("Ping <@!1331549159177846844>")
assert "1331549159177846844" not in result
assert "<@!***>" in result
def test_multiple_mentions(self):
text = "<@111111111111111111> and <@222222222222222222>"
result = redact_sensitive_text(text)
assert "111111111111111111" not in result
assert "222222222222222222" not in result
def test_short_id_not_matched(self):
"""IDs shorter than 17 digits are not Discord snowflakes."""
text = "<@12345>"
assert redact_sensitive_text(text) == text
def test_slack_mention_not_matched(self):
"""Slack mentions use letters, not pure digits."""
text = "<@U024BE7LH>"
assert redact_sensitive_text(text) == text
def test_preserves_surrounding_text(self):
text = "User <@222589316709220353> said hello"
result = redact_sensitive_text(text)
assert result.startswith("User ")
assert result.endswith(" said hello")
+1 -115
View File
@@ -8,8 +8,6 @@ from unittest.mock import AsyncMock, patch, MagicMock
import pytest
from cron.scheduler import _resolve_origin, _resolve_delivery_target, _deliver_result, _send_media_via_adapter, run_job, SILENT_MARKER, _build_job_prompt
from tools.env_passthrough import clear_env_passthrough
from tools.credential_files import clear_credential_files
class TestResolveOrigin:
@@ -235,10 +233,9 @@ class TestDeliverResultWrapping:
send_mock.assert_called_once()
sent_content = send_mock.call_args.kwargs.get("content") or send_mock.call_args[0][-1]
assert "Cronjob Response: daily-report" in sent_content
assert "(job_id: test-job)" in sent_content
assert "-------------" in sent_content
assert "Here is today's summary." in sent_content
assert "To stop or manage this job" in sent_content
assert "The agent cannot see this message" in sent_content
def test_delivery_uses_job_id_when_no_name(self):
"""When a job has no name, the wrapper should fall back to job id."""
@@ -879,117 +876,6 @@ class TestRunJobPerJobOverrides:
class TestRunJobSkillBacked:
def test_run_job_preserves_skill_env_passthrough_into_worker_thread(self, tmp_path):
job = {
"id": "skill-env-job",
"name": "skill env test",
"prompt": "Use the skill.",
"skill": "notion",
}
fake_db = MagicMock()
def _skill_view(name):
assert name == "notion"
from tools.env_passthrough import register_env_passthrough
register_env_passthrough(["NOTION_API_KEY"])
return json.dumps({"success": True, "content": "# notion\nUse Notion."})
def _run_conversation(prompt):
from tools.env_passthrough import get_all_passthrough
assert "NOTION_API_KEY" in get_all_passthrough()
return {"final_response": "ok"}
with patch("cron.scheduler._hermes_home", tmp_path), \
patch("cron.scheduler._resolve_origin", return_value=None), \
patch("dotenv.load_dotenv"), \
patch("hermes_state.SessionDB", return_value=fake_db), \
patch(
"hermes_cli.runtime_provider.resolve_runtime_provider",
return_value={
"api_key": "***",
"base_url": "https://example.invalid/v1",
"provider": "openrouter",
"api_mode": "chat_completions",
},
), \
patch("tools.skills_tool.skill_view", side_effect=_skill_view), \
patch("run_agent.AIAgent") as mock_agent_cls:
mock_agent = MagicMock()
mock_agent.run_conversation.side_effect = _run_conversation
mock_agent_cls.return_value = mock_agent
try:
success, output, final_response, error = run_job(job)
finally:
clear_env_passthrough()
assert success is True
assert error is None
assert final_response == "ok"
def test_run_job_preserves_credential_file_passthrough_into_worker_thread(self, tmp_path):
"""copy_context() also propagates credential_files ContextVar."""
job = {
"id": "cred-env-job",
"name": "cred file test",
"prompt": "Use the skill.",
"skill": "google-workspace",
}
fake_db = MagicMock()
# Create a credential file so register_credential_file succeeds
cred_dir = tmp_path / "credentials"
cred_dir.mkdir()
(cred_dir / "google_token.json").write_text('{"token": "t"}')
def _skill_view(name):
assert name == "google-workspace"
from tools.credential_files import register_credential_file
register_credential_file("credentials/google_token.json")
return json.dumps({"success": True, "content": "# google-workspace\nUse Google."})
def _run_conversation(prompt):
from tools.credential_files import _get_registered
registered = _get_registered()
assert registered, "credential files must be visible in worker thread"
assert any("google_token.json" in v for v in registered.values())
return {"final_response": "ok"}
with patch("cron.scheduler._hermes_home", tmp_path), \
patch("cron.scheduler._resolve_origin", return_value=None), \
patch("tools.credential_files._resolve_hermes_home", return_value=tmp_path), \
patch("dotenv.load_dotenv"), \
patch("hermes_state.SessionDB", return_value=fake_db), \
patch(
"hermes_cli.runtime_provider.resolve_runtime_provider",
return_value={
"api_key": "***",
"base_url": "https://example.invalid/v1",
"provider": "openrouter",
"api_mode": "chat_completions",
},
), \
patch("tools.skills_tool.skill_view", side_effect=_skill_view), \
patch("run_agent.AIAgent") as mock_agent_cls:
mock_agent = MagicMock()
mock_agent.run_conversation.side_effect = _run_conversation
mock_agent_cls.return_value = mock_agent
try:
success, output, final_response, error = run_job(job)
finally:
clear_credential_files()
assert success is True
assert error is None
assert final_response == "ok"
def test_run_job_loads_skill_and_disables_recursive_cron_tools(self, tmp_path):
job = {
"id": "skill-job",
-169
View File
@@ -1016,47 +1016,6 @@ class TestResponsesEndpoint:
assert len(call_kwargs["conversation_history"]) > 0
assert call_kwargs["user_message"] == "Now add 1 more"
@pytest.mark.asyncio
async def test_previous_response_id_preserves_session(self, adapter):
"""Chained responses via previous_response_id reuse the same session_id."""
mock_result = {
"final_response": "ok",
"messages": [{"role": "assistant", "content": "ok"}],
"api_calls": 1,
}
usage = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
app = _create_app(adapter)
async with TestClient(TestServer(app)) as cli:
# First request — establishes a session
with patch.object(adapter, "_run_agent", new_callable=AsyncMock) as mock_run:
mock_run.return_value = (mock_result, usage)
resp1 = await cli.post(
"/v1/responses",
json={"model": "hermes-agent", "input": "Hello"},
)
assert resp1.status == 200
first_session_id = mock_run.call_args.kwargs["session_id"]
data1 = await resp1.json()
response_id = data1["id"]
# Second request — chains from the first
with patch.object(adapter, "_run_agent", new_callable=AsyncMock) as mock_run:
mock_run.return_value = (mock_result, usage)
resp2 = await cli.post(
"/v1/responses",
json={
"model": "hermes-agent",
"input": "Follow up",
"previous_response_id": response_id,
},
)
assert resp2.status == 200
second_session_id = mock_run.call_args.kwargs["session_id"]
# Session must be the same across the chain
assert first_session_id == second_session_id
@pytest.mark.asyncio
async def test_invalid_previous_response_id_returns_404(self, adapter):
app = _create_app(adapter)
@@ -1156,134 +1115,6 @@ class TestResponsesEndpoint:
assert resp.status == 400
class TestResponsesStreaming:
@pytest.mark.asyncio
async def test_stream_true_returns_responses_sse(self, adapter):
app = _create_app(adapter)
async with TestClient(TestServer(app)) as cli:
async def _mock_run_agent(**kwargs):
cb = kwargs.get("stream_delta_callback")
if cb:
cb("Hello")
cb(" world")
return (
{"final_response": "Hello world", "messages": [], "api_calls": 1},
{"input_tokens": 10, "output_tokens": 5, "total_tokens": 15},
)
with patch.object(adapter, "_run_agent", side_effect=_mock_run_agent):
resp = await cli.post(
"/v1/responses",
json={"model": "hermes-agent", "input": "hi", "stream": True},
)
assert resp.status == 200
assert "text/event-stream" in resp.headers.get("Content-Type", "")
body = await resp.text()
assert "event: response.created" in body
assert "event: response.output_text.delta" in body
assert "event: response.output_text.done" in body
assert "event: response.completed" in body
assert '"sequence_number":' in body
assert '"logprobs": []' in body
assert "Hello" in body
assert " world" in body
@pytest.mark.asyncio
async def test_stream_emits_function_call_and_output_items(self, adapter):
app = _create_app(adapter)
async with TestClient(TestServer(app)) as cli:
async def _mock_run_agent(**kwargs):
start_cb = kwargs.get("tool_start_callback")
complete_cb = kwargs.get("tool_complete_callback")
text_cb = kwargs.get("stream_delta_callback")
if start_cb:
start_cb("call_123", "read_file", {"path": "/tmp/test.txt"})
if complete_cb:
complete_cb("call_123", "read_file", {"path": "/tmp/test.txt"}, '{"content":"hello"}')
if text_cb:
text_cb("Done.")
return (
{
"final_response": "Done.",
"messages": [
{
"role": "assistant",
"tool_calls": [
{
"id": "call_123",
"function": {
"name": "read_file",
"arguments": '{"path":"/tmp/test.txt"}',
},
}
],
},
{
"role": "tool",
"tool_call_id": "call_123",
"content": '{"content":"hello"}',
},
],
"api_calls": 1,
},
{"input_tokens": 10, "output_tokens": 5, "total_tokens": 15},
)
with patch.object(adapter, "_run_agent", side_effect=_mock_run_agent):
resp = await cli.post(
"/v1/responses",
json={"model": "hermes-agent", "input": "read the file", "stream": True},
)
assert resp.status == 200
body = await resp.text()
assert "event: response.output_item.added" in body
assert "event: response.output_item.done" in body
assert body.count("event: response.output_item.done") >= 2
assert '"type": "function_call"' in body
assert '"type": "function_call_output"' in body
assert '"call_id": "call_123"' in body
assert '"name": "read_file"' in body
assert '"output": [{"type": "input_text", "text": "{\\"content\\":\\"hello\\"}"}]' in body
@pytest.mark.asyncio
async def test_streamed_response_is_stored_for_get(self, adapter):
app = _create_app(adapter)
async with TestClient(TestServer(app)) as cli:
async def _mock_run_agent(**kwargs):
cb = kwargs.get("stream_delta_callback")
if cb:
cb("Stored response")
return (
{"final_response": "Stored response", "messages": [], "api_calls": 1},
{"input_tokens": 1, "output_tokens": 2, "total_tokens": 3},
)
with patch.object(adapter, "_run_agent", side_effect=_mock_run_agent):
resp = await cli.post(
"/v1/responses",
json={"model": "hermes-agent", "input": "store this", "stream": True},
)
body = await resp.text()
response_id = None
for line in body.splitlines():
if line.startswith("data: "):
try:
payload = json.loads(line[len("data: "):])
except json.JSONDecodeError:
continue
if payload.get("type") == "response.completed":
response_id = payload["response"]["id"]
break
assert response_id
get_resp = await cli.get(f"/v1/responses/{response_id}")
assert get_resp.status == 200
data = await get_resp.json()
assert data["id"] == response_id
assert data["status"] == "completed"
assert data["output"][-1]["content"][0]["text"] == "Stored response"
# ---------------------------------------------------------------------------
# Auth on endpoints
# ---------------------------------------------------------------------------
-95
View File
@@ -1,95 +0,0 @@
"""Tests for the auto-continue feature (#4493).
When the gateway restarts mid-agent-work, the session transcript ends on a
tool result that the agent never processed. The auto-continue logic detects
this and prepends a system note to the next user message so the model
finishes the interrupted work before addressing the new input.
"""
import pytest
def _simulate_auto_continue(agent_history: list, user_message: str) -> str:
"""Reproduce the auto-continue injection logic from _run_agent().
This mirrors the exact code in gateway/run.py so we can test the
detection and message transformation without spinning up a full
gateway runner.
"""
message = user_message
if agent_history and agent_history[-1].get("role") == "tool":
message = (
"[System note: Your previous turn was interrupted before you could "
"process the last tool result(s). The conversation history contains "
"tool outputs you haven't responded to yet. Please finish processing "
"those results and summarize what was accomplished, then address the "
"user's new message below.]\n\n"
+ message
)
return message
class TestAutoDetection:
"""Test that trailing tool results are correctly detected."""
def test_trailing_tool_result_triggers_note(self):
history = [
{"role": "user", "content": "deploy the app"},
{"role": "assistant", "content": None, "tool_calls": [
{"id": "call_1", "function": {"name": "terminal", "arguments": "{}"}}
]},
{"role": "tool", "tool_call_id": "call_1", "content": "deployed successfully"},
]
result = _simulate_auto_continue(history, "what happened?")
assert "[System note:" in result
assert "interrupted" in result
assert "what happened?" in result
def test_trailing_assistant_message_no_note(self):
history = [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "Hi there!"},
]
result = _simulate_auto_continue(history, "how are you?")
assert "[System note:" not in result
assert result == "how are you?"
def test_empty_history_no_note(self):
result = _simulate_auto_continue([], "hello")
assert result == "hello"
def test_trailing_user_message_no_note(self):
"""Shouldn't happen in practice, but ensure no false positive."""
history = [
{"role": "user", "content": "hello"},
]
result = _simulate_auto_continue(history, "hello again")
assert result == "hello again"
def test_multiple_tool_results_still_triggers(self):
"""Multiple tool calls in a row — last one is still role=tool."""
history = [
{"role": "user", "content": "search and read"},
{"role": "assistant", "content": None, "tool_calls": [
{"id": "call_1", "function": {"name": "search", "arguments": "{}"}},
{"id": "call_2", "function": {"name": "read", "arguments": "{}"}},
]},
{"role": "tool", "tool_call_id": "call_1", "content": "found it"},
{"role": "tool", "tool_call_id": "call_2", "content": "file content here"},
]
result = _simulate_auto_continue(history, "continue")
assert "[System note:" in result
def test_original_message_preserved_after_note(self):
"""The user's actual message must appear after the system note."""
history = [
{"role": "assistant", "content": None, "tool_calls": [
{"id": "c1", "function": {"name": "t", "arguments": "{}"}}
]},
{"role": "tool", "tool_call_id": "c1", "content": "done"},
]
result = _simulate_auto_continue(history, "now do X")
# System note comes first, then user's message
note_end = result.index("]\n\n")
user_msg_start = result.index("now do X")
assert user_msg_start > note_end
@@ -14,7 +14,7 @@ from unittest.mock import AsyncMock, patch
import pytest
from gateway.config import GatewayConfig, Platform
from gateway.run import GatewayRunner, _parse_session_key
from gateway.run import GatewayRunner
# ---------------------------------------------------------------------------
@@ -45,7 +45,7 @@ def _build_runner(monkeypatch, tmp_path, mode: str) -> GatewayRunner:
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
runner = GatewayRunner(GatewayConfig())
adapter = SimpleNamespace(send=AsyncMock(), handle_message=AsyncMock())
adapter = SimpleNamespace(send=AsyncMock())
runner.adapters[Platform.TELEGRAM] = adapter
return runner
@@ -243,174 +243,3 @@ async def test_no_thread_id_sends_no_metadata(monkeypatch, tmp_path):
assert adapter.send.await_count == 1
_, kwargs = adapter.send.call_args
assert kwargs["metadata"] is None
@pytest.mark.asyncio
async def test_inject_watch_notification_routes_from_session_store_origin(monkeypatch, tmp_path):
from gateway.session import SessionSource
runner = _build_runner(monkeypatch, tmp_path, "all")
adapter = runner.adapters[Platform.TELEGRAM]
runner.session_store._entries["agent:main:telegram:group:-100:42"] = SimpleNamespace(
origin=SessionSource(
platform=Platform.TELEGRAM,
chat_id="-100",
chat_type="group",
thread_id="42",
user_id="123",
user_name="Emiliyan",
)
)
evt = {
"session_id": "proc_watch",
"session_key": "agent:main:telegram:group:-100:42",
}
await runner._inject_watch_notification("[SYSTEM: Background process matched]", evt)
adapter.handle_message.assert_awaited_once()
synth_event = adapter.handle_message.await_args.args[0]
assert synth_event.internal is True
assert synth_event.source.platform == Platform.TELEGRAM
assert synth_event.source.chat_id == "-100"
assert synth_event.source.chat_type == "group"
assert synth_event.source.thread_id == "42"
assert synth_event.source.user_id == "123"
assert synth_event.source.user_name == "Emiliyan"
def test_build_process_event_source_falls_back_to_session_key_chat_type(monkeypatch, tmp_path):
runner = _build_runner(monkeypatch, tmp_path, "all")
evt = {
"session_id": "proc_watch",
"session_key": "agent:main:telegram:group:-100:42",
"platform": "telegram",
"chat_id": "-100",
"thread_id": "42",
"user_id": "123",
"user_name": "Emiliyan",
}
source = runner._build_process_event_source(evt)
assert source is not None
assert source.platform == Platform.TELEGRAM
assert source.chat_id == "-100"
assert source.chat_type == "group"
assert source.thread_id == "42"
assert source.user_id == "123"
assert source.user_name == "Emiliyan"
@pytest.mark.asyncio
async def test_inject_watch_notification_ignores_foreground_event_source(monkeypatch, tmp_path):
"""Negative test: watch notification must NOT route to the foreground thread."""
from gateway.session import SessionSource
runner = _build_runner(monkeypatch, tmp_path, "all")
adapter = runner.adapters[Platform.TELEGRAM]
# Session store has the process's original thread (thread 42)
runner.session_store._entries["agent:main:telegram:group:-100:42"] = SimpleNamespace(
origin=SessionSource(
platform=Platform.TELEGRAM,
chat_id="-100",
chat_type="group",
thread_id="42",
user_id="proc_owner",
user_name="alice",
)
)
# The evt dict carries the correct session_key — NOT a foreground event
evt = {
"session_id": "proc_cross_thread",
"session_key": "agent:main:telegram:group:-100:42",
}
await runner._inject_watch_notification("[SYSTEM: watch match]", evt)
adapter.handle_message.assert_awaited_once()
synth_event = adapter.handle_message.await_args.args[0]
# Must route to thread 42 (process origin), NOT some other thread
assert synth_event.source.thread_id == "42"
assert synth_event.source.user_id == "proc_owner"
def test_build_process_event_source_returns_none_for_empty_evt(monkeypatch, tmp_path):
"""Missing session_key and no platform metadata → None (drop notification)."""
runner = _build_runner(monkeypatch, tmp_path, "all")
source = runner._build_process_event_source({"session_id": "proc_orphan"})
assert source is None
def test_build_process_event_source_returns_none_for_invalid_platform(monkeypatch, tmp_path):
"""Invalid platform string → None."""
runner = _build_runner(monkeypatch, tmp_path, "all")
evt = {
"session_id": "proc_bad",
"platform": "not_a_real_platform",
"chat_type": "dm",
"chat_id": "123",
}
source = runner._build_process_event_source(evt)
assert source is None
def test_build_process_event_source_returns_none_for_short_session_key(monkeypatch, tmp_path):
"""Session key with <5 parts doesn't parse, falls through to empty metadata → None."""
runner = _build_runner(monkeypatch, tmp_path, "all")
evt = {
"session_id": "proc_short",
"session_key": "agent:main:telegram", # Too few parts
}
source = runner._build_process_event_source(evt)
assert source is None
# ---------------------------------------------------------------------------
# _parse_session_key helper
# ---------------------------------------------------------------------------
def test_parse_session_key_valid():
result = _parse_session_key("agent:main:telegram:group:-100")
assert result == {"platform": "telegram", "chat_type": "group", "chat_id": "-100"}
def test_parse_session_key_with_extra_parts():
"""6th part in a group key may be a user_id, not a thread_id — omit it."""
result = _parse_session_key("agent:main:discord:group:chan123:thread456")
assert result == {"platform": "discord", "chat_type": "group", "chat_id": "chan123"}
def test_parse_session_key_with_user_id_part():
"""Group keys with per-user isolation have user_id as 6th part — don't return as thread_id."""
result = _parse_session_key("agent:main:telegram:group:chat1:user99")
assert result == {"platform": "telegram", "chat_type": "group", "chat_id": "chat1"}
def test_parse_session_key_dm_with_thread():
"""DM keys use parts[5] as thread_id unambiguously."""
result = _parse_session_key("agent:main:telegram:dm:chat1:topic42")
assert result == {"platform": "telegram", "chat_type": "dm", "chat_id": "chat1", "thread_id": "topic42"}
def test_parse_session_key_thread_chat_type():
"""Thread-typed keys use parts[5] as thread_id unambiguously."""
result = _parse_session_key("agent:main:discord:thread:chan1:thread99")
assert result == {"platform": "discord", "chat_type": "thread", "chat_id": "chan1", "thread_id": "thread99"}
def test_parse_session_key_too_short():
assert _parse_session_key("agent:main:telegram") is None
assert _parse_session_key("") is None
def test_parse_session_key_wrong_prefix():
assert _parse_session_key("cron:main:telegram:dm:123") is None
assert _parse_session_key("agent:cron:telegram:dm:123") is None
-293
View File
@@ -1,293 +0,0 @@
"""Tests for busy-session acknowledgment when user sends messages during active agent runs.
Verifies that users get an immediate status response instead of total silence
when the agent is working on a task. See PR fix for the @Lonely__MH report.
"""
import asyncio
import time
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
# ---------------------------------------------------------------------------
# Minimal stubs so we can import gateway code without heavy deps
# ---------------------------------------------------------------------------
import sys, types
_tg = types.ModuleType("telegram")
_tg.constants = types.ModuleType("telegram.constants")
_ct = MagicMock()
_ct.SUPERGROUP = "supergroup"
_ct.GROUP = "group"
_ct.PRIVATE = "private"
_tg.constants.ChatType = _ct
sys.modules.setdefault("telegram", _tg)
sys.modules.setdefault("telegram.constants", _tg.constants)
sys.modules.setdefault("telegram.ext", types.ModuleType("telegram.ext"))
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SessionSource,
build_session_key,
)
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _make_event(text="hello", chat_id="123", platform_val="telegram"):
"""Build a minimal MessageEvent."""
source = SessionSource(
platform=MagicMock(value=platform_val),
chat_id=chat_id,
chat_type="private",
user_id="user1",
)
evt = MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=source,
message_id="msg1",
)
return evt
def _make_runner():
"""Build a minimal GatewayRunner-like object for testing."""
from gateway.run import GatewayRunner, _AGENT_PENDING_SENTINEL
runner = object.__new__(GatewayRunner)
runner._running_agents = {}
runner._running_agents_ts = {}
runner._pending_messages = {}
runner._busy_ack_ts = {}
runner._draining = False
runner.adapters = {}
runner.config = MagicMock()
runner.session_store = None
runner.hooks = MagicMock()
runner.hooks.emit = AsyncMock()
return runner, _AGENT_PENDING_SENTINEL
def _make_adapter(platform_val="telegram"):
"""Build a minimal adapter mock."""
adapter = MagicMock()
adapter._pending_messages = {}
adapter._send_with_retry = AsyncMock()
adapter.config = MagicMock()
adapter.config.extra = {}
adapter.platform = MagicMock(value=platform_val)
return adapter
# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------
class TestBusySessionAck:
"""User sends a message while agent is running — should get acknowledgment."""
@pytest.mark.asyncio
async def test_sends_ack_when_agent_running(self):
"""First message during busy session should get a status ack."""
runner, sentinel = _make_runner()
adapter = _make_adapter()
event = _make_event(text="Are you working?")
sk = build_session_key(event.source)
# Simulate running agent
agent = MagicMock()
agent.get_activity_summary.return_value = {
"api_call_count": 21,
"max_iterations": 60,
"current_tool": "terminal",
"last_activity_ts": time.time(),
"last_activity_desc": "terminal",
"seconds_since_activity": 1.0,
}
runner._running_agents[sk] = agent
runner._running_agents_ts[sk] = time.time() - 600 # 10 min ago
runner.adapters[event.source.platform] = adapter
result = await runner._handle_active_session_busy_message(event, sk)
assert result is True # handled
# Verify ack was sent
adapter._send_with_retry.assert_called_once()
call_kwargs = adapter._send_with_retry.call_args
content = call_kwargs.kwargs.get("content") or call_kwargs[1].get("content", "")
if not content and call_kwargs.args:
# positional args
content = str(call_kwargs)
assert "Interrupting" in content or "respond" in content
assert "/stop" not in content # no need — we ARE interrupting
# Verify message was queued in adapter pending
assert sk in adapter._pending_messages
# Verify agent interrupt was called
agent.interrupt.assert_called_once_with("Are you working?")
@pytest.mark.asyncio
async def test_debounce_suppresses_rapid_acks(self):
"""Second message within 30s should NOT send another ack."""
runner, sentinel = _make_runner()
adapter = _make_adapter()
event1 = _make_event(text="hello?")
# Reuse the same source so platform mock matches
event2 = MessageEvent(
text="still there?",
message_type=MessageType.TEXT,
source=event1.source,
message_id="msg2",
)
sk = build_session_key(event1.source)
agent = MagicMock()
agent.get_activity_summary.return_value = {
"api_call_count": 5,
"max_iterations": 60,
"current_tool": None,
"last_activity_ts": time.time(),
"last_activity_desc": "api_call",
"seconds_since_activity": 0.5,
}
runner._running_agents[sk] = agent
runner._running_agents_ts[sk] = time.time() - 60
runner.adapters[event1.source.platform] = adapter
# First message — should get ack
result1 = await runner._handle_active_session_busy_message(event1, sk)
assert result1 is True
assert adapter._send_with_retry.call_count == 1
# Second message within cooldown — should be queued but no ack
result2 = await runner._handle_active_session_busy_message(event2, sk)
assert result2 is True
assert adapter._send_with_retry.call_count == 1 # still 1, no new ack
# But interrupt should still be called for both
assert agent.interrupt.call_count == 2
@pytest.mark.asyncio
async def test_ack_after_cooldown_expires(self):
"""After 30s cooldown, a new message should send a fresh ack."""
runner, sentinel = _make_runner()
adapter = _make_adapter()
event = _make_event(text="hello?")
sk = build_session_key(event.source)
agent = MagicMock()
agent.get_activity_summary.return_value = {
"api_call_count": 10,
"max_iterations": 60,
"current_tool": "web_search",
"last_activity_ts": time.time(),
"last_activity_desc": "tool",
"seconds_since_activity": 0.5,
}
runner._running_agents[sk] = agent
runner._running_agents_ts[sk] = time.time() - 120
runner.adapters[event.source.platform] = adapter
# First ack
await runner._handle_active_session_busy_message(event, sk)
assert adapter._send_with_retry.call_count == 1
# Fake that cooldown expired
runner._busy_ack_ts[sk] = time.time() - 31
# Second ack should go through
await runner._handle_active_session_busy_message(event, sk)
assert adapter._send_with_retry.call_count == 2
@pytest.mark.asyncio
async def test_includes_status_detail(self):
"""Ack message should include iteration and tool info when available."""
runner, sentinel = _make_runner()
adapter = _make_adapter()
event = _make_event(text="yo")
sk = build_session_key(event.source)
agent = MagicMock()
agent.get_activity_summary.return_value = {
"api_call_count": 21,
"max_iterations": 60,
"current_tool": "terminal",
"last_activity_ts": time.time(),
"last_activity_desc": "terminal",
"seconds_since_activity": 0.5,
}
runner._running_agents[sk] = agent
runner._running_agents_ts[sk] = time.time() - 600 # 10 min
runner.adapters[event.source.platform] = adapter
await runner._handle_active_session_busy_message(event, sk)
call_kwargs = adapter._send_with_retry.call_args
content = call_kwargs.kwargs.get("content", "")
assert "21/60" in content # iteration
assert "terminal" in content # current tool
assert "10 min" in content # elapsed
@pytest.mark.asyncio
async def test_draining_still_works(self):
"""Draining case should still produce the drain-specific message."""
runner, sentinel = _make_runner()
runner._draining = True
adapter = _make_adapter()
event = _make_event(text="hello")
sk = build_session_key(event.source)
runner.adapters[event.source.platform] = adapter
# Mock the drain-specific methods
runner._queue_during_drain_enabled = lambda: False
runner._status_action_gerund = lambda: "restarting"
result = await runner._handle_active_session_busy_message(event, sk)
assert result is True
call_kwargs = adapter._send_with_retry.call_args
content = call_kwargs.kwargs.get("content", "")
assert "restarting" in content
@pytest.mark.asyncio
async def test_pending_sentinel_no_interrupt(self):
"""When agent is PENDING_SENTINEL, don't call interrupt (it has no method)."""
runner, sentinel = _make_runner()
adapter = _make_adapter()
event = _make_event(text="hey")
sk = build_session_key(event.source)
runner._running_agents[sk] = sentinel
runner._running_agents_ts[sk] = time.time()
runner.adapters[event.source.platform] = adapter
result = await runner._handle_active_session_busy_message(event, sk)
assert result is True
# Should still send ack
adapter._send_with_retry.assert_called_once()
@pytest.mark.asyncio
async def test_no_adapter_falls_through(self):
"""If adapter is missing, return False so default path handles it."""
runner, sentinel = _make_runner()
event = _make_event(text="hello")
sk = build_session_key(event.source)
# No adapter registered
runner._running_agents[sk] = MagicMock()
result = await runner._handle_active_session_busy_message(event, sk)
assert result is False # not handled, let default path try
-61
View File
@@ -193,67 +193,6 @@ class TestLoadGatewayConfig:
assert config.thread_sessions_per_user is False
def test_bridges_discord_channel_prompts_from_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text(
"discord:\n"
" channel_prompts:\n"
" \"123\": Research mode\n"
" 456: Therapist mode\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
config = load_gateway_config()
assert config.platforms[Platform.DISCORD].extra["channel_prompts"] == {
"123": "Research mode",
"456": "Therapist mode",
}
def test_bridges_telegram_channel_prompts_from_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text(
"telegram:\n"
" channel_prompts:\n"
' "-1001234567": Research assistant\n'
" 789: Creative writing\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
config = load_gateway_config()
assert config.platforms[Platform.TELEGRAM].extra["channel_prompts"] == {
"-1001234567": "Research assistant",
"789": "Creative writing",
}
def test_bridges_slack_channel_prompts_from_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text(
"slack:\n"
" channel_prompts:\n"
' "C01ABC": Code review mode\n',
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
config = load_gateway_config()
assert config.platforms[Platform.SLACK].extra["channel_prompts"] == {
"C01ABC": "Code review mode",
}
def test_invalid_quick_commands_in_config_yaml_are_ignored(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
@@ -1,259 +0,0 @@
"""Tests for Discord channel_prompts resolution and injection."""
import sys
import threading
import types
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock
import pytest
def _ensure_discord_mock():
if "discord" in sys.modules and hasattr(sys.modules["discord"], "__file__"):
return
discord_mod = types.ModuleType("discord")
discord_mod.Intents = MagicMock()
discord_mod.Intents.default.return_value = MagicMock()
discord_mod.DMChannel = type("DMChannel", (), {})
discord_mod.Thread = type("Thread", (), {})
discord_mod.ForumChannel = type("ForumChannel", (), {})
discord_mod.Interaction = object
ext_mod = MagicMock()
commands_mod = MagicMock()
commands_mod.Bot = MagicMock
ext_mod.commands = commands_mod
sys.modules.setdefault("discord", discord_mod)
sys.modules.setdefault("discord.ext", ext_mod)
sys.modules.setdefault("discord.ext.commands", commands_mod)
import gateway.run as gateway_run
from gateway.config import Platform
from gateway.platforms.base import MessageEvent
from gateway.session import SessionSource
class _CapturingAgent:
last_init = None
def __init__(self, *args, **kwargs):
type(self).last_init = dict(kwargs)
self.tools = []
def run_conversation(self, user_message, conversation_history=None, task_id=None, persist_user_message=None):
return {
"final_response": "ok",
"messages": [],
"api_calls": 1,
"completed": True,
}
def _install_fake_agent(monkeypatch):
fake_run_agent = types.ModuleType("run_agent")
fake_run_agent.AIAgent = _CapturingAgent
monkeypatch.setitem(sys.modules, "run_agent", fake_run_agent)
def _make_adapter():
_ensure_discord_mock()
from gateway.platforms.discord import DiscordAdapter
adapter = object.__new__(DiscordAdapter)
adapter.config = MagicMock()
adapter.config.extra = {}
return adapter
def _make_runner():
runner = object.__new__(gateway_run.GatewayRunner)
runner.adapters = {}
runner._ephemeral_system_prompt = "Global prompt"
runner._prefill_messages = []
runner._reasoning_config = None
runner._service_tier = None
runner._provider_routing = {}
runner._fallback_model = None
runner._smart_model_routing = {}
runner._running_agents = {}
runner._pending_model_notes = {}
runner._session_db = None
runner._agent_cache = {}
runner._agent_cache_lock = threading.Lock()
runner._session_model_overrides = {}
runner.hooks = SimpleNamespace(loaded_hooks=False)
runner.config = SimpleNamespace(streaming=None)
runner.session_store = SimpleNamespace(
get_or_create_session=lambda source: SimpleNamespace(session_id="session-1"),
load_transcript=lambda session_id: [],
)
runner._get_or_create_gateway_honcho = lambda session_key: (None, None)
runner._enrich_message_with_vision = AsyncMock(return_value="ENRICHED")
return runner
def _make_source() -> SessionSource:
return SessionSource(
platform=Platform.DISCORD,
chat_id="12345",
chat_type="thread",
user_id="user-1",
)
class TestResolveChannelPrompts:
def test_no_prompt_returns_none(self):
adapter = _make_adapter()
assert adapter._resolve_channel_prompt("123") is None
def test_match_by_channel_id(self):
adapter = _make_adapter()
adapter.config.extra = {"channel_prompts": {"100": "Research mode"}}
assert adapter._resolve_channel_prompt("100") == "Research mode"
def test_numeric_yaml_keys_normalized_at_config_load(self):
"""Numeric YAML keys are normalized to strings by config bridging.
The resolver itself expects string keys (config.py handles normalization),
so raw numeric keys will not match this is intentional.
"""
adapter = _make_adapter()
# Simulates post-bridging state: keys are already strings
adapter.config.extra = {"channel_prompts": {"100": "Research mode"}}
assert adapter._resolve_channel_prompt("100") == "Research mode"
# Pre-bridging numeric key would not match (bridging is responsible)
adapter.config.extra = {"channel_prompts": {100: "Research mode"}}
assert adapter._resolve_channel_prompt("100") is None
def test_match_by_parent_id(self):
adapter = _make_adapter()
adapter.config.extra = {"channel_prompts": {"200": "Forum prompt"}}
assert adapter._resolve_channel_prompt("999", parent_id="200") == "Forum prompt"
def test_exact_channel_overrides_parent(self):
adapter = _make_adapter()
adapter.config.extra = {
"channel_prompts": {
"999": "Thread override",
"200": "Forum prompt",
}
}
assert adapter._resolve_channel_prompt("999", parent_id="200") == "Thread override"
def test_build_message_event_sets_channel_prompt(self):
adapter = _make_adapter()
adapter.config.extra = {"channel_prompts": {"321": "Command prompt"}}
adapter.build_source = MagicMock(return_value=SimpleNamespace())
interaction = SimpleNamespace(
channel_id=321,
channel=SimpleNamespace(name="general", guild=None, parent_id=None),
user=SimpleNamespace(id=1, display_name="Brenner"),
)
adapter._get_effective_topic = MagicMock(return_value=None)
event = adapter._build_slash_event(interaction, "/retry")
assert event.channel_prompt == "Command prompt"
@pytest.mark.asyncio
async def test_dispatch_thread_session_inherits_parent_channel_prompt(self):
adapter = _make_adapter()
adapter.config.extra = {"channel_prompts": {"200": "Parent prompt"}}
adapter.build_source = MagicMock(return_value=SimpleNamespace())
adapter._get_effective_topic = MagicMock(return_value=None)
adapter.handle_message = AsyncMock()
interaction = SimpleNamespace(
guild=SimpleNamespace(name="Wetlands"),
channel=SimpleNamespace(id=200, parent=None),
user=SimpleNamespace(id=1, display_name="Brenner"),
)
await adapter._dispatch_thread_session(interaction, "999", "new-thread", "hello")
dispatched_event = adapter.handle_message.await_args.args[0]
assert dispatched_event.channel_prompt == "Parent prompt"
def test_blank_prompts_are_ignored(self):
adapter = _make_adapter()
adapter.config.extra = {"channel_prompts": {"100": " "}}
assert adapter._resolve_channel_prompt("100") is None
@pytest.mark.asyncio
async def test_retry_preserves_channel_prompt(monkeypatch):
runner = _make_runner()
runner.session_store = SimpleNamespace(
get_or_create_session=lambda source: SimpleNamespace(session_id="session-1", last_prompt_tokens=10),
load_transcript=lambda session_id: [
{"role": "user", "content": "original message"},
{"role": "assistant", "content": "old reply"},
],
rewrite_transcript=MagicMock(),
)
runner._handle_message = AsyncMock(return_value="ok")
event = MessageEvent(
text="/retry",
message_type=gateway_run.MessageType.COMMAND,
source=_make_source(),
raw_message=SimpleNamespace(),
channel_prompt="Channel prompt",
)
result = await runner._handle_retry_command(event)
assert result == "ok"
retried_event = runner._handle_message.await_args.args[0]
assert retried_event.channel_prompt == "Channel prompt"
@pytest.mark.asyncio
async def test_run_agent_appends_channel_prompt_to_ephemeral_system_prompt(monkeypatch, tmp_path):
_install_fake_agent(monkeypatch)
runner = _make_runner()
(tmp_path / "config.yaml").write_text("agent:\n system_prompt: Global prompt\n", encoding="utf-8")
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
monkeypatch.setattr(gateway_run, "_env_path", tmp_path / ".env")
monkeypatch.setattr(gateway_run, "load_dotenv", lambda *args, **kwargs: None)
monkeypatch.setattr(gateway_run, "_load_gateway_config", lambda: {})
monkeypatch.setattr(gateway_run, "_resolve_gateway_model", lambda config=None: "gpt-5.4")
monkeypatch.setattr(
gateway_run,
"_resolve_runtime_agent_kwargs",
lambda: {
"provider": "openrouter",
"api_mode": "chat_completions",
"base_url": "https://openrouter.ai/api/v1",
"api_key": "***",
},
)
import hermes_cli.tools_config as tools_config
monkeypatch.setattr(tools_config, "_get_platform_tools", lambda user_config, platform_key: {"core"})
_CapturingAgent.last_init = None
event = MessageEvent(
text="hi",
source=_make_source(),
message_id="m1",
channel_prompt="Channel prompt",
)
result = await runner._run_agent(
message="hi",
context_prompt="Context prompt",
history=[],
source=_make_source(),
session_id="session-1",
session_key="agent:main:discord:thread:12345",
channel_prompt=event.channel_prompt,
)
assert result["final_response"] == "ok"
assert _CapturingAgent.last_init["ephemeral_system_prompt"] == (
"Context prompt\n\nChannel prompt\n\nGlobal prompt"
)
@@ -117,74 +117,6 @@ async def test_registers_native_thread_slash_command(adapter):
adapter._handle_thread_create_slash.assert_awaited_once_with(interaction, "Planning", "", 1440)
@pytest.mark.asyncio
async def test_registers_native_restart_slash_command(adapter):
adapter._run_simple_slash = AsyncMock()
adapter._register_slash_commands()
assert "restart" in adapter._client.tree.commands
interaction = SimpleNamespace()
await adapter._client.tree.commands["restart"](interaction)
adapter._run_simple_slash.assert_awaited_once_with(
interaction,
"/restart",
"Restart requested~",
)
# ------------------------------------------------------------------
# Auto-registration from COMMAND_REGISTRY
# ------------------------------------------------------------------
@pytest.mark.asyncio
async def test_auto_registers_missing_gateway_commands(adapter):
"""Commands in COMMAND_REGISTRY that aren't explicitly registered should
be auto-registered by the dynamic catch-all block."""
adapter._run_simple_slash = AsyncMock()
adapter._register_slash_commands()
tree_names = set(adapter._client.tree.commands.keys())
# These commands are gateway-available but were not in the original
# hardcoded registration list — they should be auto-registered.
expected_auto = {"debug", "yolo", "reload", "profile"}
for name in expected_auto:
assert name in tree_names, f"/{name} should be auto-registered on Discord"
@pytest.mark.asyncio
async def test_auto_registered_command_dispatches_correctly(adapter):
"""Auto-registered commands should dispatch via _run_simple_slash."""
adapter._run_simple_slash = AsyncMock()
adapter._register_slash_commands()
# /debug has no args — test parameterless dispatch
debug_cmd = adapter._client.tree.commands["debug"]
interaction = SimpleNamespace()
adapter._run_simple_slash.reset_mock()
await debug_cmd.callback(interaction)
adapter._run_simple_slash.assert_awaited_once_with(interaction, "/debug")
@pytest.mark.asyncio
async def test_auto_registered_command_with_args(adapter):
"""Auto-registered commands with args_hint should accept an optional args param."""
adapter._run_simple_slash = AsyncMock()
adapter._register_slash_commands()
# /branch has args_hint="[name]" — test dispatch with args
branch_cmd = adapter._client.tree.commands["branch"]
interaction = SimpleNamespace()
adapter._run_simple_slash.reset_mock()
await branch_cmd.callback(interaction, args="my-branch")
adapter._run_simple_slash.assert_awaited_once_with(
interaction, "/branch my-branch"
)
# ------------------------------------------------------------------
# _handle_thread_create_slash — success, session dispatch, failure
# ------------------------------------------------------------------
@@ -1,354 +0,0 @@
"""Tests for duplicate reply suppression across the gateway stack.
Covers three fix paths:
1. base.py: stale response suppressed when interrupt_event is set and a
pending message exists (#8221 / #2483)
2. run.py return path: already_sent propagated from stream consumer's
already_sent flag without requiring response_previewed (#8375)
3. run.py queued-message path: first response correctly detected as
already-streamed when already_sent is True without response_previewed
"""
import asyncio
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
ProcessingOutcome,
SendResult,
)
from gateway.session import SessionSource, build_session_key
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
class StubAdapter(BasePlatformAdapter):
"""Minimal concrete adapter for testing."""
def __init__(self):
super().__init__(PlatformConfig(enabled=True, token="fake"), Platform.DISCORD)
self.sent = []
async def connect(self):
return True
async def disconnect(self):
pass
async def send(self, chat_id, content, reply_to=None, metadata=None):
self.sent.append({"chat_id": chat_id, "content": content})
return SendResult(success=True, message_id="msg1")
async def send_typing(self, chat_id, metadata=None):
pass
async def get_chat_info(self, chat_id):
return {"id": chat_id}
def _make_event(text="hello", chat_id="c1", user_id="u1"):
return MessageEvent(
text=text,
source=SessionSource(
platform=Platform.DISCORD,
chat_id=chat_id,
chat_type="dm",
user_id=user_id,
),
message_id="m1",
)
# ===================================================================
# Test 1: base.py — stale response suppressed on interrupt (#8221)
# ===================================================================
class TestBaseInterruptSuppression:
@pytest.mark.asyncio
async def test_stale_response_suppressed_when_interrupted(self):
"""When interrupt_event is set AND a pending message exists,
base.py should suppress the stale response instead of sending it."""
adapter = StubAdapter()
stale_response = "This is the stale answer to the first question."
pending_response = "This is the answer to the second question."
call_count = 0
async def fake_handler(event):
nonlocal call_count
call_count += 1
if call_count == 1:
return stale_response
return pending_response
adapter.set_message_handler(fake_handler)
event_a = _make_event(text="first question")
session_key = build_session_key(event_a.source)
# Simulate: message A is being processed, message B arrives
# The interrupt event is set and B is in pending_messages
interrupt_event = asyncio.Event()
interrupt_event.set()
adapter._active_sessions[session_key] = interrupt_event
event_b = _make_event(text="second question")
adapter._pending_messages[session_key] = event_b
await adapter._process_message_background(event_a, session_key)
# The stale response should NOT have been sent.
stale_sends = [s for s in adapter.sent if s["content"] == stale_response]
assert len(stale_sends) == 0, (
f"Stale response was sent {len(stale_sends)} time(s) — should be suppressed"
)
# The pending message's response SHOULD have been sent.
pending_sends = [s for s in adapter.sent if s["content"] == pending_response]
assert len(pending_sends) == 1, "Pending message response should be sent"
@pytest.mark.asyncio
async def test_response_not_suppressed_without_interrupt(self):
"""Normal case: no interrupt, response should be sent."""
adapter = StubAdapter()
async def fake_handler(event):
return "Normal response"
adapter.set_message_handler(fake_handler)
event = _make_event()
session_key = build_session_key(event.source)
await adapter._process_message_background(event, session_key)
assert any(s["content"] == "Normal response" for s in adapter.sent)
@pytest.mark.asyncio
async def test_response_not_suppressed_with_interrupt_but_no_pending(self):
"""Interrupt event set but no pending message (race already resolved) —
response should still be sent."""
adapter = StubAdapter()
async def fake_handler(event):
return "Valid response"
adapter.set_message_handler(fake_handler)
event = _make_event()
session_key = build_session_key(event.source)
# Set interrupt but no pending message
interrupt_event = asyncio.Event()
interrupt_event.set()
adapter._active_sessions[session_key] = interrupt_event
await adapter._process_message_background(event, session_key)
assert any(s["content"] == "Valid response" for s in adapter.sent)
# ===================================================================
# Test 2: run.py — already_sent without response_previewed (#8375)
# ===================================================================
class TestAlreadySentWithoutResponsePreviewed:
"""The already_sent flag on the response dict should be set when the
stream consumer's already_sent is True, even if response_previewed is
False. This prevents duplicate sends when streaming was interrupted
by flood control."""
def _make_mock_stream_consumer(self, already_sent=False, final_response_sent=False):
sc = SimpleNamespace(
already_sent=already_sent,
final_response_sent=final_response_sent,
)
return sc
def test_already_sent_set_without_response_previewed(self):
"""Stream consumer already_sent=True should propagate to response
dict even when response_previewed is False."""
sc = self._make_mock_stream_consumer(already_sent=True, final_response_sent=False)
response = {"final_response": "text", "response_previewed": False}
# Reproduce the logic from run.py return path (post-fix)
if sc and isinstance(response, dict) and not response.get("failed"):
if (
getattr(sc, "final_response_sent", False)
or getattr(sc, "already_sent", False)
):
response["already_sent"] = True
assert response.get("already_sent") is True
def test_already_sent_not_set_when_nothing_sent(self):
"""When stream consumer hasn't sent anything, already_sent should
not be set on the response."""
sc = self._make_mock_stream_consumer(already_sent=False, final_response_sent=False)
response = {"final_response": "text", "response_previewed": False}
if sc and isinstance(response, dict) and not response.get("failed"):
if (
getattr(sc, "final_response_sent", False)
or getattr(sc, "already_sent", False)
):
response["already_sent"] = True
assert "already_sent" not in response
def test_already_sent_set_on_final_response_sent(self):
"""final_response_sent=True should still work as before."""
sc = self._make_mock_stream_consumer(already_sent=False, final_response_sent=True)
response = {"final_response": "text"}
if sc and isinstance(response, dict) and not response.get("failed"):
if (
getattr(sc, "final_response_sent", False)
or getattr(sc, "already_sent", False)
):
response["already_sent"] = True
assert response.get("already_sent") is True
def test_already_sent_not_set_on_failed_response(self):
"""Failed responses should never be suppressed — user needs to see
the error message even if streaming sent earlier partial output."""
sc = self._make_mock_stream_consumer(already_sent=True, final_response_sent=False)
response = {"final_response": "Error: something broke", "failed": True}
if sc and isinstance(response, dict) and not response.get("failed"):
if (
getattr(sc, "final_response_sent", False)
or getattr(sc, "already_sent", False)
):
response["already_sent"] = True
assert "already_sent" not in response
# ===================================================================
# Test 2b: run.py — empty response never suppressed (#10xxx)
# ===================================================================
class TestEmptyResponseNotSuppressed:
"""When the model returns '(empty)' after tool calls (e.g. mimo-v2-pro
going silent after web_search), the gateway must NOT suppress delivery
even if the stream consumer sent intermediate text earlier.
Without this fix, the user sees partial streaming text ('Let me search
for that') and then silence — the '(empty)' sentinel is swallowed by
already_sent=True."""
def _make_mock_stream_consumer(self, already_sent=False, final_response_sent=False):
return SimpleNamespace(
already_sent=already_sent,
final_response_sent=final_response_sent,
)
def _apply_suppression_logic(self, response, sc):
"""Reproduce the fixed logic from gateway/run.py return path."""
if sc and isinstance(response, dict) and not response.get("failed"):
_final = response.get("final_response") or ""
_is_empty_sentinel = not _final or _final == "(empty)"
if not _is_empty_sentinel and (
getattr(sc, "final_response_sent", False)
or getattr(sc, "already_sent", False)
):
response["already_sent"] = True
def test_empty_sentinel_not_suppressed_with_already_sent(self):
"""'(empty)' final_response should NOT be suppressed even when
streaming sent intermediate content."""
sc = self._make_mock_stream_consumer(already_sent=True, final_response_sent=True)
response = {"final_response": "(empty)"}
self._apply_suppression_logic(response, sc)
assert "already_sent" not in response
def test_empty_string_not_suppressed_with_already_sent(self):
"""Empty string final_response should NOT be suppressed."""
sc = self._make_mock_stream_consumer(already_sent=True, final_response_sent=True)
response = {"final_response": ""}
self._apply_suppression_logic(response, sc)
assert "already_sent" not in response
def test_none_response_not_suppressed_with_already_sent(self):
"""None final_response should NOT be suppressed."""
sc = self._make_mock_stream_consumer(already_sent=True, final_response_sent=True)
response = {"final_response": None}
self._apply_suppression_logic(response, sc)
assert "already_sent" not in response
def test_real_response_still_suppressed_with_already_sent(self):
"""Normal non-empty response should still be suppressed when
streaming delivered content."""
sc = self._make_mock_stream_consumer(already_sent=True, final_response_sent=False)
response = {"final_response": "Here are the search results..."}
self._apply_suppression_logic(response, sc)
assert response.get("already_sent") is True
def test_failed_empty_response_never_suppressed(self):
"""Failed responses are never suppressed regardless of content."""
sc = self._make_mock_stream_consumer(already_sent=True, final_response_sent=True)
response = {"final_response": "(empty)", "failed": True}
self._apply_suppression_logic(response, sc)
assert "already_sent" not in response
class TestQueuedMessageAlreadyStreamed:
"""The queued-message path should detect that the first response was
already streamed (already_sent=True) even without response_previewed."""
def _make_mock_sc(self, already_sent=False, final_response_sent=False):
return SimpleNamespace(
already_sent=already_sent,
final_response_sent=final_response_sent,
)
def test_queued_path_detects_already_streamed(self):
"""already_sent=True on stream consumer means first response was
streamed skip re-sending before processing queued message."""
_sc = self._make_mock_sc(already_sent=True)
# Reproduce the queued-message logic from run.py (post-fix)
_already_streamed = bool(
_sc
and (
getattr(_sc, "final_response_sent", False)
or getattr(_sc, "already_sent", False)
)
)
assert _already_streamed is True
def test_queued_path_sends_when_not_streamed(self):
"""Nothing was streamed — first response should be sent before
processing the queued message."""
_sc = self._make_mock_sc(already_sent=False)
_already_streamed = bool(
_sc
and (
getattr(_sc, "final_response_sent", False)
or getattr(_sc, "already_sent", False)
)
)
assert _already_streamed is False
def test_queued_path_with_no_stream_consumer(self):
"""No stream consumer at all (streaming disabled) — not streamed."""
_sc = None
_already_streamed = bool(
_sc
and (
getattr(_sc, "final_response_sent", False)
or getattr(_sc, "already_sent", False)
)
)
assert _already_streamed is False
+19
View File
@@ -125,6 +125,25 @@ async def test_gateway_stop_service_restart_sets_named_exit_code():
assert runner._exit_code == GATEWAY_SERVICE_RESTART_EXIT_CODE
@pytest.mark.asyncio
async def test_gateway_stop_emits_shutdown_hook_after_drain(monkeypatch):
runner, adapter = make_restart_runner()
adapter.disconnect = AsyncMock()
runner.hooks.emit = AsyncMock()
with patch("gateway.status.remove_pid_file"), patch("gateway.status.write_runtime_status"):
await runner.stop(restart=True, service_restart=True)
runner.hooks.emit.assert_awaited_once_with(
"gateway:shutdown",
{
"restart": True,
"service_restart": True,
"detached_restart": False,
},
)
@pytest.mark.asyncio
async def test_drain_active_agents_throttles_status_updates():
runner, _adapter = make_restart_runner()
+20 -1
View File
@@ -9,7 +9,7 @@ import pytest
from gateway.hooks import HookRegistry
def _create_hook(hooks_dir, hook_name, events, handler_code):
def _create_hook(hooks_dir, hook_name, events, handler_code, *, manifest_extra=""):
"""Helper to create a hook directory with HOOK.yaml and handler.py."""
hook_dir = hooks_dir / hook_name
hook_dir.mkdir(parents=True)
@@ -17,6 +17,7 @@ def _create_hook(hooks_dir, hook_name, events, handler_code):
f"name: {hook_name}\n"
f"description: Test hook\n"
f"events: {events}\n"
f"{manifest_extra}"
)
(hook_dir / "handler.py").write_text(handler_code)
return hook_dir
@@ -112,6 +113,24 @@ class TestDiscoverAndLoad:
assert len(reg.loaded_hooks) == 2
def test_preserves_optional_startup_readiness_metadata(self, tmp_path):
_create_hook(
tmp_path,
"ready-hook",
'["gateway:startup"]',
"def handle(e, c): pass\n",
manifest_extra="startup_readiness:\n id: beam-runtime\n required: false\n",
)
reg = HookRegistry()
with patch("gateway.hooks.HOOKS_DIR", tmp_path), _patch_no_builtins(reg):
reg.discover_and_load()
assert reg.loaded_hooks[0]["startup_readiness"] == {
"id": "beam-runtime",
"required": False,
}
class TestEmit:
@pytest.mark.asyncio
@@ -230,59 +230,6 @@ async def test_notify_on_complete_preserves_user_identity(monkeypatch, tmp_path)
assert event.source.user_name == "alice"
@pytest.mark.asyncio
async def test_notify_on_complete_uses_session_store_origin_for_group_topic(monkeypatch, tmp_path):
import tools.process_registry as pr_module
from gateway.session import SessionSource
sessions = [
SimpleNamespace(
output_buffer="done\n", exited=True, exit_code=0, command="echo test"
),
]
monkeypatch.setattr(pr_module, "process_registry", _FakeRegistry(sessions))
async def _instant_sleep(*_a, **_kw):
pass
monkeypatch.setattr(asyncio, "sleep", _instant_sleep)
runner = GatewayRunner(GatewayConfig())
adapter = SimpleNamespace(send=AsyncMock(), handle_message=AsyncMock())
runner.adapters[Platform.TELEGRAM] = adapter
runner.session_store._entries["agent:main:telegram:group:-100:42"] = SimpleNamespace(
origin=SessionSource(
platform=Platform.TELEGRAM,
chat_id="-100",
chat_type="group",
thread_id="42",
user_id="user-42",
user_name="alice",
)
)
watcher = {
"session_id": "proc_test_internal",
"check_interval": 0,
"session_key": "agent:main:telegram:group:-100:42",
"platform": "telegram",
"chat_id": "-100",
"thread_id": "42",
"notify_on_complete": True,
}
await runner._run_process_watcher(watcher)
assert adapter.handle_message.await_count == 1
event = adapter.handle_message.await_args.args[0]
assert event.internal is True
assert event.source.platform == Platform.TELEGRAM
assert event.source.chat_id == "-100"
assert event.source.chat_type == "group"
assert event.source.thread_id == "42"
assert event.source.user_id == "user-42"
assert event.source.user_name == "alice"
@pytest.mark.asyncio
async def test_none_user_id_skips_pairing(monkeypatch, tmp_path):
"""A non-internal event with user_id=None should be silently dropped."""
+1 -23
View File
@@ -335,29 +335,6 @@ def _make_adapter():
return adapter
# ---------------------------------------------------------------------------
# Typing indicator
# ---------------------------------------------------------------------------
class TestMatrixTypingIndicator:
def setup_method(self):
self.adapter = _make_adapter()
self.adapter._client = MagicMock()
self.adapter._client.set_typing = AsyncMock()
@pytest.mark.asyncio
async def test_stop_typing_clears_matrix_typing_state(self):
"""stop_typing() should send typing=false instead of waiting for timeout expiry."""
from gateway.platforms.matrix import RoomID
await self.adapter.stop_typing("!room:example.org")
self.adapter._client.set_typing.assert_awaited_once_with(
RoomID("!room:example.org"),
timeout=0,
)
# ---------------------------------------------------------------------------
# mxc:// URL conversion
# ---------------------------------------------------------------------------
@@ -1854,3 +1831,4 @@ class TestMatrixPresence:
assert result is False
@@ -1,89 +0,0 @@
"""Tests for MessageDeduplicator TTL enforcement (#10306).
Previously, is_duplicate() returned True for any previously seen ID without
checking its age expired entries were only purged when cache size exceeded
max_size. Normal workloads never overflowed, so messages stayed "duplicate"
forever.
The fix checks TTL at query time: if the entry's timestamp plus TTL is in
the past, the entry is treated as expired and the message is allowed through.
"""
import time
from unittest.mock import patch
from gateway.platforms.helpers import MessageDeduplicator
class TestMessageDeduplicatorTTL:
"""TTL-based expiration must work regardless of cache size."""
def test_duplicate_within_ttl(self):
"""Same message within TTL window is duplicate."""
dedup = MessageDeduplicator(ttl_seconds=60)
assert dedup.is_duplicate("msg-1") is False
assert dedup.is_duplicate("msg-1") is True
def test_not_duplicate_after_ttl_expires(self):
"""Same message AFTER TTL expires should NOT be duplicate."""
dedup = MessageDeduplicator(ttl_seconds=5)
assert dedup.is_duplicate("msg-1") is False
# Fast-forward time past TTL
dedup._seen["msg-1"] = time.time() - 10 # 10s ago, TTL is 5s
assert dedup.is_duplicate("msg-1") is False, \
"Expired entry should not be treated as duplicate"
def test_expired_entry_gets_refreshed(self):
"""After an expired entry is allowed through, it should be re-tracked."""
dedup = MessageDeduplicator(ttl_seconds=5)
assert dedup.is_duplicate("msg-1") is False
# Expire the entry
dedup._seen["msg-1"] = time.time() - 10
# Should be allowed through (expired)
assert dedup.is_duplicate("msg-1") is False
# Now should be duplicate again (freshly tracked)
assert dedup.is_duplicate("msg-1") is True
def test_different_messages_not_confused(self):
"""Different message IDs are independent."""
dedup = MessageDeduplicator(ttl_seconds=60)
assert dedup.is_duplicate("msg-1") is False
assert dedup.is_duplicate("msg-2") is False
assert dedup.is_duplicate("msg-1") is True
assert dedup.is_duplicate("msg-2") is True
def test_empty_id_never_duplicate(self):
"""Empty/None message IDs are never treated as duplicate."""
dedup = MessageDeduplicator(ttl_seconds=60)
assert dedup.is_duplicate("") is False
assert dedup.is_duplicate("") is False
def test_max_size_eviction_prunes_expired(self):
"""Cache pruning on overflow removes expired entries."""
dedup = MessageDeduplicator(max_size=5, ttl_seconds=60)
# Add 6 entries, with the first 3 expired
now = time.time()
for i in range(3):
dedup._seen[f"old-{i}"] = now - 120 # expired (2 min ago, TTL 60s)
for i in range(3):
dedup.is_duplicate(f"new-{i}")
# Now we have 6 entries. Next insert triggers pruning.
dedup.is_duplicate("trigger")
# The 3 expired entries should be gone, leaving 4 fresh ones
assert len(dedup._seen) == 4
assert "old-0" not in dedup._seen
assert "new-0" in dedup._seen
def test_ttl_zero_means_no_dedup(self):
"""With TTL=0, all entries expire immediately."""
dedup = MessageDeduplicator(ttl_seconds=0)
assert dedup.is_duplicate("msg-1") is False
# Entry was just added at time.time(), and TTL is 0,
# so now - seen_time >= 0 = ttl, meaning it's expired
# But time.time() might be the exact same float, so
# the check is `now - ts < ttl` which is `0 < 0` = False
# This means TTL=0 effectively disables dedup
assert dedup.is_duplicate("msg-1") is False
+1 -1
View File
@@ -193,7 +193,7 @@ async def test_shutdown_notification_says_restarting_when_restart_requested():
assert len(adapter.sent) == 1
assert "restarting" in adapter.sent[0]
assert "resume" in adapter.sent[0]
assert "/retry" in adapter.sent[0]
@pytest.mark.asyncio
@@ -132,6 +132,68 @@ async def test_runner_records_connected_platform_state_on_success(monkeypatch, t
assert state["platforms"]["discord"]["error_message"] is None
@pytest.mark.asyncio
async def test_runner_discovers_plugins_before_loading_hooks(monkeypatch, tmp_path):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
config = GatewayConfig(
platforms={
Platform.DISCORD: PlatformConfig(enabled=True, token="***")
},
sessions_dir=tmp_path / "sessions",
)
runner = GatewayRunner(config)
order: list[str] = []
monkeypatch.setattr(runner, "_create_adapter", lambda platform, platform_config: _SuccessfulAdapter())
monkeypatch.setattr("hermes_cli.plugins.discover_plugins", lambda: order.append("plugins"))
monkeypatch.setattr(runner.hooks, "discover_and_load", lambda: order.append("hooks"))
monkeypatch.setattr(runner.hooks, "emit", AsyncMock())
ok = await runner.start()
assert ok is True
assert order == ["plugins", "hooks"]
@pytest.mark.asyncio
async def test_runner_initializes_startup_checks_before_gateway_startup_emit(monkeypatch, tmp_path):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
config = GatewayConfig(
platforms={
Platform.DISCORD: PlatformConfig(enabled=True, token="***")
},
sessions_dir=tmp_path / "sessions",
)
runner = GatewayRunner(config)
runner.hooks._loaded_hooks = [
{
"name": "beam-runtime",
"events": ["gateway:startup"],
"path": str(tmp_path / "hook"),
"startup_readiness": {
"id": "beam-runtime",
"required": True,
},
}
]
monkeypatch.setattr(runner, "_create_adapter", lambda platform, platform_config: _SuccessfulAdapter())
monkeypatch.setattr("hermes_cli.plugins.discover_plugins", lambda: None)
monkeypatch.setattr(runner.hooks, "discover_and_load", lambda: None)
async def _assert_checks(event_type, context):
state = read_runtime_status()
assert event_type == "gateway:startup"
assert state["startup_checks"]["beam-runtime"]["state"] == "pending"
assert state["startup_checks"]["beam-runtime"]["required"] is True
monkeypatch.setattr(runner.hooks, "emit", _assert_checks)
ok = await runner.start()
assert ok is True
@pytest.mark.asyncio
async def test_start_gateway_verbosity_imports_redacting_formatter(monkeypatch, tmp_path):
"""Verbosity != None must not crash with NameError on RedactingFormatter (#8044)."""
+4 -25
View File
@@ -1,8 +1,6 @@
import asyncio
import os
import pytest
from gateway.config import Platform
from gateway.run import GatewayRunner
from gateway.session import SessionContext, SessionSource
@@ -10,26 +8,9 @@ from gateway.session_context import (
get_session_env,
set_session_vars,
clear_session_vars,
_VAR_MAP,
_UNSET,
)
@pytest.fixture(autouse=True)
def _reset_contextvars():
"""Reset all session contextvars to _UNSET between tests.
In production each asyncio.Task gets a fresh context copy where the
defaults are _UNSET. In tests all functions share the same thread
context, so a clear_session_vars() from test A (which sets vars to "")
would leak into test B. This fixture ensures each test starts clean.
"""
yield
for var in _VAR_MAP.values():
# Can't use var.reset() without a token; just set back to sentinel.
var.set(_UNSET)
def test_set_session_env_sets_contextvars(monkeypatch):
"""_set_session_env should populate contextvars, not os.environ."""
runner = object.__new__(GatewayRunner)
@@ -117,11 +98,9 @@ def test_get_session_env_falls_back_to_os_environ(monkeypatch):
tokens = set_session_vars(platform="telegram")
assert get_session_env("HERMES_SESSION_PLATFORM") == "telegram"
# After clear — should return "" (explicitly cleared), NOT fall back
# to os.environ. This is the fix for #10304: stale os.environ values
# must not leak through after a gateway session is cleaned up.
# Restore — should fall back to os.environ again
clear_session_vars(tokens)
assert get_session_env("HERMES_SESSION_PLATFORM") == ""
assert get_session_env("HERMES_SESSION_PLATFORM") == "discord"
def test_get_session_env_default_when_nothing_set(monkeypatch):
@@ -185,9 +164,9 @@ def test_session_key_falls_back_to_os_environ(monkeypatch):
tokens = set_session_vars(session_key="ctx-session-456")
assert get_session_env("HERMES_SESSION_KEY") == "ctx-session-456"
# After clear — should return "" (explicitly cleared), not os.environ (#10304)
# Restore — should fall back to os.environ
clear_session_vars(tokens)
assert get_session_env("HERMES_SESSION_KEY") == ""
assert get_session_env("HERMES_SESSION_KEY") == "env-session-123"
def test_set_session_env_includes_session_key():
+66
View File
@@ -132,6 +132,72 @@ class TestGatewayRuntimeStatus:
assert payload["platforms"]["discord"]["error_code"] is None
assert payload["platforms"]["discord"]["error_message"] is None
def test_reset_startup_checks_replaces_previous_run_entries(self, tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
status.write_runtime_status(
gateway_state="running",
startup_checks={
"old-check": {
"state": "ready",
"required": True,
"source": "old-hook",
"detail": None,
}
},
)
status.reset_startup_checks([
{
"name": "new-hook",
"startup_readiness": {
"id": "new-check",
"required": False,
},
}
])
payload = status.read_runtime_status()
assert set(payload["startup_checks"]) == {"new-check"}
assert payload["startup_checks"]["new-check"]["state"] == "pending"
assert payload["startup_checks"]["new-check"]["required"] is False
assert payload["startup_checks"]["new-check"]["source"] == "new-hook"
def test_mark_startup_check_ready_persists_detail(self, tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
status.reset_startup_checks([
{
"name": "beam",
"startup_readiness": {
"id": "beam-runtime",
"required": True,
},
}
])
status.mark_startup_check_ready("beam-runtime", detail="ready for RPC")
payload = status.read_runtime_status()
assert payload["startup_checks"]["beam-runtime"]["state"] == "ready"
assert payload["startup_checks"]["beam-runtime"]["detail"] == "ready for RPC"
def test_mark_startup_check_failed_creates_missing_entry(self, tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
status.mark_startup_check_failed(
"late-hook",
detail="startup hook crashed",
required=False,
source="late-hook",
)
payload = status.read_runtime_status()
assert payload["startup_checks"]["late-hook"]["state"] == "failed"
assert payload["startup_checks"]["late-hook"]["required"] is False
assert payload["startup_checks"]["late-hook"]["source"] == "late-hook"
assert payload["startup_checks"]["late-hook"]["detail"] == "startup hook crashed"
class TestTerminatePid:
def test_force_uses_taskkill_on_windows(self, monkeypatch):
-116
View File
@@ -1,116 +0,0 @@
"""Tests for stuck-session loop detection (#7536).
When a session is active across 3+ consecutive gateway restarts (the agent
gets stuck, gateway restarts, same session gets stuck again), the session
is auto-suspended on startup so the user gets a clean slate.
"""
import json
from pathlib import Path
from unittest.mock import MagicMock
import pytest
from tests.gateway.restart_test_helpers import make_restart_runner
@pytest.fixture
def runner_with_home(tmp_path, monkeypatch):
"""Create a runner with a writable HERMES_HOME."""
monkeypatch.setattr("gateway.run._hermes_home", tmp_path)
runner, adapter = make_restart_runner()
return runner, tmp_path
class TestStuckLoopDetection:
def test_increment_creates_file(self, runner_with_home):
runner, home = runner_with_home
runner._increment_restart_failure_counts({"session:a", "session:b"})
path = home / runner._STUCK_LOOP_FILE
assert path.exists()
counts = json.loads(path.read_text())
assert counts["session:a"] == 1
assert counts["session:b"] == 1
def test_increment_accumulates(self, runner_with_home):
runner, home = runner_with_home
runner._increment_restart_failure_counts({"session:a"})
runner._increment_restart_failure_counts({"session:a"})
runner._increment_restart_failure_counts({"session:a"})
counts = json.loads((home / runner._STUCK_LOOP_FILE).read_text())
assert counts["session:a"] == 3
def test_increment_drops_inactive_sessions(self, runner_with_home):
runner, home = runner_with_home
runner._increment_restart_failure_counts({"session:a", "session:b"})
runner._increment_restart_failure_counts({"session:a"}) # b not active
counts = json.loads((home / runner._STUCK_LOOP_FILE).read_text())
assert "session:a" in counts
assert "session:b" not in counts
def test_suspend_at_threshold(self, runner_with_home):
runner, home = runner_with_home
# Simulate 3 restarts with session:a active each time
for _ in range(3):
runner._increment_restart_failure_counts({"session:a"})
# Create a mock session entry
mock_entry = MagicMock()
mock_entry.suspended = False
runner.session_store._entries = {"session:a": mock_entry}
runner.session_store._save = MagicMock()
suspended = runner._suspend_stuck_loop_sessions()
assert suspended == 1
assert mock_entry.suspended is True
def test_no_suspend_below_threshold(self, runner_with_home):
runner, home = runner_with_home
runner._increment_restart_failure_counts({"session:a"})
runner._increment_restart_failure_counts({"session:a"})
# Only 2 restarts — below threshold of 3
mock_entry = MagicMock()
mock_entry.suspended = False
runner.session_store._entries = {"session:a": mock_entry}
suspended = runner._suspend_stuck_loop_sessions()
assert suspended == 0
assert mock_entry.suspended is False
def test_clear_on_success(self, runner_with_home):
runner, home = runner_with_home
runner._increment_restart_failure_counts({"session:a", "session:b"})
runner._clear_restart_failure_count("session:a")
path = home / runner._STUCK_LOOP_FILE
counts = json.loads(path.read_text())
assert "session:a" not in counts
assert "session:b" in counts
def test_clear_removes_file_when_empty(self, runner_with_home):
runner, home = runner_with_home
runner._increment_restart_failure_counts({"session:a"})
runner._clear_restart_failure_count("session:a")
assert not (home / runner._STUCK_LOOP_FILE).exists()
def test_suspend_clears_file(self, runner_with_home):
runner, home = runner_with_home
for _ in range(3):
runner._increment_restart_failure_counts({"session:a"})
mock_entry = MagicMock()
mock_entry.suspended = False
runner.session_store._entries = {"session:a": mock_entry}
runner.session_store._save = MagicMock()
runner._suspend_stuck_loop_sessions()
assert not (home / runner._STUCK_LOOP_FILE).exists()
def test_no_file_no_crash(self, runner_with_home):
runner, home = runner_with_home
# No file exists — should return 0 and not crash
assert runner._suspend_stuck_loop_sessions() == 0
# Clear on nonexistent file — should not crash
runner._clear_restart_failure_count("nonexistent")
@@ -263,7 +263,7 @@ class TestTelegramApprovalCallback:
mock_resolve.assert_not_called()
@pytest.mark.asyncio
async def test_update_prompt_callback_not_affected(self, tmp_path):
async def test_update_prompt_callback_not_affected(self):
"""Ensure update prompt callbacks still work."""
adapter = _make_adapter()
@@ -281,63 +281,11 @@ class TestTelegramApprovalCallback:
context = MagicMock()
with patch("tools.approval.resolve_gateway_approval") as mock_resolve:
with patch("hermes_constants.get_hermes_home", return_value=tmp_path):
with patch.dict(os.environ, {"TELEGRAM_ALLOWED_USERS": ""}):
with patch("hermes_constants.get_hermes_home", return_value=Path("/tmp/test")):
try:
await adapter._handle_callback_query(update, context)
except Exception:
pass # May fail on file write, that's fine
# Should NOT have triggered approval resolution
mock_resolve.assert_not_called()
assert (tmp_path / ".update_response").read_text() == "y"
@pytest.mark.asyncio
async def test_update_prompt_callback_rejects_unauthorized_user(self, tmp_path):
"""Update prompt buttons should honor TELEGRAM_ALLOWED_USERS."""
adapter = _make_adapter()
query = AsyncMock()
query.data = "update_prompt:y"
query.message = MagicMock()
query.message.chat_id = 12345
query.from_user = MagicMock()
query.from_user.id = 222
query.answer = AsyncMock()
query.edit_message_text = AsyncMock()
update = MagicMock()
update.callback_query = query
context = MagicMock()
with patch("hermes_constants.get_hermes_home", return_value=tmp_path):
with patch.dict(os.environ, {"TELEGRAM_ALLOWED_USERS": "111"}):
await adapter._handle_callback_query(update, context)
query.answer.assert_called_once()
assert "not authorized" in query.answer.call_args[1]["text"].lower()
query.edit_message_text.assert_not_called()
assert not (tmp_path / ".update_response").exists()
@pytest.mark.asyncio
async def test_update_prompt_callback_allows_authorized_user(self, tmp_path):
"""Allowed Telegram users can still answer update prompt buttons."""
adapter = _make_adapter()
query = AsyncMock()
query.data = "update_prompt:n"
query.message = MagicMock()
query.message.chat_id = 12345
query.from_user = MagicMock()
query.from_user.id = 111
query.answer = AsyncMock()
query.edit_message_text = AsyncMock()
update = MagicMock()
update.callback_query = query
context = MagicMock()
with patch("hermes_constants.get_hermes_home", return_value=tmp_path):
with patch.dict(os.environ, {"TELEGRAM_ALLOWED_USERS": "111"}):
await adapter._handle_callback_query(update, context)
query.answer.assert_called_once()
query.edit_message_text.assert_called_once()
assert (tmp_path / ".update_response").read_text() == "n"
+1 -1
View File
@@ -97,7 +97,7 @@ class TestResolveCommand:
def test_alias_resolves_to_canonical(self):
assert resolve_command("bg").name == "background"
assert resolve_command("reset").name == "new"
assert resolve_command("q").name == "queue"
assert resolve_command("q").name == "quit"
assert resolve_command("exit").name == "quit"
assert resolve_command("gateway").name == "platforms"
assert resolve_command("set-home").name == "sethome"

Some files were not shown because too many files have changed in this diff Show More