Compare commits

..

5 Commits

Author SHA1 Message Date
alt-glitch a4e414c832 fix: clean up stale test references to removed attributes
Remove tests for deleted should_compress_preflight/get_status/last_total_tokens
from test_context_compressor.py. Remove stale _reasoning_deltas_fired references
from test_reasoning_command.py (attribute removed, tests were passing vacuously).
2026-04-09 15:34:08 -07:00
alt-glitch 19e95307aa chore: remove spec-dead-code.md from tracked files 2026-04-09 15:31:49 -07:00
alt-glitch d079fe507b fix: restore 6 tests that tested live code but used deleted helpers
The dead test cleanup agent incorrectly removed tests that tested live
production functions but used deleted helpers (clear_session,
clear_nous_free_tier_cache) for setup/teardown. Replaced the deleted
helpers with direct internal state manipulation.
2026-04-09 15:26:39 -07:00
alt-glitch f2968ef609 merge: resolve conflict in browser_camofox.py (keep dead code removal) 2026-04-09 15:21:09 -07:00
alt-glitch af4ac8ce45 fix: remove 115 verified dead code symbols across 46 production files
Automated dead code audit using vulture + coverage.py + ast-grep intersection,
confirmed by Opus deep verification pass. Every symbol was verified to have
zero production callers (test imports excluded from reachability analysis).

Removes 1,534 lines of dead production code and 1,382 lines of stale test code
that exclusively tested the removed symbols.

Key removals:
- hermes_cli/checklist.py: entire dead module (superseded by curses_ui.py)
- agent/builtin_memory_provider.py: entire dead module (never instantiated)
- 28 dead functions including _setup_provider_model_selection (140 lines),
  llm_audit_skill (104 lines), set_token_counts (65 lines)
- 26 dead variables/constants (KAWAII arrays, URL constants, COMPACT_BANNER)
- 15 dead attributes (written to self but never read)
- 5 dead properties, 1 dead class

Methodology documented in spec-dead-code.md.
2026-04-09 15:18:09 -07:00
481 changed files with 7725 additions and 67399 deletions
-9
View File
@@ -89,15 +89,6 @@
# Optional base URL override:
# HERMES_QWEN_BASE_URL=https://portal.qwen.ai/v1
# =============================================================================
# LLM PROVIDER (Xiaomi MiMo)
# =============================================================================
# Xiaomi MiMo models (mimo-v2-pro, mimo-v2-omni, mimo-v2-flash).
# Get your key at: https://platform.xiaomimimo.com
# XIAOMI_API_KEY=your_key_here
# Optional base URL override:
# XIAOMI_BASE_URL=https://api.xiaomimimo.com/v1
# =============================================================================
# TOOL API KEYS
# =============================================================================
-2
View File
@@ -1,2 +0,0 @@
# Auto-generated files — collapse diffs and exclude from language stats
web/package-lock.json linguist-generated=true
+1 -9
View File
@@ -41,19 +41,11 @@ jobs:
python-version: '3.11'
- name: Install PyYAML for skill extraction
run: pip install pyyaml httpx
run: pip install pyyaml
- name: Extract skill metadata for dashboard
run: python3 website/scripts/extract-skills.py
- name: Build skills index (if not already present)
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
if [ ! -f website/static/api/skills-index.json ]; then
python3 scripts/build_skills_index.py || echo "Skills index build failed (non-fatal)"
fi
- name: Install dependencies
run: npm ci
working-directory: website
+7 -2
View File
@@ -69,7 +69,9 @@ jobs:
file: Dockerfile
push: true
platforms: linux/amd64,linux/arm64
tags: nousresearch/hermes-agent:latest
tags: |
nousresearch/hermes-agent:latest
nousresearch/hermes-agent:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
@@ -81,6 +83,9 @@ jobs:
file: Dockerfile
push: true
platforms: linux/amd64,linux/arm64
tags: nousresearch/hermes-agent:${{ github.event.release.tag_name }}
tags: |
nousresearch/hermes-agent:latest
nousresearch/hermes-agent:${{ github.event.release.tag_name }}
nousresearch/hermes-agent:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
-101
View File
@@ -1,101 +0,0 @@
name: Build Skills Index
on:
schedule:
# Run twice daily: 6 AM and 6 PM UTC
- cron: '0 6,18 * * *'
workflow_dispatch: # Manual trigger
push:
branches: [main]
paths:
- 'scripts/build_skills_index.py'
- '.github/workflows/skills-index.yml'
permissions:
contents: read
jobs:
build-index:
# Only run on the upstream repository, not on forks
if: github.repository == 'NousResearch/hermes-agent'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: pip install httpx pyyaml
- name: Build skills index
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: python scripts/build_skills_index.py
- name: Upload index artifact
uses: actions/upload-artifact@v4
with:
name: skills-index
path: website/static/api/skills-index.json
retention-days: 7
deploy-with-index:
needs: build-index
runs-on: ubuntu-latest
permissions:
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.deploy.outputs.page_url }}
# Only deploy on schedule or manual trigger (not on every push to the script)
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
steps:
- uses: actions/checkout@v4
- uses: actions/download-artifact@v4
with:
name: skills-index
path: website/static/api/
- uses: actions/setup-node@v4
with:
node-version: 20
cache: npm
cache-dependency-path: website/package-lock.json
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install PyYAML for skill extraction
run: pip install pyyaml
- name: Extract skill metadata for dashboard
run: python3 website/scripts/extract-skills.py
- name: Install dependencies
run: npm ci
working-directory: website
- name: Build Docusaurus
run: npm run build
working-directory: website
- name: Stage deployment
run: |
mkdir -p _site/docs
cp -r landingpage/* _site/
cp -r website/build/* _site/docs/
echo "hermes-agent.nousresearch.com" > _site/CNAME
- name: Upload artifact
uses: actions/upload-pages-artifact@v3
with:
path: _site
- name: Deploy to GitHub Pages
id: deploy
uses: actions/deploy-pages@v4
-4
View File
@@ -51,9 +51,6 @@ ignored/
.worktrees/
environments/benchmarks/evals/
# Web UI build output
hermes_cli/web_dist/
# Release script temp files
.release_notes.md
mini-swe-agent/
@@ -61,4 +58,3 @@ mini-swe-agent/
# Nix
.direnv/
result
website/static/api/skills-index.json
+2 -3
View File
@@ -351,9 +351,8 @@ Cache-breaking forces dramatically higher costs. The ONLY time we alter context
### Background Process Notifications (Gateway)
When `terminal(background=true, notify_on_complete=true)` is used, the gateway runs a watcher that
detects process completion and triggers a new agent turn. Control verbosity of background process
messages with `display.background_process_notifications`
When `terminal(background=true, check_interval=...)` is used, the gateway runs a watcher that
pushes status updates to the user's chat. Control verbosity with `display.background_process_notifications`
in config.yaml (or `HERMES_BACKGROUND_NOTIFICATIONS` env var):
- `all` — running-output updates + final message (default)
+5 -23
View File
@@ -1,44 +1,26 @@
FROM ghcr.io/astral-sh/uv:0.11.6-python3.13-trixie@sha256:b3c543b6c4f23a5f2df22866bd7857e5d304b67a564f4feab6ac22044dde719b AS uv_source
FROM tianon/gosu:1.19-trixie@sha256:3b176695959c71e123eb390d427efc665eeb561b1540e82679c15e992006b8b9 AS gosu_source
FROM debian:13.4
# Disable Python stdout buffering to ensure logs are printed immediately
ENV PYTHONUNBUFFERED=1
# Store Playwright browsers outside the volume mount so the build-time
# install survives the /opt/data volume overlay at runtime.
ENV PLAYWRIGHT_BROWSERS_PATH=/opt/hermes/.playwright
# Install system dependencies in one layer, clear APT cache
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential nodejs npm python3 ripgrep ffmpeg gcc python3-dev libffi-dev procps && \
build-essential nodejs npm python3 python3-pip ripgrep ffmpeg gcc python3-dev libffi-dev && \
rm -rf /var/lib/apt/lists/*
# Non-root user for runtime; UID can be overridden via HERMES_UID at runtime
RUN useradd -u 10000 -m -d /opt/data hermes
COPY --chmod=0755 --from=gosu_source /gosu /usr/local/bin/
COPY --chmod=0755 --from=uv_source /usr/local/bin/uv /usr/local/bin/uvx /usr/local/bin/
COPY . /opt/hermes
WORKDIR /opt/hermes
# Install Node dependencies and Playwright as root (--with-deps needs apt)
RUN npm install --prefer-offline --no-audit && \
# Install Python and Node dependencies in one layer, no cache
RUN pip install --no-cache-dir -e ".[all]" --break-system-packages && \
npm install --prefer-offline --no-audit && \
npx playwright install --with-deps chromium --only-shell && \
cd /opt/hermes/scripts/whatsapp-bridge && \
npm install --prefer-offline --no-audit && \
npm cache clean --force
# Hand ownership to hermes user, then install Python deps in a virtualenv
RUN chown -R hermes:hermes /opt/hermes
USER hermes
RUN uv venv && \
uv pip install --no-cache-dir -e ".[all]"
USER root
WORKDIR /opt/hermes
RUN chmod +x /opt/hermes/docker/entrypoint.sh
ENV HERMES_HOME=/opt/data
+1 -4
View File
@@ -33,10 +33,8 @@ Use any model you want — [Nous Portal](https://portal.nousresearch.com), [Open
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
```
Works on Linux, macOS, WSL2, and Android via Termux. The installer handles the platform-specific setup for you.
Works on Linux, macOS, and WSL2. The installer handles everything — Python, Node.js, dependencies, and the `hermes` command. No prerequisites except git.
> **Android / Termux:** The tested manual path is documented in the [Termux guide](https://hermes-agent.nousresearch.com/docs/getting-started/termux). On Termux, Hermes installs a curated `.[termux]` extra because the full `.[all]` extra currently pulls Android-incompatible voice dependencies.
>
> **Windows:** Native Windows is not supported. Please install [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install) and run the command above.
After installation:
@@ -167,7 +165,6 @@ python -m pytest tests/ -q
- 📚 [Skills Hub](https://agentskills.io)
- 🐛 [Issues](https://github.com/NousResearch/hermes-agent/issues)
- 💡 [Discussions](https://github.com/NousResearch/hermes-agent/discussions)
- 🔌 [HermesClaw](https://github.com/AaronWong1999/hermesclaw) — Community WeChat bridge: Run Hermes Agent and OpenClaw on the same WeChat account.
---
+7 -9
View File
@@ -36,7 +36,6 @@ from acp.schema import (
SessionCapabilities,
SessionForkCapabilities,
SessionListCapabilities,
SessionResumeCapabilities,
SessionInfo,
TextContentBlock,
UnstructuredCommandInput,
@@ -246,11 +245,9 @@ class HermesACPAgent(acp.Agent):
protocol_version=acp.PROTOCOL_VERSION,
agent_info=Implementation(name="hermes-agent", version=HERMES_VERSION),
agent_capabilities=AgentCapabilities(
load_session=True,
session_capabilities=SessionCapabilities(
fork=SessionForkCapabilities(),
list=SessionListCapabilities(),
resume=SessionResumeCapabilities(),
),
),
auth_methods=auth_methods,
@@ -454,13 +451,14 @@ class HermesACPAgent(acp.Agent):
await conn.session_update(session_id, update)
usage = None
if any(result.get(key) is not None for key in ("prompt_tokens", "completion_tokens", "total_tokens")):
usage_data = result.get("usage")
if usage_data and isinstance(usage_data, dict):
usage = Usage(
input_tokens=result.get("prompt_tokens", 0),
output_tokens=result.get("completion_tokens", 0),
total_tokens=result.get("total_tokens", 0),
thought_tokens=result.get("reasoning_tokens"),
cached_read_tokens=result.get("cache_read_tokens"),
input_tokens=usage_data.get("prompt_tokens", 0),
output_tokens=usage_data.get("completion_tokens", 0),
total_tokens=usage_data.get("total_tokens", 0),
thought_tokens=usage_data.get("reasoning_tokens"),
cached_read_tokens=usage_data.get("cached_tokens"),
)
stop_reason = "cancelled" if state.cancel_event and state.cancel_event.is_set() else "end_turn"
+17 -74
View File
@@ -60,8 +60,6 @@ _ANTHROPIC_OUTPUT_LIMITS = {
"claude-3-opus": 4_096,
"claude-3-sonnet": 4_096,
"claude-3-haiku": 4_096,
# Third-party Anthropic-compatible providers
"minimax": 131_072,
}
# For any model not in the table, assume the highest current limit.
@@ -76,11 +74,8 @@ def _get_anthropic_max_output(model: str) -> int:
model IDs (claude-sonnet-4-5-20250929) and variant suffixes (:1m, :fast)
resolve correctly. Longest-prefix match wins to avoid e.g. "claude-3-5"
matching before "claude-3-5-sonnet".
Normalizes dots to hyphens so that model names like
``anthropic/claude-opus-4.6`` match the ``claude-opus-4-6`` table key.
"""
m = model.lower().replace(".", "-")
m = model.lower()
best_key = ""
best_val = _ANTHROPIC_DEFAULT_OUTPUT_LIMIT
for key, val in _ANTHROPIC_OUTPUT_LIMITS.items():
@@ -100,15 +95,6 @@ _COMMON_BETAS = [
"interleaved-thinking-2025-05-14",
"fine-grained-tool-streaming-2025-05-14",
]
# MiniMax's Anthropic-compatible endpoints fail tool-use requests when
# the fine-grained tool streaming beta is present. Omit it so tool calls
# fall back to the provider's default response path.
_TOOL_STREAMING_BETA = "fine-grained-tool-streaming-2025-05-14"
# Fast mode beta — enables the ``speed: "fast"`` request parameter for
# significantly higher output token throughput on Opus 4.6 (~2.5x).
# See https://platform.claude.com/docs/en/build-with-claude/fast-mode
_FAST_MODE_BETA = "fast-mode-2026-02-01"
# Additional beta headers required for OAuth/subscription auth.
# Matches what Claude Code (and pi-ai / OpenCode) send.
@@ -163,27 +149,18 @@ def _get_claude_code_version() -> str:
def _is_oauth_token(key: str) -> bool:
"""Check if the key is an Anthropic OAuth/setup token.
"""Check if the key is an OAuth/setup token (not a regular Console API key).
Positively identifies Anthropic OAuth tokens by their key format:
- ``sk-ant-`` prefix (but NOT ``sk-ant-api``) → setup tokens, managed keys
- ``eyJ`` prefix → JWTs from the Anthropic OAuth flow
Non-Anthropic keys (MiniMax, Alibaba, etc.) don't match either pattern
and correctly return False.
Regular API keys start with 'sk-ant-api'. Everything else (setup-tokens
starting with 'sk-ant-oat', managed keys, JWTs, etc.) needs Bearer auth.
"""
if not key:
return False
# Regular Anthropic Console API keys x-api-key auth, never OAuth
# Regular Console API keys use x-api-key header
if key.startswith("sk-ant-api"):
return False
# Anthropic-issued tokens (setup-tokens sk-ant-oat-*, managed keys)
if key.startswith("sk-ant-"):
return True
# JWTs from Anthropic OAuth flow
if key.startswith("eyJ"):
return True
return False
# Everything else (setup-tokens, managed keys, JWTs) uses Bearer auth
return True
def _normalize_base_url_text(base_url) -> str:
@@ -227,19 +204,6 @@ def _requires_bearer_auth(base_url: str | None) -> bool:
return normalized.startswith(("https://api.minimax.io/anthropic", "https://api.minimaxi.com/anthropic"))
def _common_betas_for_base_url(base_url: str | None) -> list[str]:
"""Return the beta headers that are safe for the configured endpoint.
MiniMax's Anthropic-compatible endpoints (Bearer-auth) reject requests
that include Anthropic's ``fine-grained-tool-streaming`` beta — every
tool-use message triggers a connection error. Strip that beta for
Bearer-auth endpoints while keeping all other betas intact.
"""
if _requires_bearer_auth(base_url):
return [b for b in _COMMON_BETAS if b != _TOOL_STREAMING_BETA]
return _COMMON_BETAS
def build_anthropic_client(api_key: str, base_url: str = None):
"""Create an Anthropic client, auto-detecting setup-tokens vs API keys.
@@ -258,7 +222,6 @@ def build_anthropic_client(api_key: str, base_url: str = None):
}
if normalized_base_url:
kwargs["base_url"] = normalized_base_url
common_betas = _common_betas_for_base_url(normalized_base_url)
if _requires_bearer_auth(normalized_base_url):
# Some Anthropic-compatible providers (e.g. MiniMax) expect the API key in
@@ -268,21 +231,21 @@ def build_anthropic_client(api_key: str, base_url: str = None):
# not use Anthropic's sk-ant-api prefix and would otherwise be misread as
# Anthropic OAuth/setup tokens.
kwargs["auth_token"] = api_key
if common_betas:
kwargs["default_headers"] = {"anthropic-beta": ",".join(common_betas)}
if _COMMON_BETAS:
kwargs["default_headers"] = {"anthropic-beta": ",".join(_COMMON_BETAS)}
elif _is_third_party_anthropic_endpoint(base_url):
# Third-party proxies (Azure AI Foundry, AWS Bedrock, etc.) use their
# own API keys with x-api-key auth. Skip OAuth detection — their keys
# don't follow Anthropic's sk-ant-* prefix convention and would be
# misclassified as OAuth tokens.
kwargs["api_key"] = api_key
if common_betas:
kwargs["default_headers"] = {"anthropic-beta": ",".join(common_betas)}
if _COMMON_BETAS:
kwargs["default_headers"] = {"anthropic-beta": ",".join(_COMMON_BETAS)}
elif _is_oauth_token(api_key):
# OAuth access token / setup-token → Bearer auth + Claude Code identity.
# Anthropic routes OAuth requests based on user-agent and headers;
# without Claude Code's fingerprint, requests get intermittent 500s.
all_betas = common_betas + _OAUTH_ONLY_BETAS
all_betas = _COMMON_BETAS + _OAUTH_ONLY_BETAS
kwargs["auth_token"] = api_key
kwargs["default_headers"] = {
"anthropic-beta": ",".join(all_betas),
@@ -292,8 +255,8 @@ def build_anthropic_client(api_key: str, base_url: str = None):
else:
# Regular API key → x-api-key header + common betas
kwargs["api_key"] = api_key
if common_betas:
kwargs["default_headers"] = {"anthropic-beta": ",".join(common_betas)}
if _COMMON_BETAS:
kwargs["default_headers"] = {"anthropic-beta": ",".join(_COMMON_BETAS)}
return _anthropic_sdk.Anthropic(**kwargs)
@@ -1195,7 +1158,6 @@ def build_anthropic_kwargs(
preserve_dots: bool = False,
context_length: Optional[int] = None,
base_url: str | None = None,
fast_mode: bool = False,
) -> Dict[str, Any]:
"""Build kwargs for anthropic.messages.create().
@@ -1229,10 +1191,6 @@ def build_anthropic_kwargs(
When *base_url* points to a third-party Anthropic-compatible endpoint,
thinking block signatures are stripped (they are Anthropic-proprietary).
When *fast_mode* is True, adds ``speed: "fast"`` and the fast-mode beta
header for ~2.5x faster output throughput on Opus 4.6. Currently only
supported on native Anthropic endpoints (not third-party compatible ones).
"""
system, anthropic_messages = convert_messages_to_anthropic(messages, base_url=base_url)
anthropic_tools = convert_tools_to_anthropic(tools) if tools else []
@@ -1315,10 +1273,9 @@ def build_anthropic_kwargs(
# Map reasoning_config to Anthropic's thinking parameter.
# Claude 4.6 models use adaptive thinking + output_config.effort.
# Older models use manual thinking with budget_tokens.
# MiniMax Anthropic-compat endpoints support thinking (manual mode only,
# not adaptive). Haiku does NOT support extended thinking — skip entirely.
# Haiku and MiniMax models do NOT support extended thinking — skip entirely.
if reasoning_config and isinstance(reasoning_config, dict):
if reasoning_config.get("enabled") is not False and "haiku" not in model.lower():
if reasoning_config.get("enabled") is not False and "haiku" not in model.lower() and "minimax" not in model.lower():
effort = str(reasoning_config.get("effort", "medium")).lower()
budget = THINKING_BUDGET.get(effort, 8000)
if _supports_adaptive_thinking(model):
@@ -1332,20 +1289,6 @@ def build_anthropic_kwargs(
kwargs["temperature"] = 1
kwargs["max_tokens"] = max(effective_max_tokens, budget + 4096)
# ── Fast mode (Opus 4.6 only) ────────────────────────────────────
# Adds speed:"fast" + the fast-mode beta header for ~2.5x output speed.
# Only for native Anthropic endpoints — third-party providers would
# reject the unknown beta header and speed parameter.
if fast_mode and not _is_third_party_anthropic_endpoint(base_url):
kwargs["speed"] = "fast"
# Build extra_headers with ALL applicable betas (the per-request
# extra_headers override the client-level anthropic-beta header).
betas = list(_common_betas_for_base_url(base_url))
if is_oauth:
betas.extend(_OAUTH_ONLY_BETAS)
betas.append(_FAST_MODE_BETA)
kwargs["extra_headers"] = {"anthropic-beta": ",".join(betas)}
return kwargs
@@ -1407,4 +1350,4 @@ def normalize_anthropic_response(
reasoning_details=reasoning_details or None,
),
finish_reason,
)
)
+94 -473
View File
File diff suppressed because it is too large Load Diff
+74 -156
View File
@@ -4,12 +4,8 @@ Self-contained class with its own OpenAI client for summarization.
Uses auxiliary model (cheap/fast) to summarize middle turns while
protecting head and tail context.
Improvements over v2:
- Structured summary template with Resolved/Pending question tracking
- Summarizer preamble: "Do not respond to any questions" (from OpenCode)
- Handoff framing: "different assistant" (from Codex) to create separation
- "Remaining Work" replaces "Next Steps" to avoid reading as active instructions
- Clear separator when summary merges into tail message
Improvements over v1:
- Structured summary template (Goal, Progress, Decisions, Files, Next Steps)
- Iterative summary updates (preserves info across multiple compactions)
- Token-budget tail protection instead of fixed message count
- Tool output pruning before LLM summarization (cheap pre-pass)
@@ -22,9 +18,7 @@ import time
from typing import Any, Dict, List, Optional
from agent.auxiliary_client import call_llm
from agent.context_engine import ContextEngine
from agent.model_metadata import (
MINIMUM_CONTEXT_LENGTH,
get_model_context_length,
estimate_messages_tokens_rough,
)
@@ -32,13 +26,12 @@ from agent.model_metadata import (
logger = logging.getLogger(__name__)
SUMMARY_PREFIX = (
"[CONTEXT COMPACTION — REFERENCE ONLY] Earlier turns were compacted "
"into the summary below. This is a handoff from a previous context "
"window — treat it as background reference, NOT as active instructions. "
"Do NOT answer questions or fulfill requests mentioned in this summary; "
"they were already addressed. Respond ONLY to the latest user message "
"that appears AFTER this summary. The current session state (files, "
"config, etc.) may reflect work described here — avoid repeating it:"
"[CONTEXT COMPACTION] Earlier turns in this conversation were compacted "
"to save context space. The summary below describes work that was "
"already completed, and the current session state may still reflect "
"that work (for example, files may already be changed). Use the summary "
"and the current state to continue from where things left off, and "
"avoid repeating work:"
)
LEGACY_SUMMARY_PREFIX = "[CONTEXT SUMMARY]:"
@@ -57,8 +50,8 @@ _CHARS_PER_TOKEN = 4
_SUMMARY_FAILURE_COOLDOWN_SECONDS = 600
class ContextCompressor(ContextEngine):
"""Default context engine — compresses conversation context via lossy summarization.
class ContextCompressor:
"""Compresses conversation context when approaching the model's context limit.
Algorithm:
1. Prune old tool results (cheap, no LLM call)
@@ -68,38 +61,6 @@ class ContextCompressor(ContextEngine):
5. On subsequent compactions, iteratively update the previous summary
"""
@property
def name(self) -> str:
return "compressor"
def on_session_reset(self) -> None:
"""Reset all per-session state for /new or /reset."""
super().on_session_reset()
self._context_probed = False
self._context_probe_persistable = False
self._previous_summary = None
def update_model(
self,
model: str,
context_length: int,
base_url: str = "",
api_key: str = "",
provider: str = "",
api_mode: str = "",
) -> None:
"""Update model info after a model switch or fallback activation."""
self.model = model
self.base_url = base_url
self.api_key = api_key
self.provider = provider
self.api_mode = api_mode
self.context_length = context_length
self.threshold_tokens = max(
int(context_length * self.threshold_percent),
MINIMUM_CONTEXT_LENGTH,
)
def __init__(
self,
model: str,
@@ -113,13 +74,11 @@ class ContextCompressor(ContextEngine):
api_key: str = "",
config_context_length: int | None = None,
provider: str = "",
api_mode: str = "",
):
self.model = model
self.base_url = base_url
self.api_key = api_key
self.provider = provider
self.api_mode = api_mode
self.threshold_percent = threshold_percent
self.protect_first_n = protect_first_n
self.protect_last_n = protect_last_n
@@ -131,14 +90,7 @@ class ContextCompressor(ContextEngine):
config_context_length=config_context_length,
provider=provider,
)
# Floor: never compress below MINIMUM_CONTEXT_LENGTH tokens even if
# the percentage would suggest a lower value. This prevents premature
# compression on large-context models at 50% while keeping the % sane
# for models right at the minimum.
self.threshold_tokens = max(
int(self.context_length * threshold_percent),
MINIMUM_CONTEXT_LENGTH,
)
self.threshold_tokens = int(self.context_length * threshold_percent)
self.compression_count = 0
# Derive token budgets: ratio is relative to the threshold, not total context
@@ -315,20 +267,13 @@ class ContextCompressor(ContextEngine):
return "\n\n".join(parts)
def _generate_summary(self, turns_to_summarize: List[Dict[str, Any]], focus_topic: str = None) -> Optional[str]:
def _generate_summary(self, turns_to_summarize: List[Dict[str, Any]]) -> Optional[str]:
"""Generate a structured summary of conversation turns.
Uses a structured template (Goal, Progress, Decisions, Resolved/Pending
Questions, Files, Remaining Work) with explicit preamble telling the
summarizer not to answer questions. When a previous summary exists,
Uses a structured template (Goal, Progress, Decisions, Files, Next Steps)
inspired by Pi-mono and OpenCode. When a previous summary exists,
generates an iterative update instead of summarizing from scratch.
Args:
focus_topic: Optional focus string for guided compression. When
provided, the summariser prioritises preserving information
related to this topic and is more aggressive about compressing
everything else. Inspired by Claude Code's ``/compact``.
Returns None if all attempts fail — the caller should drop
the middle turns without a summary rather than inject a useless
placeholder.
@@ -344,27 +289,60 @@ class ContextCompressor(ContextEngine):
summary_budget = self._compute_summary_budget(turns_to_summarize)
content_to_summarize = self._serialize_for_summary(turns_to_summarize)
# Preamble shared by both first-compaction and iterative-update prompts.
# Inspired by OpenCode's "do not respond to any questions" instruction
# and Codex's "another language model" framing.
_summarizer_preamble = (
"You are a summarization agent creating a context checkpoint. "
"Your output will be injected as reference material for a DIFFERENT "
"assistant that continues the conversation. "
"Do NOT respond to any questions or requests in the conversation — "
"only output the structured summary. "
"Do NOT include any preamble, greeting, or prefix."
)
if self._previous_summary:
# Iterative update: preserve existing info, add new progress
prompt = f"""You are updating a context compaction summary. A previous compaction produced the summary below. New conversation turns have occurred since then and need to be incorporated.
# Shared structured template (used by both paths).
# Key changes vs v1:
# - "Pending User Asks" section (from Claude Code) explicitly tracks
# unanswered questions so the model knows what's resolved vs open
# - "Remaining Work" replaces "Next Steps" to avoid reading as active
# instructions
# - "Resolved Questions" makes it clear which questions were already
# answered (prevents model from re-answering them)
_template_sections = f"""## Goal
PREVIOUS SUMMARY:
{self._previous_summary}
NEW TURNS TO INCORPORATE:
{content_to_summarize}
Update the summary using this exact structure. PRESERVE all existing information that is still relevant. ADD new progress. Move items from "In Progress" to "Done" when completed. Remove information only if it is clearly obsolete.
## Goal
[What the user is trying to accomplish — preserve from previous summary, update if goal evolved]
## Constraints & Preferences
[User preferences, coding style, constraints, important decisions — accumulate across compactions]
## Progress
### Done
[Completed work — include specific file paths, commands run, results obtained]
### In Progress
[Work currently underway]
### Blocked
[Any blockers or issues encountered]
## Key Decisions
[Important technical decisions and why they were made]
## Relevant Files
[Files read, modified, or created — with brief note on each. Accumulate across compactions.]
## Next Steps
[What needs to happen next to continue the work]
## Critical Context
[Any specific values, error messages, configuration details, or data that would be lost without explicit preservation]
## Tools & Patterns
[Which tools were used, how they were used effectively, and any tool-specific discoveries. Accumulate across compactions.]
Target ~{summary_budget} tokens. Be specific — include file paths, command outputs, error messages, and concrete values rather than vague descriptions.
Write only the summary body. Do not include any preamble or prefix."""
else:
# First compaction: summarize from scratch
prompt = f"""Create a structured handoff summary for a later assistant that will continue this conversation after earlier turns are compacted.
TURNS TO SUMMARIZE:
{content_to_summarize}
Use this exact structure:
## Goal
[What the user is trying to accomplish]
## Constraints & Preferences
@@ -381,74 +359,25 @@ class ContextCompressor(ContextEngine):
## Key Decisions
[Important technical decisions and why they were made]
## Resolved Questions
[Questions the user asked that were ALREADY answered — include the answer so the next assistant does not re-answer them]
## Pending User Asks
[Questions or requests from the user that have NOT yet been answered or fulfilled. If none, write "None."]
## Relevant Files
[Files read, modified, or created — with brief note on each]
## Remaining Work
[What remains to be done — framed as context, not instructions]
## Next Steps
[What needs to happen next to continue the work]
## Critical Context
[Any specific values, error messages, configuration details, or data that would be lost without explicit preservation]
## Tools & Patterns
[Which tools were used, how they were used effectively, and any tool-specific discoveries]
[Which tools were used, how they were used effectively, and any tool-specific discoveries (e.g., preferred flags, working invocations, successful command patterns)]
Target ~{summary_budget} tokens. Be specific — include file paths, command outputs, error messages, and concrete values rather than vague descriptions.
Target ~{summary_budget} tokens. Be specific — include file paths, command outputs, error messages, and concrete values rather than vague descriptions. The goal is to prevent the next assistant from repeating work or losing important details.
Write only the summary body. Do not include any preamble or prefix."""
if self._previous_summary:
# Iterative update: preserve existing info, add new progress
prompt = f"""{_summarizer_preamble}
You are updating a context compaction summary. A previous compaction produced the summary below. New conversation turns have occurred since then and need to be incorporated.
PREVIOUS SUMMARY:
{self._previous_summary}
NEW TURNS TO INCORPORATE:
{content_to_summarize}
Update the summary using this exact structure. PRESERVE all existing information that is still relevant. ADD new progress. Move items from "In Progress" to "Done" when completed. Move answered questions to "Resolved Questions". Remove information only if it is clearly obsolete.
{_template_sections}"""
else:
# First compaction: summarize from scratch
prompt = f"""{_summarizer_preamble}
Create a structured handoff summary for a different assistant that will continue this conversation after earlier turns are compacted. The next assistant should be able to understand what happened without re-reading the original turns.
TURNS TO SUMMARIZE:
{content_to_summarize}
Use this exact structure:
{_template_sections}"""
# Inject focus topic guidance when the user provides one via /compress <focus>.
# This goes at the end of the prompt so it takes precedence.
if focus_topic:
prompt += f"""
FOCUS TOPIC: "{focus_topic}"
The user has requested that this compaction PRIORITISE preserving all information related to the focus topic above. For content related to "{focus_topic}", include full detail — exact values, file paths, command outputs, error messages, and decisions. For content NOT related to the focus topic, summarise more aggressively (brief one-liners or omit if truly irrelevant). The focus topic sections should receive roughly 60-70% of the summary token budget."""
try:
call_kwargs = {
"task": "compression",
"main_runtime": {
"model": self.model,
"provider": self.provider,
"base_url": self.base_url,
"api_key": self.api_key,
"api_mode": self.api_mode,
},
"messages": [{"role": "user", "content": prompt}],
"max_tokens": summary_budget * 2,
# timeout resolved from auxiliary.compression.timeout config by call_llm
@@ -663,7 +592,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
# Main compression entry point
# ------------------------------------------------------------------
def compress(self, messages: List[Dict[str, Any]], current_tokens: int = None, focus_topic: str = None) -> List[Dict[str, Any]]:
def compress(self, messages: List[Dict[str, Any]], current_tokens: int = None) -> List[Dict[str, Any]]:
"""Compress conversation messages by summarizing middle turns.
Algorithm:
@@ -675,12 +604,6 @@ The user has requested that this compaction PRIORITISE preserving all informatio
After compression, orphaned tool_call / tool_result pairs are cleaned
up so the API never receives mismatched IDs.
Args:
focus_topic: Optional focus string for guided compression. When
provided, the summariser will prioritise preserving information
related to this topic and be more aggressive about compressing
everything else. Inspired by Claude Code's ``/compact``.
"""
n_messages = len(messages)
# Only need head + 3 tail messages minimum (token budget decides the real tail size)
@@ -738,7 +661,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
)
# Phase 3: Generate structured summary
summary = self._generate_summary(turns_to_summarize, focus_topic=focus_topic)
summary = self._generate_summary(turns_to_summarize)
# Phase 4: Assemble compressed message list
compressed = []
@@ -793,12 +716,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
msg = messages[i].copy()
if _merge_summary_into_tail and i == compress_end:
original = msg.get("content") or ""
msg["content"] = (
summary
+ "\n\n--- END OF CONTEXT SUMMARY — "
"respond to the message below, not the summary above ---\n\n"
+ original
)
msg["content"] = summary + "\n\n" + original
_merge_summary_into_tail = False
compressed.append(msg)
-184
View File
@@ -1,184 +0,0 @@
"""Abstract base class for pluggable context engines.
A context engine controls how conversation context is managed when
approaching the model's token limit. The built-in ContextCompressor
is the default implementation. Third-party engines (e.g. LCM) can
replace it via the plugin system or by being placed in the
``plugins/context_engine/<name>/`` directory.
Selection is config-driven: ``context.engine`` in config.yaml.
Default is ``"compressor"`` (the built-in). Only one engine is active.
The engine is responsible for:
- Deciding when compaction should fire
- Performing compaction (summarization, DAG construction, etc.)
- Optionally exposing tools the agent can call (e.g. lcm_grep)
- Tracking token usage from API responses
Lifecycle:
1. Engine is instantiated and registered (plugin register() or default)
2. on_session_start() called when a conversation begins
3. update_from_response() called after each API response with usage data
4. should_compress() checked after each turn
5. compress() called when should_compress() returns True
6. on_session_end() called at real session boundaries (CLI exit, /reset,
gateway session expiry) — NOT per-turn
"""
from abc import ABC, abstractmethod
from typing import Any, Dict, List, Optional
class ContextEngine(ABC):
"""Base class all context engines must implement."""
# -- Identity ----------------------------------------------------------
@property
@abstractmethod
def name(self) -> str:
"""Short identifier (e.g. 'compressor', 'lcm')."""
# -- Token state (read by run_agent.py for display/logging) ------------
#
# Engines MUST maintain these. run_agent.py reads them directly.
last_prompt_tokens: int = 0
last_completion_tokens: int = 0
last_total_tokens: int = 0
threshold_tokens: int = 0
context_length: int = 0
compression_count: int = 0
# -- Compaction parameters (read by run_agent.py for preflight) --------
#
# These control the preflight compression check. Subclasses may
# override via __init__ or property; defaults are sensible for most
# engines.
threshold_percent: float = 0.75
protect_first_n: int = 3
protect_last_n: int = 6
# -- Core interface ----------------------------------------------------
@abstractmethod
def update_from_response(self, usage: Dict[str, Any]) -> None:
"""Update tracked token usage from an API response.
Called after every LLM call with the usage dict from the response.
"""
@abstractmethod
def should_compress(self, prompt_tokens: int = None) -> bool:
"""Return True if compaction should fire this turn."""
@abstractmethod
def compress(
self,
messages: List[Dict[str, Any]],
current_tokens: int = None,
) -> List[Dict[str, Any]]:
"""Compact the message list and return the new message list.
This is the main entry point. The engine receives the full message
list and returns a (possibly shorter) list that fits within the
context budget. The implementation is free to summarize, build a
DAG, or do anything else — as long as the returned list is a valid
OpenAI-format message sequence.
"""
# -- Optional: pre-flight check ----------------------------------------
def should_compress_preflight(self, messages: List[Dict[str, Any]]) -> bool:
"""Quick rough check before the API call (no real token count yet).
Default returns False (skip pre-flight). Override if your engine
can do a cheap estimate.
"""
return False
# -- Optional: session lifecycle ---------------------------------------
def on_session_start(self, session_id: str, **kwargs) -> None:
"""Called when a new conversation session begins.
Use this to load persisted state (DAG, store) for the session.
kwargs may include hermes_home, platform, model, etc.
"""
def on_session_end(self, session_id: str, messages: List[Dict[str, Any]]) -> None:
"""Called at real session boundaries (CLI exit, /reset, gateway expiry).
Use this to flush state, close DB connections, etc.
NOT called per-turn — only when the session truly ends.
"""
def on_session_reset(self) -> None:
"""Called on /new or /reset. Reset per-session state.
Default resets compression_count and token tracking.
"""
self.last_prompt_tokens = 0
self.last_completion_tokens = 0
self.last_total_tokens = 0
self.compression_count = 0
# -- Optional: tools ---------------------------------------------------
def get_tool_schemas(self) -> List[Dict[str, Any]]:
"""Return tool schemas this engine provides to the agent.
Default returns empty list (no tools). LCM would return schemas
for lcm_grep, lcm_describe, lcm_expand here.
"""
return []
def handle_tool_call(self, name: str, args: Dict[str, Any], **kwargs) -> str:
"""Handle a tool call from the agent.
Only called for tool names returned by get_tool_schemas().
Must return a JSON string.
kwargs may include:
messages: the current in-memory message list (for live ingestion)
"""
import json
return json.dumps({"error": f"Unknown context engine tool: {name}"})
# -- Optional: status / display ----------------------------------------
def get_status(self) -> Dict[str, Any]:
"""Return status dict for display/logging.
Default returns the standard fields run_agent.py expects.
"""
return {
"last_prompt_tokens": self.last_prompt_tokens,
"threshold_tokens": self.threshold_tokens,
"context_length": self.context_length,
"usage_percent": (
min(100, self.last_prompt_tokens / self.context_length * 100)
if self.context_length else 0
),
"compression_count": self.compression_count,
}
# -- Optional: model switch support ------------------------------------
def update_model(
self,
model: str,
context_length: int,
base_url: str = "",
api_key: str = "",
provider: str = "",
) -> None:
"""Called when the user switches models or on fallback activation.
Default updates context_length and recalculates threshold_tokens
from threshold_percent. Override if your engine needs more
(e.g. recalculate DAG budgets, switch summary models).
"""
self.context_length = context_length
self.threshold_tokens = int(context_length * self.threshold_percent)
+7 -36
View File
@@ -13,9 +13,8 @@ from typing import Awaitable, Callable
from agent.model_metadata import estimate_tokens_rough
_QUOTED_REFERENCE_VALUE = r'(?:`[^`\n]+`|"[^"\n]+"|\'[^\'\n]+\')'
REFERENCE_PATTERN = re.compile(
rf"(?<![\w/])@(?:(?P<simple>diff|staged)\b|(?P<kind>file|folder|git|url):(?P<value>{_QUOTED_REFERENCE_VALUE}(?::\d+(?:-\d+)?)?|\S+))"
r"(?<![\w/])@(?:(?P<simple>diff|staged)\b|(?P<kind>file|folder|git|url):(?P<value>\S+))"
)
TRAILING_PUNCTUATION = ",.;!?"
_SENSITIVE_HOME_DIRS = (".ssh", ".aws", ".gnupg", ".kube", ".docker", ".azure", ".config/gh")
@@ -82,10 +81,14 @@ def parse_context_references(message: str) -> list[ContextReference]:
value = _strip_trailing_punctuation(match.group("value") or "")
line_start = None
line_end = None
target = _strip_reference_wrappers(value)
target = value
if kind == "file":
target, line_start, line_end = _parse_file_reference_value(value)
range_match = re.match(r"^(?P<path>.+?):(?P<start>\d+)(?:-(?P<end>\d+))?$", value)
if range_match:
target = range_match.group("path")
line_start = int(range_match.group("start"))
line_end = int(range_match.group("end") or range_match.group("start"))
refs.append(
ContextReference(
@@ -372,38 +375,6 @@ def _strip_trailing_punctuation(value: str) -> str:
return stripped
def _strip_reference_wrappers(value: str) -> str:
if len(value) >= 2 and value[0] == value[-1] and value[0] in "`\"'":
return value[1:-1]
return value
def _parse_file_reference_value(value: str) -> tuple[str, int | None, int | None]:
quoted_match = re.match(
r'^(?P<quote>`|"|\')(?P<path>.+?)(?P=quote)(?::(?P<start>\d+)(?:-(?P<end>\d+))?)?$',
value,
)
if quoted_match:
line_start = quoted_match.group("start")
line_end = quoted_match.group("end")
return (
quoted_match.group("path"),
int(line_start) if line_start is not None else None,
int(line_end or line_start) if line_start is not None else None,
)
range_match = re.match(r"^(?P<path>.+?):(?P<start>\d+)(?:-(?P<end>\d+))?$", value)
if range_match:
line_start = int(range_match.group("start"))
return (
range_match.group("path"),
line_start,
int(range_match.group("end") or range_match.group("start")),
)
return _strip_reference_wrappers(value), None, None
def _remove_reference_tokens(message: str, refs: list[ContextReference]) -> str:
pieces: list[str] = []
cursor = 0
-161
View File
@@ -20,17 +20,13 @@ from hermes_cli.auth import (
DEFAULT_AGENT_KEY_MIN_TTL_SECONDS,
KIMI_CODE_BASE_URL,
PROVIDER_REGISTRY,
_auth_store_lock,
_codex_access_token_is_expiring,
_decode_jwt_claims,
_import_codex_cli_tokens,
_write_codex_cli_tokens,
_load_auth_store,
_load_provider_state,
_resolve_kimi_base_url,
_resolve_zai_base_url,
_save_auth_store,
_save_provider_state,
read_credential_pool,
write_credential_pool,
)
@@ -483,67 +479,6 @@ class CredentialPool:
logger.debug("Failed to sync from ~/.codex/auth.json: %s", exc)
return entry
def _sync_device_code_entry_to_auth_store(self, entry: PooledCredential) -> None:
"""Write refreshed pool entry tokens back to auth.json providers.
After a pool-level refresh, the pool entry has fresh tokens but
auth.json's ``providers.<id>`` still holds the pre-refresh state.
On the next ``load_pool()``, ``_seed_from_singletons()`` reads that
stale state and can overwrite the fresh pool entry — potentially
re-seeding a consumed single-use refresh token.
Applies to any OAuth provider whose singleton lives in auth.json
(currently Nous and OpenAI Codex).
"""
if entry.source != "device_code":
return
try:
with _auth_store_lock():
auth_store = _load_auth_store()
if self.provider == "nous":
state = _load_provider_state(auth_store, "nous")
if state is None:
return
state["access_token"] = entry.access_token
if entry.refresh_token:
state["refresh_token"] = entry.refresh_token
if entry.expires_at:
state["expires_at"] = entry.expires_at
if entry.agent_key:
state["agent_key"] = entry.agent_key
if entry.agent_key_expires_at:
state["agent_key_expires_at"] = entry.agent_key_expires_at
for extra_key in ("obtained_at", "expires_in", "agent_key_id",
"agent_key_expires_in", "agent_key_reused",
"agent_key_obtained_at"):
val = entry.extra.get(extra_key)
if val is not None:
state[extra_key] = val
if entry.inference_base_url:
state["inference_base_url"] = entry.inference_base_url
_save_provider_state(auth_store, "nous", state)
elif self.provider == "openai-codex":
state = _load_provider_state(auth_store, "openai-codex")
if not isinstance(state, dict):
return
tokens = state.get("tokens")
if not isinstance(tokens, dict):
return
tokens["access_token"] = entry.access_token
if entry.refresh_token:
tokens["refresh_token"] = entry.refresh_token
if entry.last_refresh:
state["last_refresh"] = entry.last_refresh
_save_provider_state(auth_store, "openai-codex", state)
else:
return
_save_auth_store(auth_store)
except Exception as exc:
logger.debug("Failed to sync %s pool entry back to auth store: %s", self.provider, exc)
def _refresh_entry(self, entry: PooledCredential, *, force: bool) -> Optional[PooledCredential]:
if entry.auth_type != AUTH_TYPE_OAUTH or not entry.refresh_token:
if force:
@@ -578,13 +513,6 @@ class CredentialPool:
except Exception as wexc:
logger.debug("Failed to write refreshed token to credentials file: %s", wexc)
elif self.provider == "openai-codex":
# Proactively sync from ~/.codex/auth.json before refresh.
# The Codex CLI (or another Hermes profile) may have already
# consumed our refresh_token. Syncing first avoids a
# "refresh_token_reused" error when the CLI has a newer pair.
synced = self._sync_codex_entry_from_cli(entry)
if synced is not entry:
entry = synced
refreshed = auth_mod.refresh_codex_oauth_pure(
entry.access_token,
entry.refresh_token,
@@ -670,45 +598,6 @@ class CredentialPool:
# Credentials file had a valid (non-expired) token — use it directly
logger.debug("Credentials file has valid token, using without refresh")
return synced
# For openai-codex: the refresh_token may have been consumed by
# the Codex CLI between our proactive sync and the refresh call.
# Re-sync and retry once.
if self.provider == "openai-codex":
synced = self._sync_codex_entry_from_cli(entry)
if synced.refresh_token != entry.refresh_token:
logger.debug("Retrying Codex refresh with synced token from ~/.codex/auth.json")
try:
refreshed = auth_mod.refresh_codex_oauth_pure(
synced.access_token,
synced.refresh_token,
)
updated = replace(
synced,
access_token=refreshed["access_token"],
refresh_token=refreshed["refresh_token"],
last_refresh=refreshed.get("last_refresh"),
last_status=STATUS_OK,
last_status_at=None,
last_error_code=None,
)
self._replace_entry(synced, updated)
self._persist()
self._sync_device_code_entry_to_auth_store(updated)
try:
_write_codex_cli_tokens(
updated.access_token,
updated.refresh_token,
last_refresh=updated.last_refresh,
)
except Exception as wexc:
logger.debug("Failed to write refreshed Codex tokens to CLI file (retry): %s", wexc)
return updated
except Exception as retry_exc:
logger.debug("Codex retry refresh also failed: %s", retry_exc)
elif not self._entry_needs_refresh(synced):
logger.debug("Codex CLI has valid token, using without refresh")
self._sync_device_code_entry_to_auth_store(synced)
return synced
self._mark_exhausted(entry, None)
return None
@@ -723,21 +612,6 @@ class CredentialPool:
)
self._replace_entry(entry, updated)
self._persist()
# Sync refreshed tokens back to auth.json providers so that
# _seed_from_singletons() on the next load_pool() sees fresh state
# instead of re-seeding stale/consumed tokens.
self._sync_device_code_entry_to_auth_store(updated)
# Write refreshed tokens back to ~/.codex/auth.json so Codex CLI
# and VS Code don't hit "refresh_token_reused" on their next refresh.
if self.provider == "openai-codex":
try:
_write_codex_cli_tokens(
updated.access_token,
updated.refresh_token,
last_refresh=updated.last_refresh,
)
except Exception as wexc:
logger.debug("Failed to write refreshed Codex tokens to CLI file: %s", wexc)
return updated
def _entry_needs_refresh(self, entry: PooledCredential) -> bool:
@@ -1079,17 +953,6 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
auth_store = _load_auth_store()
if provider == "anthropic":
# Only auto-discover external credentials (Claude Code, Hermes PKCE)
# when the user has explicitly configured anthropic as their provider.
# Without this gate, auxiliary client fallback chains silently read
# ~/.claude/.credentials.json without user consent. See PR #4210.
try:
from hermes_cli.auth import is_provider_explicitly_configured
if not is_provider_explicitly_configured("anthropic"):
return changed, active_sources
except ImportError:
pass
from agent.anthropic_adapter import read_claude_code_credentials, read_hermes_oauth_credentials
for source_name, creds in (
@@ -1097,13 +960,6 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
("claude_code", read_claude_code_credentials()),
):
if creds and creds.get("accessToken"):
# Check if user explicitly removed this source
try:
from hermes_cli.auth import is_source_suppressed
if is_source_suppressed(provider, source_name):
continue
except ImportError:
pass
active_sources.add(source_name)
changed |= _upsert_entry(
entries,
@@ -1148,23 +1004,6 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
elif provider == "openai-codex":
state = _load_provider_state(auth_store, "openai-codex")
tokens = state.get("tokens") if isinstance(state, dict) else None
# Fallback: import from Codex CLI (~/.codex/auth.json) if Hermes auth
# store has no tokens. This mirrors resolve_codex_runtime_credentials()
# so that load_pool() and list_authenticated_providers() detect tokens
# that only exist in the Codex CLI shared file.
if not (isinstance(tokens, dict) and tokens.get("access_token")):
try:
from hermes_cli.auth import _import_codex_cli_tokens, _save_codex_tokens
cli_tokens = _import_codex_cli_tokens()
if cli_tokens:
logger.info("Importing Codex CLI tokens into Hermes auth store.")
_save_codex_tokens(cli_tokens)
# Re-read state after import
auth_store = _load_auth_store()
state = _load_provider_state(auth_store, "openai-codex")
tokens = state.get("tokens") if isinstance(state, dict) else None
except Exception as exc:
logger.debug("Codex CLI token import failed: %s", exc)
if isinstance(tokens, dict) and tokens.get("access_token"):
active_sources.add("device_code")
changed |= _upsert_entry(
+27 -82
View File
@@ -4,6 +4,7 @@ Pure display functions and classes with no AIAgent dependency.
Used by AIAgent._execute_tool_calls for CLI feedback.
"""
import json
import logging
import os
import sys
@@ -13,8 +14,6 @@ from dataclasses import dataclass, field
from difflib import unified_diff
from pathlib import Path
from utils import safe_json_loads
# ANSI escape codes for coloring tool failure indicators
_RED = "\033[31m"
_RESET = "\033[0m"
@@ -22,73 +21,11 @@ _RESET = "\033[0m"
logger = logging.getLogger(__name__)
_ANSI_RESET = "\033[0m"
# Diff colors — resolved lazily from the skin engine so they adapt
# to light/dark themes. Falls back to sensible defaults on import
# failure. We cache after first resolution for performance.
_diff_colors_cached: dict[str, str] | None = None
def _diff_ansi() -> dict[str, str]:
"""Return ANSI escapes for diff display, resolved from the active skin."""
global _diff_colors_cached
if _diff_colors_cached is not None:
return _diff_colors_cached
# Defaults that work on dark terminals
dim = "\033[38;2;150;150;150m"
file_c = "\033[38;2;180;160;255m"
hunk = "\033[38;2;120;120;140m"
minus = "\033[38;2;255;255;255;48;2;120;20;20m"
plus = "\033[38;2;255;255;255;48;2;20;90;20m"
try:
from hermes_cli.skin_engine import get_active_skin
skin = get_active_skin()
def _hex_fg(key: str, fallback_rgb: tuple[int, int, int]) -> str:
h = skin.get_color(key, "")
if h and len(h) == 7 and h[0] == "#":
r, g, b = int(h[1:3], 16), int(h[3:5], 16), int(h[5:7], 16)
return f"\033[38;2;{r};{g};{b}m"
r, g, b = fallback_rgb
return f"\033[38;2;{r};{g};{b}m"
dim = _hex_fg("banner_dim", (150, 150, 150))
file_c = _hex_fg("session_label", (180, 160, 255))
hunk = _hex_fg("session_border", (120, 120, 140))
# minus/plus use background colors — derive from ui_error/ui_ok
err_h = skin.get_color("ui_error", "#ef5350")
ok_h = skin.get_color("ui_ok", "#4caf50")
if err_h and len(err_h) == 7:
er, eg, eb = int(err_h[1:3], 16), int(err_h[3:5], 16), int(err_h[5:7], 16)
# Use a dark tinted version as background
minus = f"\033[38;2;255;255;255;48;2;{max(er//2,20)};{max(eg//4,10)};{max(eb//4,10)}m"
if ok_h and len(ok_h) == 7:
or_, og, ob = int(ok_h[1:3], 16), int(ok_h[3:5], 16), int(ok_h[5:7], 16)
plus = f"\033[38;2;255;255;255;48;2;{max(or_//4,10)};{max(og//2,20)};{max(ob//4,10)}m"
except Exception:
pass
_diff_colors_cached = {
"dim": dim, "file": file_c, "hunk": hunk,
"minus": minus, "plus": plus,
}
return _diff_colors_cached
def reset_diff_colors() -> None:
"""Reset cached diff colors (call after /skin switch)."""
global _diff_colors_cached
_diff_colors_cached = None
# Module-level helpers — each call resolves from the active skin lazily.
def _diff_dim(): return _diff_ansi()["dim"]
def _diff_file(): return _diff_ansi()["file"]
def _diff_hunk(): return _diff_ansi()["hunk"]
def _diff_minus(): return _diff_ansi()["minus"]
def _diff_plus(): return _diff_ansi()["plus"]
_ANSI_DIM = "\033[38;2;150;150;150m"
_ANSI_FILE = "\033[38;2;180;160;255m"
_ANSI_HUNK = "\033[38;2;120;120;140m"
_ANSI_MINUS = "\033[38;2;255;255;255;48;2;120;20;20m"
_ANSI_PLUS = "\033[38;2;255;255;255;48;2;20;90;20m"
_MAX_INLINE_DIFF_FILES = 6
_MAX_INLINE_DIFF_LINES = 80
@@ -373,8 +310,9 @@ def _result_succeeded(result: str | None) -> bool:
"""Conservatively detect whether a tool result represents success."""
if not result:
return False
data = safe_json_loads(result)
if data is None:
try:
data = json.loads(result)
except (json.JSONDecodeError, TypeError):
return False
if not isinstance(data, dict):
return False
@@ -423,7 +361,10 @@ def extract_edit_diff(
) -> str | None:
"""Extract a unified diff from a file-edit tool result."""
if tool_name == "patch" and result:
data = safe_json_loads(result)
try:
data = json.loads(result)
except (json.JSONDecodeError, TypeError):
data = None
if isinstance(data, dict):
diff = data.get("diff")
if isinstance(diff, str) and diff.strip():
@@ -462,19 +403,19 @@ def _render_inline_unified_diff(diff: str) -> list[str]:
if raw_line.startswith("+++ "):
to_file = raw_line[4:].strip()
if from_file or to_file:
rendered.append(f"{_diff_file()}{from_file or 'a/?'}{to_file or 'b/?'}{_ANSI_RESET}")
rendered.append(f"{_ANSI_FILE}{from_file or 'a/?'}{to_file or 'b/?'}{_ANSI_RESET}")
continue
if raw_line.startswith("@@"):
rendered.append(f"{_diff_hunk()}{raw_line}{_ANSI_RESET}")
rendered.append(f"{_ANSI_HUNK}{raw_line}{_ANSI_RESET}")
continue
if raw_line.startswith("-"):
rendered.append(f"{_diff_minus()}{raw_line}{_ANSI_RESET}")
rendered.append(f"{_ANSI_MINUS}{raw_line}{_ANSI_RESET}")
continue
if raw_line.startswith("+"):
rendered.append(f"{_diff_plus()}{raw_line}{_ANSI_RESET}")
rendered.append(f"{_ANSI_PLUS}{raw_line}{_ANSI_RESET}")
continue
if raw_line.startswith(" "):
rendered.append(f"{_diff_dim()}{raw_line}{_ANSI_RESET}")
rendered.append(f"{_ANSI_DIM}{raw_line}{_ANSI_RESET}")
continue
if raw_line:
rendered.append(raw_line)
@@ -540,7 +481,7 @@ def _summarize_rendered_diff_sections(
summary = f"… omitted {omitted_lines} diff line(s)"
if omitted_files:
summary += f" across {omitted_files} additional file(s)/section(s)"
rendered.append(f"{_diff_hunk()}{summary}{_ANSI_RESET}")
rendered.append(f"{_ANSI_HUNK}{summary}{_ANSI_RESET}")
return rendered
@@ -777,19 +718,23 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
return False, ""
if tool_name == "terminal":
data = safe_json_loads(result)
if isinstance(data, dict):
try:
data = json.loads(result)
exit_code = data.get("exit_code")
if exit_code is not None and exit_code != 0:
return True, f" [exit {exit_code}]"
except (json.JSONDecodeError, TypeError, AttributeError):
logger.debug("Could not parse terminal result as JSON for exit code check")
return False, ""
# Memory-specific: distinguish "full" from real errors
if tool_name == "memory":
data = safe_json_loads(result)
if isinstance(data, dict):
try:
data = json.loads(result)
if data.get("success") is False and "exceed the limit" in data.get("error", ""):
return True, " [full]"
except (json.JSONDecodeError, TypeError, AttributeError):
logger.debug("Could not parse memory result as JSON for capacity check")
# Generic heuristic for non-terminal tools
lower = result[:500].lower()
+1 -28
View File
@@ -112,7 +112,6 @@ _RATE_LIMIT_PATTERNS = [
"try again in",
"please retry after",
"resource_exhausted",
"rate increased too quickly", # Alibaba/DashScope throttling
]
# Usage-limit patterns that need disambiguation (could be billing OR rate_limit)
@@ -668,27 +667,6 @@ def _classify_by_message(
should_compress=True,
)
# Usage-limit patterns need the same disambiguation as 402: some providers
# surface "usage limit" errors without an HTTP status code. A transient
# signal ("try again", "resets at", …) means it's a periodic quota, not
# billing exhaustion.
has_usage_limit = any(p in error_msg for p in _USAGE_LIMIT_PATTERNS)
if has_usage_limit:
has_transient_signal = any(p in error_msg for p in _USAGE_LIMIT_TRANSIENT_SIGNALS)
if has_transient_signal:
return result_fn(
FailoverReason.rate_limit,
retryable=True,
should_rotate_credential=True,
should_fallback=True,
)
return result_fn(
FailoverReason.billing,
retryable=False,
should_rotate_credential=True,
should_fallback=True,
)
# Billing patterns
if any(p in error_msg for p in _BILLING_PATTERNS):
return result_fn(
@@ -716,16 +694,11 @@ def _classify_by_message(
)
# Auth patterns
# Auth errors should NOT be retried directly — the credential is invalid and
# retrying with the same key will always fail. Set retryable=False so the
# caller triggers credential rotation (should_rotate_credential=True) or
# provider fallback rather than an immediate retry loop.
if any(p in error_msg for p in _AUTH_PATTERNS):
return result_fn(
FailoverReason.auth,
retryable=False,
retryable=True,
should_rotate_credential=True,
should_fallback=True,
)
# Model not found patterns
-49
View File
@@ -1,49 +0,0 @@
"""User-facing summaries for manual compression commands."""
from __future__ import annotations
from typing import Any, Sequence
def summarize_manual_compression(
before_messages: Sequence[dict[str, Any]],
after_messages: Sequence[dict[str, Any]],
before_tokens: int,
after_tokens: int,
) -> dict[str, Any]:
"""Return consistent user-facing feedback for manual compression."""
before_count = len(before_messages)
after_count = len(after_messages)
noop = list(after_messages) == list(before_messages)
if noop:
headline = f"No changes from compression: {before_count} messages"
if after_tokens == before_tokens:
token_line = (
f"Rough transcript estimate: ~{before_tokens:,} tokens (unchanged)"
)
else:
token_line = (
f"Rough transcript estimate: ~{before_tokens:,}"
f"~{after_tokens:,} tokens"
)
else:
headline = f"Compressed: {before_count}{after_count} messages"
token_line = (
f"Rough transcript estimate: ~{before_tokens:,}"
f"~{after_tokens:,} tokens"
)
note = None
if not noop and after_count < before_count and after_tokens > before_tokens:
note = (
"Note: fewer messages can still raise this rough transcript estimate "
"when compression rewrites the transcript into denser summaries."
)
return {
"noop": noop,
"headline": headline,
"token_line": token_line,
"note": note,
}
+24 -66
View File
@@ -27,14 +27,12 @@ _PROVIDER_PREFIXES: frozenset[str] = frozenset({
"gemini", "zai", "kimi-coding", "minimax", "minimax-cn", "anthropic", "deepseek",
"opencode-zen", "opencode-go", "ai-gateway", "kilocode", "alibaba",
"qwen-oauth",
"xiaomi",
"custom", "local",
# Common aliases
"google", "google-gemini", "google-ai-studio",
"glm", "z-ai", "z.ai", "zhipu", "github", "github-copilot",
"github-models", "kimi", "moonshot", "claude", "deep-seek",
"opencode", "zen", "go", "vercel", "kilo", "dashscope", "aliyun", "qwen",
"mimo", "xiaomi-mimo",
"qwen-portal",
})
@@ -85,11 +83,6 @@ CONTEXT_PROBE_TIERS = [
# Default context length when no detection method succeeds.
DEFAULT_FALLBACK_CONTEXT = CONTEXT_PROBE_TIERS[0]
# Minimum context length required to run Hermes Agent. Models with fewer
# tokens cannot maintain enough working memory for tool-calling workflows.
# Sessions, model switches, and cron jobs should reject models below this.
MINIMUM_CONTEXT_LENGTH = 64_000
# Thin fallback defaults — only broad model family patterns.
# These fire only when provider is unknown AND models.dev/OpenRouter/Anthropic
# all miss. Replaced the previous 80+ entry dict.
@@ -120,31 +113,19 @@ DEFAULT_CONTEXT_LENGTHS = {
"deepseek": 128000,
# Meta
"llama": 131072,
# Qwen — specific model families before the catch-all.
# Official docs: https://help.aliyun.com/zh/model-studio/developer-reference/
"qwen3-coder-plus": 1000000, # 1M context
"qwen3-coder": 262144, # 256K context
# Qwen
"qwen": 131072,
# MiniMax — official docs: 204,800 context for all models
# https://platform.minimax.io/docs/api-reference/text-anthropic-api
"minimax": 204800,
# MiniMax (lowercase — lookup lowercases model names at line 973)
"minimax-m1-256k": 1000000,
"minimax-m1-128k": 1000000,
"minimax-m1-80k": 1000000,
"minimax-m1-40k": 1000000,
"minimax-m1": 1000000,
"minimax-m2.5": 1048576,
"minimax-m2.7": 1048576,
"minimax": 1048576,
# GLM
"glm": 202752,
# xAI Grok — xAI /v1/models does not return context_length metadata,
# so these hardcoded fallbacks prevent Hermes from probing-down to
# the default 128k when the user points at https://api.x.ai/v1
# via a custom provider. Values sourced from models.dev (2026-04).
# Keys use substring matching (longest-first), so e.g. "grok-4.20"
# matches "grok-4.20-0309-reasoning" / "-non-reasoning" / "-multi-agent-0309".
"grok-code-fast": 256000, # grok-code-fast-1
"grok-4-1-fast": 2000000, # grok-4-1-fast-(non-)reasoning
"grok-2-vision": 8192, # grok-2-vision, -1212, -latest
"grok-4-fast": 2000000, # grok-4-fast-(non-)reasoning
"grok-4.20": 2000000, # grok-4.20-0309-(non-)reasoning, -multi-agent-0309
"grok-4": 256000, # grok-4, grok-4-0709
"grok-3": 131072, # grok-3, grok-3-mini, grok-3-fast, grok-3-mini-fast
"grok-2": 131072, # grok-2, grok-2-1212, grok-2-latest
"grok": 131072, # catch-all (grok-beta, unknown grok-*)
# Kimi
"kimi": 262144,
# Arcee
@@ -155,11 +136,10 @@ DEFAULT_CONTEXT_LENGTHS = {
"deepseek-ai/DeepSeek-V3.2": 65536,
"moonshotai/Kimi-K2.5": 262144,
"moonshotai/Kimi-K2-Thinking": 262144,
"MiniMaxAI/MiniMax-M2.5": 204800,
"XiaomiMiMo/MiMo-V2-Flash": 256000,
"mimo-v2-pro": 1000000,
"mimo-v2-omni": 256000,
"mimo-v2-flash": 256000,
"MiniMaxAI/MiniMax-M2.5": 1048576,
"XiaomiMiMo/MiMo-V2-Flash": 32768,
"mimo-v2-pro": 1048576,
"mimo-v2-omni": 1048576,
"zai-org/GLM-5": 202752,
}
@@ -184,12 +164,6 @@ _MAX_COMPLETION_KEYS = (
# Local server hostnames / address patterns
_LOCAL_HOSTS = ("localhost", "127.0.0.1", "::1", "0.0.0.0")
# Docker / Podman / Lima DNS names that resolve to the host machine
_CONTAINER_LOCAL_SUFFIXES = (
".docker.internal",
".containers.internal",
".lima.internal",
)
def _normalize_base_url(base_url: str) -> str:
@@ -224,9 +198,6 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"models.github.ai": "copilot",
"api.fireworks.ai": "fireworks",
"opencode.ai": "opencode-go",
"api.x.ai": "xai",
"api.xiaomimimo.com": "xiaomi",
"xiaomimimo.com": "xiaomi",
}
@@ -265,9 +236,6 @@ def is_local_endpoint(base_url: str) -> bool:
return False
if host in _LOCAL_HOSTS:
return True
# Docker / Podman / Lima internal DNS names (e.g. host.docker.internal)
if any(host.endswith(suffix) for suffix in _CONTAINER_LOCAL_SUFFIXES):
return True
# RFC-1918 private ranges and link-local
import ipaddress
try:
@@ -775,12 +743,12 @@ def _query_local_context_length(model: str, base_url: str) -> Optional[int]:
resp = client.post(f"{server_url}/api/show", json={"name": model})
if resp.status_code == 200:
data = resp.json()
# Prefer explicit num_ctx from Modelfile parameters: this is
# the *runtime* context Ollama will actually allocate KV cache
# for. The GGUF model_info.context_length is the training max,
# which can be larger than num_ctx — using it here would let
# Hermes grow conversations past the runtime limit and Ollama
# would silently truncate. Matches query_ollama_num_ctx().
# Check model_info for context length
model_info = data.get("model_info", {})
for key, value in model_info.items():
if "context_length" in key and isinstance(value, (int, float)):
return int(value)
# Check parameters string for num_ctx
params = data.get("parameters", "")
if "num_ctx" in params:
for line in params.split("\n"):
@@ -791,11 +759,6 @@ def _query_local_context_length(model: str, base_url: str) -> Optional[int]:
return int(parts[-1])
except ValueError:
pass
# Fall back to GGUF model_info context_length (training max)
model_info = data.get("model_info", {})
for key, value in model_info.items():
if "context_length" in key and isinstance(value, (int, float)):
return int(value)
# LM Studio native API: /api/v1/models returns max_context_length.
# This is more reliable than the OpenAI-compat /v1/models which
@@ -1050,21 +1013,16 @@ def get_model_context_length(
def estimate_tokens_rough(text: str) -> int:
"""Rough token estimate (~4 chars/token) for pre-flight checks.
Uses ceiling division so short texts (1-3 chars) never estimate as
0 tokens, which would cause the compressor and pre-flight checks to
systematically undercount when many short tool results are present.
"""
"""Rough token estimate (~4 chars/token) for pre-flight checks."""
if not text:
return 0
return (len(text) + 3) // 4
return len(text) // 4
def estimate_messages_tokens_rough(messages: List[Dict[str, Any]]) -> int:
"""Rough token estimate for a message list (pre-flight only)."""
total_chars = sum(len(str(msg)) for msg in messages)
return (total_chars + 3) // 4
return total_chars // 4
def estimate_request_tokens_rough(
@@ -1087,4 +1045,4 @@ def estimate_request_tokens_rough(
total_chars += sum(len(str(msg)) for msg in messages)
if tools:
total_chars += len(str(tools))
return (total_chars + 3) // 4
return total_chars // 4
+1 -11
View File
@@ -144,8 +144,6 @@ class ProviderInfo:
PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
"openrouter": "openrouter",
"anthropic": "anthropic",
"openai": "openai",
"openai-codex": "openai",
"zai": "zai",
"kimi-coding": "kimi-for-coding",
"minimax": "minimax",
@@ -163,7 +161,6 @@ PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
"gemini": "google",
"google": "google",
"xai": "xai",
"xiaomi": "xiaomi",
"nvidia": "nvidia",
"groq": "groq",
"mistral": "mistral",
@@ -386,14 +383,7 @@ def get_model_capabilities(provider: str, model: str) -> Optional[ModelCapabilit
# Extract capability flags (default to False if missing)
supports_tools = bool(entry.get("tool_call", False))
# Vision: check both the `attachment` flag and `modalities.input` for "image".
# Some models (e.g. gemma-4) list image in input modalities but not attachment.
input_mods = entry.get("modalities", {})
if isinstance(input_mods, dict):
input_mods = input_mods.get("input", [])
else:
input_mods = []
supports_vision = bool(entry.get("attachment", False)) or "image" in input_mods
supports_vision = bool(entry.get("attachment", False))
supports_reasoning = bool(entry.get("reasoning", False))
# Extract limits
+10 -68
View File
@@ -12,7 +12,7 @@ import threading
from collections import OrderedDict
from pathlib import Path
from hermes_constants import get_hermes_home, get_skills_dir, is_wsl
from hermes_constants import get_hermes_home
from typing import Optional
from agent.skill_utils import (
@@ -40,7 +40,7 @@ _CONTEXT_THREAT_PATTERNS = [
(r'disregard\s+(your|all|any)\s+(instructions|rules|guidelines)', "disregard_rules"),
(r'act\s+as\s+(if|though)\s+you\s+(have\s+no|don\'t\s+have)\s+(restrictions|limits|rules)', "bypass_restrictions"),
(r'<!--[^>]*(?:ignore|override|system|secret|hidden)[^>]*-->', "html_comment_injection"),
(r'<\s*div\s+style\s*=\s*["\'][\s\S]*?display\s*:\s*none', "hidden_div"),
(r'<\s*div\s+style\s*=\s*["\'].*display\s*:\s*none', "hidden_div"),
(r'translate\s+.*\s+into\s+.*\s+and\s+(execute|run|eval)', "translate_execute"),
(r'curl\s+[^\n]*\$\{?\w*(KEY|TOKEN|SECRET|PASSWORD|CREDENTIAL|API)', "exfil_curl"),
(r'cat\s+[^\n]*(\.env|credentials|\.netrc|\.pgpass)', "read_secrets"),
@@ -356,58 +356,8 @@ PLATFORM_HINTS = {
"MEDIA:/absolute/path/to/file in your response. Images (.jpg, .png, "
".heic) appear as photos and other files arrive as attachments."
),
"weixin": (
"You are on Weixin/WeChat. Markdown formatting is supported, so you may use it when "
"it improves readability, but keep the message compact and chat-friendly. You can send media files natively: "
"include MEDIA:/absolute/path/to/file in your response. Images are sent as native "
"photos, videos play inline when supported, and other files arrive as downloadable "
"documents. You can also include image URLs in markdown format ![alt](url) and they "
"will be downloaded and sent as native media when possible."
),
"wecom": (
"You are on WeCom (企业微信 / Enterprise WeChat). Markdown formatting is supported. "
"You CAN send media files natively — to deliver a file to the user, include "
"MEDIA:/absolute/path/to/file in your response. The file will be sent as a native "
"WeCom attachment: images (.jpg, .png, .webp) are sent as photos (up to 10 MB), "
"other files (.pdf, .docx, .xlsx, .md, .txt, etc.) arrive as downloadable documents "
"(up to 20 MB), and videos (.mp4) play inline. Voice messages are supported but "
"must be in AMR format — other audio formats are automatically sent as file attachments. "
"You can also include image URLs in markdown format ![alt](url) and they will be "
"downloaded and sent as native photos. Do NOT tell the user you lack file-sending "
"capability — use MEDIA: syntax whenever a file delivery is appropriate."
),
}
# ---------------------------------------------------------------------------
# Environment hints — execution-environment awareness for the agent.
# Unlike PLATFORM_HINTS (which describe the messaging channel), these describe
# the machine/OS the agent's tools actually run on.
# ---------------------------------------------------------------------------
WSL_ENVIRONMENT_HINT = (
"You are running inside WSL (Windows Subsystem for Linux). "
"The Windows host filesystem is mounted under /mnt/ — "
"/mnt/c/ is the C: drive, /mnt/d/ is D:, etc. "
"The user's Windows files are typically at "
"/mnt/c/Users/<username>/Desktop/, Documents/, Downloads/, etc. "
"When the user references Windows paths or desktop files, translate "
"to the /mnt/c/ equivalent. You can list /mnt/c/Users/ to discover "
"the Windows username if needed."
)
def build_environment_hints() -> str:
"""Return environment-specific guidance for the system prompt.
Detects WSL, and can be extended for Termux, Docker, etc.
Returns an empty string when no special environment is detected.
"""
hints: list[str] = []
if is_wsl():
hints.append(WSL_ENVIRONMENT_HINT)
return "\n\n".join(hints)
CONTEXT_FILE_MAX_CHARS = 20_000
CONTEXT_TRUNCATE_HEAD_RATIO = 0.7
CONTEXT_TRUNCATE_TAIL_RATIO = 0.2
@@ -529,7 +479,7 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:
(True, {}, "") to err on the side of showing the skill.
"""
try:
raw = skill_file.read_text(encoding="utf-8")
raw = skill_file.read_text(encoding="utf-8")[:2000]
frontmatter, _ = parse_frontmatter(raw)
if not skill_matches_platform(frontmatter):
@@ -537,7 +487,7 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:
return True, frontmatter, extract_skill_description(frontmatter)
except Exception as e:
logger.warning("Failed to parse skill file %s: %s", skill_file, e)
logger.debug("Failed to parse skill file %s: %s", skill_file, e)
return True, {}, ""
@@ -590,7 +540,8 @@ def build_skills_system_prompt(
are read-only — they appear in the index but new skills are always created
in the local dir. Local skills take precedence when names collide.
"""
skills_dir = get_skills_dir()
hermes_home = get_hermes_home()
skills_dir = hermes_home / "skills"
external_dirs = get_all_skills_dirs()[1:] # skip local (index 0)
if not skills_dir.exists() and not external_dirs:
@@ -599,10 +550,9 @@ def build_skills_system_prompt(
# ── Layer 1: in-process LRU cache ─────────────────────────────────
# Include the resolved platform so per-platform disabled-skill lists
# produce distinct cache entries (gateway serves multiple platforms).
from gateway.session_context import get_session_env
_platform_hint = (
os.environ.get("HERMES_PLATFORM")
or get_session_env("HERMES_SESSION_PLATFORM")
or os.environ.get("HERMES_SESSION_PLATFORM")
or ""
)
cache_key = (
@@ -768,16 +718,8 @@ def build_skills_system_prompt(
result = (
"## Skills (mandatory)\n"
"Before replying, scan the skills below. If a skill matches or is even partially relevant "
"to your task, you MUST load it with skill_view(name) and follow its instructions. "
"Err on the side of loading — it is always better to have context you don't need "
"than to miss critical steps, pitfalls, or established workflows. "
"Skills contain specialized knowledge — API endpoints, tool-specific commands, "
"and proven workflows that outperform general-purpose approaches. Load the skill "
"even if you think you could handle the task with basic tools like web_search or terminal. "
"Skills also encode the user's preferred approach, conventions, and quality standards "
"for tasks like code review, planning, and testing — load them even for tasks you "
"already know how to do, because the skill defines how it should be done here.\n"
"Before replying, scan the skills below. If one clearly matches your task, "
"load it with skill_view(name) and follow its instructions. "
"If a skill has issues, fix it with skill_manage(action='patch').\n"
"After difficult/iterative tasks, offer to save as a skill. "
"If a skill you loaded was missing steps, had wrong commands, or needed "
@@ -787,7 +729,7 @@ def build_skills_system_prompt(
+ "\n".join(index_lines) + "\n"
"</available_skills>\n"
"\n"
"Only proceed without loading a skill if genuinely none are relevant to the task."
"If none match, proceed normally without loading a skill."
)
# ── Store in LRU cache ────────────────────────────────────────────
+4 -8
View File
@@ -97,12 +97,8 @@ def parse_rate_limit_headers(
Returns None if no rate limit headers are present.
"""
# Normalize to lowercase so lookups work regardless of how the server
# capitalises headers (HTTP header names are case-insensitive per RFC 7230).
lowered = {k.lower(): v for k, v in headers.items()}
# Quick check: at least one rate limit header must exist
has_any = any(k.startswith("x-ratelimit-") for k in lowered)
has_any = any(k.lower().startswith("x-ratelimit-") for k in headers)
if not has_any:
return None
@@ -113,9 +109,9 @@ def parse_rate_limit_headers(
# resource="tokens", suffix="-1h" -> per-hour
tag = f"{resource}{suffix}"
return RateLimitBucket(
limit=_safe_int(lowered.get(f"x-ratelimit-limit-{tag}")),
remaining=_safe_int(lowered.get(f"x-ratelimit-remaining-{tag}")),
reset_seconds=_safe_float(lowered.get(f"x-ratelimit-reset-{tag}")),
limit=_safe_int(headers.get(f"x-ratelimit-limit-{tag}")),
remaining=_safe_int(headers.get(f"x-ratelimit-remaining-{tag}")),
reset_seconds=_safe_float(headers.get(f"x-ratelimit-reset-{tag}")),
captured_at=now,
)
+1 -1
View File
@@ -168,7 +168,7 @@ def _build_skill_message(
subdir_path = skill_dir / subdir
if subdir_path.exists():
for f in sorted(subdir_path.rglob("*")):
if f.is_file() and not f.is_symlink():
if f.is_file():
rel = str(f.relative_to(skill_dir))
supporting.append(rel)
+7 -8
View File
@@ -12,7 +12,7 @@ import sys
from pathlib import Path
from typing import Any, Dict, List, Set, Tuple
from hermes_constants import get_config_path, get_skills_dir
from hermes_constants import get_hermes_home
logger = logging.getLogger(__name__)
@@ -130,7 +130,7 @@ def get_disabled_skill_names(platform: str | None = None) -> Set[str]:
Reads the config file directly (no CLI config imports) to stay
lightweight.
"""
config_path = get_config_path()
config_path = get_hermes_home() / "config.yaml"
if not config_path.exists():
return set()
try:
@@ -145,11 +145,10 @@ def get_disabled_skill_names(platform: str | None = None) -> Set[str]:
if not isinstance(skills_cfg, dict):
return set()
from gateway.session_context import get_session_env
resolved_platform = (
platform
or os.getenv("HERMES_PLATFORM")
or get_session_env("HERMES_SESSION_PLATFORM")
or os.getenv("HERMES_SESSION_PLATFORM")
)
if resolved_platform:
platform_disabled = (skills_cfg.get("platform_disabled") or {}).get(
@@ -178,7 +177,7 @@ def get_external_skills_dirs() -> List[Path]:
path. Only directories that actually exist are returned. Duplicates and
paths that resolve to the local ``~/.hermes/skills/`` are silently skipped.
"""
config_path = get_config_path()
config_path = get_hermes_home() / "config.yaml"
if not config_path.exists():
return []
try:
@@ -200,7 +199,7 @@ def get_external_skills_dirs() -> List[Path]:
if not isinstance(raw_dirs, list):
return []
local_skills = get_skills_dir().resolve()
local_skills = (get_hermes_home() / "skills").resolve()
seen: Set[Path] = set()
result: List[Path] = []
@@ -230,7 +229,7 @@ def get_all_skills_dirs() -> List[Path]:
The local dir is always first (and always included even if it doesn't exist
yet callers handle that). External dirs follow in config order.
"""
dirs = [get_skills_dir()]
dirs = [get_hermes_home() / "skills"]
dirs.extend(get_external_skills_dirs())
return dirs
@@ -384,7 +383,7 @@ def resolve_skill_config_values(
current values (or the declared default if the key isn't set).
Path values are expanded via ``os.path.expanduser``.
"""
config_path = get_config_path()
config_path = get_hermes_home() / "config.yaml"
config: Dict[str, Any] = {}
if config_path.exists():
try:
-1
View File
@@ -181,7 +181,6 @@ def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any
"api_mode": runtime.get("api_mode"),
"command": runtime.get("command"),
"args": list(runtime.get("args") or []),
"credential_pool": runtime.get("credential_pool"),
},
"label": f"smart route → {route.get('model')} ({runtime.get('provider')})",
"signature": (
+1 -1
View File
@@ -36,7 +36,7 @@ def generate_title(user_message: str, assistant_response: str, timeout: float =
try:
response = call_llm(
task="title_generation",
task="compression", # reuse compression task config (cheap/fast model)
messages=messages,
max_tokens=30,
temperature=0.3,
+4 -20
View File
@@ -24,7 +24,6 @@ model:
# "minimax" - MiniMax global (requires: MINIMAX_API_KEY)
# "minimax-cn" - MiniMax China (requires: MINIMAX_CN_API_KEY)
# "huggingface" - Hugging Face Inference (requires: HF_TOKEN)
# "xiaomi" - Xiaomi MiMo (requires: XIAOMI_API_KEY)
# "kilocode" - KiloCode gateway (requires: KILOCODE_API_KEY)
# "ai-gateway" - Vercel AI Gateway (requires: AI_GATEWAY_API_KEY)
#
@@ -481,12 +480,6 @@ agent:
# Fires once per run when inactivity reaches this threshold (seconds).
# Set to 0 to disable the warning.
# gateway_timeout_warning: 900
# Graceful drain timeout for gateway stop/restart (seconds).
# The gateway stops accepting new work, waits for in-flight agents to
# finish, then interrupts anything still running after this timeout.
# 0 = no drain, interrupt immediately.
# restart_drain_timeout: 60
# Enable verbose logging
verbose: false
@@ -589,7 +582,7 @@ platform_toolsets:
# skills_hub - skill_hub (search/install/manage from online registries — user-driven only)
# moa - mixture_of_agents (requires OPENROUTER_API_KEY)
# todo - todo (in-memory task planning, no deps)
# tts - text_to_speech (Edge TTS free, or ELEVENLABS/OPENAI/MINIMAX/MISTRAL key)
# tts - text_to_speech (Edge TTS free, or ELEVENLABS/OPENAI/MINIMAX key)
# cronjob - cronjob (create/list/update/pause/resume/run/remove scheduled tasks)
# rl - rl_list_environments, rl_start_training, etc. (requires TINKER_API_KEY)
#
@@ -618,7 +611,7 @@ platform_toolsets:
# todo - Task planning and tracking for multi-step work
# memory - Persistent memory across sessions (personal notes + user profile)
# session_search - Search and recall past conversations (FTS5 + Gemini Flash summarization)
# tts - Text-to-speech (Edge TTS free, ElevenLabs, OpenAI, MiniMax, Mistral)
# tts - Text-to-speech (Edge TTS free, ElevenLabs, OpenAI, MiniMax)
# cronjob - Schedule and manage automated tasks (CLI-only)
# rl - RL training tools (Tinker-Atropos)
#
@@ -691,11 +684,7 @@ platform_toolsets:
stt:
enabled: true
# provider: "local" # auto-detected if omitted
local:
model: "base" # tiny | base | small | medium | large-v3 | turbo
# language: "" # auto-detect; set to "en", "es", "fr", etc. to force
openai:
model: "whisper-1" # whisper-1 | gpt-4o-mini-transcribe | gpt-4o-transcribe
model: "whisper-1" # whisper-1 (cheapest) | gpt-4o-mini-transcribe | gpt-4o-transcribe
# mistral:
# model: "voxtral-mini-latest" # voxtral-mini-latest | voxtral-mini-2602
@@ -774,11 +763,6 @@ display:
# Toggle at runtime with /verbose in the CLI
tool_progress: all
# Gateway-only natural mid-turn assistant updates.
# When true, completed assistant status messages are sent as separate chat
# messages. This is independent of tool_progress and gateway streaming.
interim_assistant_messages: true
# What Enter does when Hermes is already busy in the CLI.
# interrupt: Interrupt the current run and redirect Hermes (default)
# queue: Queue your message for the next turn
@@ -787,7 +771,7 @@ display:
# Background process notifications (gateway/messaging only).
# Controls how chatty the process watcher is when you use
# terminal(background=true, notify_on_complete=true) from Telegram/Discord/etc.
# terminal(background=true, check_interval=...) from Telegram/Discord/etc.
# off: No watcher messages at all
# result: Only the final completion message
# error: Only the final message when exit code != 0
+236 -1557
View File
File diff suppressed because it is too large Load Diff
-15
View File
@@ -1,15 +0,0 @@
# Termux / Android dependency constraints for Hermes Agent.
#
# Usage:
# python -m pip install -e '.[termux]' -c constraints-termux.txt
#
# These pins keep the tested Android install path stable when upstream packages
# move faster than Termux-compatible wheels / sdists.
ipython<10
jedi>=0.18.1,<0.20
parso>=0.8.4,<0.9
stack-data>=0.6,<0.7
pexpect>4.3,<5
matplotlib-inline>=0.1.7,<0.2
asttokens>=2.1,<3
+7 -10
View File
@@ -31,7 +31,7 @@ except ImportError:
# Configuration
# =============================================================================
HERMES_DIR = get_hermes_home().resolve()
HERMES_DIR = get_hermes_home()
CRON_DIR = HERMES_DIR / "cron"
JOBS_FILE = CRON_DIR / "jobs.json"
OUTPUT_DIR = CRON_DIR / "output"
@@ -338,12 +338,10 @@ def load_jobs() -> List[Dict[str, Any]]:
save_jobs(jobs)
logger.warning("Auto-repaired jobs.json (had invalid control characters)")
return jobs
except Exception as e:
logger.error("Failed to auto-repair jobs.json: %s", e)
raise RuntimeError(f"Cron database corrupted and unrepairable: {e}") from e
except IOError as e:
logger.error("IOError reading jobs.json: %s", e)
raise RuntimeError(f"Failed to read cron database: {e}") from e
except Exception:
return []
except IOError:
return []
def save_jobs(jobs: List[Dict[str, Any]]):
@@ -454,7 +452,6 @@ def create_job(
"last_run_at": None,
"last_status": None,
"last_error": None,
"last_delivery_error": None,
# Delivery configuration
"deliver": deliver,
"origin": origin, # Tracks where job was created for "origin" delivery
@@ -623,8 +620,8 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None,
save_jobs(jobs)
return
logger.warning("mark_job_run: job_id %s not found, skipping save", job_id)
save_jobs(jobs)
def advance_next_run(job_id: str) -> bool:
+12 -97
View File
@@ -44,7 +44,7 @@ logger = logging.getLogger(__name__)
_KNOWN_DELIVERY_PLATFORMS = frozenset({
"telegram", "discord", "slack", "whatsapp", "signal",
"matrix", "mattermost", "homeassistant", "dingtalk", "feishu",
"wecom", "wecom_callback", "weixin", "sms", "email", "webhook", "bluebubbles",
"wecom", "sms", "email", "webhook", "bluebubbles",
})
from cron.jobs import get_due_jobs, mark_job_run, save_job_output, advance_next_run
@@ -219,21 +219,6 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
chat_id = target["chat_id"]
thread_id = target.get("thread_id")
# Diagnostic: log thread_id for topic-aware delivery debugging
origin = job.get("origin") or {}
origin_thread = origin.get("thread_id")
if origin_thread and not thread_id:
logger.warning(
"Job '%s': origin has thread_id=%s but delivery target lost it "
"(deliver=%s, target=%s)",
job["id"], origin_thread, job.get("deliver", "local"), target,
)
elif thread_id:
logger.debug(
"Job '%s': delivering to %s:%s thread_id=%s",
job["id"], platform_name, chat_id, thread_id,
)
from tools.send_message_tool import _send_to_platform
from gateway.config import load_gateway_config, Platform
@@ -249,8 +234,6 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
"dingtalk": Platform.DINGTALK,
"feishu": Platform.FEISHU,
"wecom": Platform.WECOM,
"wecom_callback": Platform.WECOM_CALLBACK,
"weixin": Platform.WEIXIN,
"email": Platform.EMAIL,
"sms": Platform.SMS,
"bluebubbles": Platform.BLUEBUBBLES,
@@ -363,42 +346,7 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
return None
_DEFAULT_SCRIPT_TIMEOUT = 120 # seconds
# Backward-compatible module override used by tests and emergency monkeypatches.
_SCRIPT_TIMEOUT = _DEFAULT_SCRIPT_TIMEOUT
def _get_script_timeout() -> int:
"""Resolve cron pre-run script timeout from module/env/config with a safe default."""
if _SCRIPT_TIMEOUT != _DEFAULT_SCRIPT_TIMEOUT:
try:
timeout = int(float(_SCRIPT_TIMEOUT))
if timeout > 0:
return timeout
except Exception:
logger.warning("Invalid patched _SCRIPT_TIMEOUT=%r; using env/config/default", _SCRIPT_TIMEOUT)
env_value = os.getenv("HERMES_CRON_SCRIPT_TIMEOUT", "").strip()
if env_value:
try:
timeout = int(float(env_value))
if timeout > 0:
return timeout
except Exception:
logger.warning("Invalid HERMES_CRON_SCRIPT_TIMEOUT=%r; using config/default", env_value)
try:
cfg = load_config() or {}
cron_cfg = cfg.get("cron", {}) if isinstance(cfg, dict) else {}
configured = cron_cfg.get("script_timeout_seconds")
if configured is not None:
timeout = int(float(configured))
if timeout > 0:
return timeout
except Exception as exc:
logger.debug("Failed to load cron script timeout from config: %s", exc)
return _DEFAULT_SCRIPT_TIMEOUT
_SCRIPT_TIMEOUT = 120 # seconds
def _run_job_script(script_path: str) -> tuple[bool, str]:
@@ -445,27 +393,17 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
if not path.is_file():
return False, f"Script path is not a file: {path}"
script_timeout = _get_script_timeout()
try:
result = subprocess.run(
[sys.executable, str(path)],
capture_output=True,
text=True,
timeout=script_timeout,
timeout=_SCRIPT_TIMEOUT,
cwd=str(path.parent),
)
stdout = (result.stdout or "").strip()
stderr = (result.stderr or "").strip()
# Redact secrets from both stdout and stderr before any return path.
try:
from agent.redact import redact_sensitive_text
stdout = redact_sensitive_text(stdout)
stderr = redact_sensitive_text(stderr)
except Exception:
pass
if result.returncode != 0:
parts = [f"Script exited with code {result.returncode}"]
if stderr:
@@ -474,10 +412,17 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
parts.append(f"stdout:\n{stdout}")
return False, "\n".join(parts)
# Redact any secrets that may appear in script output before
# they are injected into the LLM prompt context.
try:
from agent.redact import redact_sensitive_text
stdout = redact_sensitive_text(stdout)
except Exception:
pass
return True, stdout
except subprocess.TimeoutExpired:
return False, f"Script timed out after {script_timeout}s: {path}"
return False, f"Script timed out after {_SCRIPT_TIMEOUT}s: {path}"
except Exception as exc:
return False, f"Script execution failed: {exc}"
@@ -641,15 +586,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
except Exception as e:
logger.warning("Job '%s': failed to load config.yaml, using defaults: %s", job_id, e)
# Apply IPv4 preference if configured.
try:
from hermes_constants import apply_ipv4_preference
_net_cfg = _cfg.get("network", {})
if isinstance(_net_cfg, dict) and _net_cfg.get("force_ipv4"):
apply_ipv4_preference(force=True)
except Exception:
pass
# Reasoning config from config.yaml
from hermes_constants import parse_reasoning_effort
effort = str(_cfg.get("agent", {}).get("reasoning_effort", "")).strip()
@@ -710,24 +646,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
},
)
fallback_model = _cfg.get("fallback_providers") or _cfg.get("fallback_model") or None
credential_pool = None
runtime_provider = str(turn_route["runtime"].get("provider") or "").strip().lower()
if runtime_provider:
try:
from agent.credential_pool import load_pool
pool = load_pool(runtime_provider)
if pool.has_credentials():
credential_pool = pool
logger.info(
"Job '%s': loaded credential pool for provider %s with %d entries",
job_id,
runtime_provider,
len(pool.entries()),
)
except Exception as e:
logger.debug("Job '%s': failed to load credential pool for %s: %s", job_id, runtime_provider, e)
agent = AIAgent(
model=turn_route["model"],
api_key=turn_route["runtime"].get("api_key"),
@@ -739,15 +657,12 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
max_iterations=max_iterations,
reasoning_config=reasoning_config,
prefill_messages=prefill_messages,
fallback_model=fallback_model,
credential_pool=credential_pool,
providers_allowed=pr.get("only"),
providers_ignored=pr.get("ignore"),
providers_order=pr.get("order"),
provider_sort=pr.get("sort"),
disabled_toolsets=["cronjob", "messaging", "clarify"],
quiet_mode=True,
skip_context_files=True, # Don't inject SOUL.md/AGENTS.md from scheduler cwd
skip_memory=True, # Cron system prompts would corrupt user representations
platform="cron",
session_id=_cron_session_id,
@@ -796,7 +711,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
_cron_pool.shutdown(wait=False, cancel_futures=True)
raise
finally:
_cron_pool.shutdown(wait=False, cancel_futures=True)
_cron_pool.shutdown(wait=False)
if _inactivity_timeout:
# Build diagnostic summary from the agent's activity tracker.
+1 -31
View File
@@ -5,41 +5,11 @@ set -e
HERMES_HOME="/opt/data"
INSTALL_DIR="/opt/hermes"
# --- Privilege dropping via gosu ---
# When started as root (the default), optionally remap the hermes user/group
# to match host-side ownership, fix volume permissions, then re-exec as hermes.
if [ "$(id -u)" = "0" ]; then
if [ -n "$HERMES_UID" ] && [ "$HERMES_UID" != "$(id -u hermes)" ]; then
echo "Changing hermes UID to $HERMES_UID"
usermod -u "$HERMES_UID" hermes
fi
if [ -n "$HERMES_GID" ] && [ "$HERMES_GID" != "$(id -g hermes)" ]; then
echo "Changing hermes GID to $HERMES_GID"
groupmod -g "$HERMES_GID" hermes
fi
actual_hermes_uid=$(id -u hermes)
if [ "$(stat -c %u "$HERMES_HOME" 2>/dev/null)" != "$actual_hermes_uid" ]; then
echo "$HERMES_HOME is not owned by $actual_hermes_uid, fixing"
chown -R hermes:hermes "$HERMES_HOME"
fi
echo "Dropping root privileges"
exec gosu hermes "$0" "$@"
fi
# --- Running as hermes from here ---
source "${INSTALL_DIR}/.venv/bin/activate"
# Create essential directory structure. Cache and platform directories
# (cache/images, cache/audio, platforms/whatsapp, etc.) are created on
# demand by the application — don't pre-create them here so new installs
# get the consolidated layout from get_hermes_dir().
# The "home/" subdirectory is a per-profile HOME for subprocesses (git,
# ssh, gh, npm …). Without it those tools write to /root which is
# ephemeral and shared across profiles. See issue #4426.
mkdir -p "$HERMES_HOME"/{cron,sessions,logs,hooks,memories,skills,skins,plans,workspace,home}
mkdir -p "$HERMES_HOME"/{cron,sessions,logs,hooks,memories,skills}
# .env
if [ ! -f "$HERMES_HOME/.env" ]; then
+12 -44
View File
@@ -11,14 +11,12 @@ When you run `hermes setup` for the first time and Hermes detects `~/.openclaw`,
### 2. CLI Command (quick, scriptable)
```bash
hermes claw migrate # Preview then migrate (always shows preview first)
hermes claw migrate --dry-run # Preview only, no changes
hermes claw migrate # Full migration with confirmation prompt
hermes claw migrate --dry-run # Preview what would happen
hermes claw migrate --preset user-data # Migrate without API keys/secrets
hermes claw migrate --yes # Skip confirmation prompt
```
The migration always shows a full preview of what will be imported before making any changes. You review the preview and confirm before anything is written.
**All options:**
| Flag | Description |
@@ -41,7 +39,7 @@ Ask the agent to run the migration for you:
```
The agent will use the `openclaw-migration` skill to:
1. Run a preview first to show what would change
1. Run a dry-run first to preview changes
2. Ask about conflict resolution (SOUL.md, skills, etc.)
3. Let you choose between `user-data` and `full` presets
4. Execute the migration with your choices
@@ -60,31 +58,16 @@ The agent will use the `openclaw-migration` skill to:
| Messaging settings | `~/.openclaw/config.yaml` (TELEGRAM_ALLOWED_USERS, MESSAGING_CWD) | `~/.hermes/.env` |
| TTS assets | `~/.openclaw/workspace/tts/` | `~/.hermes/tts/` |
Workspace files are also checked at `workspace.default/` and `workspace-main/` as fallback paths (OpenClaw renamed `workspace/` to `workspace-main/` in recent versions).
### `full` preset (adds to `user-data`)
| Item | Source | Destination |
|------|--------|-------------|
| Telegram bot token | `openclaw.json` channels config | `~/.hermes/.env` |
| OpenRouter API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |
| OpenAI API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |
| Anthropic API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |
| ElevenLabs API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |
| Telegram bot token | `~/.openclaw/config.yaml` | `~/.hermes/.env` |
| OpenRouter API key | `~/.openclaw/.env` or config | `~/.hermes/.env` |
| OpenAI API key | `~/.openclaw/.env` or config | `~/.hermes/.env` |
| Anthropic API key | `~/.openclaw/.env` or config | `~/.hermes/.env` |
| ElevenLabs API key | `~/.openclaw/.env` or config | `~/.hermes/.env` |
API keys are searched across four sources: inline config values, `~/.openclaw/.env`, the `openclaw.json` `"env"` sub-object, and per-agent auth profiles.
Only allowlisted secrets are ever imported. Other credentials are skipped and reported.
## OpenClaw Schema Compatibility
The migration handles both old and current OpenClaw config layouts:
- **Channel tokens**: Reads from flat paths (`channels.telegram.botToken`) and the newer `accounts.default` layout (`channels.telegram.accounts.default.botToken`)
- **TTS provider**: OpenClaw renamed "edge" to "microsoft" — both are recognized and mapped to Hermes' "edge"
- **Provider API types**: Both short (`openai`, `anthropic`) and hyphenated (`openai-completions`, `anthropic-messages`, `google-generative-ai`) values are mapped correctly
- **thinkingDefault**: All enum values are handled including newer ones (`minimal`, `xhigh`, `adaptive`)
- **Matrix**: Uses `accessToken` field (not `botToken`)
- **SecretRef formats**: Plain strings, env templates (`${VAR}`), and `source: "env"` SecretRefs are resolved. `source: "file"` and `source: "exec"` SecretRefs produce a warning — add those keys manually after migration.
Only these 6 allowlisted secrets are ever imported. Other credentials are skipped and reported.
## Conflict Handling
@@ -101,24 +84,18 @@ For skills, you can also use `--skill-conflict rename` to import conflicting ski
## Migration Report
Every migration produces a report showing:
Every migration (including dry runs) produces a report showing:
- **Migrated items** — what was successfully imported
- **Conflicts** — items skipped because they already exist
- **Skipped items** — items not found in the source
- **Errors** — items that failed to import
For executed migrations, the full report is saved to `~/.hermes/migration/openclaw/<timestamp>/`.
## Post-Migration Notes
- **Skills require a new session** — imported skills take effect after restarting your agent or starting a new chat.
- **WhatsApp requires re-pairing** — WhatsApp uses QR-code pairing, not token-based auth. Run `hermes whatsapp` to pair.
- **Archive cleanup** — after migration, you'll be offered to rename `~/.openclaw/` to `.openclaw.pre-migration/` to prevent state confusion. You can also run `hermes claw cleanup` later.
For execute runs, the full report is saved to `~/.hermes/migration/openclaw/<timestamp>/`.
## Troubleshooting
### "OpenClaw directory not found"
The migration looks for `~/.openclaw` by default, then tries `~/.clawdbot` and `~/.moltbot`. If your OpenClaw is installed elsewhere, use `--source`:
The migration looks for `~/.openclaw` by default. If your OpenClaw is installed elsewhere, use `--source`:
```bash
hermes claw migrate --source /path/to/.openclaw
```
@@ -131,12 +108,3 @@ hermes skills install openclaw-migration
### Memory overflow
If your OpenClaw MEMORY.md or USER.md exceeds Hermes' character limits, excess entries are exported to an overflow file in the migration report directory. You can manually review and add the most important ones.
### API keys not found
Keys might be stored in different places depending on your OpenClaw setup:
- `~/.openclaw/.env` file
- Inline in `openclaw.json` under `models.providers.*.apiKey`
- In `openclaw.json` under the `"env"` or `"env.vars"` sub-objects
- In `~/.openclaw/agents/main/agent/auth-profiles.json`
The migration checks all four. If keys use `source: "file"` or `source: "exec"` SecretRefs, they can't be resolved automatically — add them via `hermes config set`.
-329
View File
@@ -1,329 +0,0 @@
# Container-Aware CLI Review Fixes Spec
**PR:** NousResearch/hermes-agent#7543
**Review:** cursor[bot] bugbot review (4094049442) + two prior rounds
**Date:** 2026-04-12
**Branch:** `feat/container-aware-cli-clean`
## Review Issues Summary
Six issues were raised across three bugbot review rounds. Three were fixed in intermediate commits (38277a6a, 726cf90f). This spec addresses remaining design concerns surfaced by those reviews and simplifies the implementation based on interview decisions.
| # | Issue | Severity | Status |
|---|-------|----------|--------|
| 1 | `os.execvp` retry loop unreachable | Medium | Fixed in 79e8cd12 (switched to subprocess.run) |
| 2 | Redundant `shutil.which("sudo")` | Medium | Fixed in 38277a6a (reuses `sudo` var) |
| 3 | Missing `chown -h` on symlink update | Low | Fixed in 38277a6a |
| 4 | Container routing after `parse_args()` | High | Fixed in 726cf90f |
| 5 | Hardcoded `/home/${user}` | Medium | Fixed in 726cf90f |
| 6 | Group membership not gated on `container.enable` | Low | Fixed in 726cf90f |
The mechanical fixes are in place but the overall design needs revision. The retry loop, error swallowing, and process model have deeper issues than what the bugbot flagged.
---
## Spec: Revised `_exec_in_container`
### Design Principles
1. **Let it crash.** No silent fallbacks. If `.container-mode` exists but something goes wrong, the error propagates naturally (Python traceback). The only case where container routing is skipped is when `.container-mode` doesn't exist or `HERMES_DEV=1`.
2. **No retries.** Probe once for sudo, exec once. If it fails, docker/podman's stderr reaches the user verbatim.
3. **Completely transparent.** No error wrapping, no prefixes, no spinners. Docker's output goes straight through.
4. **`os.execvp` on the happy path.** Replace the Python process entirely so there's no idle parent during interactive sessions. Note: `execvp` never returns on success (process is replaced) and raises `OSError` on failure (it does not return a value). The container process's exit code becomes the process exit code by definition — no explicit propagation needed.
5. **One human-readable exception to "let it crash".** `subprocess.TimeoutExpired` from the sudo probe gets a specific catch with a readable message, since a raw traceback for "your Docker daemon is slow" is confusing. All other exceptions propagate naturally.
### Execution Flow
```
1. get_container_exec_info()
- HERMES_DEV=1 → return None (skip routing)
- Inside container → return None (skip routing)
- .container-mode doesn't exist → return None (skip routing)
- .container-mode exists → parse and return dict
- .container-mode exists but malformed/unreadable → LET IT CRASH (no try/except)
2. _exec_in_container(container_info, sys.argv[1:])
a. shutil.which(backend) → if None, print "{backend} not found on PATH" and sys.exit(1)
b. Sudo probe: subprocess.run([runtime, "inspect", "--format", "ok", container_name], timeout=15)
- If succeeds → needs_sudo = False
- If fails → try subprocess.run([sudo, "-n", runtime, "inspect", ...], timeout=15)
- If succeeds → needs_sudo = True
- If fails → print error with sudoers hint (including why -n is required) and sys.exit(1)
- If TimeoutExpired → catch specifically, print human-readable message about slow daemon
c. Build exec_cmd: [sudo? + runtime, "exec", tty_flags, "-u", exec_user, env_flags, container, hermes_bin, *cli_args]
d. os.execvp(exec_cmd[0], exec_cmd)
- On success: process is replaced — Python is gone, container exit code IS the process exit code
- On OSError: let it crash (natural traceback)
```
### Changes to `hermes_cli/main.py`
#### `_exec_in_container` — rewrite
Remove:
- The entire retry loop (`max_retries`, `for attempt in range(...)`)
- Spinner logic (`"Waiting for container..."`, dots)
- Exit code classification (125/126/127 handling)
- `subprocess.run` for the exec call (keep it only for the sudo probe)
- Special TTY vs non-TTY retry counts
- The `time` import (no longer needed)
Change:
- Use `os.execvp(exec_cmd[0], exec_cmd)` as the final call
- Keep the `subprocess` import only for the sudo probe
- Keep TTY detection for the `-it` vs `-i` flag
- Keep env var forwarding (TERM, COLORTERM, LANG, LC_ALL)
- Keep the sudo probe as-is (it's the one "smart" part)
- Bump probe `timeout` from 5s to 15s — cold podman on a loaded machine needs headroom
- Catch `subprocess.TimeoutExpired` specifically on both probe calls — print a readable message about the daemon being unresponsive instead of a raw traceback
- Expand the sudoers hint error message to explain *why* `-n` (non-interactive) is required: a password prompt would hang the CLI or break piped commands
The function becomes roughly:
```python
def _exec_in_container(container_info: dict, cli_args: list):
"""Replace the current process with a command inside the managed container.
Probes whether sudo is needed (rootful containers), then os.execvp
into the container. If exec fails, the OS error propagates naturally.
"""
import shutil
import subprocess
backend = container_info["backend"]
container_name = container_info["container_name"]
exec_user = container_info["exec_user"]
hermes_bin = container_info["hermes_bin"]
runtime = shutil.which(backend)
if not runtime:
print(f"Error: {backend} not found on PATH. Cannot route to container.",
file=sys.stderr)
sys.exit(1)
# Probe whether we need sudo to see the rootful container.
# Timeout is 15s — cold podman on a loaded machine can take a while.
# TimeoutExpired is caught specifically for a human-readable message;
# all other exceptions propagate naturally.
needs_sudo = False
sudo = None
try:
probe = subprocess.run(
[runtime, "inspect", "--format", "ok", container_name],
capture_output=True, text=True, timeout=15,
)
except subprocess.TimeoutExpired:
print(
f"Error: timed out waiting for {backend} to respond.\n"
f"The {backend} daemon may be unresponsive or starting up.",
file=sys.stderr,
)
sys.exit(1)
if probe.returncode != 0:
sudo = shutil.which("sudo")
if sudo:
try:
probe2 = subprocess.run(
[sudo, "-n", runtime, "inspect", "--format", "ok", container_name],
capture_output=True, text=True, timeout=15,
)
except subprocess.TimeoutExpired:
print(
f"Error: timed out waiting for sudo {backend} to respond.",
file=sys.stderr,
)
sys.exit(1)
if probe2.returncode == 0:
needs_sudo = True
else:
print(
f"Error: container '{container_name}' not found via {backend}.\n"
f"\n"
f"The NixOS service runs the container as root. Your user cannot\n"
f"see it because {backend} uses per-user namespaces.\n"
f"\n"
f"Fix: grant passwordless sudo for {backend}. The -n (non-interactive)\n"
f"flag is required because the CLI calls sudo non-interactively —\n"
f"a password prompt would hang or break piped commands:\n"
f"\n"
f' security.sudo.extraRules = [{{\n'
f' users = [ "{os.getenv("USER", "your-user")}" ];\n'
f' commands = [{{ command = "{runtime}"; options = [ "NOPASSWD" ]; }}];\n'
f' }}];\n'
f"\n"
f"Or run: sudo hermes {' '.join(cli_args)}",
file=sys.stderr,
)
sys.exit(1)
else:
print(
f"Error: container '{container_name}' not found via {backend}.\n"
f"The container may be running under root. Try: sudo hermes {' '.join(cli_args)}",
file=sys.stderr,
)
sys.exit(1)
is_tty = sys.stdin.isatty()
tty_flags = ["-it"] if is_tty else ["-i"]
env_flags = []
for var in ("TERM", "COLORTERM", "LANG", "LC_ALL"):
val = os.environ.get(var)
if val:
env_flags.extend(["-e", f"{var}={val}"])
cmd_prefix = [sudo, "-n", runtime] if needs_sudo else [runtime]
exec_cmd = (
cmd_prefix + ["exec"]
+ tty_flags
+ ["-u", exec_user]
+ env_flags
+ [container_name, hermes_bin]
+ cli_args
)
# execvp replaces this process entirely — it never returns on success.
# On failure it raises OSError, which propagates naturally.
os.execvp(exec_cmd[0], exec_cmd)
```
#### Container routing call site in `main()` — remove try/except
Current:
```python
try:
from hermes_cli.config import get_container_exec_info
container_info = get_container_exec_info()
if container_info:
_exec_in_container(container_info, sys.argv[1:])
sys.exit(1) # exec failed if we reach here
except SystemExit:
raise
except Exception:
pass # Container routing unavailable, proceed locally
```
Revised:
```python
from hermes_cli.config import get_container_exec_info
container_info = get_container_exec_info()
if container_info:
_exec_in_container(container_info, sys.argv[1:])
# Unreachable: os.execvp never returns on success (process is replaced)
# and raises OSError on failure (which propagates as a traceback).
# This line exists only as a defensive assertion.
sys.exit(1)
```
No try/except. If `.container-mode` doesn't exist, `get_container_exec_info()` returns `None` and we skip routing. If it exists but is broken, the exception propagates with a natural traceback.
Note: `sys.exit(1)` after `_exec_in_container` is dead code in all paths — `os.execvp` either replaces the process or raises. It's kept as a belt-and-suspenders assertion with a comment marking it unreachable, not as actual error handling.
### Changes to `hermes_cli/config.py`
#### `get_container_exec_info` — remove inner try/except
Current code catches `(OSError, IOError)` and returns `None`. This silently hides permission errors, corrupt files, etc.
Change: Remove the try/except around file reading. Keep the early returns for `HERMES_DEV=1` and `_is_inside_container()`. The `FileNotFoundError` from `open()` when `.container-mode` doesn't exist should still return `None` (this is the "container mode not enabled" case). All other exceptions propagate.
```python
def get_container_exec_info() -> Optional[dict]:
if os.environ.get("HERMES_DEV") == "1":
return None
if _is_inside_container():
return None
container_mode_file = get_hermes_home() / ".container-mode"
try:
with open(container_mode_file, "r") as f:
# ... parse key=value lines ...
except FileNotFoundError:
return None
# All other exceptions (PermissionError, malformed data, etc.) propagate
return { ... }
```
---
## Spec: NixOS Module Changes
### Symlink creation — simplify to two branches
Current: 4 branches (symlink exists, directory exists, other file, doesn't exist).
Revised: 2 branches.
```bash
if [ -d "${symlinkPath}" ] && [ ! -L "${symlinkPath}" ]; then
# Real directory — back it up, then create symlink
_backup="${symlinkPath}.bak.$(date +%s)"
echo "hermes-agent: backing up existing ${symlinkPath} to $_backup"
mv "${symlinkPath}" "$_backup"
fi
# For everything else (symlink, doesn't exist, etc.) — just force-create
ln -sfn "${target}" "${symlinkPath}"
chown -h ${user}:${cfg.group} "${symlinkPath}"
```
`ln -sfn` handles: existing symlink (replaces), doesn't exist (creates), and after the `mv` above (creates). The only case that needs special handling is a real directory, because `ln -sfn` cannot atomically replace a directory.
Note: there is a theoretical race between the `[ -d ... ]` check and the `mv` (something could create/remove the directory in between). In practice this is a NixOS activation script running as root during `nixos-rebuild switch` — no other process should be touching `~/.hermes` at that moment. Not worth adding locking for.
### Sudoers — document, don't auto-configure
Do NOT add `security.sudo.extraRules` to the module. Document the sudoers requirement in the module's description/comments and in the error message the CLI prints when sudo probe fails.
### Group membership gating — keep as-is
The fix in 726cf90f (`cfg.container.enable && cfg.container.hostUsers != []`) is correct. Leftover group membership when container mode is disabled is harmless. No cleanup needed.
---
## Spec: Test Rewrite
The existing test file (`tests/hermes_cli/test_container_aware_cli.py`) has 16 tests. With the simplified exec model, several are obsolete.
### Tests to keep (update as needed)
- `test_is_inside_container_dockerenv` — unchanged
- `test_is_inside_container_containerenv` — unchanged
- `test_is_inside_container_cgroup_docker` — unchanged
- `test_is_inside_container_false_on_host` — unchanged
- `test_get_container_exec_info_returns_metadata` — unchanged
- `test_get_container_exec_info_none_inside_container` — unchanged
- `test_get_container_exec_info_none_without_file` — unchanged
- `test_get_container_exec_info_skipped_when_hermes_dev` — unchanged
- `test_get_container_exec_info_not_skipped_when_hermes_dev_zero` — unchanged
- `test_get_container_exec_info_defaults` — unchanged
- `test_get_container_exec_info_docker_backend` — unchanged
### Tests to add
- `test_get_container_exec_info_crashes_on_permission_error` — verify that `PermissionError` propagates (no silent `None` return)
- `test_exec_in_container_calls_execvp` — verify `os.execvp` is called with correct args (runtime, tty flags, user, env, container, binary, cli args)
- `test_exec_in_container_sudo_probe_sets_prefix` — verify that when first probe fails and sudo probe succeeds, `os.execvp` is called with `sudo -n` prefix
- `test_exec_in_container_no_runtime_hard_fails` — keep existing, verify `sys.exit(1)` when `shutil.which` returns None
- `test_exec_in_container_non_tty_uses_i_only` — update to check `os.execvp` args instead of `subprocess.run` args
- `test_exec_in_container_probe_timeout_prints_message` — verify that `subprocess.TimeoutExpired` from the probe produces a human-readable error and `sys.exit(1)`, not a raw traceback
- `test_exec_in_container_container_not_running_no_sudo` — verify the path where runtime exists (`shutil.which` returns a path) but probe returns non-zero and no sudo is available. Should print the "container may be running under root" error. This is distinct from `no_runtime_hard_fails` which covers `shutil.which` returning None.
### Tests to delete
- `test_exec_in_container_tty_retries_on_container_failure` — retry loop removed
- `test_exec_in_container_non_tty_retries_silently_exits_126` — retry loop removed
- `test_exec_in_container_propagates_hermes_exit_code` — no subprocess.run to check exit codes; execvp replaces the process. Note: exit code propagation still works correctly — when `os.execvp` succeeds, the container's process *becomes* this process, so its exit code is the process exit code by OS semantics. No application code needed, no test needed. A comment in the function docstring documents this intent for future readers.
---
## Out of Scope
- Auto-configuring sudoers rules in the NixOS module
- Any changes to `get_container_exec_info` parsing logic beyond the try/except narrowing
- Changes to `.container-mode` file format
- Changes to the `HERMES_DEV=1` bypass
- Changes to container detection logic (`_is_inside_container`)
@@ -49,8 +49,6 @@ class HermesToolCallParser(ToolCallParser):
continue
tc_data = json.loads(raw_json)
if "name" not in tc_data:
continue
tool_calls.append(
ChatCompletionMessageToolCall(
id=f"call_{uuid.uuid4().hex[:8]}",
@@ -89,8 +89,6 @@ class MistralToolCallParser(ToolCallParser):
parsed = [parsed]
for tc in parsed:
if "name" not in tc:
continue
args = tc.get("arguments", {})
if isinstance(args, dict):
args = json.dumps(args, ensure_ascii=False)
+4 -9
View File
@@ -76,15 +76,10 @@ def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
except Exception as e:
logger.warning("Channel directory: failed to build %s: %s", platform.value, e)
# Platforms that don't support direct channel enumeration get session-based
# discovery automatically. Skip infrastructure entries that aren't messaging
# platforms — everything else falls through to _build_from_sessions().
_SKIP_SESSION_DISCOVERY = frozenset({"local", "api_server", "webhook"})
for plat in Platform:
plat_name = plat.value
if plat_name in _SKIP_SESSION_DISCOVERY or plat_name in platforms:
continue
platforms[plat_name] = _build_from_sessions(plat_name)
# Telegram, WhatsApp & Signal can't enumerate chats -- pull from session history
for plat_name in ("telegram", "whatsapp", "signal", "email", "sms", "bluebubbles"):
if plat_name not in platforms:
platforms[plat_name] = _build_from_sessions(plat_name)
directory = {
"updated_at": datetime.now().isoformat(),
+4 -123
View File
@@ -63,8 +63,6 @@ class Platform(Enum):
WEBHOOK = "webhook"
FEISHU = "feishu"
WECOM = "wecom"
WECOM_CALLBACK = "wecom_callback"
WEIXIN = "weixin"
BLUEBUBBLES = "bluebubbles"
@@ -191,7 +189,7 @@ class StreamingConfig:
"""Configuration for real-time token streaming to messaging platforms."""
enabled: bool = False
transport: str = "edit" # "edit" (progressive editMessageText) or "off"
edit_interval: float = 1.0 # Seconds between message edits (Telegram rate-limits at ~1/s)
edit_interval: float = 0.3 # Seconds between message edits
buffer_threshold: int = 40 # Chars before forcing an edit
cursor: str = "" # Cursor shown during streaming
@@ -211,7 +209,7 @@ class StreamingConfig:
return cls(
enabled=data.get("enabled", False),
transport=data.get("transport", "edit"),
edit_interval=float(data.get("edit_interval", 1.0)),
edit_interval=float(data.get("edit_interval", 0.3)),
buffer_threshold=int(data.get("buffer_threshold", 40)),
cursor=data.get("cursor", ""),
)
@@ -263,11 +261,6 @@ class GatewayConfig:
for platform, config in self.platforms.items():
if not config.enabled:
continue
# Weixin requires both a token and an account_id
if platform == Platform.WEIXIN:
if config.extra.get("account_id") and (config.token or config.extra.get("token")):
connected.append(platform)
continue
# Platforms that use token/api_key auth
if config.token or config.api_key:
connected.append(platform)
@@ -292,14 +285,9 @@ class GatewayConfig:
# Feishu uses extra dict for app credentials
elif platform == Platform.FEISHU and config.extra.get("app_id"):
connected.append(platform)
# WeCom bot mode uses extra dict for bot credentials
# WeCom uses extra dict for bot credentials
elif platform == Platform.WECOM and config.extra.get("bot_id"):
connected.append(platform)
# WeCom callback mode uses corp_id or apps list
elif platform == Platform.WECOM_CALLBACK and (
config.extra.get("corp_id") or config.extra.get("apps")
):
connected.append(platform)
# BlueBubbles uses extra dict for local server config
elif platform == Platform.BLUEBUBBLES and config.extra.get("server_url") and config.extra.get("password"):
connected.append(platform)
@@ -548,8 +536,6 @@ def load_gateway_config() -> GatewayConfig:
bridged["free_response_channels"] = platform_cfg["free_response_channels"]
if "mention_patterns" in platform_cfg:
bridged["mention_patterns"] = platform_cfg["mention_patterns"]
if plat == Platform.DISCORD and "channel_skill_bindings" in platform_cfg:
bridged["channel_skill_bindings"] = platform_cfg["channel_skill_bindings"]
if not bridged:
continue
plat_data = platforms_data.setdefault(plat.value, {})
@@ -595,12 +581,6 @@ def load_gateway_config() -> GatewayConfig:
if isinstance(ic, list):
ic = ",".join(str(v) for v in ic)
os.environ["DISCORD_IGNORED_CHANNELS"] = str(ic)
# allowed_channels: if set, bot ONLY responds in these channels (whitelist)
ac = discord_cfg.get("allowed_channels")
if ac is not None and not os.getenv("DISCORD_ALLOWED_CHANNELS"):
if isinstance(ac, list):
ac = ",".join(str(v) for v in ac)
os.environ["DISCORD_ALLOWED_CHANNELS"] = str(ac)
# no_thread_channels: channels where bot responds directly without creating thread
ntc = discord_cfg.get("no_thread_channels")
if ntc is not None and not os.getenv("DISCORD_NO_THREAD_CHANNELS"):
@@ -648,8 +628,6 @@ def load_gateway_config() -> GatewayConfig:
os.environ["MATRIX_FREE_RESPONSE_ROOMS"] = str(frc)
if "auto_thread" in matrix_cfg and not os.getenv("MATRIX_AUTO_THREAD"):
os.environ["MATRIX_AUTO_THREAD"] = str(matrix_cfg["auto_thread"]).lower()
if "dm_mention_threads" in matrix_cfg and not os.getenv("MATRIX_DM_MENTION_THREADS"):
os.environ["MATRIX_DM_MENTION_THREADS"] = str(matrix_cfg["dm_mention_threads"]).lower()
except Exception as e:
logger.warning(
@@ -665,17 +643,6 @@ def load_gateway_config() -> GatewayConfig:
_apply_env_overrides(config)
# --- Validate loaded values ---
_validate_gateway_config(config)
return config
def _validate_gateway_config(config: "GatewayConfig") -> None:
"""Validate and sanitize a loaded GatewayConfig in place.
Called by ``load_gateway_config()`` after all config sources are merged.
Extracted as a separate function for testability.
"""
policy = config.default_reset_policy
if not (0 <= policy.at_hour <= 23):
@@ -699,7 +666,6 @@ def _validate_gateway_config(config: "GatewayConfig") -> None:
Platform.SLACK: "SLACK_BOT_TOKEN",
Platform.MATTERMOST: "MATTERMOST_TOKEN",
Platform.MATRIX: "MATRIX_ACCESS_TOKEN",
Platform.WEIXIN: "WEIXIN_TOKEN",
}
for platform, pconfig in config.platforms.items():
if not pconfig.enabled:
@@ -712,31 +678,7 @@ def _validate_gateway_config(config: "GatewayConfig") -> None:
platform.value, env_name,
)
# Reject known-weak placeholder tokens.
# Ported from openclaw/openclaw#64586: users who copy .env.example
# without changing placeholder values get a clear startup error instead
# of a confusing "auth failed" from the platform API.
try:
from hermes_cli.auth import has_usable_secret
except ImportError:
has_usable_secret = None # type: ignore[assignment]
if has_usable_secret is not None:
for platform, pconfig in config.platforms.items():
if not pconfig.enabled:
continue
env_name = _token_env_names.get(platform)
if not env_name:
continue
token = pconfig.token
if token and token.strip() and not has_usable_secret(token, min_length=4):
logger.error(
"%s is enabled but %s is set to a placeholder value ('%s'). "
"Set a real bot token before starting the gateway. "
"The adapter will NOT be started.",
platform.value, env_name, token.strip()[:6] + "...",
)
pconfig.enabled = False
return config
def _apply_env_overrides(config: GatewayConfig) -> None:
@@ -959,9 +901,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
pass
if api_server_host:
config.platforms[Platform.API_SERVER].extra["host"] = api_server_host
api_server_model_name = os.getenv("API_SERVER_MODEL_NAME", "")
if api_server_model_name:
config.platforms[Platform.API_SERVER].extra["model_name"] = api_server_model_name
# Webhook platform
webhook_enabled = os.getenv("WEBHOOK_ENABLED", "").lower() in ("true", "1", "yes")
@@ -1028,64 +967,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
name=os.getenv("WECOM_HOME_CHANNEL_NAME", "Home"),
)
# WeCom callback mode (self-built apps)
wecom_callback_corp_id = os.getenv("WECOM_CALLBACK_CORP_ID")
wecom_callback_corp_secret = os.getenv("WECOM_CALLBACK_CORP_SECRET")
if wecom_callback_corp_id and wecom_callback_corp_secret:
if Platform.WECOM_CALLBACK not in config.platforms:
config.platforms[Platform.WECOM_CALLBACK] = PlatformConfig()
config.platforms[Platform.WECOM_CALLBACK].enabled = True
config.platforms[Platform.WECOM_CALLBACK].extra.update({
"corp_id": wecom_callback_corp_id,
"corp_secret": wecom_callback_corp_secret,
"agent_id": os.getenv("WECOM_CALLBACK_AGENT_ID", ""),
"token": os.getenv("WECOM_CALLBACK_TOKEN", ""),
"encoding_aes_key": os.getenv("WECOM_CALLBACK_ENCODING_AES_KEY", ""),
"host": os.getenv("WECOM_CALLBACK_HOST", "0.0.0.0"),
"port": int(os.getenv("WECOM_CALLBACK_PORT", "8645")),
})
# Weixin (personal WeChat via iLink Bot API)
weixin_token = os.getenv("WEIXIN_TOKEN")
weixin_account_id = os.getenv("WEIXIN_ACCOUNT_ID")
if weixin_token or weixin_account_id:
if Platform.WEIXIN not in config.platforms:
config.platforms[Platform.WEIXIN] = PlatformConfig()
config.platforms[Platform.WEIXIN].enabled = True
if weixin_token:
config.platforms[Platform.WEIXIN].token = weixin_token
extra = config.platforms[Platform.WEIXIN].extra
if weixin_account_id:
extra["account_id"] = weixin_account_id
weixin_base_url = os.getenv("WEIXIN_BASE_URL", "").strip()
if weixin_base_url:
extra["base_url"] = weixin_base_url.rstrip("/")
weixin_cdn_base_url = os.getenv("WEIXIN_CDN_BASE_URL", "").strip()
if weixin_cdn_base_url:
extra["cdn_base_url"] = weixin_cdn_base_url.rstrip("/")
weixin_dm_policy = os.getenv("WEIXIN_DM_POLICY", "").strip().lower()
if weixin_dm_policy:
extra["dm_policy"] = weixin_dm_policy
weixin_group_policy = os.getenv("WEIXIN_GROUP_POLICY", "").strip().lower()
if weixin_group_policy:
extra["group_policy"] = weixin_group_policy
weixin_allowed_users = os.getenv("WEIXIN_ALLOWED_USERS", "").strip()
if weixin_allowed_users:
extra["allow_from"] = weixin_allowed_users
weixin_group_allowed_users = os.getenv("WEIXIN_GROUP_ALLOWED_USERS", "").strip()
if weixin_group_allowed_users:
extra["group_allow_from"] = weixin_group_allowed_users
weixin_split_multiline = os.getenv("WEIXIN_SPLIT_MULTILINE_MESSAGES", "").strip()
if weixin_split_multiline:
extra["split_multiline_messages"] = weixin_split_multiline
weixin_home = os.getenv("WEIXIN_HOME_CHANNEL", "").strip()
if weixin_home:
config.platforms[Platform.WEIXIN].home_channel = HomeChannel(
platform=Platform.WEIXIN,
chat_id=weixin_home,
name=os.getenv("WEIXIN_HOME_CHANNEL_NAME", "Home"),
)
# BlueBubbles (iMessage)
bluebubbles_server_url = os.getenv("BLUEBUBBLES_SERVER_URL")
bluebubbles_password = os.getenv("BLUEBUBBLES_PASSWORD")
-206
View File
@@ -1,206 +0,0 @@
"""Per-platform display/verbosity configuration resolver.
Provides ``resolve_display_setting()`` the single entry-point for reading
display settings with platform-specific overrides and sensible defaults.
Resolution order (first non-None wins):
1. ``display.platforms.<platform>.<key>`` explicit per-platform user override
2. ``display.<key>`` global user setting
3. ``_PLATFORM_DEFAULTS[<platform>][<key>]`` built-in sensible default
4. ``_GLOBAL_DEFAULTS[<key>]`` built-in global default
Backward compatibility: ``display.tool_progress_overrides`` is still read as a
fallback for ``tool_progress`` when no ``display.platforms`` entry exists. A
config migration (version bump) automatically moves the old format into the new
``display.platforms`` structure.
"""
from __future__ import annotations
from typing import Any
# ---------------------------------------------------------------------------
# Overrideable display settings and their global defaults
# ---------------------------------------------------------------------------
# These are the settings that can be configured per-platform.
# Other display settings (compact, personality, skin, etc.) are CLI-only
# and don't participate in per-platform resolution.
_GLOBAL_DEFAULTS: dict[str, Any] = {
"tool_progress": "all",
"show_reasoning": False,
"tool_preview_length": 0,
"streaming": None, # None = follow top-level streaming config
}
# ---------------------------------------------------------------------------
# Sensible per-platform defaults — tiered by platform capability
# ---------------------------------------------------------------------------
# Tier 1 (high): Supports message editing, typically personal/team use
# Tier 2 (medium): Supports editing but often workspace/customer-facing
# Tier 3 (low): No edit support — each progress msg is permanent
# Tier 4 (minimal): Batch/non-interactive delivery
_TIER_HIGH = {
"tool_progress": "all",
"show_reasoning": False,
"tool_preview_length": 40,
"streaming": None, # follow global
}
_TIER_MEDIUM = {
"tool_progress": "new",
"show_reasoning": False,
"tool_preview_length": 40,
"streaming": None,
}
_TIER_LOW = {
"tool_progress": "off",
"show_reasoning": False,
"tool_preview_length": 40,
"streaming": False,
}
_TIER_MINIMAL = {
"tool_progress": "off",
"show_reasoning": False,
"tool_preview_length": 0,
"streaming": False,
}
_PLATFORM_DEFAULTS: dict[str, dict[str, Any]] = {
# Tier 1 — full edit support, personal/team use
"telegram": _TIER_HIGH,
"discord": _TIER_HIGH,
# Tier 2 — edit support, often customer/workspace channels
"slack": _TIER_MEDIUM,
"mattermost": _TIER_MEDIUM,
"matrix": _TIER_MEDIUM,
"feishu": _TIER_MEDIUM,
# Tier 3 — no edit support, progress messages are permanent
"signal": _TIER_LOW,
"whatsapp": _TIER_MEDIUM, # Baileys bridge supports /edit
"bluebubbles": _TIER_LOW,
"weixin": _TIER_LOW,
"wecom": _TIER_LOW,
"wecom_callback": _TIER_LOW,
"dingtalk": _TIER_LOW,
# Tier 4 — batch or non-interactive delivery
"email": _TIER_MINIMAL,
"sms": _TIER_MINIMAL,
"webhook": _TIER_MINIMAL,
"homeassistant": _TIER_MINIMAL,
"api_server": {**_TIER_HIGH, "tool_preview_length": 0},
}
# Canonical set of per-platform overrideable keys (for validation).
OVERRIDEABLE_KEYS = frozenset(_GLOBAL_DEFAULTS.keys())
def resolve_display_setting(
user_config: dict,
platform_key: str,
setting: str,
fallback: Any = None,
) -> Any:
"""Resolve a display setting with per-platform override support.
Parameters
----------
user_config : dict
The full parsed config.yaml dict.
platform_key : str
Platform config key (e.g. ``"telegram"``, ``"slack"``). Use
``_platform_config_key(source.platform)`` from gateway/run.py.
setting : str
Display setting name (e.g. ``"tool_progress"``, ``"show_reasoning"``).
fallback : Any
Fallback value when the setting isn't found anywhere.
Returns
-------
The resolved value, or *fallback* if nothing is configured.
"""
display_cfg = user_config.get("display") or {}
# 1. Explicit per-platform override (display.platforms.<platform>.<key>)
platforms = display_cfg.get("platforms") or {}
plat_overrides = platforms.get(platform_key)
if isinstance(plat_overrides, dict):
val = plat_overrides.get(setting)
if val is not None:
return _normalise(setting, val)
# 1b. Backward compat: display.tool_progress_overrides.<platform>
if setting == "tool_progress":
legacy = display_cfg.get("tool_progress_overrides")
if isinstance(legacy, dict):
val = legacy.get(platform_key)
if val is not None:
return _normalise(setting, val)
# 2. Global user setting (display.<key>)
val = display_cfg.get(setting)
if val is not None:
return _normalise(setting, val)
# 3. Built-in platform default
plat_defaults = _PLATFORM_DEFAULTS.get(platform_key)
if plat_defaults:
val = plat_defaults.get(setting)
if val is not None:
return val
# 4. Built-in global default
val = _GLOBAL_DEFAULTS.get(setting)
if val is not None:
return val
return fallback
def get_platform_defaults(platform_key: str) -> dict[str, Any]:
"""Return the built-in default display settings for a platform.
Falls back to ``_GLOBAL_DEFAULTS`` for unknown platforms.
"""
return dict(_PLATFORM_DEFAULTS.get(platform_key, _GLOBAL_DEFAULTS))
def get_effective_display(user_config: dict, platform_key: str) -> dict[str, Any]:
"""Return the fully-resolved display settings for a platform.
Useful for status commands that want to show all effective settings.
"""
return {
key: resolve_display_setting(user_config, platform_key, key)
for key in OVERRIDEABLE_KEYS
}
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _normalise(setting: str, value: Any) -> Any:
"""Normalise YAML quirks (bare ``off`` → False in YAML 1.1)."""
if setting == "tool_progress":
if value is False:
return "off"
if value is True:
return "all"
return str(value).lower()
if setting in ("show_reasoning", "streaming"):
if isinstance(value, str):
return value.lower() in ("true", "1", "yes", "on")
return bool(value)
if setting == "tool_preview_length":
try:
return int(value)
except (TypeError, ValueError):
return 0
return value
+39 -247
View File
@@ -20,13 +20,10 @@ Requires:
"""
import asyncio
import hashlib
import hmac
import json
import logging
import os
import socket as _socket
import re
import sqlite3
import time
import uuid
@@ -43,7 +40,6 @@ from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
SendResult,
is_network_accessible,
)
logger = logging.getLogger(__name__)
@@ -53,67 +49,6 @@ DEFAULT_HOST = "127.0.0.1"
DEFAULT_PORT = 8642
MAX_STORED_RESPONSES = 100
MAX_REQUEST_BYTES = 1_000_000 # 1 MB default limit for POST bodies
CHAT_COMPLETIONS_SSE_KEEPALIVE_SECONDS = 30.0
MAX_NORMALIZED_TEXT_LENGTH = 65_536 # 64 KB cap for normalized content parts
MAX_CONTENT_LIST_SIZE = 1_000 # Max items when content is an array
def _normalize_chat_content(
content: Any, *, _max_depth: int = 10, _depth: int = 0,
) -> str:
"""Normalize OpenAI chat message content into a plain text string.
Some clients (Open WebUI, LobeChat, etc.) send content as an array of
typed parts instead of a plain string::
[{"type": "text", "text": "hello"}, {"type": "input_text", "text": "..."}]
This function flattens those into a single string so the agent pipeline
(which expects strings) doesn't choke.
Defensive limits prevent abuse: recursion depth, list size, and output
length are all bounded.
"""
if _depth > _max_depth:
return ""
if content is None:
return ""
if isinstance(content, str):
return content[:MAX_NORMALIZED_TEXT_LENGTH] if len(content) > MAX_NORMALIZED_TEXT_LENGTH else content
if isinstance(content, list):
parts: List[str] = []
items = content[:MAX_CONTENT_LIST_SIZE] if len(content) > MAX_CONTENT_LIST_SIZE else content
for item in items:
if isinstance(item, str):
if item:
parts.append(item[:MAX_NORMALIZED_TEXT_LENGTH])
elif isinstance(item, dict):
item_type = str(item.get("type") or "").strip().lower()
if item_type in {"text", "input_text", "output_text"}:
text = item.get("text", "")
if text:
try:
parts.append(str(text)[:MAX_NORMALIZED_TEXT_LENGTH])
except Exception:
pass
# Silently skip image_url / other non-text parts
elif isinstance(item, list):
nested = _normalize_chat_content(item, _max_depth=_max_depth, _depth=_depth + 1)
if nested:
parts.append(nested)
# Check accumulated size
if sum(len(p) for p in parts) >= MAX_NORMALIZED_TEXT_LENGTH:
break
result = "\n".join(parts)
return result[:MAX_NORMALIZED_TEXT_LENGTH] if len(result) > MAX_NORMALIZED_TEXT_LENGTH else result
# Fallback for unexpected types (int, float, bool, etc.)
try:
result = str(content)
return result[:MAX_NORMALIZED_TEXT_LENGTH] if len(result) > MAX_NORMALIZED_TEXT_LENGTH else result
except Exception:
return ""
def check_api_server_requirements() -> bool:
@@ -347,24 +282,6 @@ def _make_request_fingerprint(body: Dict[str, Any], keys: List[str]) -> str:
return sha256(repr(subset).encode("utf-8")).hexdigest()
def _derive_chat_session_id(
system_prompt: Optional[str],
first_user_message: str,
) -> str:
"""Derive a stable session ID from the conversation's first user message.
OpenAI-compatible frontends (Open WebUI, LibreChat, etc.) send the full
conversation history with every request. The system prompt and first user
message are constant across all turns of the same conversation, so hashing
them produces a deterministic session ID that lets the API server reuse
the same Hermes session (and therefore the same Docker container sandbox
directory) across turns.
"""
seed = f"{system_prompt or ''}\n{first_user_message}"
digest = hashlib.sha256(seed.encode("utf-8")).hexdigest()[:16]
return f"api-{digest}"
class APIServerAdapter(BasePlatformAdapter):
"""
OpenAI-compatible HTTP API server adapter.
@@ -382,9 +299,6 @@ class APIServerAdapter(BasePlatformAdapter):
self._cors_origins: tuple[str, ...] = self._parse_cors_origins(
extra.get("cors_origins", os.getenv("API_SERVER_CORS_ORIGINS", "")),
)
self._model_name: str = self._resolve_model_name(
extra.get("model_name", os.getenv("API_SERVER_MODEL_NAME", "")),
)
self._app: Optional["web.Application"] = None
self._runner: Optional["web.AppRunner"] = None
self._site: Optional["web.TCPSite"] = None
@@ -410,26 +324,6 @@ class APIServerAdapter(BasePlatformAdapter):
return tuple(str(item).strip() for item in items if str(item).strip())
@staticmethod
def _resolve_model_name(explicit: str) -> str:
"""Derive the advertised model name for /v1/models.
Priority:
1. Explicit override (config extra or API_SERVER_MODEL_NAME env var)
2. Active profile name (so each profile advertises a distinct model)
3. Fallback: "hermes-agent"
"""
if explicit and explicit.strip():
return explicit.strip()
try:
from hermes_cli.profiles import get_active_profile_name
profile = get_active_profile_name()
if profile and profile not in ("default", "custom"):
return profile
except Exception:
pass
return "hermes-agent"
def _cors_headers_for_origin(self, origin: str) -> Optional[Dict[str, str]]:
"""Return CORS headers for an allowed browser origin."""
if not origin or not self._cors_origins:
@@ -469,8 +363,7 @@ class APIServerAdapter(BasePlatformAdapter):
Validate Bearer token from Authorization header.
Returns None if auth is OK, or a 401 web.Response on failure.
If no API key is configured, all requests are allowed (only when API
server is local).
If no API key is configured, all requests are allowed.
"""
if not self._api_key:
return None # No key configured — allow all (local-only use)
@@ -575,12 +468,12 @@ class APIServerAdapter(BasePlatformAdapter):
"object": "list",
"data": [
{
"id": self._model_name,
"id": "hermes-agent",
"object": "model",
"created": int(time.time()),
"owned_by": "hermes",
"permission": [],
"root": self._model_name,
"root": "hermes-agent",
"parent": None,
}
],
@@ -613,7 +506,7 @@ class APIServerAdapter(BasePlatformAdapter):
for msg in messages:
role = msg.get("role", "")
content = _normalize_chat_content(msg.get("content", ""))
content = msg.get("content", "")
if role == "system":
# Accumulate system messages
if system_prompt is None:
@@ -638,32 +531,8 @@ class APIServerAdapter(BasePlatformAdapter):
# Allow caller to continue an existing session by passing X-Hermes-Session-Id.
# When provided, history is loaded from state.db instead of from the request body.
#
# Security: session continuation exposes conversation history, so it is
# only allowed when the API key is configured and the request is
# authenticated. Without this gate, any unauthenticated client could
# read arbitrary session history by guessing/enumerating session IDs.
provided_session_id = request.headers.get("X-Hermes-Session-Id", "").strip()
if provided_session_id:
if not self._api_key:
logger.warning(
"Session continuation via X-Hermes-Session-Id rejected: "
"no API key configured. Set API_SERVER_KEY to enable "
"session continuity."
)
return web.json_response(
_openai_error(
"Session continuation requires API key authentication. "
"Configure API_SERVER_KEY to enable this feature."
),
status=403,
)
# Sanitize: reject control characters that could enable header injection.
if re.search(r'[\r\n\x00]', provided_session_id):
return web.json_response(
{"error": {"message": "Invalid session ID", "type": "invalid_request_error"}},
status=400,
)
session_id = provided_session_id
try:
db = self._ensure_session_db()
@@ -673,20 +542,11 @@ class APIServerAdapter(BasePlatformAdapter):
logger.warning("Failed to load session history for %s: %s", session_id, e)
history = []
else:
# Derive a stable session ID from the conversation fingerprint so
# that consecutive messages from the same Open WebUI (or similar)
# conversation map to the same Hermes session. The first user
# message + system prompt are constant across all turns.
first_user = ""
for cm in conversation_messages:
if cm.get("role") == "user":
first_user = cm.get("content", "")
break
session_id = _derive_chat_session_id(system_prompt, first_user)
session_id = str(uuid.uuid4())
# history already set from request body above
completion_id = f"chatcmpl-{uuid.uuid4().hex[:29]}"
model_name = body.get("model", self._model_name)
model_name = body.get("model", "hermes-agent")
created = int(time.time())
if stream:
@@ -705,35 +565,15 @@ class APIServerAdapter(BasePlatformAdapter):
_stream_q.put(delta)
def _on_tool_progress(event_type, name, preview, args, **kwargs):
"""Send tool progress as a separate SSE event.
Previously, progress markers like `` list`` were injected
directly into ``delta.content``. OpenAI-compatible frontends
(Open WebUI, LobeChat, ) store ``delta.content`` verbatim as
the assistant message and send it back on subsequent requests.
After enough turns the model learns to *emit* the markers as
plain text instead of issuing real tool calls silently
hallucinating tool results. See #6972.
The fix: push a tagged tuple ``("__tool_progress__", payload)``
onto the stream queue. The SSE writer emits it as a custom
``event: hermes.tool.progress`` line that compliant frontends
can render for UX but will *not* persist into conversation
history. Clients that don't understand the custom event type
silently ignore it per the SSE specification.
"""
"""Inject tool progress into the SSE stream for Open WebUI."""
if event_type != "tool.started":
return
return # Only show tool start events in chat stream
if name.startswith("_"):
return
return # Skip internal events (_thinking)
from agent.display import get_tool_emoji
emoji = get_tool_emoji(name)
label = preview or name
_stream_q.put(("__tool_progress__", {
"tool": name,
"emoji": emoji,
"label": label,
}))
_stream_q.put(f"\n`{emoji} {label}`\n")
# Start agent in background. agent_ref is a mutable container
# so the SSE writer can interrupt the agent on client disconnect.
@@ -823,11 +663,7 @@ class APIServerAdapter(BasePlatformAdapter):
"""
import queue as _q
sse_headers = {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
"X-Accel-Buffering": "no",
}
sse_headers = {"Content-Type": "text/event-stream", "Cache-Control": "no-cache"}
# CORS middleware can't inject headers into StreamResponse after
# prepare() flushes them, so resolve CORS headers up front.
origin = request.headers.get("Origin", "")
@@ -840,8 +676,6 @@ class APIServerAdapter(BasePlatformAdapter):
await response.prepare(request)
try:
last_activity = time.monotonic()
# Role chunk
role_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
@@ -849,31 +683,6 @@ class APIServerAdapter(BasePlatformAdapter):
"choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": None}],
}
await response.write(f"data: {json.dumps(role_chunk)}\n\n".encode())
last_activity = time.monotonic()
# Helper — route a queue item to the correct SSE event.
async def _emit(item):
"""Write a single queue item to the SSE stream.
Plain strings are sent as normal ``delta.content`` chunks.
Tagged tuples ``("__tool_progress__", payload)`` are sent
as a custom ``event: hermes.tool.progress`` SSE event so
frontends can display them without storing the markers in
conversation history. See #6972.
"""
if isinstance(item, tuple) and len(item) == 2 and item[0] == "__tool_progress__":
event_data = json.dumps(item[1])
await response.write(
f"event: hermes.tool.progress\ndata: {event_data}\n\n".encode()
)
else:
content_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
"created": created, "model": model,
"choices": [{"index": 0, "delta": {"content": item}, "finish_reason": None}],
}
await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
return time.monotonic()
# Stream content chunks as they arrive from the agent
loop = asyncio.get_event_loop()
@@ -888,19 +697,26 @@ class APIServerAdapter(BasePlatformAdapter):
delta = stream_q.get_nowait()
if delta is None:
break
last_activity = await _emit(delta)
content_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
"created": created, "model": model,
"choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
}
await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
except _q.Empty:
break
break
if time.monotonic() - last_activity >= CHAT_COMPLETIONS_SSE_KEEPALIVE_SECONDS:
await response.write(b": keepalive\n\n")
last_activity = time.monotonic()
continue
if delta is None: # End of stream sentinel
break
last_activity = await _emit(delta)
content_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
"created": created, "model": model,
"choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
}
await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
# Get usage from completed agent
usage = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
@@ -986,7 +802,18 @@ class APIServerAdapter(BasePlatformAdapter):
input_messages.append({"role": "user", "content": item})
elif isinstance(item, dict):
role = item.get("role", "user")
content = _normalize_chat_content(item.get("content", ""))
content = item.get("content", "")
# Handle content that may be a list of content parts
if isinstance(content, list):
text_parts = []
for part in content:
if isinstance(part, dict) and part.get("type") == "input_text":
text_parts.append(part.get("text", ""))
elif isinstance(part, dict) and part.get("type") == "output_text":
text_parts.append(part.get("text", ""))
elif isinstance(part, str):
text_parts.append(part)
content = "\n".join(text_parts)
input_messages.append({"role": role, "content": content})
else:
return web.json_response(_openai_error("'input' must be a string or array"), status=400)
@@ -1096,7 +923,7 @@ class APIServerAdapter(BasePlatformAdapter):
"object": "response",
"status": "completed",
"created_at": created_at,
"model": body.get("model", self._model_name),
"model": body.get("model", "hermes-agent"),
"output": output_items,
"usage": {
"input_tokens": usage.get("input_tokens", 0),
@@ -1491,7 +1318,6 @@ class APIServerAdapter(BasePlatformAdapter):
result = agent.run_conversation(
user_message=user_message,
conversation_history=conversation_history,
task_id="default",
)
usage = {
"input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
@@ -1658,7 +1484,6 @@ class APIServerAdapter(BasePlatformAdapter):
r = agent.run_conversation(
user_message=user_message,
conversation_history=conversation_history,
task_id="default",
)
u = {
"input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
@@ -1810,33 +1635,8 @@ class APIServerAdapter(BasePlatformAdapter):
if hasattr(sweep_task, "add_done_callback"):
sweep_task.add_done_callback(self._background_tasks.discard)
# Refuse to start network-accessible without authentication
if is_network_accessible(self._host) and not self._api_key:
logger.error(
"[%s] Refusing to start: binding to %s requires API_SERVER_KEY. "
"Set API_SERVER_KEY or use the default 127.0.0.1.",
self.name, self._host,
)
return False
# Refuse to start network-accessible with a placeholder key.
# Ported from openclaw/openclaw#64586.
if is_network_accessible(self._host) and self._api_key:
try:
from hermes_cli.auth import has_usable_secret
if not has_usable_secret(self._api_key, min_length=8):
logger.error(
"[%s] Refusing to start: API_SERVER_KEY is set to a "
"placeholder value. Generate a real secret "
"(e.g. `openssl rand -hex 32`) and set API_SERVER_KEY "
"before exposing the API server on %s.",
self.name, self._host,
)
return False
except ImportError:
pass
# Port conflict detection — fail fast if port is already in use
import socket as _socket
try:
with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as _s:
_s.settimeout(1)
@@ -1852,17 +1652,9 @@ class APIServerAdapter(BasePlatformAdapter):
await self._site.start()
self._mark_connected()
if not self._api_key:
logger.warning(
"[%s] ⚠️ No API key configured (API_SERVER_KEY / platforms.api_server.key). "
"All requests will be accepted without authentication. "
"Set an API key for production deployments to prevent "
"unauthorized access to sessions, responses, and cron jobs.",
self.name,
)
logger.info(
"[%s] API server listening on http://%s:%d (model: %s)",
self.name, self._host, self._port, self._model_name,
"[%s] API server listening on http://%s:%d",
self.name, self._host, self._port,
)
return True
+33 -335
View File
@@ -6,12 +6,10 @@ and implement the required methods.
"""
import asyncio
import ipaddress
import logging
import os
import random
import re
import socket as _socket
import subprocess
import sys
import uuid
@@ -21,94 +19,6 @@ from urllib.parse import urlsplit
logger = logging.getLogger(__name__)
def utf16_len(s: str) -> int:
"""Count UTF-16 code units in *s*.
Telegram's message-length limit (4 096) is measured in UTF-16 code units,
**not** Unicode code-points. Characters outside the Basic Multilingual
Plane (emoji like 😀, CJK Extension B, musical symbols, ) are encoded as
surrogate pairs and therefore consume **two** UTF-16 code units each, even
though Python's ``len()`` counts them as one.
Ported from nearai/ironclaw#2304 which discovered the same discrepancy in
Rust's ``chars().count()``.
"""
return len(s.encode("utf-16-le")) // 2
def _prefix_within_utf16_limit(s: str, limit: int) -> str:
"""Return the longest prefix of *s* whose UTF-16 length ≤ *limit*.
Unlike a plain ``s[:limit]``, this respects surrogate-pair boundaries so
we never slice a multi-code-unit character in half.
"""
if utf16_len(s) <= limit:
return s
# Binary search for the longest safe prefix
lo, hi = 0, len(s)
while lo < hi:
mid = (lo + hi + 1) // 2
if utf16_len(s[:mid]) <= limit:
lo = mid
else:
hi = mid - 1
return s[:lo]
def _custom_unit_to_cp(s: str, budget: int, len_fn) -> int:
"""Return the largest codepoint offset *n* such that ``len_fn(s[:n]) <= budget``.
Used by :meth:`BasePlatformAdapter.truncate_message` when *len_fn* measures
length in units different from Python codepoints (e.g. UTF-16 code units).
Falls back to binary search which is O(log n) calls to *len_fn*.
"""
if len_fn(s) <= budget:
return len(s)
lo, hi = 0, len(s)
while lo < hi:
mid = (lo + hi + 1) // 2
if len_fn(s[:mid]) <= budget:
lo = mid
else:
hi = mid - 1
return lo
def is_network_accessible(host: str) -> bool:
"""Return True if *host* would expose the server beyond loopback.
Loopback addresses (127.0.0.1, ::1, IPv4-mapped ::ffff:127.0.0.1)
are local-only. Unspecified addresses (0.0.0.0, ::) bind all
interfaces. Hostnames are resolved; DNS failure fails closed.
"""
try:
addr = ipaddress.ip_address(host)
if addr.is_loopback:
return False
# ::ffff:127.0.0.1 — Python reports is_loopback=False for mapped
# addresses, so check the underlying IPv4 explicitly.
if getattr(addr, "ipv4_mapped", None) and addr.ipv4_mapped.is_loopback:
return False
return True
except ValueError:
# when host variable is a hostname, we should try to resolve below
pass
try:
resolved = _socket.getaddrinfo(
host, None, _socket.AF_UNSPEC, _socket.SOCK_STREAM,
)
# if the hostname resolves into at least one non-loopback address,
# then we consider it to be network accessible
for _family, _type, _proto, _canonname, sockaddr in resolved:
addr = ipaddress.ip_address(sockaddr[0])
if not addr.is_loopback:
return True
return False
except (_socket.gaierror, OSError):
return True
def _detect_macos_system_proxy() -> str | None:
"""Read the macOS system HTTP(S) proxy via ``scutil --proxy``.
@@ -250,7 +160,7 @@ GATEWAY_SECRET_CAPTURE_UNSUPPORTED_MESSAGE = (
)
def safe_url_for_log(url: str, max_len: int = 80) -> str:
def _safe_url_for_log(url: str, max_len: int = 80) -> str:
"""Return a URL string safe for logs (no query/fragment/userinfo)."""
if max_len <= 0:
return ""
@@ -287,23 +197,6 @@ def safe_url_for_log(url: str, max_len: int = 80) -> str:
return f"{safe[:max_len - 3]}..."
async def _ssrf_redirect_guard(response):
"""Re-validate each redirect target to prevent redirect-based SSRF.
Without this, an attacker can host a public URL that 302-redirects to
http://169.254.169.254/ and bypass the pre-flight is_safe_url() check.
Must be async because httpx.AsyncClient awaits response event hooks.
"""
if response.is_redirect and response.next_request:
redirect_url = str(response.next_request.url)
from tools.url_safety import is_safe_url
if not is_safe_url(redirect_url):
raise ValueError(
f"Blocked redirect to private/internal address: {safe_url_for_log(redirect_url)}"
)
# ---------------------------------------------------------------------------
# Image cache utilities
#
@@ -323,23 +216,6 @@ def get_image_cache_dir() -> Path:
return IMAGE_CACHE_DIR
def _looks_like_image(data: bytes) -> bool:
"""Return True if *data* starts with a known image magic-byte sequence."""
if len(data) < 4:
return False
if data[:8] == b"\x89PNG\r\n\x1a\n":
return True
if data[:3] == b"\xff\xd8\xff":
return True
if data[:6] in (b"GIF87a", b"GIF89a"):
return True
if data[:2] == b"BM":
return True
if data[:4] == b"RIFF" and len(data) >= 12 and data[8:12] == b"WEBP":
return True
return False
def cache_image_from_bytes(data: bytes, ext: str = ".jpg") -> str:
"""
Save raw image bytes to the cache and return the absolute file path.
@@ -350,17 +226,7 @@ def cache_image_from_bytes(data: bytes, ext: str = ".jpg") -> str:
Returns:
Absolute path to the cached image file as a string.
Raises:
ValueError: If *data* does not look like a valid image (e.g. an HTML
error page returned by the upstream server).
"""
if not _looks_like_image(data):
snippet = data[:80].decode("utf-8", errors="replace")
raise ValueError(
f"Refusing to cache non-image data as {ext} "
f"(starts with: {snippet!r})"
)
cache_dir = get_image_cache_dir()
filename = f"img_{uuid.uuid4().hex[:12]}{ext}"
filepath = cache_dir / filename
@@ -388,7 +254,7 @@ async def cache_image_from_url(url: str, ext: str = ".jpg", retries: int = 2) ->
"""
from tools.url_safety import is_safe_url
if not is_safe_url(url):
raise ValueError(f"Blocked unsafe URL (SSRF protection): {safe_url_for_log(url)}")
raise ValueError(f"Blocked unsafe URL (SSRF protection): {_safe_url_for_log(url)}")
import asyncio
import httpx
@@ -396,11 +262,7 @@ async def cache_image_from_url(url: str, ext: str = ".jpg", retries: int = 2) ->
_log = _logging.getLogger(__name__)
last_exc = None
async with httpx.AsyncClient(
timeout=30.0,
follow_redirects=True,
event_hooks={"response": [_ssrf_redirect_guard]},
) as client:
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
for attempt in range(retries + 1):
try:
response = await client.get(
@@ -422,7 +284,7 @@ async def cache_image_from_url(url: str, ext: str = ".jpg", retries: int = 2) ->
"Media cache retry %d/%d for %s (%.1fs): %s",
attempt + 1,
retries,
safe_url_for_log(url),
_safe_url_for_log(url),
wait,
exc,
)
@@ -507,7 +369,7 @@ async def cache_audio_from_url(url: str, ext: str = ".ogg", retries: int = 2) ->
"""
from tools.url_safety import is_safe_url
if not is_safe_url(url):
raise ValueError(f"Blocked unsafe URL (SSRF protection): {safe_url_for_log(url)}")
raise ValueError(f"Blocked unsafe URL (SSRF protection): {_safe_url_for_log(url)}")
import asyncio
import httpx
@@ -515,11 +377,7 @@ async def cache_audio_from_url(url: str, ext: str = ".ogg", retries: int = 2) ->
_log = _logging.getLogger(__name__)
last_exc = None
async with httpx.AsyncClient(
timeout=30.0,
follow_redirects=True,
event_hooks={"response": [_ssrf_redirect_guard]},
) as client:
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
for attempt in range(retries + 1):
try:
response = await client.get(
@@ -541,7 +399,7 @@ async def cache_audio_from_url(url: str, ext: str = ".ogg", retries: int = 2) ->
"Audio cache retry %d/%d for %s (%.1fs): %s",
attempt + 1,
retries,
safe_url_for_log(url),
_safe_url_for_log(url),
wait,
exc,
)
@@ -644,14 +502,6 @@ class MessageType(Enum):
COMMAND = "command" # /command style
class ProcessingOutcome(Enum):
"""Result classification for message-processing lifecycle hooks."""
SUCCESS = "success"
FAILURE = "failure"
CANCELLED = "cancelled"
@dataclass
class MessageEvent:
"""
@@ -679,9 +529,8 @@ class MessageEvent:
reply_to_message_id: Optional[str] = None
reply_to_text: Optional[str] = None # Text of the replied-to message (for context injection)
# Auto-loaded skill(s) for topic/channel bindings (e.g., Telegram DM Topics,
# Discord channel_skill_bindings). A single name or ordered list.
auto_skill: Optional[str | list[str]] = None
# Auto-loaded skill for topic/channel bindings (e.g., Telegram DM Topics)
auto_skill: Optional[str] = None
# Internal flag — set for synthetic events (e.g. background process
# completion notifications) that must bypass user authorization checks.
@@ -703,9 +552,6 @@ class MessageEvent:
raw = parts[0][1:].lower() if parts else None
if raw and "@" in raw:
raw = raw.split("@", 1)[0]
# Reject file paths: valid command names never contain /
if raw and "/" in raw:
return None
return raw
def get_command_args(self) -> str:
@@ -726,32 +572,6 @@ class SendResult:
retryable: bool = False # True for transient connection errors — base will retry automatically
def merge_pending_message_event(
pending_messages: Dict[str, MessageEvent],
session_key: str,
event: MessageEvent,
) -> None:
"""Store or merge a pending event for a session.
Photo bursts/albums often arrive as multiple near-simultaneous PHOTO
events. Merge those into the existing queued event so the next turn sees
the whole burst, while non-photo follow-ups still replace the pending
event normally.
"""
existing = pending_messages.get(session_key)
if (
existing
and getattr(existing, "message_type", None) == MessageType.PHOTO
and event.message_type == MessageType.PHOTO
):
existing.media_urls.extend(event.media_urls)
existing.media_types.extend(event.media_types)
if event.text:
existing.text = BasePlatformAdapter._merge_caption(existing.text, event.text)
return
pending_messages[session_key] = event
# Error substrings that indicate a transient *connection* failure worth retrying.
# "timeout" / "timed out" / "readtimeout" / "writetimeout" are intentionally
# excluded: a read/write timeout on a non-idempotent call (e.g. send_message)
@@ -805,8 +625,6 @@ class BasePlatformAdapter(ABC):
# Gateway shutdown cancels these so an old gateway instance doesn't keep
# working on a task after --replace or manual restarts.
self._background_tasks: set[asyncio.Task] = set()
self._expected_cancelled_tasks: set[asyncio.Task] = set()
self._busy_session_handler: Optional[Callable[[MessageEvent, str], Awaitable[bool]]] = None
# Chats where auto-TTS on voice input is disabled (set by /voice off)
self._auto_tts_disabled_chats: set = set()
# Chats where typing indicator is paused (e.g. during approval waits).
@@ -876,36 +694,7 @@ class BasePlatformAdapter(ABC):
result = handler(self)
if asyncio.iscoroutine(result):
await result
def _acquire_platform_lock(self, scope: str, identity: str, resource_desc: str) -> bool:
"""Acquire a scoped lock for this adapter. Returns True on success."""
from gateway.status import acquire_scoped_lock
self._platform_lock_scope = scope
self._platform_lock_identity = identity
acquired, existing = acquire_scoped_lock(
scope, identity, metadata={'platform': self.platform.value}
)
if acquired:
return True
owner_pid = existing.get('pid') if isinstance(existing, dict) else None
message = (
f'{resource_desc} already in use'
+ (f' (PID {owner_pid})' if owner_pid else '')
+ '. Stop the other gateway first.'
)
logger.error('[%s] %s', self.name, message)
self._set_fatal_error(f'{scope}_lock', message, retryable=False)
return False
def _release_platform_lock(self) -> None:
"""Release the scoped lock acquired by _acquire_platform_lock."""
identity = getattr(self, '_platform_lock_identity', None)
if not identity:
return
from gateway.status import release_scoped_lock
release_scoped_lock(self._platform_lock_scope, identity)
self._platform_lock_identity = None
@property
def name(self) -> str:
"""Human-readable name for this adapter."""
@@ -924,10 +713,6 @@ class BasePlatformAdapter(ABC):
an optional response string.
"""
self._message_handler = handler
def set_busy_session_handler(self, handler: Optional[Callable[[MessageEvent, str], Awaitable[bool]]]) -> None:
"""Set an optional handler for messages arriving during active sessions."""
self._busy_session_handler = handler
def set_session_store(self, session_store: Any) -> None:
"""
@@ -1230,32 +1015,6 @@ class BasePlatformAdapter(ABC):
return media, cleaned
def extract_inline_keyboard(
self, text: str,
) -> Tuple[Optional[list], str]:
"""Extract inline keyboard directives from response text.
Returns:
Tuple of (parsed keyboard data or None, cleaned text with tags removed).
The base implementation is a no-op override in platform adapters
that support inline keyboards (e.g. Telegram).
"""
return None, text
async def attach_inline_keyboard(
self,
chat_id: str,
message_id: str,
buttons: list,
) -> "SendResult":
"""Attach an inline keyboard to an existing message.
The base implementation is a no-op override in platform adapters
that support inline keyboards (e.g. Telegram).
"""
return SendResult(success=False, error="Not supported")
@staticmethod
def extract_local_files(content: str) -> Tuple[List[str], str]:
"""
@@ -1374,7 +1133,7 @@ class BasePlatformAdapter(ABC):
async def on_processing_start(self, event: MessageEvent) -> None:
"""Hook called when background processing begins."""
async def on_processing_complete(self, event: MessageEvent, outcome: ProcessingOutcome) -> None:
async def on_processing_complete(self, event: MessageEvent, success: bool) -> None:
"""Hook called when background processing completes."""
async def _run_processing_hook(self, hook_name: str, *args: Any, **kwargs: Any) -> None:
@@ -1535,7 +1294,7 @@ class BasePlatformAdapter(ABC):
# session lifecycle and its cleanup races with the running task
# (see PR #4926).
cmd = event.get_command()
if cmd in ("approve", "deny", "status", "stop", "new", "reset", "background", "restart"):
if cmd in ("approve", "deny", "status", "stop", "new", "reset"):
logger.debug(
"[%s] Command '/%s' bypassing active-session guard for %s",
self.name, cmd, session_key,
@@ -1554,19 +1313,19 @@ class BasePlatformAdapter(ABC):
logger.error("[%s] Command '/%s' dispatch failed: %s", self.name, cmd, e, exc_info=True)
return
if self._busy_session_handler is not None:
try:
if await self._busy_session_handler(event, session_key):
return
except Exception as e:
logger.error("[%s] Busy-session handler failed: %s", self.name, e, exc_info=True)
# Special case: photo bursts/albums frequently arrive as multiple near-
# simultaneous messages. Queue them without interrupting the active run,
# then process them immediately after the current task finishes.
if event.message_type == MessageType.PHOTO:
logger.debug("[%s] Queuing photo follow-up for session %s without interrupt", self.name, session_key)
merge_pending_message_event(self._pending_messages, session_key, event)
existing = self._pending_messages.get(session_key)
if existing and existing.message_type == MessageType.PHOTO:
existing.media_urls.extend(event.media_urls)
existing.media_types.extend(event.media_types)
if event.text:
existing.text = self._merge_caption(existing.text, event.text)
else:
self._pending_messages[session_key] = event
return # Don't interrupt now - will run after current task completes
# Default behavior for non-photo follow-ups: interrupt the running agent
@@ -1593,7 +1352,6 @@ class BasePlatformAdapter(ABC):
return
if hasattr(task, "add_done_callback"):
task.add_done_callback(self._background_tasks.discard)
task.add_done_callback(self._expected_cancelled_tasks.discard)
@staticmethod
def _get_human_delay() -> float:
@@ -1661,19 +1419,6 @@ class BasePlatformAdapter(ABC):
# Strip any remaining internal directives from message body (fixes #1561)
text_content = text_content.replace("[[audio_as_voice]]", "").strip()
text_content = re.sub(r"MEDIA:\s*\S+", "", text_content).strip()
# Extract inline keyboard tags (e.g. [KEYBOARD: ...] blocks)
inline_keyboard = None
if text_content:
try:
inline_keyboard, text_content = self.extract_inline_keyboard(text_content)
except Exception as kb_err:
logger.warning("[%s] Failed to extract inline keyboard: %s", self.name, kb_err)
# If the response was only a keyboard block, provide minimal
# placeholder text so there's a message to attach buttons to.
if inline_keyboard and not text_content:
text_content = "Please choose:"
if images:
logger.info("[%s] extract_images found %d image(s) in response (%d chars)", self.name, len(images), len(response))
@@ -1730,23 +1475,6 @@ class BasePlatformAdapter(ABC):
)
_record_delivery(result)
# Attach inline keyboard to the last message chunk
if inline_keyboard and result.success and result.message_id:
target_id = result.message_id
# For chunked messages, attach to the very last chunk
if isinstance(getattr(result, "raw_response", None), dict):
ids = result.raw_response.get("message_ids")
if isinstance(ids, list) and ids:
target_id = ids[-1]
try:
await self.attach_inline_keyboard(
chat_id=event.source.chat_id,
message_id=str(target_id),
buttons=inline_keyboard,
)
except Exception as kb_err:
logger.warning("[%s] Failed to attach inline keyboard: %s", self.name, kb_err)
# Human-like pacing delay between text and media
human_delay = self._get_human_delay()
@@ -1760,7 +1488,7 @@ class BasePlatformAdapter(ABC):
logger.info(
"[%s] Sending image: %s (alt=%s)",
self.name,
safe_url_for_log(image_url),
_safe_url_for_log(image_url),
alt_text[:30] if alt_text else "",
)
# Route animated GIFs through send_animation for proper playback
@@ -1852,11 +1580,7 @@ class BasePlatformAdapter(ABC):
# Determine overall success for the processing hook
processing_ok = delivery_succeeded if delivery_attempted else not bool(response)
await self._run_processing_hook(
"on_processing_complete",
event,
ProcessingOutcome.SUCCESS if processing_ok else ProcessingOutcome.FAILURE,
)
await self._run_processing_hook("on_processing_complete", event, processing_ok)
# Check if there's a pending message that was queued during our processing
if session_key in self._pending_messages:
@@ -1875,14 +1599,10 @@ class BasePlatformAdapter(ABC):
return # Already cleaned up
except asyncio.CancelledError:
current_task = asyncio.current_task()
outcome = ProcessingOutcome.CANCELLED
if current_task is None or current_task not in self._expected_cancelled_tasks:
outcome = ProcessingOutcome.FAILURE
await self._run_processing_hook("on_processing_complete", event, outcome)
await self._run_processing_hook("on_processing_complete", event, False)
raise
except Exception as e:
await self._run_processing_hook("on_processing_complete", event, ProcessingOutcome.FAILURE)
await self._run_processing_hook("on_processing_complete", event, False)
logger.error("[%s] Error handling message: %s", self.name, e, exc_info=True)
# Send the error to the user so they aren't left with radio silence
try:
@@ -1926,12 +1646,10 @@ class BasePlatformAdapter(ABC):
"""
tasks = [task for task in self._background_tasks if not task.done()]
for task in tasks:
self._expected_cancelled_tasks.add(task)
task.cancel()
if tasks:
await asyncio.gather(*tasks, return_exceptions=True)
self._background_tasks.clear()
self._expected_cancelled_tasks.clear()
self._pending_messages.clear()
self._active_sessions.clear()
@@ -1995,11 +1713,7 @@ class BasePlatformAdapter(ABC):
return content
@staticmethod
def truncate_message(
content: str,
max_length: int = 4096,
len_fn: Optional["Callable[[str], int]"] = None,
) -> List[str]:
def truncate_message(content: str, max_length: int = 4096) -> List[str]:
"""
Split a long message into chunks, preserving code block boundaries.
@@ -2011,16 +1725,11 @@ class BasePlatformAdapter(ABC):
Args:
content: The full message content
max_length: Maximum length per chunk (platform-specific)
len_fn: Optional length function for measuring string length.
Defaults to ``len`` (Unicode code-points). Pass
``utf16_len`` for platforms that measure message
length in UTF-16 code units (e.g. Telegram).
Returns:
List of message chunks
"""
_len = len_fn or len
if _len(content) <= max_length:
if len(content) <= max_length:
return [content]
INDICATOR_RESERVE = 10 # room for " (XX/XX)"
@@ -2039,33 +1748,22 @@ class BasePlatformAdapter(ABC):
# How much body text we can fit after accounting for the prefix,
# a potential closing fence, and the chunk indicator.
headroom = max_length - INDICATOR_RESERVE - _len(prefix) - _len(FENCE_CLOSE)
headroom = max_length - INDICATOR_RESERVE - len(prefix) - len(FENCE_CLOSE)
if headroom < 1:
headroom = max_length // 2
# Everything remaining fits in one final chunk
if _len(prefix) + _len(remaining) <= max_length - INDICATOR_RESERVE:
if len(prefix) + len(remaining) <= max_length - INDICATOR_RESERVE:
chunks.append(prefix + remaining)
break
# Find a natural split point (prefer newlines, then spaces).
# When _len != len (e.g. utf16_len for Telegram), headroom is
# measured in the custom unit. We need codepoint-based slice
# positions that stay within the custom-unit budget.
#
# _safe_slice_pos() maps a custom-unit budget to the largest
# codepoint offset whose custom length ≤ budget.
if _len is not len:
# Map headroom (custom units) → codepoint slice length
_cp_limit = _custom_unit_to_cp(remaining, headroom, _len)
else:
_cp_limit = headroom
region = remaining[:_cp_limit]
# Find a natural split point (prefer newlines, then spaces)
region = remaining[:headroom]
split_at = region.rfind("\n")
if split_at < _cp_limit // 2:
if split_at < headroom // 2:
split_at = region.rfind(" ")
if split_at < 1:
split_at = _cp_limit
split_at = headroom
# Avoid splitting inside an inline code span (`...`).
# If the text before split_at has an odd number of unescaped
@@ -2085,7 +1783,7 @@ class BasePlatformAdapter(ABC):
safe_split = candidate.rfind(" ", 0, last_bt)
nl_split = candidate.rfind("\n", 0, last_bt)
safe_split = max(safe_split, nl_split)
if safe_split > _cp_limit // 4:
if safe_split > headroom // 4:
split_at = safe_split
chunk_body = remaining[:split_at]
+14 -112
View File
@@ -30,7 +30,6 @@ from gateway.platforms.base import (
cache_audio_from_bytes,
cache_document_from_bytes,
)
from gateway.platforms.helpers import strip_markdown
logger = logging.getLogger(__name__)
@@ -90,7 +89,18 @@ def _normalize_server_url(raw: str) -> str:
return value.rstrip("/")
def _strip_markdown(text: str) -> str:
"""Strip common markdown formatting for iMessage plain-text delivery."""
text = re.sub(r"\*\*(.+?)\*\*", r"\1", text, flags=re.DOTALL)
text = re.sub(r"\*(.+?)\*", r"\1", text, flags=re.DOTALL)
text = re.sub(r"__(.+?)__", r"\1", text, flags=re.DOTALL)
text = re.sub(r"_(.+?)_", r"\1", text, flags=re.DOTALL)
text = re.sub(r"```[a-zA-Z0-9_+-]*\n?", "", text)
text = re.sub(r"`(.+?)`", r"\1", text)
text = re.sub(r"^#{1,6}\s+", "", text, flags=re.MULTILINE)
text = re.sub(r"\[([^\]]+)\]\(([^\)]+)\)", r"\1", text)
text = re.sub(r"\n{3,}", "\n\n", text)
return text.strip()
# ---------------------------------------------------------------------------
@@ -197,17 +207,9 @@ class BlueBubblesAdapter(BasePlatformAdapter):
self.webhook_port,
self.webhook_path,
)
# Register webhook with BlueBubbles server
# This is required for the server to know where to send events
await self._register_webhook()
return True
async def disconnect(self) -> None:
# Unregister webhook before cleaning up
await self._unregister_webhook()
if self.client:
await self.client.aclose()
self.client = None
@@ -216,105 +218,6 @@ class BlueBubblesAdapter(BasePlatformAdapter):
self._runner = None
self._mark_disconnected()
@property
def _webhook_url(self) -> str:
"""Compute the external webhook URL for BlueBubbles registration."""
host = self.webhook_host
if host in ("0.0.0.0", "127.0.0.1", "localhost", "::"):
host = "localhost"
return f"http://{host}:{self.webhook_port}{self.webhook_path}"
async def _find_registered_webhooks(self, url: str) -> list:
"""Return list of BB webhook entries matching *url*."""
try:
res = await self._api_get("/api/v1/webhook")
data = res.get("data")
if isinstance(data, list):
return [wh for wh in data if wh.get("url") == url]
except Exception:
pass
return []
async def _register_webhook(self) -> bool:
"""Register this webhook URL with the BlueBubbles server.
BlueBubbles requires webhooks to be registered via API before
it will send events. Checks for an existing registration first
to avoid duplicates (e.g. after a crash without clean shutdown).
"""
if not self.client:
return False
webhook_url = self._webhook_url
# Crash resilience — reuse an existing registration if present
existing = await self._find_registered_webhooks(webhook_url)
if existing:
logger.info(
"[bluebubbles] webhook already registered: %s", webhook_url
)
return True
payload = {
"url": webhook_url,
"events": ["new-message", "updated-message", "message"],
}
try:
res = await self._api_post("/api/v1/webhook", payload)
status = res.get("status", 0)
if 200 <= status < 300:
logger.info(
"[bluebubbles] webhook registered with server: %s",
webhook_url,
)
return True
else:
logger.warning(
"[bluebubbles] webhook registration returned status %s: %s",
status,
res.get("message"),
)
return False
except Exception as exc:
logger.warning(
"[bluebubbles] failed to register webhook with server: %s",
exc,
)
return False
async def _unregister_webhook(self) -> bool:
"""Unregister this webhook URL from the BlueBubbles server.
Removes *all* matching registrations to clean up any duplicates
left by prior crashes.
"""
if not self.client:
return False
webhook_url = self._webhook_url
removed = False
try:
for wh in await self._find_registered_webhooks(webhook_url):
wh_id = wh.get("id")
if wh_id:
res = await self.client.delete(
self._api_url(f"/api/v1/webhook/{wh_id}")
)
res.raise_for_status()
removed = True
if removed:
logger.info(
"[bluebubbles] webhook unregistered: %s", webhook_url
)
except Exception as exc:
logger.debug(
"[bluebubbles] failed to unregister webhook (non-critical): %s",
exc,
)
return removed
# ------------------------------------------------------------------
# Chat GUID resolution
# ------------------------------------------------------------------
@@ -383,7 +286,7 @@ class BlueBubblesAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
text = strip_markdown(content or "")
text = _strip_markdown(content or "")
if not text:
return SendResult(success=False, error="BlueBubbles send requires text")
chunks = self.truncate_message(text, max_length=self.MAX_MESSAGE_LENGTH)
@@ -669,7 +572,7 @@ class BlueBubblesAdapter(BasePlatformAdapter):
return info
def format_message(self, content: str) -> str:
return strip_markdown(content)
return _strip_markdown(content)
# ------------------------------------------------------------------
# Inbound attachment downloading (from #4588)
@@ -923,4 +826,3 @@ class BlueBubblesAdapter(BasePlatformAdapter):
asyncio.create_task(self.mark_read(session_chat_id))
return web.Response(text="ok")
+22 -16
View File
@@ -20,7 +20,6 @@ Configuration in config.yaml:
import asyncio
import logging
import os
import re
import time
import uuid
from datetime import datetime, timezone
@@ -42,7 +41,6 @@ except ImportError:
httpx = None # type: ignore[assignment]
from gateway.config import Platform, PlatformConfig
from gateway.platforms.helpers import MessageDeduplicator
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
@@ -53,9 +51,9 @@ from gateway.platforms.base import (
logger = logging.getLogger(__name__)
MAX_MESSAGE_LENGTH = 20000
DEDUP_WINDOW_SECONDS = 300
DEDUP_MAX_SIZE = 1000
RECONNECT_BACKOFF = [2, 5, 10, 30, 60]
_SESSION_WEBHOOKS_MAX = 500
_DINGTALK_WEBHOOK_RE = re.compile(r'^https://api\.dingtalk\.com/')
def check_dingtalk_requirements() -> bool:
@@ -88,8 +86,8 @@ class DingTalkAdapter(BasePlatformAdapter):
self._stream_task: Optional[asyncio.Task] = None
self._http_client: Optional["httpx.AsyncClient"] = None
# Message deduplication
self._dedup = MessageDeduplicator(max_size=1000)
# Message deduplication: msg_id -> timestamp
self._seen_messages: Dict[str, float] = {}
# Map chat_id -> session_webhook for reply routing
self._session_webhooks: Dict[str, str] = {}
@@ -169,7 +167,7 @@ class DingTalkAdapter(BasePlatformAdapter):
self._stream_client = None
self._session_webhooks.clear()
self._dedup.clear()
self._seen_messages.clear()
logger.info("[%s] Disconnected", self.name)
# -- Inbound message processing -----------------------------------------
@@ -177,7 +175,7 @@ class DingTalkAdapter(BasePlatformAdapter):
async def _on_message(self, message: "ChatbotMessage") -> None:
"""Process an incoming DingTalk chatbot message."""
msg_id = getattr(message, "message_id", None) or uuid.uuid4().hex
if self._dedup.is_duplicate(msg_id):
if self._is_duplicate(msg_id):
logger.debug("[%s] Duplicate message %s, skipping", self.name, msg_id)
return
@@ -197,15 +195,9 @@ class DingTalkAdapter(BasePlatformAdapter):
chat_id = conversation_id or sender_id
chat_type = "group" if is_group else "dm"
# Store session webhook for reply routing (validate origin to prevent SSRF)
# Store session webhook for reply routing
session_webhook = getattr(message, "session_webhook", None) or ""
if session_webhook and chat_id and _DINGTALK_WEBHOOK_RE.match(session_webhook):
if len(self._session_webhooks) >= _SESSION_WEBHOOKS_MAX:
# Evict oldest entry to cap memory growth
try:
self._session_webhooks.pop(next(iter(self._session_webhooks)))
except StopIteration:
pass
if session_webhook and chat_id:
self._session_webhooks[chat_id] = session_webhook
source = self.build_source(
@@ -255,6 +247,20 @@ class DingTalkAdapter(BasePlatformAdapter):
content = " ".join(parts).strip()
return content
# -- Deduplication ------------------------------------------------------
def _is_duplicate(self, msg_id: str) -> bool:
"""Check and record a message ID. Returns True if already seen."""
now = time.time()
if len(self._seen_messages) > DEDUP_MAX_SIZE:
cutoff = now - DEDUP_WINDOW_SECONDS
self._seen_messages = {k: v for k, v in self._seen_messages.items() if v > cutoff}
if msg_id in self._seen_messages:
return True
self._seen_messages[msg_id] = now
return False
# -- Outbound messaging -------------------------------------------------
async def send(
+135 -236
View File
@@ -45,12 +45,10 @@ sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
import re
from gateway.platforms.helpers import MessageDeduplicator, ThreadParticipationTracker
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
ProcessingOutcome,
SendResult,
cache_image_from_url,
cache_audio_from_url,
@@ -424,7 +422,6 @@ class DiscordAdapter(BasePlatformAdapter):
# Discord message limits
MAX_MESSAGE_LENGTH = 2000
_SPLIT_THRESHOLD = 1900 # near the 2000-char split point
# Auto-disconnect from voice channel after this many seconds of inactivity
VOICE_TIMEOUT = 300
@@ -436,13 +433,7 @@ class DiscordAdapter(BasePlatformAdapter):
self._allowed_user_ids: set = set() # For button approval authorization
# Voice channel state (per-guild)
self._voice_clients: Dict[int, Any] = {} # guild_id -> VoiceClient
# Text batching: merge rapid successive messages (Telegram-style)
self._text_batch_delay_seconds = float(os.getenv("HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS", "0.6"))
self._text_batch_split_delay_seconds = float(os.getenv("HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
self._voice_text_channels: Dict[int, int] = {} # guild_id -> text_channel_id
self._voice_sources: Dict[int, Dict[str, Any]] = {} # guild_id -> linked text channel source metadata
self._voice_timeout_tasks: Dict[int, asyncio.Task] = {} # guild_id -> timeout task
# Phase 2: voice listening
self._voice_receivers: Dict[int, VoiceReceiver] = {} # guild_id -> VoiceReceiver
@@ -452,15 +443,18 @@ class DiscordAdapter(BasePlatformAdapter):
# Track threads where the bot has participated so follow-up messages
# in those threads don't require @mention. Persisted to disk so the
# set survives gateway restarts.
self._threads = ThreadParticipationTracker("discord")
self._bot_participated_threads: set = self._load_participated_threads()
# Persistent typing indicator loops per channel (DMs don't reliably
# show the standard typing gateway event for bots)
self._typing_tasks: Dict[str, asyncio.Task] = {}
self._bot_task: Optional[asyncio.Task] = None
self._post_connect_task: Optional[asyncio.Task] = None
# Dedup cache: prevents duplicate bot responses when Discord
# RESUME replays events after reconnects.
self._dedup = MessageDeduplicator()
# Cap to prevent unbounded growth (Discord threads get archived).
self._MAX_TRACKED_THREADS = 500
# Dedup cache: message_id → timestamp. Prevents duplicate bot
# responses when Discord RESUME replays events after reconnects.
self._seen_messages: Dict[str, float] = {}
self._SEEN_TTL = 300 # 5 minutes
self._SEEN_MAX = 2000 # prune threshold
# Reply threading mode: "off" (no replies), "first" (reply on first
# chunk only, default), "all" (reply-reference on every chunk).
self._reply_to_mode: str = getattr(config, 'reply_to_mode', 'first') or 'first'
@@ -501,9 +495,18 @@ class DiscordAdapter(BasePlatformAdapter):
return False
try:
if not self._acquire_platform_lock('discord-bot-token', self.config.token, 'Discord bot token'):
# Acquire scoped lock to prevent duplicate bot token usage
from gateway.status import acquire_scoped_lock
self._token_lock_identity = self.config.token
acquired, existing = acquire_scoped_lock('discord-bot-token', self._token_lock_identity, metadata={'platform': 'discord'})
if not acquired:
owner_pid = existing.get('pid') if isinstance(existing, dict) else None
message = f'Discord bot token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
logger.error('[%s] %s', self.name, message)
self._set_fatal_error('discord_token_lock', message, retryable=False)
return False
# Parse allowed user entries (may contain usernames or IDs)
allowed_env = os.getenv("DISCORD_ALLOWED_USERS", "")
if allowed_env:
@@ -547,19 +550,29 @@ class DiscordAdapter(BasePlatformAdapter):
# Resolve any usernames in the allowed list to numeric IDs
await adapter_self._resolve_allowed_usernames()
adapter_self._ready_event.set()
if adapter_self._post_connect_task and not adapter_self._post_connect_task.done():
adapter_self._post_connect_task.cancel()
adapter_self._post_connect_task = asyncio.create_task(
adapter_self._run_post_connect_initialization()
)
# Sync slash commands with Discord
try:
synced = await adapter_self._client.tree.sync()
logger.info("[%s] Synced %d slash command(s)", adapter_self.name, len(synced))
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[%s] Slash command sync failed: %s", adapter_self.name, e, exc_info=True)
adapter_self._ready_event.set()
@self._client.event
async def on_message(message: DiscordMessage):
# Dedup: Discord RESUME replays events after reconnects (#4777)
if adapter_self._dedup.is_duplicate(str(message.id)):
msg_id = str(message.id)
now = time.time()
if msg_id in adapter_self._seen_messages:
return
adapter_self._seen_messages[msg_id] = now
if len(adapter_self._seen_messages) > adapter_self._SEEN_MAX:
cutoff = now - adapter_self._SEEN_TTL
adapter_self._seen_messages = {
k: v for k, v in adapter_self._seen_messages.items()
if v > cutoff
}
# Always ignore our own messages
if message.author == self._client.user:
@@ -586,35 +599,22 @@ class DiscordAdapter(BasePlatformAdapter):
if not self._client.user or self._client.user not in message.mentions:
return
# "all" falls through to handle_message
# Multi-agent filtering: if the message mentions specific bots
# but NOT this bot, the sender is talking to another agent —
# stay silent. Messages with no bot mentions (general chat)
# still fall through to _handle_message for the existing
# DISCORD_REQUIRE_MENTION check.
#
# This replaces the older DISCORD_IGNORE_NO_MENTION logic
# with bot-aware filtering that works correctly when multiple
# agents share a channel.
if not isinstance(message.channel, discord.DMChannel) and message.mentions:
_self_mentioned = (
# If the message @mentions other users but NOT the bot, the
# sender is talking to someone else — stay silent. Only
# applies in server channels; in DMs the user is always
# talking to the bot (mentions are just references).
# Controlled by DISCORD_IGNORE_NO_MENTION (default: true).
_ignore_no_mention = os.getenv(
"DISCORD_IGNORE_NO_MENTION", "true"
).lower() in ("true", "1", "yes")
if _ignore_no_mention and message.mentions and not isinstance(message.channel, discord.DMChannel):
_bot_mentioned = (
self._client.user is not None
and self._client.user in message.mentions
)
_other_bots_mentioned = any(
m.bot and m != self._client.user
for m in message.mentions
)
# If other bots are mentioned but we're not → not for us
if _other_bots_mentioned and not _self_mentioned:
return
# If humans are mentioned but we're not → not for us
# (preserves old DISCORD_IGNORE_NO_MENTION=true behavior)
_ignore_no_mention = os.getenv(
"DISCORD_IGNORE_NO_MENTION", "true"
).lower() in ("true", "1", "yes")
if _ignore_no_mention and not _self_mentioned and not _other_bots_mentioned:
return
if not _bot_mentioned:
return # Talking to someone else, don't interrupt
await self._handle_message(message)
@@ -665,11 +665,23 @@ class DiscordAdapter(BasePlatformAdapter):
except asyncio.TimeoutError:
logger.error("[%s] Timeout waiting for connection to Discord", self.name, exc_info=True)
self._release_platform_lock()
try:
from gateway.status import release_scoped_lock
if getattr(self, '_token_lock_identity', None):
release_scoped_lock('discord-bot-token', self._token_lock_identity)
self._token_lock_identity = None
except Exception:
pass
return False
except Exception as e: # pragma: no cover - defensive logging
logger.error("[%s] Failed to connect to Discord: %s", self.name, e, exc_info=True)
self._release_platform_lock()
try:
from gateway.status import release_scoped_lock
if getattr(self, '_token_lock_identity', None):
release_scoped_lock('discord-bot-token', self._token_lock_identity)
self._token_lock_identity = None
except Exception:
pass
return False
async def disconnect(self) -> None:
@@ -687,36 +699,21 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[%s] Error during disconnect: %s", self.name, e, exc_info=True)
if self._post_connect_task and not self._post_connect_task.done():
self._post_connect_task.cancel()
try:
await self._post_connect_task
except asyncio.CancelledError:
pass
self._running = False
self._client = None
self._ready_event.clear()
self._post_connect_task = None
self._release_platform_lock()
# Release the token lock
try:
from gateway.status import release_scoped_lock
if getattr(self, '_token_lock_identity', None):
release_scoped_lock('discord-bot-token', self._token_lock_identity)
self._token_lock_identity = None
except Exception:
pass
logger.info("[%s] Disconnected", self.name)
async def _run_post_connect_initialization(self) -> None:
"""Finish non-critical startup work after Discord is connected."""
if not self._client:
return
try:
synced = await asyncio.wait_for(self._client.tree.sync(), timeout=30)
logger.info("[%s] Synced %d slash command(s)", self.name, len(synced))
except asyncio.TimeoutError:
logger.warning("[%s] Slash command sync timed out after 30s", self.name)
except asyncio.CancelledError:
raise
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[%s] Slash command sync failed: %s", self.name, e, exc_info=True)
async def _add_reaction(self, message: Any, emoji: str) -> bool:
"""Add an emoji reaction to a Discord message."""
if not message or not hasattr(message, "add_reaction"):
@@ -751,17 +748,14 @@ class DiscordAdapter(BasePlatformAdapter):
if hasattr(message, "add_reaction"):
await self._add_reaction(message, "👀")
async def on_processing_complete(self, event: MessageEvent, outcome: ProcessingOutcome) -> None:
async def on_processing_complete(self, event: MessageEvent, success: bool) -> None:
"""Swap the in-progress reaction for a final success/failure reaction."""
if not self._reactions_enabled():
return
message = event.raw_message
if hasattr(message, "add_reaction"):
await self._remove_reaction(message, "👀")
if outcome == ProcessingOutcome.SUCCESS:
await self._add_reaction(message, "")
elif outcome == ProcessingOutcome.FAILURE:
await self._add_reaction(message, "")
await self._add_reaction(message, "" if success else "")
async def send(
self,
@@ -770,34 +764,18 @@ class DiscordAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None
) -> SendResult:
"""Send a message to a Discord channel or thread.
When metadata contains a thread_id, the message is sent to that
thread instead of the parent channel identified by chat_id.
"""
"""Send a message to a Discord channel."""
if not self._client:
return SendResult(success=False, error="Not connected")
try:
# Determine target channel: thread_id in metadata takes precedence.
thread_id = None
if metadata and metadata.get("thread_id"):
thread_id = metadata["thread_id"]
# Get the channel
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if thread_id:
# Fetch the thread directly — threads are addressed by their own ID.
channel = self._client.get_channel(int(thread_id))
if not channel:
channel = await self._client.fetch_channel(int(thread_id))
if not channel:
return SendResult(success=False, error=f"Thread {thread_id} not found")
else:
# Get the parent channel
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
# Format and split message if needed
formatted = self.format_message(content)
@@ -1046,7 +1024,6 @@ class DiscordAdapter(BasePlatformAdapter):
if task:
task.cancel()
self._voice_text_channels.pop(guild_id, None)
self._voice_sources.pop(guild_id, None)
# Maximum seconds to wait for voice playback before giving up
PLAYBACK_TIMEOUT = 120
@@ -1261,8 +1238,9 @@ class DiscordAdapter(BasePlatformAdapter):
try:
await asyncio.to_thread(VoiceReceiver.pcm_to_wav, pcm_data, wav_path)
from tools.transcription_tools import transcribe_audio
result = await asyncio.to_thread(transcribe_audio, wav_path)
from tools.transcription_tools import transcribe_audio, get_stt_model_from_config
stt_model = get_stt_model_from_config()
result = await asyncio.to_thread(transcribe_audio, wav_path, model=stt_model)
if not result.get("success"):
return
@@ -1854,7 +1832,7 @@ class DiscordAdapter(BasePlatformAdapter):
# Track thread participation so follow-ups don't require @mention
if thread_id:
self._threads.mark(thread_id)
self._track_thread(thread_id)
# If a message was provided, kick off a new Hermes session in the thread
starter = (message or "").strip()
@@ -1889,42 +1867,14 @@ class DiscordAdapter(BasePlatformAdapter):
chat_topic=chat_topic,
)
_parent_id = str(getattr(getattr(interaction, "channel", None), "parent_id", "") or "")
_skills = self._resolve_channel_skills(thread_id, _parent_id or None)
event = MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=source,
raw_message=interaction,
auto_skill=_skills,
)
await self.handle_message(event)
def _resolve_channel_skills(self, channel_id: str, parent_id: str | None = None) -> list[str] | None:
"""Look up auto-skill bindings for a Discord channel/forum thread.
Config format (in platform extra):
channel_skill_bindings:
- id: "123456"
skills: ["skill-a", "skill-b"]
Also checks parent_id so forum threads inherit the forum's bindings.
"""
bindings = self.config.extra.get("channel_skill_bindings", [])
if not bindings:
return None
ids_to_check = {channel_id}
if parent_id:
ids_to_check.add(parent_id)
for entry in bindings:
entry_id = str(entry.get("id", ""))
if entry_id in ids_to_check:
skills = entry.get("skills") or entry.get("skill")
if isinstance(skills, str):
return [skills]
if isinstance(skills, list) and skills:
return list(dict.fromkeys(skills)) # dedup, preserve order
return None
def _thread_parent_channel(self, channel: Any) -> Any:
"""Return the parent text channel when invoked from a thread."""
return getattr(channel, "parent", None) or channel
@@ -2225,6 +2175,49 @@ class DiscordAdapter(BasePlatformAdapter):
return f"{parent_name} / {thread_name}"
return thread_name
# ------------------------------------------------------------------
# Thread participation persistence
# ------------------------------------------------------------------
@staticmethod
def _thread_state_path() -> Path:
"""Path to the persisted thread participation set."""
from hermes_cli.config import get_hermes_home
return get_hermes_home() / "discord_threads.json"
@classmethod
def _load_participated_threads(cls) -> set:
"""Load persisted thread IDs from disk."""
path = cls._thread_state_path()
try:
if path.exists():
data = json.loads(path.read_text(encoding="utf-8"))
if isinstance(data, list):
return set(data)
except Exception as e:
logger.debug("Could not load discord thread state: %s", e)
return set()
def _save_participated_threads(self) -> None:
"""Persist the current thread set to disk (best-effort)."""
path = self._thread_state_path()
try:
# Trim to most recent entries if over cap
thread_list = list(self._bot_participated_threads)
if len(thread_list) > self._MAX_TRACKED_THREADS:
thread_list = thread_list[-self._MAX_TRACKED_THREADS:]
self._bot_participated_threads = set(thread_list)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(thread_list), encoding="utf-8")
except Exception as e:
logger.debug("Could not save discord thread state: %s", e)
def _track_thread(self, thread_id: str) -> None:
"""Add a thread to the participation set and persist."""
if thread_id not in self._bot_participated_threads:
self._bot_participated_threads.add(thread_id)
self._save_participated_threads()
async def _handle_message(self, message: DiscordMessage) -> None:
"""Handle incoming Discord messages."""
# In server channels (not DMs), require the bot to be @mentioned
@@ -2235,7 +2228,6 @@ class DiscordAdapter(BasePlatformAdapter):
# discord.require_mention: Require @mention in server channels (default: true)
# discord.free_response_channels: Channel IDs where bot responds without mention
# discord.ignored_channels: Channel IDs where bot NEVER responds (even when mentioned)
# discord.allowed_channels: If set, bot ONLY responds in these channels (whitelist)
# discord.no_thread_channels: Channel IDs where bot responds directly without creating thread
# discord.auto_thread: Auto-create thread on @mention in channels (default: true)
@@ -2246,23 +2238,13 @@ class DiscordAdapter(BasePlatformAdapter):
thread_id = str(message.channel.id)
parent_channel_id = self._get_parent_channel_id(message.channel)
is_voice_linked_channel = False
if not isinstance(message.channel, discord.DMChannel):
# Check ignored channels first - never respond even when mentioned
ignored_channels_raw = os.getenv("DISCORD_IGNORED_CHANNELS", "")
ignored_channels = {ch.strip() for ch in ignored_channels_raw.split(",") if ch.strip()}
channel_ids = {str(message.channel.id)}
if parent_channel_id:
channel_ids.add(parent_channel_id)
# Check allowed channels - if set, only respond in these channels
allowed_channels_raw = os.getenv("DISCORD_ALLOWED_CHANNELS", "")
if allowed_channels_raw:
allowed_channels = {ch.strip() for ch in allowed_channels_raw.split(",") if ch.strip()}
if not (channel_ids & allowed_channels):
logger.debug("[%s] Ignoring message in non-allowed channel: %s", self.name, channel_ids)
return
# Check ignored channels - never respond even when mentioned
ignored_channels_raw = os.getenv("DISCORD_IGNORED_CHANNELS", "")
ignored_channels = {ch.strip() for ch in ignored_channels_raw.split(",") if ch.strip()}
if channel_ids & ignored_channels:
logger.debug("[%s] Ignoring message in ignored channel: %s", self.name, channel_ids)
return
@@ -2273,16 +2255,11 @@ class DiscordAdapter(BasePlatformAdapter):
channel_ids.add(parent_channel_id)
require_mention = os.getenv("DISCORD_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no")
# Voice-linked text channels act as free-response while voice is active.
# Only the exact bound channel gets the exemption, not sibling threads.
voice_linked_ids = {str(ch_id) for ch_id in self._voice_text_channels.values()}
current_channel_id = str(message.channel.id)
is_voice_linked_channel = current_channel_id in voice_linked_ids
is_free_channel = bool(channel_ids & free_channels) or is_voice_linked_channel
is_free_channel = bool(channel_ids & free_channels)
# Skip the mention check if the message is in a thread where
# the bot has previously participated (auto-created or replied in).
in_bot_thread = is_thread and thread_id in self._threads
in_bot_thread = is_thread and thread_id in self._bot_participated_threads
if require_mention and not is_free_channel and not in_bot_thread:
if self._client.user not in message.mentions:
@@ -2302,13 +2279,13 @@ class DiscordAdapter(BasePlatformAdapter):
no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
skip_thread = bool(channel_ids & no_thread_channels)
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
if auto_thread and not skip_thread and not is_voice_linked_channel:
if auto_thread and not skip_thread:
thread = await self._auto_create_thread(message)
if thread:
is_thread = True
thread_id = str(thread.id)
auto_threaded_channel = thread
self._threads.mark(thread_id)
self._track_thread(thread_id)
# Determine message type
msg_type = MessageType.TEXT
@@ -2472,10 +2449,6 @@ class DiscordAdapter(BasePlatformAdapter):
if not event_text or not event_text.strip():
event_text = "(The user sent a message with no text content)"
_chan = message.channel
_parent_id = str(getattr(_chan, "parent_id", "") or "")
_chan_id = str(getattr(_chan, "id", ""))
_skills = self._resolve_channel_skills(_chan_id, _parent_id or None)
event = MessageEvent(
text=event_text,
message_type=msg_type,
@@ -2486,88 +2459,14 @@ class DiscordAdapter(BasePlatformAdapter):
media_types=media_types,
reply_to_message_id=str(message.reference.message_id) if message.reference else None,
timestamp=message.created_at,
auto_skill=_skills,
)
# Track thread participation so the bot won't require @mention for
# follow-up messages in threads it has already engaged in.
if thread_id:
self._threads.mark(thread_id)
self._track_thread(thread_id)
# Only batch plain text messages — commands, media, etc. dispatch
# immediately since they won't be split by the Discord client.
if msg_type == MessageType.TEXT and self._text_batch_delay_seconds > 0:
self._enqueue_text_event(event)
else:
await self.handle_message(event)
# ------------------------------------------------------------------
# Text message aggregation (handles Discord client-side splits)
# ------------------------------------------------------------------
def _text_batch_key(self, event: MessageEvent) -> str:
"""Session-scoped key for text message batching."""
from gateway.session import build_session_key
return build_session_key(
event.source,
group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
thread_sessions_per_user=self.config.extra.get("thread_sessions_per_user", False),
)
def _enqueue_text_event(self, event: MessageEvent) -> None:
"""Buffer a text event and reset the flush timer.
When Discord splits a long user message at 2000 chars, the chunks
arrive within a few hundred milliseconds. This merges them into
a single event before dispatching.
"""
key = self._text_batch_key(event)
existing = self._pending_text_batches.get(key)
chunk_len = len(event.text or "")
if existing is None:
event._last_chunk_len = chunk_len # type: ignore[attr-defined]
self._pending_text_batches[key] = event
else:
if event.text:
existing.text = f"{existing.text}\n{event.text}" if existing.text else event.text
existing._last_chunk_len = chunk_len # type: ignore[attr-defined]
if event.media_urls:
existing.media_urls.extend(event.media_urls)
existing.media_types.extend(event.media_types)
prior_task = self._pending_text_batch_tasks.get(key)
if prior_task and not prior_task.done():
prior_task.cancel()
self._pending_text_batch_tasks[key] = asyncio.create_task(
self._flush_text_batch(key)
)
async def _flush_text_batch(self, key: str) -> None:
"""Wait for the quiet period then dispatch the aggregated text.
Uses a longer delay when the latest chunk is near Discord's 2000-char
split point, since a continuation chunk is almost certain.
"""
current_task = asyncio.current_task()
try:
pending = self._pending_text_batches.get(key)
last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
if last_len >= self._SPLIT_THRESHOLD:
delay = self._text_batch_split_delay_seconds
else:
delay = self._text_batch_delay_seconds
await asyncio.sleep(delay)
event = self._pending_text_batches.pop(key, None)
if not event:
return
logger.info(
"[Discord] Flushing text batch %s (%d chars)",
key, len(event.text or ""),
)
await self.handle_message(event)
finally:
if self._pending_text_batch_tasks.get(key) is current_task:
self._pending_text_batch_tasks.pop(key, None)
await self.handle_message(event)
# ---------------------------------------------------------------------------
+1 -5
View File
@@ -195,11 +195,7 @@ def _extract_attachments(
ext = Path(filename).suffix.lower()
if ext in _IMAGE_EXTS:
try:
cached_path = cache_image_from_bytes(payload, ext)
except ValueError:
logger.debug("Skipping non-image attachment %s (invalid magic bytes)", filename)
continue
cached_path = cache_image_from_bytes(payload, ext)
attachments.append({
"path": cached_path,
"filename": filename,
+17 -392
View File
@@ -34,9 +34,6 @@ from datetime import datetime
from pathlib import Path
from types import SimpleNamespace
from typing import Any, Dict, List, Optional
from urllib.error import HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import Request, urlopen
# aiohttp/websockets are independent optional deps — import outside lark_oapi
# so they remain available for tests and webhook mode even if lark_oapi is missing.
@@ -172,19 +169,6 @@ _FEISHU_CARD_ACTION_DEDUP_TTL_SECONDS = 15 * 60 # card action token dedup win
_FEISHU_BOT_MSG_TRACK_SIZE = 512 # LRU size for tracking sent message IDs
_FEISHU_REPLY_FALLBACK_CODES = frozenset({230011, 231003}) # reply target withdrawn/missing → create fallback
_FEISHU_ACK_EMOJI = "OK"
# QR onboarding constants
_ONBOARD_ACCOUNTS_URLS = {
"feishu": "https://accounts.feishu.cn",
"lark": "https://accounts.larksuite.com",
}
_ONBOARD_OPEN_URLS = {
"feishu": "https://open.feishu.cn",
"lark": "https://open.larksuite.com",
}
_REGISTRATION_PATH = "/oauth/v1/app/registration"
_ONBOARD_REQUEST_TIMEOUT_S = 10
# ---------------------------------------------------------------------------
# Fallback display strings
# ---------------------------------------------------------------------------
@@ -280,7 +264,6 @@ class FeishuAdapterSettings:
bot_name: str
dedup_cache_size: int
text_batch_delay_seconds: float
text_batch_split_delay_seconds: float
text_batch_max_messages: int
text_batch_max_chars: int
media_batch_delay_seconds: float
@@ -376,21 +359,19 @@ def _render_code_block_element(element: Dict[str, Any]) -> str:
def _strip_markdown_to_plain_text(text: str) -> str:
"""Strip markdown formatting to plain text for Feishu text fallbacks.
Delegates common markdown stripping to the shared helper and adds
Feishu-specific patterns (blockquotes, strikethrough, underline tags,
horizontal rules, \\r\\n normalisation).
"""
from gateway.platforms.helpers import strip_markdown
plain = text.replace("\r\n", "\n")
plain = _MARKDOWN_LINK_RE.sub(lambda m: f"{m.group(1)} ({m.group(2).strip()})", plain)
plain = re.sub(r"^#{1,6}\s+", "", plain, flags=re.MULTILINE)
plain = re.sub(r"^>\s?", "", plain, flags=re.MULTILINE)
plain = re.sub(r"^\s*---+\s*$", "---", plain, flags=re.MULTILINE)
plain = re.sub(r"```(?:[^\n]*\n)?([\s\S]*?)```", lambda m: m.group(1).strip("\n"), plain)
plain = re.sub(r"`([^`\n]+)`", r"\1", plain)
plain = re.sub(r"\*\*([^*\n]+)\*\*", r"\1", plain)
plain = re.sub(r"\*([^*\n]+)\*", r"\1", plain)
plain = re.sub(r"~~([^~\n]+)~~", r"\1", plain)
plain = re.sub(r"<u>([\s\S]*?)</u>", r"\1", plain)
plain = strip_markdown(plain)
return plain
plain = re.sub(r"\n{3,}", "\n\n", plain)
return plain.strip()
def _coerce_int(value: Any, default: Optional[int] = None, min_value: int = 0) -> Optional[int]:
@@ -991,8 +972,7 @@ def _run_official_feishu_ws_client(ws_client: Any, adapter: Any) -> None:
return await original_connect(*args, **kwargs)
def _configure_with_overrides(conf: Any) -> Any:
if original_configure is None:
raise RuntimeError("Feishu _configure_with_overrides called but original_configure is None")
assert original_configure is not None
result = original_configure(conf)
_apply_runtime_ws_overrides()
return result
@@ -1034,10 +1014,6 @@ class FeishuAdapter(BasePlatformAdapter):
"""Feishu/Lark bot adapter."""
MAX_MESSAGE_LENGTH = 8000
# Threshold for detecting Feishu client-side message splits.
# When a chunk is near the ~4096-char practical limit, a continuation
# is almost certain.
_SPLIT_THRESHOLD = 4000
# =========================================================================
# Lifecycle — init / settings / connect / disconnect
@@ -1129,9 +1105,6 @@ class FeishuAdapter(BasePlatformAdapter):
text_batch_delay_seconds=float(
os.getenv("HERMES_FEISHU_TEXT_BATCH_DELAY_SECONDS", str(_DEFAULT_TEXT_BATCH_DELAY_SECONDS))
),
text_batch_split_delay_seconds=float(
os.getenv("HERMES_FEISHU_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0")
),
text_batch_max_messages=max(
1,
int(os.getenv("HERMES_FEISHU_TEXT_BATCH_MAX_MESSAGES", str(_DEFAULT_TEXT_BATCH_MAX_MESSAGES))),
@@ -1179,7 +1152,6 @@ class FeishuAdapter(BasePlatformAdapter):
self._bot_name = settings.bot_name
self._dedup_cache_size = settings.dedup_cache_size
self._text_batch_delay_seconds = settings.text_batch_delay_seconds
self._text_batch_split_delay_seconds = settings.text_batch_split_delay_seconds
self._text_batch_max_messages = settings.text_batch_max_messages
self._text_batch_max_chars = settings.text_batch_max_chars
self._media_batch_delay_seconds = settings.media_batch_delay_seconds
@@ -1208,8 +1180,6 @@ class FeishuAdapter(BasePlatformAdapter):
lambda data: self._on_reaction_event("im.message.reaction.deleted_v1", data)
)
.register_p2_card_action_trigger(self._on_card_action_trigger)
.register_p2_im_chat_member_bot_added_v1(self._on_bot_added_to_chat)
.register_p2_im_chat_member_bot_deleted_v1(self._on_bot_removed_from_chat)
.build()
)
@@ -1600,18 +1570,13 @@ class FeishuAdapter(BasePlatformAdapter):
return SendResult(success=False, error=f"Image file not found: {image_path}")
try:
import io as _io
with open(image_path, "rb") as f:
image_bytes = f.read()
# Wrap in BytesIO so lark SDK's MultipartEncoder can read .name and .tell()
image_file = _io.BytesIO(image_bytes)
image_file.name = os.path.basename(image_path)
body = self._build_image_upload_body(
image_type=_FEISHU_IMAGE_UPLOAD_TYPE,
image=image_file,
)
request = self._build_image_upload_request(body)
upload_response = await asyncio.to_thread(self._client.im.v1.image.create, request)
with open(image_path, "rb") as image_file:
body = self._build_image_upload_body(
image_type=_FEISHU_IMAGE_UPLOAD_TYPE,
image=image_file,
)
request = self._build_image_upload_request(body)
upload_response = await asyncio.to_thread(self._client.im.v1.image.create, request)
image_key = self._extract_response_field(upload_response, "image_key")
if not image_key:
return self._response_error_result(
@@ -2513,10 +2478,8 @@ class FeishuAdapter(BasePlatformAdapter):
async def _enqueue_text_event(self, event: MessageEvent) -> None:
"""Debounce rapid Feishu text bursts into a single MessageEvent."""
key = self._text_batch_key(event)
chunk_len = len(event.text or "")
existing = self._pending_text_batches.get(key)
if existing is None:
event._last_chunk_len = chunk_len # type: ignore[attr-defined]
self._pending_text_batches[key] = event
self._pending_text_batch_counts[key] = 1
self._schedule_text_batch_flush(key)
@@ -2541,7 +2504,6 @@ class FeishuAdapter(BasePlatformAdapter):
return
existing.text = next_text
existing._last_chunk_len = chunk_len # type: ignore[attr-defined]
existing.timestamp = event.timestamp
if event.message_id:
existing.message_id = event.message_id
@@ -2568,22 +2530,10 @@ class FeishuAdapter(BasePlatformAdapter):
task_map[key] = asyncio.create_task(flush_fn(key))
async def _flush_text_batch(self, key: str) -> None:
"""Flush a pending text batch after the quiet period.
Uses a longer delay when the latest chunk is near Feishu's ~4096-char
split point, since a continuation chunk is almost certain.
"""
"""Flush a pending text batch after the quiet period."""
current_task = asyncio.current_task()
try:
# Adaptive delay: if the latest chunk is near the split threshold,
# a continuation is almost certain — wait longer.
pending = self._pending_text_batches.get(key)
last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
if last_len >= self._SPLIT_THRESHOLD:
delay = self._text_batch_split_delay_seconds
else:
delay = self._text_batch_delay_seconds
await asyncio.sleep(delay)
await asyncio.sleep(self._text_batch_delay_seconds)
await self._flush_text_batch_now(key)
finally:
if self._pending_text_batch_tasks.get(key) is current_task:
@@ -3637,328 +3587,3 @@ class FeishuAdapter(BasePlatformAdapter):
return _FEISHU_FILE_UPLOAD_TYPE, "file"
return _FEISHU_FILE_UPLOAD_TYPE, "file"
# =============================================================================
# QR scan-to-create onboarding
#
# Device-code flow: user scans a QR code with Feishu/Lark mobile app and the
# platform creates a fully configured bot application automatically.
# Called by `hermes gateway setup` via _setup_feishu() in hermes_cli/gateway.py.
# =============================================================================
def _accounts_base_url(domain: str) -> str:
return _ONBOARD_ACCOUNTS_URLS.get(domain, _ONBOARD_ACCOUNTS_URLS["feishu"])
def _onboard_open_base_url(domain: str) -> str:
return _ONBOARD_OPEN_URLS.get(domain, _ONBOARD_OPEN_URLS["feishu"])
def _post_registration(base_url: str, body: Dict[str, str]) -> dict:
"""POST form-encoded data to the registration endpoint, return parsed JSON.
The registration endpoint returns JSON even on 4xx (e.g. poll returns
authorization_pending as a 400). We always parse the body regardless of
HTTP status.
"""
url = f"{base_url}{_REGISTRATION_PATH}"
data = urlencode(body).encode("utf-8")
req = Request(url, data=data, headers={"Content-Type": "application/x-www-form-urlencoded"})
try:
with urlopen(req, timeout=_ONBOARD_REQUEST_TIMEOUT_S) as resp:
return json.loads(resp.read().decode("utf-8"))
except HTTPError as exc:
body_bytes = exc.read()
if body_bytes:
try:
return json.loads(body_bytes.decode("utf-8"))
except (ValueError, json.JSONDecodeError):
raise exc from None
raise
def _init_registration(domain: str = "feishu") -> None:
"""Verify the environment supports client_secret auth.
Raises RuntimeError if not supported.
"""
base_url = _accounts_base_url(domain)
res = _post_registration(base_url, {"action": "init"})
methods = res.get("supported_auth_methods") or []
if "client_secret" not in methods:
raise RuntimeError(
f"Feishu / Lark registration environment does not support client_secret auth. "
f"Supported: {methods}"
)
def _begin_registration(domain: str = "feishu") -> dict:
"""Start the device-code flow. Returns device_code, qr_url, user_code, interval, expire_in."""
base_url = _accounts_base_url(domain)
res = _post_registration(base_url, {
"action": "begin",
"archetype": "PersonalAgent",
"auth_method": "client_secret",
"request_user_info": "open_id",
})
device_code = res.get("device_code")
if not device_code:
raise RuntimeError("Feishu / Lark registration did not return a device_code")
qr_url = res.get("verification_uri_complete", "")
if "?" in qr_url:
qr_url += "&from=hermes&tp=hermes"
else:
qr_url += "?from=hermes&tp=hermes"
return {
"device_code": device_code,
"qr_url": qr_url,
"user_code": res.get("user_code", ""),
"interval": res.get("interval") or 5,
"expire_in": res.get("expire_in") or 600,
}
def _poll_registration(
*,
device_code: str,
interval: int,
expire_in: int,
domain: str = "feishu",
) -> Optional[dict]:
"""Poll until the user scans the QR code, or timeout/denial.
Returns dict with app_id, app_secret, domain, open_id on success.
Returns None on failure.
"""
deadline = time.time() + expire_in
current_domain = domain
domain_switched = False
poll_count = 0
while time.time() < deadline:
base_url = _accounts_base_url(current_domain)
try:
res = _post_registration(base_url, {
"action": "poll",
"device_code": device_code,
"tp": "ob_app",
})
except (URLError, OSError, json.JSONDecodeError):
time.sleep(interval)
continue
poll_count += 1
if poll_count == 1:
print(" Fetching configuration results...", end="", flush=True)
elif poll_count % 6 == 0:
print(".", end="", flush=True)
# Domain auto-detection
user_info = res.get("user_info") or {}
tenant_brand = user_info.get("tenant_brand")
if tenant_brand == "lark" and not domain_switched:
current_domain = "lark"
domain_switched = True
# Fall through — server may return credentials in this same response.
# Success
if res.get("client_id") and res.get("client_secret"):
if poll_count > 0:
print() # newline after "Fetching configuration results..." dots
return {
"app_id": res["client_id"],
"app_secret": res["client_secret"],
"domain": current_domain,
"open_id": user_info.get("open_id"),
}
# Terminal errors
error = res.get("error", "")
if error in ("access_denied", "expired_token"):
if poll_count > 0:
print()
logger.warning("[Feishu onboard] Registration %s", error)
return None
# authorization_pending or unknown — keep polling
time.sleep(interval)
if poll_count > 0:
print()
logger.warning("[Feishu onboard] Poll timed out after %ds", expire_in)
return None
try:
import qrcode as _qrcode_mod
except (ImportError, TypeError):
_qrcode_mod = None # type: ignore[assignment]
def _render_qr(url: str) -> bool:
"""Try to render a QR code in the terminal. Returns True if successful."""
if _qrcode_mod is None:
return False
try:
qr = _qrcode_mod.QRCode()
qr.add_data(url)
qr.make(fit=True)
qr.print_ascii(invert=True)
return True
except Exception:
return False
def probe_bot(app_id: str, app_secret: str, domain: str) -> Optional[dict]:
"""Verify bot connectivity via /open-apis/bot/v3/info.
Uses lark_oapi SDK when available, falls back to raw HTTP otherwise.
Returns {"bot_name": ..., "bot_open_id": ...} on success, None on failure.
"""
if FEISHU_AVAILABLE:
return _probe_bot_sdk(app_id, app_secret, domain)
return _probe_bot_http(app_id, app_secret, domain)
def _build_onboard_client(app_id: str, app_secret: str, domain: str) -> Any:
"""Build a lark Client for the given credentials and domain."""
sdk_domain = LARK_DOMAIN if domain == "lark" else FEISHU_DOMAIN
return (
lark.Client.builder()
.app_id(app_id)
.app_secret(app_secret)
.domain(sdk_domain)
.log_level(lark.LogLevel.WARNING)
.build()
)
def _parse_bot_response(data: dict) -> Optional[dict]:
"""Extract bot_name and bot_open_id from a /bot/v3/info response."""
if data.get("code") != 0:
return None
bot = data.get("bot") or data.get("data", {}).get("bot") or {}
return {
"bot_name": bot.get("bot_name"),
"bot_open_id": bot.get("open_id"),
}
def _probe_bot_sdk(app_id: str, app_secret: str, domain: str) -> Optional[dict]:
"""Probe bot info using lark_oapi SDK."""
try:
client = _build_onboard_client(app_id, app_secret, domain)
resp = client.request(
method="GET",
url="/open-apis/bot/v3/info",
body=None,
raw_response=True,
)
return _parse_bot_response(json.loads(resp.content))
except Exception as exc:
logger.debug("[Feishu onboard] SDK probe failed: %s", exc)
return None
def _probe_bot_http(app_id: str, app_secret: str, domain: str) -> Optional[dict]:
"""Fallback probe using raw HTTP (when lark_oapi is not installed)."""
base_url = _onboard_open_base_url(domain)
try:
token_data = json.dumps({"app_id": app_id, "app_secret": app_secret}).encode("utf-8")
token_req = Request(
f"{base_url}/open-apis/auth/v3/tenant_access_token/internal",
data=token_data,
headers={"Content-Type": "application/json"},
)
with urlopen(token_req, timeout=_ONBOARD_REQUEST_TIMEOUT_S) as resp:
token_res = json.loads(resp.read().decode("utf-8"))
access_token = token_res.get("tenant_access_token")
if not access_token:
return None
bot_req = Request(
f"{base_url}/open-apis/bot/v3/info",
headers={
"Authorization": f"Bearer {access_token}",
"Content-Type": "application/json",
},
)
with urlopen(bot_req, timeout=_ONBOARD_REQUEST_TIMEOUT_S) as resp:
bot_res = json.loads(resp.read().decode("utf-8"))
return _parse_bot_response(bot_res)
except (URLError, OSError, KeyError, json.JSONDecodeError) as exc:
logger.debug("[Feishu onboard] HTTP probe failed: %s", exc)
return None
def qr_register(
*,
initial_domain: str = "feishu",
timeout_seconds: int = 600,
) -> Optional[dict]:
"""Run the Feishu / Lark scan-to-create QR registration flow.
Returns on success::
{
"app_id": str,
"app_secret": str,
"domain": "feishu" | "lark",
"open_id": str | None,
"bot_name": str | None,
"bot_open_id": str | None,
}
Returns None on expected failures (network, auth denied, timeout).
Unexpected errors (bugs, protocol regressions) propagate to the caller.
"""
try:
return _qr_register_inner(initial_domain=initial_domain, timeout_seconds=timeout_seconds)
except (RuntimeError, URLError, OSError, json.JSONDecodeError) as exc:
logger.warning("[Feishu onboard] Registration failed: %s", exc)
return None
def _qr_register_inner(
*,
initial_domain: str,
timeout_seconds: int,
) -> Optional[dict]:
"""Run init → begin → poll → probe. Raises on network/protocol errors."""
print(" Connecting to Feishu / Lark...", end="", flush=True)
_init_registration(initial_domain)
begin = _begin_registration(initial_domain)
print(" done.")
print()
qr_url = begin["qr_url"]
if _render_qr(qr_url):
print(f"\n Scan the QR code above, or open this URL directly:\n {qr_url}")
else:
print(f" Open this URL in Feishu / Lark on your phone:\n\n {qr_url}\n")
print(" Tip: pip install qrcode to display a scannable QR code here next time")
print()
result = _poll_registration(
device_code=begin["device_code"],
interval=begin["interval"],
expire_in=min(begin["expire_in"], timeout_seconds),
domain=initial_domain,
)
if not result:
return None
# Probe bot — best-effort, don't fail the registration
bot_info = probe_bot(result["app_id"], result["app_secret"], result["domain"])
if bot_info:
result["bot_name"] = bot_info.get("bot_name")
result["bot_open_id"] = bot_info.get("bot_open_id")
else:
result["bot_name"] = None
result["bot_open_id"] = None
return result
-261
View File
@@ -1,261 +0,0 @@
"""Shared helper classes for gateway platform adapters.
Extracts common patterns that were duplicated across 5-7 adapters:
message deduplication, text batch aggregation, markdown stripping,
and thread participation tracking.
"""
import asyncio
import json
import logging
import re
import time
from pathlib import Path
from typing import TYPE_CHECKING, Dict, Optional
if TYPE_CHECKING:
from gateway.platforms.base import BasePlatformAdapter, MessageEvent
logger = logging.getLogger(__name__)
# ─── Message Deduplication ────────────────────────────────────────────────────
class MessageDeduplicator:
"""TTL-based message deduplication cache.
Replaces the identical ``_seen_messages`` / ``_is_duplicate()`` pattern
previously duplicated in discord, slack, dingtalk, wecom, weixin,
mattermost, and feishu adapters.
Usage::
self._dedup = MessageDeduplicator()
# In message handler:
if self._dedup.is_duplicate(msg_id):
return
"""
def __init__(self, max_size: int = 2000, ttl_seconds: float = 300):
self._seen: Dict[str, float] = {}
self._max_size = max_size
self._ttl = ttl_seconds
def is_duplicate(self, msg_id: str) -> bool:
"""Return True if *msg_id* was already seen within the TTL window."""
if not msg_id:
return False
now = time.time()
if msg_id in self._seen:
return True
self._seen[msg_id] = now
if len(self._seen) > self._max_size:
cutoff = now - self._ttl
self._seen = {k: v for k, v in self._seen.items() if v > cutoff}
return False
def clear(self):
"""Clear all tracked messages."""
self._seen.clear()
# ─── Text Batch Aggregation ──────────────────────────────────────────────────
class TextBatchAggregator:
"""Aggregates rapid-fire text events into single messages.
Replaces the ``_enqueue_text_event`` / ``_flush_text_batch`` pattern
previously duplicated in telegram, discord, matrix, wecom, and feishu.
Usage::
self._text_batcher = TextBatchAggregator(
handler=self._message_handler,
batch_delay=0.6,
split_threshold=1900,
)
# In message dispatch:
if msg_type == MessageType.TEXT and self._text_batcher.is_enabled():
self._text_batcher.enqueue(event, session_key)
return
"""
def __init__(
self,
handler,
*,
batch_delay: float = 0.6,
split_delay: float = 2.0,
split_threshold: int = 4000,
):
self._handler = handler
self._batch_delay = batch_delay
self._split_delay = split_delay
self._split_threshold = split_threshold
self._pending: Dict[str, "MessageEvent"] = {}
self._pending_tasks: Dict[str, asyncio.Task] = {}
def is_enabled(self) -> bool:
"""Return True if batching is active (delay > 0)."""
return self._batch_delay > 0
def enqueue(self, event: "MessageEvent", key: str) -> None:
"""Add *event* to the pending batch for *key*."""
chunk_len = len(event.text or "")
existing = self._pending.get(key)
if not existing:
event._last_chunk_len = chunk_len # type: ignore[attr-defined]
self._pending[key] = event
else:
existing.text = f"{existing.text}\n{event.text}"
existing._last_chunk_len = chunk_len # type: ignore[attr-defined]
# Cancel prior flush timer, start a new one
prior = self._pending_tasks.get(key)
if prior and not prior.done():
prior.cancel()
self._pending_tasks[key] = asyncio.create_task(self._flush(key))
async def _flush(self, key: str) -> None:
"""Wait then dispatch the batched event for *key*."""
current_task = self._pending_tasks.get(key)
pending = self._pending.get(key)
last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
# Use longer delay when the last chunk looks like a split message
delay = self._split_delay if last_len >= self._split_threshold else self._batch_delay
await asyncio.sleep(delay)
event = self._pending.pop(key, None)
if event:
try:
await self._handler(event)
except Exception:
logger.exception("[TextBatchAggregator] Error dispatching batched event for %s", key)
if self._pending_tasks.get(key) is current_task:
self._pending_tasks.pop(key, None)
def cancel_all(self) -> None:
"""Cancel all pending flush tasks."""
for task in self._pending_tasks.values():
if not task.done():
task.cancel()
self._pending_tasks.clear()
self._pending.clear()
# ─── Markdown Stripping ──────────────────────────────────────────────────────
# Pre-compiled regexes for performance
_RE_BOLD = re.compile(r"\*\*(.+?)\*\*", re.DOTALL)
_RE_ITALIC_STAR = re.compile(r"\*(.+?)\*", re.DOTALL)
_RE_BOLD_UNDER = re.compile(r"__(.+?)__", re.DOTALL)
_RE_ITALIC_UNDER = re.compile(r"_(.+?)_", re.DOTALL)
_RE_CODE_BLOCK = re.compile(r"```[a-zA-Z0-9_+-]*\n?")
_RE_INLINE_CODE = re.compile(r"`(.+?)`")
_RE_HEADING = re.compile(r"^#{1,6}\s+", re.MULTILINE)
_RE_LINK = re.compile(r"\[([^\]]+)\]\([^\)]+\)")
_RE_MULTI_NEWLINE = re.compile(r"\n{3,}")
def strip_markdown(text: str) -> str:
"""Strip markdown formatting for plain-text platforms (SMS, iMessage, etc.).
Replaces the identical ``_strip_markdown()`` functions previously
duplicated in sms.py, bluebubbles.py, and feishu.py.
"""
text = _RE_BOLD.sub(r"\1", text)
text = _RE_ITALIC_STAR.sub(r"\1", text)
text = _RE_BOLD_UNDER.sub(r"\1", text)
text = _RE_ITALIC_UNDER.sub(r"\1", text)
text = _RE_CODE_BLOCK.sub("", text)
text = _RE_INLINE_CODE.sub(r"\1", text)
text = _RE_HEADING.sub("", text)
text = _RE_LINK.sub(r"\1", text)
text = _RE_MULTI_NEWLINE.sub("\n\n", text)
return text.strip()
# ─── Thread Participation Tracking ───────────────────────────────────────────
class ThreadParticipationTracker:
"""Persistent tracking of threads the bot has participated in.
Replaces the identical ``_load/_save_participated_threads`` +
``_mark_thread_participated`` pattern previously duplicated in
discord.py and matrix.py.
Usage::
self._threads = ThreadParticipationTracker("discord")
# Check membership:
if thread_id in self._threads:
...
# Mark participation:
self._threads.mark(thread_id)
"""
_MAX_TRACKED = 500
def __init__(self, platform_name: str, max_tracked: int = 500):
self._platform = platform_name
self._max_tracked = max_tracked
self._threads: set = self._load()
def _state_path(self) -> Path:
from hermes_constants import get_hermes_home
return get_hermes_home() / f"{self._platform}_threads.json"
def _load(self) -> set:
path = self._state_path()
if path.exists():
try:
return set(json.loads(path.read_text(encoding="utf-8")))
except Exception:
pass
return set()
def _save(self) -> None:
path = self._state_path()
path.parent.mkdir(parents=True, exist_ok=True)
thread_list = list(self._threads)
if len(thread_list) > self._max_tracked:
thread_list = thread_list[-self._max_tracked:]
self._threads = set(thread_list)
path.write_text(json.dumps(thread_list), encoding="utf-8")
def mark(self, thread_id: str) -> None:
"""Mark *thread_id* as participated and persist."""
if thread_id not in self._threads:
self._threads.add(thread_id)
self._save()
def __contains__(self, thread_id: str) -> bool:
return thread_id in self._threads
def clear(self) -> None:
self._threads.clear()
# ─── Phone Number Redaction ──────────────────────────────────────────────────
def redact_phone(phone: str) -> str:
"""Redact a phone number for logging, preserving country code and last 4.
Replaces the identical ``_redact_phone()`` functions in signal.py,
sms.py, and bluebubbles.py.
"""
if not phone:
return "<none>"
if len(phone) <= 8:
return phone[:2] + "****" + phone[-2:] if len(phone) > 4 else "****"
return phone[:4] + "****" + phone[-4:]
File diff suppressed because it is too large Load Diff
+18 -5
View File
@@ -18,11 +18,11 @@ import json
import logging
import os
import re
import time
from pathlib import Path
from typing import Any, Dict, List, Optional
from gateway.config import Platform, PlatformConfig
from gateway.platforms.helpers import MessageDeduplicator
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
@@ -96,8 +96,10 @@ class MattermostAdapter(BasePlatformAdapter):
or os.getenv("MATTERMOST_REPLY_MODE", "off")
).lower()
# Dedup cache (prevent reprocessing)
self._dedup = MessageDeduplicator()
# Dedup cache: post_id → timestamp (prevent reprocessing)
self._seen_posts: Dict[str, float] = {}
self._SEEN_MAX = 2000
self._SEEN_TTL = 300 # 5 minutes
# ------------------------------------------------------------------
# HTTP helpers
@@ -602,8 +604,10 @@ class MattermostAdapter(BasePlatformAdapter):
post_id = post.get("id", "")
# Dedup.
if self._dedup.is_duplicate(post_id):
self._prune_seen()
if post_id in self._seen_posts:
return
self._seen_posts[post_id] = time.time()
# Build message event.
channel_id = post.get("channel_id", "")
@@ -730,4 +734,13 @@ class MattermostAdapter(BasePlatformAdapter):
await self.handle_message(msg_event)
def _prune_seen(self) -> None:
"""Remove expired entries from the dedup cache."""
if len(self._seen_posts) < self._SEEN_MAX:
return
now = time.time()
self._seen_posts = {
pid: ts
for pid, ts in self._seen_posts.items()
if now - ts < self._SEEN_TTL
}
+40 -5
View File
@@ -37,7 +37,6 @@ from gateway.platforms.base import (
cache_document_from_bytes,
cache_image_from_url,
)
from gateway.platforms.helpers import redact_phone
logger = logging.getLogger(__name__)
@@ -52,10 +51,22 @@ SSE_RETRY_DELAY_MAX = 60.0
HEALTH_CHECK_INTERVAL = 30.0 # seconds between health checks
HEALTH_CHECK_STALE_THRESHOLD = 120.0 # seconds without SSE activity before concern
# E.164 phone number pattern for redaction
_PHONE_RE = re.compile(r"\+[1-9]\d{6,14}")
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _redact_phone(phone: str) -> str:
"""Redact a phone number for logging: +15551234567 -> +155****4567."""
if not phone:
return "<none>"
if len(phone) <= 8:
return phone[:2] + "****" + phone[-2:] if len(phone) > 4 else "****"
return phone[:4] + "****" + phone[-4:]
def _parse_comma_list(value: str) -> List[str]:
"""Split a comma-separated string into a list, stripping whitespace."""
@@ -173,8 +184,10 @@ class SignalAdapter(BasePlatformAdapter):
self._recent_sent_timestamps: set = set()
self._max_recent_timestamps = 50
self._phone_lock_identity: Optional[str] = None
logger.info("Signal adapter initialized: url=%s account=%s groups=%s",
self.http_url, redact_phone(self.account),
self.http_url, _redact_phone(self.account),
"enabled" if self.group_allow_from else "disabled")
# ------------------------------------------------------------------
@@ -189,7 +202,23 @@ class SignalAdapter(BasePlatformAdapter):
# Acquire scoped lock to prevent duplicate Signal listeners for the same phone
try:
if not self._acquire_platform_lock('signal-phone', self.account, 'Signal account'):
from gateway.status import acquire_scoped_lock
self._phone_lock_identity = self.account
acquired, existing = acquire_scoped_lock(
"signal-phone",
self._phone_lock_identity,
metadata={"platform": self.platform.value},
)
if not acquired:
owner_pid = existing.get("pid") if isinstance(existing, dict) else None
message = (
"Another local Hermes gateway is already using this Signal account"
+ (f" (PID {owner_pid})." if owner_pid else ".")
+ " Stop the other gateway before starting a second Signal listener."
)
logger.error("Signal: %s", message)
self._set_fatal_error("signal_phone_lock", message, retryable=False)
return False
except Exception as e:
logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)
@@ -241,7 +270,13 @@ class SignalAdapter(BasePlatformAdapter):
await self.client.aclose()
self.client = None
self._release_platform_lock()
if self._phone_lock_identity:
try:
from gateway.status import release_scoped_lock
release_scoped_lock("signal-phone", self._phone_lock_identity)
except Exception as e:
logger.warning("Signal: Error releasing phone lock: %s", e, exc_info=True)
self._phone_lock_identity = None
logger.info("Signal: disconnected")
@@ -507,7 +542,7 @@ class SignalAdapter(BasePlatformAdapter):
)
logger.debug("Signal: message from %s in %s: %s",
redact_phone(sender), chat_id[:20], (text or "")[:50])
_redact_phone(sender), chat_id[:20], (text or "")[:50])
await self.handle_message(event)
+35 -34
View File
@@ -33,14 +33,12 @@ from pathlib import Path as _Path
sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
from gateway.platforms.helpers import MessageDeduplicator
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
SUPPORTED_DOCUMENT_TYPES,
safe_url_for_log,
cache_document_from_bytes,
)
@@ -90,9 +88,11 @@ class SlackAdapter(BasePlatformAdapter):
self._team_clients: Dict[str, AsyncWebClient] = {} # team_id → WebClient
self._team_bot_user_ids: Dict[str, str] = {} # team_id → bot_user_id
self._channel_team: Dict[str, str] = {} # channel_id → team_id
# Dedup cache: prevents duplicate bot responses when Socket Mode
# reconnects redeliver events.
self._dedup = MessageDeduplicator()
# Dedup cache: event_ts → timestamp. Prevents duplicate bot
# responses when Socket Mode reconnects redeliver events.
self._seen_messages: Dict[str, float] = {}
self._SEEN_TTL = 300 # 5 minutes
self._SEEN_MAX = 2000 # prune threshold
# Track pending approval message_ts → resolved flag to prevent
# double-clicks on approval buttons.
self._approval_resolved: Dict[str, bool] = {}
@@ -151,7 +151,15 @@ class SlackAdapter(BasePlatformAdapter):
logger.warning("[Slack] Failed to read %s: %s", tokens_file, e)
try:
if not self._acquire_platform_lock('slack-app-token', app_token, 'Slack app token'):
# Acquire scoped lock to prevent duplicate app token usage
from gateway.status import acquire_scoped_lock
self._token_lock_identity = app_token
acquired, existing = acquire_scoped_lock('slack-app-token', app_token, metadata={'platform': 'slack'})
if not acquired:
owner_pid = existing.get('pid') if isinstance(existing, dict) else None
message = f'Slack app token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
logger.error('[%s] %s', self.name, message)
self._set_fatal_error('slack_token_lock', message, retryable=False)
return False
# First token is the primary — used for AsyncApp / Socket Mode
@@ -238,7 +246,14 @@ class SlackAdapter(BasePlatformAdapter):
logger.warning("[Slack] Error while closing Socket Mode handler: %s", e, exc_info=True)
self._running = False
self._release_platform_lock()
# Release the token lock (use stored identity, not re-read env)
try:
from gateway.status import release_scoped_lock
if getattr(self, '_token_lock_identity', None):
release_scoped_lock('slack-app-token', self._token_lock_identity)
self._token_lock_identity = None
except Exception:
pass
logger.info("[Slack] Disconnected")
@@ -641,19 +656,8 @@ class SlackAdapter(BasePlatformAdapter):
try:
import httpx
async def _ssrf_redirect_guard(response):
"""Re-check redirect targets so public URLs cannot bounce into private IPs."""
if response.is_redirect and response.next_request:
redirect_url = str(response.next_request.url)
if not is_safe_url(redirect_url):
raise ValueError("Blocked redirect to private/internal address")
# Download the image first
async with httpx.AsyncClient(
timeout=30.0,
follow_redirects=True,
event_hooks={"response": [_ssrf_redirect_guard]},
) as client:
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
response = await client.get(image_url)
response.raise_for_status()
@@ -670,7 +674,7 @@ class SlackAdapter(BasePlatformAdapter):
except Exception as e: # pragma: no cover - defensive logging
logger.warning(
"[Slack] Failed to upload image from URL %s, falling back to text: %s",
safe_url_for_log(image_url),
image_url,
e,
exc_info=True,
)
@@ -937,8 +941,17 @@ class SlackAdapter(BasePlatformAdapter):
"""Handle an incoming Slack message event."""
# Dedup: Slack Socket Mode can redeliver events after reconnects (#4777)
event_ts = event.get("ts", "")
if event_ts and self._dedup.is_duplicate(event_ts):
return
if event_ts:
now = time.time()
if event_ts in self._seen_messages:
return
self._seen_messages[event_ts] = now
if len(self._seen_messages) > self._SEEN_MAX:
cutoff = now - self._SEEN_TTL
self._seen_messages = {
k: v for k, v in self._seen_messages.items()
if v > cutoff
}
# Bot message filtering (SLACK_ALLOW_BOTS / config allow_bots):
# "none" — ignore all bot messages (default, backward-compatible)
@@ -1583,18 +1596,6 @@ class SlackAdapter(BasePlatformAdapter):
)
response.raise_for_status()
# Slack may return an HTML sign-in/redirect page
# instead of actual media bytes (e.g. expired token,
# restricted file access). Detect this early so we
# don't cache bogus data and confuse downstream tools.
ct = response.headers.get("content-type", "")
if "text/html" in ct:
raise ValueError(
"Slack returned HTML instead of media "
f"(content-type: {ct}); "
"check bot token scopes and file permissions"
)
if audio:
from gateway.platforms.base import cache_audio_from_bytes
return cache_audio_from_bytes(response.content, ext)
+32 -129
View File
@@ -10,9 +10,6 @@ Shares credentials with the optional telephony skill — same env vars:
Gateway-specific env vars:
- SMS_WEBHOOK_PORT (default 8080)
- SMS_WEBHOOK_HOST (default 0.0.0.0)
- SMS_WEBHOOK_URL (public URL for Twilio signature validation required)
- SMS_INSECURE_NO_SIGNATURE (true to disable signature validation dev only)
- SMS_ALLOWED_USERS (comma-separated E.164 phone numbers)
- SMS_ALLOW_ALL_USERS (true/false)
- SMS_HOME_CHANNEL (phone number for cron delivery)
@@ -20,10 +17,9 @@ Gateway-specific env vars:
import asyncio
import base64
import hashlib
import hmac
import logging
import os
import re
import urllib.parse
from typing import Any, Dict, Optional
@@ -34,14 +30,24 @@ from gateway.platforms.base import (
MessageType,
SendResult,
)
from gateway.platforms.helpers import redact_phone, strip_markdown
logger = logging.getLogger(__name__)
TWILIO_API_BASE = "https://api.twilio.com/2010-04-01/Accounts"
MAX_SMS_LENGTH = 1600 # ~10 SMS segments
DEFAULT_WEBHOOK_PORT = 8080
DEFAULT_WEBHOOK_HOST = "0.0.0.0"
# E.164 phone number pattern for redaction
_PHONE_RE = re.compile(r"\+[1-9]\d{6,14}")
def _redact_phone(phone: str) -> str:
"""Redact a phone number for logging: +15551234567 -> +1555***4567."""
if not phone:
return "<none>"
if len(phone) <= 8:
return phone[:2] + "***" + phone[-2:] if len(phone) > 4 else "****"
return phone[:5] + "***" + phone[-4:]
def check_sms_requirements() -> bool:
@@ -71,8 +77,6 @@ class SmsAdapter(BasePlatformAdapter):
self._webhook_port: int = int(
os.getenv("SMS_WEBHOOK_PORT", str(DEFAULT_WEBHOOK_PORT))
)
self._webhook_host: str = os.getenv("SMS_WEBHOOK_HOST", DEFAULT_WEBHOOK_HOST)
self._webhook_url: str = os.getenv("SMS_WEBHOOK_URL", "").strip()
self._runner = None
self._http_session: Optional["aiohttp.ClientSession"] = None
@@ -94,33 +98,13 @@ class SmsAdapter(BasePlatformAdapter):
logger.error("[sms] TWILIO_PHONE_NUMBER not set — cannot send replies")
return False
insecure_no_sig = os.getenv("SMS_INSECURE_NO_SIGNATURE", "").lower() == "true"
if not self._webhook_url and not insecure_no_sig:
logger.error(
"[sms] Refusing to start: SMS_WEBHOOK_URL is required for Twilio "
"signature validation. Set it to the public URL configured in your "
"Twilio console (e.g. https://example.com/webhooks/twilio). "
"For local development without validation, set "
"SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production).",
)
return False
if insecure_no_sig and not self._webhook_url:
logger.warning(
"[sms] SMS_INSECURE_NO_SIGNATURE=true — Twilio signature validation "
"is DISABLED. Any client that can reach port %d can inject messages. "
"Do NOT use this in production.",
self._webhook_port,
)
app = web.Application()
app.router.add_post("/webhooks/twilio", self._handle_webhook)
app.router.add_get("/health", lambda _: web.Response(text="ok"))
self._runner = web.AppRunner(app)
await self._runner.setup()
site = web.TCPSite(self._runner, self._webhook_host, self._webhook_port)
site = web.TCPSite(self._runner, "0.0.0.0", self._webhook_port)
await site.start()
self._http_session = aiohttp.ClientSession(
timeout=aiohttp.ClientTimeout(total=30),
@@ -128,10 +112,9 @@ class SmsAdapter(BasePlatformAdapter):
self._running = True
logger.info(
"[sms] Twilio webhook server listening on %s:%d, from: %s",
self._webhook_host,
"[sms] Twilio webhook server listening on port %d, from: %s",
self._webhook_port,
redact_phone(self._from_number),
_redact_phone(self._from_number),
)
return True
@@ -180,7 +163,7 @@ class SmsAdapter(BasePlatformAdapter):
error_msg = body.get("message", str(body))
logger.error(
"[sms] send failed to %s: %s %s",
redact_phone(chat_id),
_redact_phone(chat_id),
resp.status,
error_msg,
)
@@ -191,7 +174,7 @@ class SmsAdapter(BasePlatformAdapter):
msg_sid = body.get("sid", "")
last_result = SendResult(success=True, message_id=msg_sid)
except Exception as e:
logger.error("[sms] send error to %s: %s", redact_phone(chat_id), e)
logger.error("[sms] send error to %s: %s", _redact_phone(chat_id), e)
return SendResult(success=False, error=str(e))
finally:
# Close session only if we created a fallback (no persistent session)
@@ -209,75 +192,16 @@ class SmsAdapter(BasePlatformAdapter):
def format_message(self, content: str) -> str:
"""Strip markdown — SMS renders it as literal characters."""
return strip_markdown(content)
# ------------------------------------------------------------------
# Twilio signature validation
# ------------------------------------------------------------------
def _validate_twilio_signature(
self, url: str, post_params: dict, signature: str,
) -> bool:
"""Validate ``X-Twilio-Signature`` header (HMAC-SHA1, base64).
Tries both with and without the default port for the URL scheme,
since Twilio may sign with either variant.
Algorithm: https://www.twilio.com/docs/usage/security#validating-requests
"""
if self._check_signature(url, post_params, signature):
return True
variant = self._port_variant_url(url)
if variant and self._check_signature(variant, post_params, signature):
return True
return False
def _check_signature(
self, url: str, post_params: dict, signature: str,
) -> bool:
"""Compute and compare a single Twilio signature."""
data_to_sign = url
for key in sorted(post_params.keys()):
data_to_sign += key + post_params[key]
mac = hmac.new(
self._auth_token.encode("utf-8"),
data_to_sign.encode("utf-8"),
hashlib.sha1,
)
computed = base64.b64encode(mac.digest()).decode("utf-8")
return hmac.compare_digest(computed, signature)
@staticmethod
def _port_variant_url(url: str) -> str | None:
"""Return the URL with the default port toggled, or None.
Only toggles default ports (443 for https, 80 for http).
Non-standard ports are never modified.
"""
parsed = urllib.parse.urlparse(url)
default_ports = {"https": 443, "http": 80}
default_port = default_ports.get(parsed.scheme)
if default_port is None:
return None
if parsed.port == default_port:
# Has explicit default port → strip it
return urllib.parse.urlunparse(
(parsed.scheme, parsed.hostname, parsed.path,
parsed.params, parsed.query, parsed.fragment)
)
elif parsed.port is None:
# No port → add default
netloc = f"{parsed.hostname}:{default_port}"
return urllib.parse.urlunparse(
(parsed.scheme, netloc, parsed.path,
parsed.params, parsed.query, parsed.fragment)
)
# Non-standard port — no variant
return None
content = re.sub(r"\*\*(.+?)\*\*", r"\1", content, flags=re.DOTALL)
content = re.sub(r"\*(.+?)\*", r"\1", content, flags=re.DOTALL)
content = re.sub(r"__(.+?)__", r"\1", content, flags=re.DOTALL)
content = re.sub(r"_(.+?)_", r"\1", content, flags=re.DOTALL)
content = re.sub(r"```[a-z]*\n?", "", content)
content = re.sub(r"`(.+?)`", r"\1", content)
content = re.sub(r"^#{1,6}\s+", "", content, flags=re.MULTILINE)
content = re.sub(r"\[([^\]]+)\]\([^\)]+\)", r"\1", content)
content = re.sub(r"\n{3,}", "\n\n", content)
return content.strip()
# ------------------------------------------------------------------
# Twilio webhook handler
@@ -289,7 +213,7 @@ class SmsAdapter(BasePlatformAdapter):
try:
raw = await request.read()
# Twilio sends form-encoded data, not JSON
form = urllib.parse.parse_qs(raw.decode("utf-8"), keep_blank_values=True)
form = urllib.parse.parse_qs(raw.decode("utf-8"))
except Exception as e:
logger.error("[sms] webhook parse error: %s", e)
return web.Response(
@@ -298,27 +222,6 @@ class SmsAdapter(BasePlatformAdapter):
status=400,
)
# Validate Twilio request signature when SMS_WEBHOOK_URL is configured
if self._webhook_url:
twilio_sig = request.headers.get("X-Twilio-Signature", "")
if not twilio_sig:
logger.warning("[sms] Rejected: missing X-Twilio-Signature header")
return web.Response(
text='<?xml version="1.0" encoding="UTF-8"?><Response></Response>',
content_type="application/xml",
status=403,
)
flat_params = {k: v[0] for k, v in form.items() if v}
if not self._validate_twilio_signature(
self._webhook_url, flat_params, twilio_sig
):
logger.warning("[sms] Rejected: invalid Twilio signature")
return web.Response(
text='<?xml version="1.0" encoding="UTF-8"?><Response></Response>',
content_type="application/xml",
status=403,
)
# Extract fields (parse_qs returns lists)
from_number = (form.get("From", [""]))[0].strip()
to_number = (form.get("To", [""]))[0].strip()
@@ -333,7 +236,7 @@ class SmsAdapter(BasePlatformAdapter):
# Ignore messages from our own number (echo prevention)
if from_number == self._from_number:
logger.debug("[sms] ignoring echo from own number %s", redact_phone(from_number))
logger.debug("[sms] ignoring echo from own number %s", _redact_phone(from_number))
return web.Response(
text='<?xml version="1.0" encoding="UTF-8"?><Response></Response>',
content_type="application/xml",
@@ -341,8 +244,8 @@ class SmsAdapter(BasePlatformAdapter):
logger.info(
"[sms] inbound from %s -> %s: %s",
redact_phone(from_number),
redact_phone(to_number),
_redact_phone(from_number),
_redact_phone(to_number),
text[:80],
)
+46 -266
View File
@@ -12,7 +12,7 @@ import json
import logging
import os
import re
from typing import Dict, List, Optional, Any, Tuple
from typing import Dict, List, Optional, Any
logger = logging.getLogger(__name__)
@@ -60,15 +60,11 @@ from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
ProcessingOutcome,
SendResult,
cache_image_from_bytes,
cache_audio_from_bytes,
cache_document_from_bytes,
resolve_proxy_url,
SUPPORTED_DOCUMENT_TYPES,
utf16_len,
_prefix_within_utf16_limit,
)
from gateway.platforms.telegram_network import (
TelegramFallbackTransport,
@@ -125,9 +121,6 @@ class TelegramAdapter(BasePlatformAdapter):
# Telegram message limits
MAX_MESSAGE_LENGTH = 4096
# Threshold for detecting Telegram client-side message splits.
# When a chunk is near this limit, a continuation is almost certain.
_SPLIT_THRESHOLD = 4000
MEDIA_GROUP_WAIT_SECONDS = 0.8
def __init__(self, config: PlatformConfig):
@@ -147,9 +140,9 @@ class TelegramAdapter(BasePlatformAdapter):
# Buffer rapid text messages so Telegram client-side splits of long
# messages are aggregated into a single MessageEvent.
self._text_batch_delay_seconds = float(os.getenv("HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS", "0.6"))
self._text_batch_split_delay_seconds = float(os.getenv("HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
self._token_lock_identity: Optional[str] = None
self._polling_error_task: Optional[asyncio.Task] = None
self._polling_conflict_count: int = 0
self._polling_network_error_count: int = 0
@@ -302,11 +295,9 @@ class TelegramAdapter(BasePlatformAdapter):
# Exhausted retries — fatal
message = (
"Another process is already polling this Telegram bot token "
"(possibly OpenClaw or another Hermes instance). "
"Another Telegram bot poller is already using this token. "
"Hermes stopped Telegram polling after %d retries. "
"Only one poller can run per token — stop the other process "
"and restart with 'hermes start'."
"Make sure only one gateway instance is running for this bot token."
% MAX_CONFLICT_RETRIES
)
logger.error("[%s] %s Original error: %s", self.name, message, error)
@@ -501,47 +492,27 @@ class TelegramAdapter(BasePlatformAdapter):
return False
try:
if not self._acquire_platform_lock('telegram-bot-token', self.config.token, 'Telegram bot token'):
from gateway.status import acquire_scoped_lock
self._token_lock_identity = self.config.token
acquired, existing = acquire_scoped_lock(
"telegram-bot-token",
self._token_lock_identity,
metadata={"platform": self.platform.value},
)
if not acquired:
owner_pid = existing.get("pid") if isinstance(existing, dict) else None
message = (
"Another local Hermes gateway is already using this Telegram bot token"
+ (f" (PID {owner_pid})." if owner_pid else ".")
+ " Stop the other gateway before starting a second Telegram poller."
)
logger.error("[%s] %s", self.name, message)
self._set_fatal_error("telegram_token_lock", message, retryable=False)
return False
# Build the application
builder = Application.builder().token(self.config.token)
custom_base_url = self.config.extra.get("base_url")
if custom_base_url:
builder = builder.base_url(custom_base_url)
builder = builder.base_file_url(
self.config.extra.get("base_file_url", custom_base_url)
)
logger.info(
"[%s] Using custom Telegram base_url: %s",
self.name, custom_base_url,
)
# PTB defaults (pool_timeout=1s) are too aggressive on flaky networks and
# can trigger "Pool timeout: All connections in the connection pool are occupied"
# during reconnect/bootstrap. Use safer defaults and allow env overrides.
def _env_int(name: str, default: int) -> int:
try:
return int(os.getenv(name, str(default)))
except (TypeError, ValueError):
return default
def _env_float(name: str, default: float) -> float:
try:
return float(os.getenv(name, str(default)))
except (TypeError, ValueError):
return default
request_kwargs = {
"connection_pool_size": _env_int("HERMES_TELEGRAM_HTTP_POOL_SIZE", 512),
"pool_timeout": _env_float("HERMES_TELEGRAM_HTTP_POOL_TIMEOUT", 8.0),
"connect_timeout": _env_float("HERMES_TELEGRAM_HTTP_CONNECT_TIMEOUT", 10.0),
"read_timeout": _env_float("HERMES_TELEGRAM_HTTP_READ_TIMEOUT", 20.0),
"write_timeout": _env_float("HERMES_TELEGRAM_HTTP_WRITE_TIMEOUT", 20.0),
}
proxy_url = resolve_proxy_url()
disable_fallback = (os.getenv("HERMES_TELEGRAM_DISABLE_FALLBACK_IPS", "").strip().lower() in ("1", "true", "yes", "on"))
fallback_ips = self._fallback_ips()
if not fallback_ips:
fallback_ips = await discover_fallback_ips()
@@ -550,34 +521,16 @@ class TelegramAdapter(BasePlatformAdapter):
self.name,
", ".join(fallback_ips),
)
if fallback_ips and not proxy_url and not disable_fallback:
if fallback_ips:
logger.info(
"[%s] Telegram fallback IPs active: %s",
self.name,
", ".join(fallback_ips),
)
# Keep request/update pools separate to reduce contention during
# polling reconnect + bot API bootstrap/delete_webhook calls.
request = HTTPXRequest(
**request_kwargs,
httpx_kwargs={"transport": TelegramFallbackTransport(fallback_ips)},
)
get_updates_request = HTTPXRequest(
**request_kwargs,
httpx_kwargs={"transport": TelegramFallbackTransport(fallback_ips)},
)
elif proxy_url:
logger.info("[%s] Proxy detected; passing explicitly to HTTPXRequest: %s", self.name, proxy_url)
request = HTTPXRequest(**request_kwargs, proxy=proxy_url)
get_updates_request = HTTPXRequest(**request_kwargs, proxy=proxy_url)
else:
if disable_fallback:
logger.info("[%s] Telegram fallback-IP transport disabled via env", self.name)
request = HTTPXRequest(**request_kwargs)
get_updates_request = HTTPXRequest(**request_kwargs)
builder = builder.request(request).get_updates_request(get_updates_request)
transport = TelegramFallbackTransport(fallback_ips)
request = HTTPXRequest(httpx_kwargs={"transport": transport})
get_updates_request = HTTPXRequest(httpx_kwargs={"transport": transport})
builder = builder.request(request).get_updates_request(get_updates_request)
self._app = builder.build()
self._bot = self._app.bot
@@ -724,7 +677,12 @@ class TelegramAdapter(BasePlatformAdapter):
return True
except Exception as e:
self._release_platform_lock()
if self._token_lock_identity:
try:
from gateway.status import release_scoped_lock
release_scoped_lock("telegram-bot-token", self._token_lock_identity)
except Exception:
pass
message = f"Telegram startup failed: {e}"
self._set_fatal_error("telegram_connect_error", message, retryable=True)
logger.error("[%s] Failed to connect to Telegram: %s", self.name, e, exc_info=True)
@@ -750,7 +708,12 @@ class TelegramAdapter(BasePlatformAdapter):
await self._app.shutdown()
except Exception as e:
logger.warning("[%s] Error during Telegram disconnect: %s", self.name, e, exc_info=True)
self._release_platform_lock()
if self._token_lock_identity:
try:
from gateway.status import release_scoped_lock
release_scoped_lock("telegram-bot-token", self._token_lock_identity)
except Exception as e:
logger.warning("[%s] Error releasing Telegram token lock: %s", self.name, e, exc_info=True)
for task in self._pending_photo_batch_tasks.values():
if task and not task.done():
@@ -761,6 +724,7 @@ class TelegramAdapter(BasePlatformAdapter):
self._mark_disconnected()
self._app = None
self._bot = None
self._token_lock_identity = None
logger.info("[%s] Disconnected from Telegram", self.name)
def _should_thread_reply(self, reply_to: Optional[str], chunk_index: int) -> bool:
@@ -801,9 +765,7 @@ class TelegramAdapter(BasePlatformAdapter):
try:
# Format and split message if needed
formatted = self.format_message(content)
chunks = self.truncate_message(
formatted, self.MAX_MESSAGE_LENGTH, len_fn=utf16_len,
)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
if len(chunks) > 1:
# truncate_message appends a raw " (1/2)" suffix. Escape the
# MarkdownV2-special parentheses so Telegram doesn't reject the
@@ -974,9 +936,7 @@ class TelegramAdapter(BasePlatformAdapter):
# streaming). Truncate and succeed so the stream consumer can
# split the overflow into a new message instead of dying.
if "message_too_long" in err_str or "too long" in err_str:
truncated = _prefix_within_utf16_limit(
content, self.MAX_MESSAGE_LENGTH - 20
) + ""
truncated = content[: self.MAX_MESSAGE_LENGTH - 20] + ""
try:
await self._bot.edit_message_text(
chat_id=int(chat_id),
@@ -1021,92 +981,6 @@ class TelegramAdapter(BasePlatformAdapter):
)
return SendResult(success=False, error=str(e))
# ------------------------------------------------------------------
# Inline keyboard support (extract from agent output → attach to msg)
# ------------------------------------------------------------------
def extract_inline_keyboard(
self, text: str,
) -> Tuple[Optional[List[List[Tuple[str, str]]]], str]:
"""Extract a ``[KEYBOARD: ...]`` block from agent output.
Syntax produced by the agent::
[KEYBOARD:
Confirm=ck:confirm | Cancel=ck:cancel
🔄 Later=ck:later
]
Each line becomes one row; ``|`` separates buttons within a row.
Only ``ck:``-prefixed callback data is accepted (custom-keyboard
namespace). Callbacks exceeding Telegram's 64-byte limit are
silently skipped.
Returns:
``(buttons, cleaned_text)`` *buttons* is ``None`` when no
valid keyboard block was found.
"""
match = re.search(r'\[KEYBOARD:\s*\n(.*?)\]', text, re.DOTALL)
if not match:
return None, text
raw = match.group(1).strip()
cleaned = (text[:match.start()] + text[match.end():]).strip()
buttons: List[List[Tuple[str, str]]] = []
for line in raw.splitlines():
line = line.strip()
if not line:
continue
row: List[Tuple[str, str]] = []
for item in line.split("|"):
item = item.strip()
if "=" not in item:
continue
label, callback = item.rsplit("=", 1)
label, callback = label.strip(), callback.strip()
if not label or not callback.startswith("ck:"):
continue
if len(callback.encode("utf-8")) > 64:
logger.warning(
"[%s] Skipping keyboard callback over 64 bytes: %s",
self.name, callback,
)
continue
row.append((label, callback))
if row:
buttons.append(row)
return (buttons or None), cleaned
async def attach_inline_keyboard(
self,
chat_id: str,
message_id: str,
buttons: List[List[Tuple[str, str]]],
) -> SendResult:
"""Attach an inline keyboard to an existing Telegram message."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
keyboard = InlineKeyboardMarkup([
[InlineKeyboardButton(label, callback_data=data)
for label, data in row]
for row in buttons
])
await self._bot.edit_message_reply_markup(
chat_id=int(chat_id),
message_id=int(message_id),
reply_markup=keyboard,
)
return SendResult(success=True, message_id=message_id)
except Exception as e:
logger.error(
"[%s] Failed to attach inline keyboard to msg %s: %s",
self.name, message_id, e, exc_info=True,
)
return SendResult(success=False, error=str(e))
async def send_update_prompt(
self, chat_id: str, prompt: str, default: str = "",
session_key: str = "",
@@ -1497,76 +1371,6 @@ class TelegramAdapter(BasePlatformAdapter):
# Catch-all (e.g. page counter button "mx:noop")
await query.answer()
async def _handle_custom_keyboard_callback(
self, query: Any, data: str,
) -> None:
"""Re-inject a ``ck:`` callback as a virtual user message.
When the user taps an inline keyboard button whose callback_data
starts with ``ck:``, we acknowledge the tap, edit the message to
show which option was selected (removing the keyboard), and feed
the raw callback string back into the normal message handler so
the agent can act on it.
"""
if not query or not query.message:
return
try:
await query.answer()
except Exception:
pass
# Find the label of the clicked button from the original keyboard
clicked_label = data
try:
markup = query.message.reply_markup
if markup and markup.inline_keyboard:
for row in markup.inline_keyboard:
for btn in row:
if btn.callback_data == data:
clicked_label = btn.text
break
except Exception:
pass
# Edit the message: keep original text, append selection, remove keyboard
try:
original_text = query.message.text or ""
user_display = getattr(query.from_user, "first_name", "User")
updated_text = f"{original_text}\n\n> {user_display} selected: {clicked_label}"
await query.edit_message_text(
text=updated_text,
reply_markup=None,
)
except Exception:
# Fallback: just remove the keyboard
try:
await query.edit_message_reply_markup(reply_markup=None)
except Exception:
pass
# Re-inject as a virtual user message.
# We use _build_message_event(query.message) to inherit the correct
# chat/thread context (the bot's reply lives in the same chat).
# message_id is intentionally kept as the bot's message so the
# agent's response threads off the keyboard message (better UX).
from_user = getattr(query, "from_user", None)
if not from_user:
logger.warning("[%s] ck: callback has no from_user, cannot re-inject", self.name)
return
try:
event = self._build_message_event(query.message, MessageType.TEXT)
event.text = data
event.source.user_id = str(from_user.id)
event.source.user_name = from_user.full_name
await self.handle_message(event)
except Exception as exc:
logger.error(
"[%s] Failed to re-inject keyboard callback %r: %s",
self.name, data, exc, exc_info=True,
)
async def _handle_callback_query(
self, update: "Update", context: "ContextTypes.DEFAULT_TYPE"
) -> None:
@@ -1583,11 +1387,6 @@ class TelegramAdapter(BasePlatformAdapter):
await self._handle_model_picker_callback(query, data, chat_id)
return
# --- Custom inline keyboard callbacks (ck:...) ---
if data.startswith("ck:"):
await self._handle_custom_keyboard_callback(query, data)
return
# --- Exec approval callbacks (ea:choice:id) ---
if data.startswith("ea:"):
parts = data.split(":", 2)
@@ -2361,15 +2160,12 @@ class TelegramAdapter(BasePlatformAdapter):
"""
key = self._text_batch_key(event)
existing = self._pending_text_batches.get(key)
chunk_len = len(event.text or "")
if existing is None:
event._last_chunk_len = chunk_len # type: ignore[attr-defined]
self._pending_text_batches[key] = event
else:
# Append text from the follow-up chunk
if event.text:
existing.text = f"{existing.text}\n{event.text}" if existing.text else event.text
existing._last_chunk_len = chunk_len # type: ignore[attr-defined]
# Merge any media that might be attached
if event.media_urls:
existing.media_urls.extend(event.media_urls)
@@ -2384,22 +2180,10 @@ class TelegramAdapter(BasePlatformAdapter):
)
async def _flush_text_batch(self, key: str) -> None:
"""Wait for the quiet period then dispatch the aggregated text.
Uses a longer delay when the latest chunk is near Telegram's 4096-char
split point, since a continuation chunk is almost certain.
"""
"""Wait for the quiet period then dispatch the aggregated text."""
current_task = asyncio.current_task()
try:
# Adaptive delay: if the latest chunk is near Telegram's 4096-char
# split point, a continuation is almost certain — wait longer.
pending = self._pending_text_batches.get(key)
last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
if last_len >= self._SPLIT_THRESHOLD:
delay = self._text_batch_split_delay_seconds
else:
delay = self._text_batch_delay_seconds
await asyncio.sleep(delay)
await asyncio.sleep(self._text_batch_delay_seconds)
event = self._pending_text_batches.pop(key, None)
if not event:
return
@@ -2929,7 +2713,7 @@ class TelegramAdapter(BasePlatformAdapter):
if chat_id and message_id:
await self._set_reaction(chat_id, message_id, "\U0001f440")
async def on_processing_complete(self, event: MessageEvent, outcome: ProcessingOutcome) -> None:
async def on_processing_complete(self, event: MessageEvent, success: bool) -> None:
"""Swap the in-progress reaction for a final success/failure reaction.
Unlike Discord (additive reactions), Telegram's set_message_reaction
@@ -2939,9 +2723,5 @@ class TelegramAdapter(BasePlatformAdapter):
return
chat_id = getattr(event.source, "chat_id", None)
message_id = getattr(event, "message_id", None)
if chat_id and message_id and outcome != ProcessingOutcome.CANCELLED:
await self._set_reaction(
chat_id,
message_id,
"\U0001f44d" if outcome == ProcessingOutcome.SUCCESS else "\U0001f44e",
)
if chat_id and message_id:
await self._set_reaction(chat_id, message_id, "\u2705" if success else "\u274c")
+1 -2
View File
@@ -110,8 +110,7 @@ class TelegramFallbackTransport(httpx.AsyncBaseTransport):
logger.warning("[Telegram] Fallback IP %s failed: %s", ip, exc)
continue
if last_error is None:
raise RuntimeError("All Telegram fallback IPs exhausted but no error was recorded")
assert last_error is not None
raise last_error
async def aclose(self) -> None:
+2 -13
View File
@@ -186,24 +186,13 @@ class WebhookAdapter(BasePlatformAdapter):
if deliver_type == "github_comment":
return await self._deliver_github_comment(content, delivery)
# Cross-platform delivery — any platform with a gateway adapter
# Cross-platform delivery (telegram, discord, etc.)
if self.gateway_runner and deliver_type in (
"telegram",
"discord",
"slack",
"signal",
"sms",
"whatsapp",
"matrix",
"mattermost",
"homeassistant",
"email",
"dingtalk",
"feishu",
"wecom",
"wecom_callback",
"weixin",
"bluebubbles",
):
return await self._deliver_cross_platform(
deliver_type, content, delivery
@@ -273,7 +262,7 @@ class WebhookAdapter(BasePlatformAdapter):
", ".join(self._dynamic_routes.keys()) or "(none)",
)
except Exception as e:
logger.error("[webhook] Failed to reload dynamic routes: %s", e)
logger.warning("[webhook] Failed to reload dynamic routes: %s", e)
async def _handle_webhook(self, request: "web.Request") -> "web.Response":
"""POST /webhooks/{route_name} — receive and process a webhook event."""
+26 -115
View File
@@ -59,7 +59,6 @@ except ImportError:
httpx = None # type: ignore[assignment]
from gateway.config import Platform, PlatformConfig
from gateway.platforms.helpers import MessageDeduplicator
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
@@ -93,6 +92,7 @@ REQUEST_TIMEOUT_SECONDS = 15.0
HEARTBEAT_INTERVAL_SECONDS = 30.0
RECONNECT_BACKOFF = [2, 5, 10, 30, 60]
DEDUP_WINDOW_SECONDS = 300
DEDUP_MAX_SIZE = 1000
IMAGE_MAX_BYTES = 10 * 1024 * 1024
@@ -143,9 +143,6 @@ class WeComAdapter(BasePlatformAdapter):
"""WeCom AI Bot adapter backed by a persistent WebSocket connection."""
MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
# Threshold for detecting WeCom client-side message splits.
# When a chunk is near the 4000-char limit, a continuation is almost certain.
_SPLIT_THRESHOLD = 3900
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.WECOM)
@@ -172,16 +169,9 @@ class WeComAdapter(BasePlatformAdapter):
self._listen_task: Optional[asyncio.Task] = None
self._heartbeat_task: Optional[asyncio.Task] = None
self._pending_responses: Dict[str, asyncio.Future] = {}
self._dedup = MessageDeduplicator(max_size=DEDUP_MAX_SIZE)
self._seen_messages: Dict[str, float] = {}
self._reply_req_ids: Dict[str, str] = {}
# Text batching: merge rapid successive messages (Telegram-style).
# WeCom clients split long messages around 4000 chars.
self._text_batch_delay_seconds = float(os.getenv("HERMES_WECOM_TEXT_BATCH_DELAY_SECONDS", "0.6"))
self._text_batch_split_delay_seconds = float(os.getenv("HERMES_WECOM_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
# ------------------------------------------------------------------
# Connection lifecycle
# ------------------------------------------------------------------
@@ -250,7 +240,7 @@ class WeComAdapter(BasePlatformAdapter):
await self._http_client.aclose()
self._http_client = None
self._dedup.clear()
self._seen_messages.clear()
logger.info("[%s] Disconnected", self.name)
async def _cleanup_ws(self) -> None:
@@ -266,7 +256,7 @@ class WeComAdapter(BasePlatformAdapter):
async def _open_connection(self) -> None:
"""Open and authenticate a websocket connection."""
await self._cleanup_ws()
self._session = aiohttp.ClientSession(trust_env=True)
self._session = aiohttp.ClientSession()
self._ws = await self._session.ws_connect(
self._ws_url,
heartbeat=HEARTBEAT_INTERVAL_SECONDS * 2,
@@ -476,7 +466,7 @@ class WeComAdapter(BasePlatformAdapter):
return
msg_id = str(body.get("msgid") or self._payload_req_id(payload) or uuid.uuid4().hex)
if self._dedup.is_duplicate(msg_id):
if self._is_duplicate(msg_id):
logger.debug("[%s] Duplicate message %s ignored", self.name, msg_id)
return
self._remember_reply_req_id(msg_id, self._payload_req_id(payload))
@@ -529,82 +519,7 @@ class WeComAdapter(BasePlatformAdapter):
timestamp=datetime.now(tz=timezone.utc),
)
# Only batch plain text messages — commands, media, etc. dispatch
# immediately since they won't be split by the WeCom client.
if message_type == MessageType.TEXT and self._text_batch_delay_seconds > 0:
self._enqueue_text_event(event)
else:
await self.handle_message(event)
# ------------------------------------------------------------------
# Text message aggregation (handles WeCom client-side splits)
# ------------------------------------------------------------------
def _text_batch_key(self, event: MessageEvent) -> str:
"""Session-scoped key for text message batching."""
from gateway.session import build_session_key
return build_session_key(
event.source,
group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
thread_sessions_per_user=self.config.extra.get("thread_sessions_per_user", False),
)
def _enqueue_text_event(self, event: MessageEvent) -> None:
"""Buffer a text event and reset the flush timer.
When WeCom splits a long user message at 4000 chars, the chunks
arrive within a few hundred milliseconds. This merges them into
a single event before dispatching.
"""
key = self._text_batch_key(event)
existing = self._pending_text_batches.get(key)
chunk_len = len(event.text or "")
if existing is None:
event._last_chunk_len = chunk_len # type: ignore[attr-defined]
self._pending_text_batches[key] = event
else:
if event.text:
existing.text = f"{existing.text}\n{event.text}" if existing.text else event.text
existing._last_chunk_len = chunk_len # type: ignore[attr-defined]
# Merge any media that might be attached
if event.media_urls:
existing.media_urls.extend(event.media_urls)
existing.media_types.extend(event.media_types)
# Cancel any pending flush and restart the timer
prior_task = self._pending_text_batch_tasks.get(key)
if prior_task and not prior_task.done():
prior_task.cancel()
self._pending_text_batch_tasks[key] = asyncio.create_task(
self._flush_text_batch(key)
)
async def _flush_text_batch(self, key: str) -> None:
"""Wait for the quiet period then dispatch the aggregated text.
Uses a longer delay when the latest chunk is near WeCom's 4000-char
split point, since a continuation chunk is almost certain.
"""
current_task = asyncio.current_task()
try:
pending = self._pending_text_batches.get(key)
last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
if last_len >= self._SPLIT_THRESHOLD:
delay = self._text_batch_split_delay_seconds
else:
delay = self._text_batch_delay_seconds
await asyncio.sleep(delay)
event = self._pending_text_batches.pop(key, None)
if not event:
return
logger.info(
"[WeCom] Flushing text batch %s (%d chars)",
key, len(event.text or ""),
)
await self.handle_message(event)
finally:
if self._pending_text_batch_tasks.get(key) is current_task:
self._pending_text_batch_tasks.pop(key, None)
await self.handle_message(event)
@staticmethod
def _extract_text(body: Dict[str, Any]) -> Tuple[str, Optional[str]]:
@@ -636,13 +551,6 @@ class WeComAdapter(BasePlatformAdapter):
if voice_text:
text_parts.append(voice_text)
# Extract appmsg title (filename) for WeCom AI Bot attachments
if msgtype == "appmsg":
appmsg = body.get("appmsg") if isinstance(body.get("appmsg"), dict) else {}
title = str(appmsg.get("title") or "").strip()
if title:
text_parts.append(title)
quote = body.get("quote") if isinstance(body.get("quote"), dict) else {}
quote_type = str(quote.get("msgtype") or "").lower()
if quote_type == "text":
@@ -675,13 +583,6 @@ class WeComAdapter(BasePlatformAdapter):
refs.append(("image", body["image"]))
if msgtype == "file" and isinstance(body.get("file"), dict):
refs.append(("file", body["file"]))
# Handle appmsg (WeCom AI Bot attachments with PDF/Word/Excel)
if msgtype == "appmsg" and isinstance(body.get("appmsg"), dict):
appmsg = body["appmsg"]
if isinstance(appmsg.get("file"), dict):
refs.append(("file", appmsg["file"]))
elif isinstance(appmsg.get("image"), dict):
refs.append(("image", appmsg["image"]))
quote = body.get("quote") if isinstance(body.get("quote"), dict) else {}
quote_type = str(quote.get("msgtype") or "").lower()
@@ -710,11 +611,7 @@ class WeComAdapter(BasePlatformAdapter):
if kind == "image":
ext = self._detect_image_ext(raw)
try:
return cache_image_from_bytes(raw, ext), self._mime_for_ext(ext, fallback="image/jpeg")
except ValueError as exc:
logger.warning("[%s] Rejected non-image bytes: %s", self.name, exc)
return None
return cache_image_from_bytes(raw, ext), self._mime_for_ext(ext, fallback="image/jpeg")
filename = str(media.get("filename") or media.get("name") or "wecom_file")
return cache_document_from_bytes(raw, filename), mimetypes.guess_type(filename)[0] or "application/octet-stream"
@@ -740,11 +637,7 @@ class WeComAdapter(BasePlatformAdapter):
content_type = str(headers.get("content-type") or "").split(";", 1)[0].strip() or "application/octet-stream"
if kind == "image":
ext = self._guess_extension(url, content_type, fallback=self._detect_image_ext(raw))
try:
return cache_image_from_bytes(raw, ext), content_type or self._mime_for_ext(ext, fallback="image/jpeg")
except ValueError as exc:
logger.warning("[%s] Rejected non-image bytes from %s: %s", self.name, url, exc)
return None
return cache_image_from_bytes(raw, ext), content_type or self._mime_for_ext(ext, fallback="image/jpeg")
filename = self._guess_filename(url, headers.get("content-disposition"), content_type)
return cache_document_from_bytes(raw, filename), content_type
@@ -839,6 +732,24 @@ class WeComAdapter(BasePlatformAdapter):
wildcard = self._groups.get("*")
return wildcard if isinstance(wildcard, dict) else {}
def _is_duplicate(self, msg_id: str) -> bool:
now = time.time()
if len(self._seen_messages) > DEDUP_MAX_SIZE:
cutoff = now - DEDUP_WINDOW_SECONDS
self._seen_messages = {
key: ts for key, ts in self._seen_messages.items() if ts > cutoff
}
if self._reply_req_ids:
self._reply_req_ids = {
key: value for key, value in self._reply_req_ids.items() if key in self._seen_messages
}
if msg_id in self._seen_messages:
return True
self._seen_messages[msg_id] = now
return False
def _remember_reply_req_id(self, message_id: str, req_id: str) -> None:
normalized_message_id = str(message_id or "").strip()
normalized_req_id = str(req_id or "").strip()
-387
View File
@@ -1,387 +0,0 @@
"""WeCom callback-mode adapter for self-built enterprise applications.
Unlike the bot/websocket adapter in ``wecom.py``, this handles the standard
WeCom callback flow: WeCom POSTs encrypted XML to an HTTP endpoint, the
adapter decrypts it, queues the message for the agent, and immediately
acknowledges. The agent's reply is delivered later via the proactive
``message/send`` API using an access-token.
Supports multiple self-built apps under one gateway instance, scoped by
``corp_id:user_id`` to avoid cross-corp collisions.
"""
from __future__ import annotations
import asyncio
import logging
import socket as _socket
import time
from typing import Any, Dict, List, Optional
from xml.etree import ElementTree as ET
try:
from aiohttp import web
AIOHTTP_AVAILABLE = True
except ImportError:
web = None # type: ignore[assignment]
AIOHTTP_AVAILABLE = False
try:
import httpx
HTTPX_AVAILABLE = True
except ImportError:
httpx = None # type: ignore[assignment]
HTTPX_AVAILABLE = False
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import BasePlatformAdapter, MessageEvent, MessageType, SendResult
from gateway.platforms.wecom_crypto import WXBizMsgCrypt, WeComCryptoError
logger = logging.getLogger(__name__)
DEFAULT_HOST = "0.0.0.0"
DEFAULT_PORT = 8645
DEFAULT_PATH = "/wecom/callback"
ACCESS_TOKEN_TTL_SECONDS = 7200
MESSAGE_DEDUP_TTL_SECONDS = 300
def check_wecom_callback_requirements() -> bool:
return AIOHTTP_AVAILABLE and HTTPX_AVAILABLE
class WecomCallbackAdapter(BasePlatformAdapter):
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.WECOM_CALLBACK)
extra = config.extra or {}
self._host = str(extra.get("host") or DEFAULT_HOST)
self._port = int(extra.get("port") or DEFAULT_PORT)
self._path = str(extra.get("path") or DEFAULT_PATH)
self._apps: List[Dict[str, Any]] = self._normalize_apps(extra)
self._runner: Optional[web.AppRunner] = None
self._site: Optional[web.TCPSite] = None
self._app: Optional[web.Application] = None
self._http_client: Optional[httpx.AsyncClient] = None
self._message_queue: asyncio.Queue[MessageEvent] = asyncio.Queue()
self._poll_task: Optional[asyncio.Task] = None
self._seen_messages: Dict[str, float] = {}
self._user_app_map: Dict[str, str] = {}
self._access_tokens: Dict[str, Dict[str, Any]] = {}
# ------------------------------------------------------------------
# App normalisation
# ------------------------------------------------------------------
@staticmethod
def _user_app_key(corp_id: str, user_id: str) -> str:
return f"{corp_id}:{user_id}" if corp_id else user_id
@staticmethod
def _normalize_apps(extra: Dict[str, Any]) -> List[Dict[str, Any]]:
apps = extra.get("apps")
if isinstance(apps, list) and apps:
return [dict(app) for app in apps if isinstance(app, dict)]
if extra.get("corp_id"):
return [
{
"name": extra.get("name") or "default",
"corp_id": extra.get("corp_id", ""),
"corp_secret": extra.get("corp_secret", ""),
"agent_id": str(extra.get("agent_id", "")),
"token": extra.get("token", ""),
"encoding_aes_key": extra.get("encoding_aes_key", ""),
}
]
return []
# ------------------------------------------------------------------
# Lifecycle
# ------------------------------------------------------------------
async def connect(self) -> bool:
if not self._apps:
logger.warning("[WecomCallback] No callback apps configured")
return False
if not check_wecom_callback_requirements():
logger.warning("[WecomCallback] aiohttp/httpx not installed")
return False
# Quick port-in-use check.
try:
with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as sock:
sock.settimeout(1)
sock.connect(("127.0.0.1", self._port))
logger.error("[WecomCallback] Port %d already in use", self._port)
return False
except (ConnectionRefusedError, OSError):
pass
try:
self._http_client = httpx.AsyncClient(timeout=20.0)
self._app = web.Application()
self._app.router.add_get("/health", self._handle_health)
self._app.router.add_get(self._path, self._handle_verify)
self._app.router.add_post(self._path, self._handle_callback)
self._runner = web.AppRunner(self._app)
await self._runner.setup()
self._site = web.TCPSite(self._runner, self._host, self._port)
await self._site.start()
self._poll_task = asyncio.create_task(self._poll_loop())
self._mark_connected()
logger.info(
"[WecomCallback] HTTP server listening on %s:%s%s",
self._host, self._port, self._path,
)
for app in self._apps:
try:
await self._refresh_access_token(app)
except Exception as exc:
logger.warning(
"[WecomCallback] Initial token refresh failed for app '%s': %s",
app.get("name", "default"), exc,
)
return True
except Exception:
await self._cleanup()
logger.exception("[WecomCallback] Failed to start")
return False
async def disconnect(self) -> None:
self._running = False
if self._poll_task:
self._poll_task.cancel()
try:
await self._poll_task
except asyncio.CancelledError:
pass
self._poll_task = None
await self._cleanup()
self._mark_disconnected()
logger.info("[WecomCallback] Disconnected")
async def _cleanup(self) -> None:
self._site = None
if self._runner:
await self._runner.cleanup()
self._runner = None
self._app = None
if self._http_client:
await self._http_client.aclose()
self._http_client = None
# ------------------------------------------------------------------
# Outbound: proactive send via access-token API
# ------------------------------------------------------------------
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
app = self._resolve_app_for_chat(chat_id)
touser = chat_id.split(":", 1)[1] if ":" in chat_id else chat_id
try:
token = await self._get_access_token(app)
payload = {
"touser": touser,
"msgtype": "text",
"agentid": int(str(app.get("agent_id") or 0)),
"text": {"content": content[:2048]},
"safe": 0,
}
resp = await self._http_client.post(
f"https://qyapi.weixin.qq.com/cgi-bin/message/send?access_token={token}",
json=payload,
)
data = resp.json()
if data.get("errcode") != 0:
return SendResult(success=False, error=str(data))
return SendResult(
success=True,
message_id=str(data.get("msgid", "")),
raw_response=data,
)
except Exception as exc:
return SendResult(success=False, error=str(exc))
def _resolve_app_for_chat(self, chat_id: str) -> Dict[str, Any]:
"""Pick the app associated with *chat_id*, falling back sensibly."""
app_name = self._user_app_map.get(chat_id)
if not app_name and ":" not in chat_id:
# Legacy bare user_id — try to find a unique match.
matching = [k for k in self._user_app_map if k.endswith(f":{chat_id}")]
if len(matching) == 1:
app_name = self._user_app_map.get(matching[0])
app = self._get_app_by_name(app_name) if app_name else None
return app or self._apps[0]
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
return {"name": chat_id, "type": "dm"}
# ------------------------------------------------------------------
# Inbound: HTTP callback handlers
# ------------------------------------------------------------------
async def _handle_health(self, request: web.Request) -> web.Response:
return web.json_response({"status": "ok", "platform": "wecom_callback"})
async def _handle_verify(self, request: web.Request) -> web.Response:
"""GET endpoint — WeCom URL verification handshake."""
msg_signature = request.query.get("msg_signature", "")
timestamp = request.query.get("timestamp", "")
nonce = request.query.get("nonce", "")
echostr = request.query.get("echostr", "")
for app in self._apps:
try:
crypt = self._crypt_for_app(app)
plain = crypt.verify_url(msg_signature, timestamp, nonce, echostr)
return web.Response(text=plain, content_type="text/plain")
except Exception:
continue
return web.Response(status=403, text="signature verification failed")
async def _handle_callback(self, request: web.Request) -> web.Response:
"""POST endpoint — receive an encrypted message callback."""
msg_signature = request.query.get("msg_signature", "")
timestamp = request.query.get("timestamp", "")
nonce = request.query.get("nonce", "")
body = await request.text()
for app in self._apps:
try:
decrypted = self._decrypt_request(
app, body, msg_signature, timestamp, nonce,
)
event = self._build_event(app, decrypted)
if event is not None:
# Record which app this user belongs to.
if event.source and event.source.user_id:
map_key = self._user_app_key(
str(app.get("corp_id") or ""), event.source.user_id,
)
self._user_app_map[map_key] = app["name"]
await self._message_queue.put(event)
# Immediately acknowledge — the agent's reply will arrive
# later via the proactive message/send API.
return web.Response(text="success", content_type="text/plain")
except WeComCryptoError:
continue
except Exception:
logger.exception("[WecomCallback] Error handling message")
break
return web.Response(status=400, text="invalid callback payload")
async def _poll_loop(self) -> None:
"""Drain the message queue and dispatch to the gateway runner."""
while True:
event = await self._message_queue.get()
try:
task = asyncio.create_task(self.handle_message(event))
self._background_tasks.add(task)
task.add_done_callback(self._background_tasks.discard)
except Exception:
logger.exception("[WecomCallback] Failed to enqueue event")
# ------------------------------------------------------------------
# XML / crypto helpers
# ------------------------------------------------------------------
def _decrypt_request(
self, app: Dict[str, Any], body: str,
msg_signature: str, timestamp: str, nonce: str,
) -> str:
root = ET.fromstring(body)
encrypt = root.findtext("Encrypt", default="")
crypt = self._crypt_for_app(app)
return crypt.decrypt(msg_signature, timestamp, nonce, encrypt).decode("utf-8")
def _build_event(self, app: Dict[str, Any], xml_text: str) -> Optional[MessageEvent]:
root = ET.fromstring(xml_text)
msg_type = (root.findtext("MsgType") or "").lower()
# Silently acknowledge lifecycle events.
if msg_type == "event":
event_name = (root.findtext("Event") or "").lower()
if event_name in {"enter_agent", "subscribe"}:
return None
if msg_type not in {"text", "event"}:
return None
user_id = root.findtext("FromUserName", default="")
corp_id = root.findtext("ToUserName", default=app.get("corp_id", ""))
scoped_chat_id = self._user_app_key(corp_id, user_id)
content = root.findtext("Content", default="").strip()
if not content and msg_type == "event":
content = "/start"
msg_id = (
root.findtext("MsgId")
or f"{user_id}:{root.findtext('CreateTime', default='0')}"
)
source = self.build_source(
chat_id=scoped_chat_id,
chat_name=user_id,
chat_type="dm",
user_id=user_id,
user_name=user_id,
)
return MessageEvent(
text=content,
message_type=MessageType.TEXT,
source=source,
raw_message=xml_text,
message_id=msg_id,
)
def _crypt_for_app(self, app: Dict[str, Any]) -> WXBizMsgCrypt:
return WXBizMsgCrypt(
token=str(app.get("token") or ""),
encoding_aes_key=str(app.get("encoding_aes_key") or ""),
receive_id=str(app.get("corp_id") or ""),
)
def _get_app_by_name(self, name: Optional[str]) -> Optional[Dict[str, Any]]:
if not name:
return None
for app in self._apps:
if app.get("name") == name:
return app
return None
# ------------------------------------------------------------------
# Access-token management
# ------------------------------------------------------------------
async def _get_access_token(self, app: Dict[str, Any]) -> str:
cached = self._access_tokens.get(app["name"])
now = time.time()
if cached and cached.get("expires_at", 0) > now + 60:
return cached["token"]
return await self._refresh_access_token(app)
async def _refresh_access_token(self, app: Dict[str, Any]) -> str:
resp = await self._http_client.get(
"https://qyapi.weixin.qq.com/cgi-bin/gettoken",
params={
"corpid": app.get("corp_id"),
"corpsecret": app.get("corp_secret"),
},
)
data = resp.json()
if data.get("errcode") != 0:
raise RuntimeError(f"WeCom token refresh failed: {data}")
token = data["access_token"]
expires_in = int(data.get("expires_in", ACCESS_TOKEN_TTL_SECONDS))
self._access_tokens[app["name"]] = {
"token": token,
"expires_at": time.time() + expires_in,
}
logger.info(
"[WecomCallback] Token refreshed for app '%s' (corp=%s), expires in %ss",
app.get("name", "default"),
app.get("corp_id", ""),
expires_in,
)
return token
-142
View File
@@ -1,142 +0,0 @@
"""WeCom BizMsgCrypt-compatible AES-CBC encryption for callback mode.
Implements the same wire format as Tencent's official ``WXBizMsgCrypt``
SDK so that WeCom can verify, encrypt, and decrypt callback payloads.
"""
from __future__ import annotations
import base64
import hashlib
import os
import secrets
import socket
import struct
from typing import Optional
from xml.etree import ElementTree as ET
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
class WeComCryptoError(Exception):
pass
class SignatureError(WeComCryptoError):
pass
class DecryptError(WeComCryptoError):
pass
class EncryptError(WeComCryptoError):
pass
class PKCS7Encoder:
block_size = 32
@classmethod
def encode(cls, text: bytes) -> bytes:
amount_to_pad = cls.block_size - (len(text) % cls.block_size)
if amount_to_pad == 0:
amount_to_pad = cls.block_size
pad = bytes([amount_to_pad]) * amount_to_pad
return text + pad
@classmethod
def decode(cls, decrypted: bytes) -> bytes:
if not decrypted:
raise DecryptError("empty decrypted payload")
pad = decrypted[-1]
if pad < 1 or pad > cls.block_size:
raise DecryptError("invalid PKCS7 padding")
if decrypted[-pad:] != bytes([pad]) * pad:
raise DecryptError("malformed PKCS7 padding")
return decrypted[:-pad]
def _sha1_signature(token: str, timestamp: str, nonce: str, encrypt: str) -> str:
parts = sorted([token, timestamp, nonce, encrypt])
return hashlib.sha1("".join(parts).encode("utf-8")).hexdigest()
class WXBizMsgCrypt:
"""Minimal WeCom callback crypto helper compatible with BizMsgCrypt semantics."""
def __init__(self, token: str, encoding_aes_key: str, receive_id: str):
if not token:
raise ValueError("token is required")
if not encoding_aes_key:
raise ValueError("encoding_aes_key is required")
if len(encoding_aes_key) != 43:
raise ValueError("encoding_aes_key must be 43 chars")
if not receive_id:
raise ValueError("receive_id is required")
self.token = token
self.receive_id = receive_id
self.key = base64.b64decode(encoding_aes_key + "=")
self.iv = self.key[:16]
def verify_url(self, msg_signature: str, timestamp: str, nonce: str, echostr: str) -> str:
plain = self.decrypt(msg_signature, timestamp, nonce, echostr)
return plain.decode("utf-8")
def decrypt(self, msg_signature: str, timestamp: str, nonce: str, encrypt: str) -> bytes:
expected = _sha1_signature(self.token, timestamp, nonce, encrypt)
if expected != msg_signature:
raise SignatureError("signature mismatch")
try:
cipher_text = base64.b64decode(encrypt)
except Exception as exc:
raise DecryptError(f"invalid base64 payload: {exc}") from exc
try:
cipher = Cipher(algorithms.AES(self.key), modes.CBC(self.iv), backend=default_backend())
decryptor = cipher.decryptor()
padded = decryptor.update(cipher_text) + decryptor.finalize()
plain = PKCS7Encoder.decode(padded)
content = plain[16:] # skip 16-byte random prefix
xml_length = socket.ntohl(struct.unpack("I", content[:4])[0])
xml_content = content[4:4 + xml_length]
receive_id = content[4 + xml_length:].decode("utf-8")
except WeComCryptoError:
raise
except Exception as exc:
raise DecryptError(f"decrypt failed: {exc}") from exc
if receive_id != self.receive_id:
raise DecryptError("receive_id mismatch")
return xml_content
def encrypt(self, plaintext: str, nonce: Optional[str] = None, timestamp: Optional[str] = None) -> str:
nonce = nonce or self._random_nonce()
timestamp = timestamp or str(int(__import__("time").time()))
encrypt = self._encrypt_bytes(plaintext.encode("utf-8"))
signature = _sha1_signature(self.token, timestamp, nonce, encrypt)
root = ET.Element("xml")
ET.SubElement(root, "Encrypt").text = encrypt
ET.SubElement(root, "MsgSignature").text = signature
ET.SubElement(root, "TimeStamp").text = timestamp
ET.SubElement(root, "Nonce").text = nonce
return ET.tostring(root, encoding="unicode")
def _encrypt_bytes(self, raw: bytes) -> str:
try:
random_prefix = os.urandom(16)
msg_len = struct.pack("I", socket.htonl(len(raw)))
payload = random_prefix + msg_len + raw + self.receive_id.encode("utf-8")
padded = PKCS7Encoder.encode(payload)
cipher = Cipher(algorithms.AES(self.key), modes.CBC(self.iv), backend=default_backend())
encryptor = cipher.encryptor()
encrypted = encryptor.update(padded) + encryptor.finalize()
return base64.b64encode(encrypted).decode("utf-8")
except Exception as exc:
raise EncryptError(f"encrypt failed: {exc}") from exc
@staticmethod
def _random_nonce(length: int = 10) -> str:
alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
return "".join(secrets.choice(alphabet) for _ in range(length))
File diff suppressed because it is too large Load Diff
+57 -106
View File
@@ -120,9 +120,8 @@ class WhatsAppAdapter(BasePlatformAdapter):
- session_path: Path to store WhatsApp session data
"""
# WhatsApp message limits — practical UX limit, not protocol max.
# WhatsApp allows ~65K but long messages are unreadable on mobile.
MAX_MESSAGE_LENGTH = 4096
# WhatsApp message limits
MAX_MESSAGE_LENGTH = 65536 # WhatsApp allows longer messages
# Default bridge location relative to the hermes-agent install
_DEFAULT_BRIDGE_DIR = Path(__file__).resolve().parents[2] / "scripts" / "whatsapp-bridge"
@@ -146,6 +145,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
self._bridge_log: Optional[Path] = None
self._poll_task: Optional[asyncio.Task] = None
self._http_session: Optional["aiohttp.ClientSession"] = None
self._session_lock_identity: Optional[str] = None
def _whatsapp_require_mention(self) -> bool:
configured = self.config.extra.get("require_mention")
@@ -290,7 +290,23 @@ class WhatsAppAdapter(BasePlatformAdapter):
# Acquire scoped lock to prevent duplicate sessions
try:
if not self._acquire_platform_lock('whatsapp-session', str(self._session_path), 'WhatsApp session'):
from gateway.status import acquire_scoped_lock
self._session_lock_identity = str(self._session_path)
acquired, existing = acquire_scoped_lock(
"whatsapp-session",
self._session_lock_identity,
metadata={"platform": self.platform.value},
)
if not acquired:
owner_pid = existing.get("pid") if isinstance(existing, dict) else None
message = (
"Another local Hermes gateway is already using this WhatsApp session"
+ (f" (PID {owner_pid})." if owner_pid else ".")
+ " Stop the other gateway before starting a second WhatsApp bridge."
)
logger.error("[%s] %s", self.name, message)
self._set_fatal_error("whatsapp_session_lock", message, retryable=False)
return False
except Exception as e:
logger.warning("[%s] Could not acquire session lock (non-fatal): %s", self.name, e)
@@ -452,7 +468,12 @@ class WhatsAppAdapter(BasePlatformAdapter):
return True
except Exception as e:
self._release_platform_lock()
if self._session_lock_identity:
try:
from gateway.status import release_scoped_lock
release_scoped_lock("whatsapp-session", self._session_lock_identity)
except Exception:
pass
logger.error("[%s] Failed to start bridge: %s", self.name, e, exc_info=True)
self._close_bridge_log()
return False
@@ -525,70 +546,19 @@ class WhatsAppAdapter(BasePlatformAdapter):
await self._http_session.close()
self._http_session = None
self._release_platform_lock()
if self._session_lock_identity:
try:
from gateway.status import release_scoped_lock
release_scoped_lock("whatsapp-session", self._session_lock_identity)
except Exception as e:
logger.warning("[%s] Error releasing WhatsApp session lock: %s", self.name, e, exc_info=True)
self._mark_disconnected()
self._bridge_process = None
self._close_bridge_log()
self._session_lock_identity = None
print(f"[{self.name}] Disconnected")
def format_message(self, content: str) -> str:
"""Convert standard markdown to WhatsApp-compatible formatting.
WhatsApp supports: *bold*, _italic_, ~strikethrough~, ```code```,
and monospaced `inline`. Standard markdown uses different syntax
for bold/italic/strikethrough, so we convert here.
Code blocks (``` fenced) and inline code (`) are protected from
conversion via placeholder substitution.
"""
if not content:
return content
# --- 1. Protect fenced code blocks from formatting changes ---
_FENCE_PH = "\x00FENCE"
fences: list[str] = []
def _save_fence(m: re.Match) -> str:
fences.append(m.group(0))
return f"{_FENCE_PH}{len(fences) - 1}\x00"
result = re.sub(r"```[\s\S]*?```", _save_fence, content)
# --- 2. Protect inline code ---
_CODE_PH = "\x00CODE"
codes: list[str] = []
def _save_code(m: re.Match) -> str:
codes.append(m.group(0))
return f"{_CODE_PH}{len(codes) - 1}\x00"
result = re.sub(r"`[^`\n]+`", _save_code, result)
# --- 3. Convert markdown formatting to WhatsApp syntax ---
# Bold: **text** or __text__ → *text*
result = re.sub(r"\*\*(.+?)\*\*", r"*\1*", result)
result = re.sub(r"__(.+?)__", r"*\1*", result)
# Strikethrough: ~~text~~ → ~text~
result = re.sub(r"~~(.+?)~~", r"~\1~", result)
# Italic: *text* is already WhatsApp italic — leave as-is
# _text_ is already WhatsApp italic — leave as-is
# --- 4. Convert markdown headers to bold text ---
# # Header → *Header*
result = re.sub(r"^#{1,6}\s+(.+)$", r"*\1*", result, flags=re.MULTILINE)
# --- 5. Convert markdown links: [text](url) → text (url) ---
result = re.sub(r"\[([^\]]+)\]\(([^)]+)\)", r"\1 (\2)", result)
# --- 6. Restore protected sections ---
for i, fence in enumerate(fences):
result = result.replace(f"{_FENCE_PH}{i}\x00", fence)
for i, code in enumerate(codes):
result = result.replace(f"{_CODE_PH}{i}\x00", code)
return result
async def send(
self,
chat_id: str,
@@ -596,57 +566,38 @@ class WhatsAppAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None
) -> SendResult:
"""Send a message via the WhatsApp bridge.
Formats markdown for WhatsApp, splits long messages into chunks
that preserve code block boundaries, and sends each chunk sequentially.
"""
"""Send a message via the WhatsApp bridge."""
if not self._running or not self._http_session:
return SendResult(success=False, error="Not connected")
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
return SendResult(success=False, error=bridge_exit)
if not content or not content.strip():
return SendResult(success=True, message_id=None)
try:
import aiohttp
# Format and chunk the message
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
last_message_id = None
for chunk in chunks:
payload: Dict[str, Any] = {
"chatId": chat_id,
"message": chunk,
}
if reply_to and last_message_id is None:
# Only reply-to on the first chunk
payload["replyTo"] = reply_to
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/send",
json=payload,
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
data = await resp.json()
last_message_id = data.get("messageId")
else:
error = await resp.text()
return SendResult(success=False, error=error)
# Small delay between chunks to avoid rate limiting
if len(chunks) > 1:
await asyncio.sleep(0.3)
return SendResult(
success=True,
message_id=last_message_id,
)
payload = {
"chatId": chat_id,
"message": content,
}
if reply_to:
payload["replyTo"] = reply_to
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/send",
json=payload,
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except Exception as e:
return SendResult(success=False, error=str(e))
-20
View File
@@ -1,20 +0,0 @@
"""Shared gateway restart constants and parsing helpers."""
from hermes_cli.config import DEFAULT_CONFIG
# EX_TEMPFAIL from sysexits.h — used to ask the service manager to restart
# the gateway after a graceful drain/reload path completes.
GATEWAY_SERVICE_RESTART_EXIT_CODE = 75
DEFAULT_GATEWAY_RESTART_DRAIN_TIMEOUT = float(
DEFAULT_CONFIG["agent"]["restart_drain_timeout"]
)
def parse_restart_drain_timeout(raw: object) -> float:
"""Parse a configured drain timeout, falling back to the shared default."""
try:
value = float(raw) if str(raw or "").strip() else DEFAULT_GATEWAY_RESTART_DRAIN_TIMEOUT
except (TypeError, ValueError):
return DEFAULT_GATEWAY_RESTART_DRAIN_TIMEOUT
return max(0.0, value)
+498 -1818
View File
File diff suppressed because it is too large Load Diff
+36 -51
View File
@@ -368,11 +368,6 @@ class SessionEntry:
# survives gateway restarts (the old in-memory _pre_flushed_sessions
# set was lost on restart, causing redundant re-flushes).
memory_flushed: bool = False
# When True the next call to get_or_create_session() will auto-reset
# this session (create a new session_id) so the user starts fresh.
# Set by /stop to break stuck-resume loops (#7536).
suspended: bool = False
def to_dict(self) -> Dict[str, Any]:
result = {
@@ -392,7 +387,6 @@ class SessionEntry:
"estimated_cost_usd": self.estimated_cost_usd,
"cost_status": self.cost_status,
"memory_flushed": self.memory_flushed,
"suspended": self.suspended,
}
if self.origin:
result["origin"] = self.origin.to_dict()
@@ -429,7 +423,6 @@ class SessionEntry:
estimated_cost_usd=data.get("estimated_cost_usd", 0.0),
cost_status=data.get("cost_status", "unknown"),
memory_flushed=data.get("memory_flushed", False),
suspended=data.get("suspended", False),
)
@@ -705,12 +698,7 @@ class SessionStore:
if session_key in self._entries and not force_new:
entry = self._entries[session_key]
# Auto-reset sessions marked as suspended (e.g. after /stop
# broke a stuck loop — #7536).
if entry.suspended:
reset_reason = "suspended"
else:
reset_reason = self._should_reset(entry, source)
reset_reason = self._should_reset(entry, source)
if not reset_reason:
entry.updated_at = now
self._save()
@@ -765,6 +753,41 @@ class SessionStore:
except Exception as e:
print(f"[gateway] Warning: Failed to create SQLite session: {e}")
# Seed new DM thread sessions with parent DM session history.
# When a bot reply creates a Slack thread and the user responds in it,
# the thread gets a new session (keyed by thread_ts). Without seeding,
# the thread session starts with zero context — the user's original
# question and the bot's answer are invisible. Fix: copy the parent
# DM session's transcript into the new thread session so context carries
# over while still keeping threads isolated from each other.
if (
source.chat_type == "dm"
and source.thread_id
and entry.created_at == entry.updated_at # brand-new session
and not was_auto_reset
):
parent_source = SessionSource(
platform=source.platform,
chat_id=source.chat_id,
chat_type="dm",
user_id=source.user_id,
# no thread_id — this is the parent DM session
)
parent_key = self._generate_session_key(parent_source)
with self._lock:
parent_entry = self._entries.get(parent_key)
if parent_entry and parent_entry.session_id != entry.session_id:
try:
parent_history = self.load_transcript(parent_entry.session_id)
if parent_history:
self.rewrite_transcript(entry.session_id, parent_history)
logger.info(
"[Session] Seeded DM thread session %s with %d messages from parent %s",
entry.session_id, len(parent_history), parent_entry.session_id,
)
except Exception as e:
logger.warning("[Session] Failed to seed thread session: %s", e)
return entry
def update_session(
@@ -783,44 +806,6 @@ class SessionStore:
entry.last_prompt_tokens = last_prompt_tokens
self._save()
def suspend_session(self, session_key: str) -> bool:
"""Mark a session as suspended so it auto-resets on next access.
Used by ``/stop`` to prevent stuck sessions from being resumed
after a gateway restart (#7536). Returns True if the session
existed and was marked.
"""
with self._lock:
self._ensure_loaded_locked()
if session_key in self._entries:
self._entries[session_key].suspended = True
self._save()
return True
return False
def suspend_recently_active(self, max_age_seconds: int = 120) -> int:
"""Mark recently-active sessions as suspended.
Called on gateway startup to prevent sessions that were likely
in-flight when the gateway last exited from being blindly resumed
(#7536). Only suspends sessions updated within *max_age_seconds*
to avoid resetting long-idle sessions that are harmless to resume.
Returns the number of sessions that were suspended.
"""
from datetime import timedelta
cutoff = _now() - timedelta(seconds=max_age_seconds)
count = 0
with self._lock:
self._ensure_loaded_locked()
for entry in self._entries.values():
if not entry.suspended and entry.updated_at >= cutoff:
entry.suspended = True
count += 1
if count:
self._save()
return count
def reset_session(self, session_key: str) -> Optional[SessionEntry]:
"""Force reset a session, creating a new session ID."""
db_end_session_id = None
-128
View File
@@ -1,128 +0,0 @@
"""
Session-scoped context variables for the Hermes gateway.
Replaces the previous ``os.environ``-based session state
(``HERMES_SESSION_PLATFORM``, ``HERMES_SESSION_CHAT_ID``, etc.) with
Python's ``contextvars.ContextVar``.
**Why this matters**
The gateway processes messages concurrently via ``asyncio``. When two
messages arrive at the same time the old code did:
os.environ["HERMES_SESSION_THREAD_ID"] = str(context.source.thread_id)
Because ``os.environ`` is *process-global*, Message A's value was
silently overwritten by Message B before Message A's agent finished
running. Background-task notifications and tool calls therefore routed
to the wrong thread.
``contextvars.ContextVar`` values are *task-local*: each ``asyncio``
task (and any ``run_in_executor`` thread it spawns) gets its own copy,
so concurrent messages never interfere.
**Backward compatibility**
The public helper ``get_session_env(name, default="")`` mirrors the old
``os.getenv("HERMES_SESSION_*", ...)`` calls. Existing tool code only
needs to replace the import + call site:
# before
import os
platform = os.getenv("HERMES_SESSION_PLATFORM", "")
# after
from gateway.session_context import get_session_env
platform = get_session_env("HERMES_SESSION_PLATFORM", "")
"""
from contextvars import ContextVar
# ---------------------------------------------------------------------------
# Per-task session variables
# ---------------------------------------------------------------------------
_SESSION_PLATFORM: ContextVar[str] = ContextVar("HERMES_SESSION_PLATFORM", default="")
_SESSION_CHAT_ID: ContextVar[str] = ContextVar("HERMES_SESSION_CHAT_ID", default="")
_SESSION_CHAT_NAME: ContextVar[str] = ContextVar("HERMES_SESSION_CHAT_NAME", default="")
_SESSION_THREAD_ID: ContextVar[str] = ContextVar("HERMES_SESSION_THREAD_ID", default="")
_SESSION_USER_ID: ContextVar[str] = ContextVar("HERMES_SESSION_USER_ID", default="")
_SESSION_USER_NAME: ContextVar[str] = ContextVar("HERMES_SESSION_USER_NAME", default="")
_SESSION_KEY: ContextVar[str] = ContextVar("HERMES_SESSION_KEY", default="")
_VAR_MAP = {
"HERMES_SESSION_PLATFORM": _SESSION_PLATFORM,
"HERMES_SESSION_CHAT_ID": _SESSION_CHAT_ID,
"HERMES_SESSION_CHAT_NAME": _SESSION_CHAT_NAME,
"HERMES_SESSION_THREAD_ID": _SESSION_THREAD_ID,
"HERMES_SESSION_USER_ID": _SESSION_USER_ID,
"HERMES_SESSION_USER_NAME": _SESSION_USER_NAME,
"HERMES_SESSION_KEY": _SESSION_KEY,
}
def set_session_vars(
platform: str = "",
chat_id: str = "",
chat_name: str = "",
thread_id: str = "",
user_id: str = "",
user_name: str = "",
session_key: str = "",
) -> list:
"""Set all session context variables and return reset tokens.
Call ``clear_session_vars(tokens)`` in a ``finally`` block to restore
the previous values when the handler exits.
Returns a list of ``Token`` objects (one per variable) that can be
passed to ``clear_session_vars``.
"""
tokens = [
_SESSION_PLATFORM.set(platform),
_SESSION_CHAT_ID.set(chat_id),
_SESSION_CHAT_NAME.set(chat_name),
_SESSION_THREAD_ID.set(thread_id),
_SESSION_USER_ID.set(user_id),
_SESSION_USER_NAME.set(user_name),
_SESSION_KEY.set(session_key),
]
return tokens
def clear_session_vars(tokens: list) -> None:
"""Restore session context variables to their pre-handler values."""
if not tokens:
return
vars_in_order = [
_SESSION_PLATFORM,
_SESSION_CHAT_ID,
_SESSION_CHAT_NAME,
_SESSION_THREAD_ID,
_SESSION_USER_ID,
_SESSION_USER_NAME,
_SESSION_KEY,
]
for var, token in zip(vars_in_order, tokens):
var.reset(token)
def get_session_env(name: str, default: str = "") -> str:
"""Read a session context variable by its legacy ``HERMES_SESSION_*`` name.
Drop-in replacement for ``os.getenv("HERMES_SESSION_*", default)``.
Resolution order:
1. Context variable (set by the gateway for concurrency-safe access)
2. ``os.environ`` (used by CLI, cron scheduler, and tests)
3. *default*
"""
import os
var = _VAR_MAP.get(name)
if var is not None:
value = var.get()
if value:
return value
# Fall back to os.environ for CLI, cron, and test compatibility
return os.getenv(name, default)
+12 -51
View File
@@ -14,8 +14,6 @@ concurrently under distinct configurations).
import hashlib
import json
import os
import signal
import subprocess
import sys
from datetime import datetime, timezone
from pathlib import Path
@@ -25,8 +23,6 @@ from typing import Any, Optional
_GATEWAY_KIND = "hermes-gateway"
_RUNTIME_STATUS_FILE = "gateway_state.json"
_LOCKS_DIRNAME = "gateway-locks"
_IS_WINDOWS = sys.platform == "win32"
_UNSET = object()
def _get_pid_path() -> Path:
@@ -53,33 +49,6 @@ def _utc_now_iso() -> str:
return datetime.now(timezone.utc).isoformat()
def terminate_pid(pid: int, *, force: bool = False) -> None:
"""Terminate a PID with platform-appropriate force semantics.
POSIX uses SIGTERM/SIGKILL. Windows uses taskkill /T /F for true force-kill
because os.kill(..., SIGTERM) is not equivalent to a tree-killing hard stop.
"""
if force and _IS_WINDOWS:
try:
result = subprocess.run(
["taskkill", "/PID", str(pid), "/T", "/F"],
capture_output=True,
text=True,
timeout=10,
)
except FileNotFoundError:
os.kill(pid, signal.SIGTERM)
return
if result.returncode != 0:
details = (result.stderr or result.stdout or "").strip()
raise OSError(details or f"taskkill failed for PID {pid}")
return
sig = signal.SIGTERM if not force else getattr(signal, "SIGKILL", signal.SIGTERM)
os.kill(pid, sig)
def _scope_hash(identity: str) -> str:
return hashlib.sha256(identity.encode("utf-8")).hexdigest()[:16]
@@ -159,8 +128,6 @@ def _build_runtime_status_record() -> dict[str, Any]:
payload.update({
"gateway_state": "starting",
"exit_reason": None,
"restart_requested": False,
"active_agents": 0,
"platforms": {},
"updated_at": _utc_now_iso(),
})
@@ -219,14 +186,12 @@ def write_pid_file() -> None:
def write_runtime_status(
*,
gateway_state: Any = _UNSET,
exit_reason: Any = _UNSET,
restart_requested: Any = _UNSET,
active_agents: Any = _UNSET,
platform: Any = _UNSET,
platform_state: Any = _UNSET,
error_code: Any = _UNSET,
error_message: Any = _UNSET,
gateway_state: Optional[str] = None,
exit_reason: Optional[str] = None,
platform: Optional[str] = None,
platform_state: Optional[str] = None,
error_code: Optional[str] = None,
error_message: Optional[str] = None,
) -> None:
"""Persist gateway runtime health information for diagnostics/status."""
path = _get_runtime_status_path()
@@ -237,22 +202,18 @@ def write_runtime_status(
payload["start_time"] = _get_process_start_time(os.getpid())
payload["updated_at"] = _utc_now_iso()
if gateway_state is not _UNSET:
if gateway_state is not None:
payload["gateway_state"] = gateway_state
if exit_reason is not _UNSET:
if exit_reason is not None:
payload["exit_reason"] = exit_reason
if restart_requested is not _UNSET:
payload["restart_requested"] = bool(restart_requested)
if active_agents is not _UNSET:
payload["active_agents"] = max(0, int(active_agents))
if platform is not _UNSET:
if platform is not None:
platform_payload = payload["platforms"].get(platform, {})
if platform_state is not _UNSET:
if platform_state is not None:
platform_payload["state"] = platform_state
if error_code is not _UNSET:
if error_code is not None:
platform_payload["error_code"] = error_code
if error_message is not _UNSET:
if error_message is not None:
platform_payload["error_message"] = error_message
platform_payload["updated_at"] = _utc_now_iso()
payload["platforms"][platform] = platform_payload
+54 -249
View File
@@ -32,15 +32,11 @@ _DONE = object()
# new one so that subsequent text appears below tool progress messages.
_NEW_SEGMENT = object()
# Queue marker for a completed assistant commentary message emitted between
# API/tool iterations (for example: "I'll inspect the repo first.").
_COMMENTARY = object()
@dataclass
class StreamConsumerConfig:
"""Runtime config for a single stream consumer instance."""
edit_interval: float = 1.0
edit_interval: float = 0.3
buffer_threshold: int = 40
cursor: str = ""
@@ -60,10 +56,6 @@ class GatewayStreamConsumer:
await task # wait for final edit
"""
# After this many consecutive flood-control failures, permanently disable
# progressive edits for the remainder of the stream.
_MAX_FLOOD_STRIKES = 3
def __init__(
self,
adapter: Any,
@@ -79,43 +71,18 @@ class GatewayStreamConsumer:
self._accumulated = ""
self._message_id: Optional[str] = None
self._already_sent = False
self._edit_supported = True # Disabled when progressive edits are no longer usable
self._edit_supported = True # Disabled on first edit failure (Signal/Email/HA)
self._last_edit_time = 0.0
self._last_sent_text = "" # Track last-sent text to skip redundant edits
self._fallback_final_send = False
self._fallback_prefix = ""
self._flood_strikes = 0 # Consecutive flood-control edit failures
self._current_edit_interval = self.cfg.edit_interval # Adaptive backoff
self._final_response_sent = False
@property
def already_sent(self) -> bool:
"""True if at least one message was sent or edited during the run."""
"""True if at least one message was sent/edited — signals the base
adapter to skip re-sending the final response."""
return self._already_sent
@property
def final_response_sent(self) -> bool:
"""True when the stream consumer delivered the final assistant reply."""
return self._final_response_sent
def on_segment_break(self) -> None:
"""Finalize the current stream segment and start a fresh message."""
self._queue.put(_NEW_SEGMENT)
def on_commentary(self, text: str) -> None:
"""Queue a completed interim assistant commentary message."""
if text:
self._queue.put((_COMMENTARY, text))
def _reset_segment_state(self, *, preserve_no_edit: bool = False) -> None:
if preserve_no_edit and self._message_id == "__no_edit__":
return
self._message_id = None
self._accumulated = ""
self._last_sent_text = ""
self._fallback_final_send = False
self._fallback_prefix = ""
def on_delta(self, text: str) -> None:
"""Thread-safe callback — called from the agent's worker thread.
@@ -126,7 +93,7 @@ class GatewayStreamConsumer:
if text:
self._queue.put(text)
elif text is None:
self.on_segment_break()
self._queue.put(_NEW_SEGMENT)
def finish(self) -> None:
"""Signal that the stream is complete."""
@@ -143,7 +110,6 @@ class GatewayStreamConsumer:
# Drain all available items from the queue
got_done = False
got_segment_break = False
commentary_text = None
while True:
try:
item = self._queue.get_nowait()
@@ -153,9 +119,6 @@ class GatewayStreamConsumer:
if item is _NEW_SEGMENT:
got_segment_break = True
break
if isinstance(item, tuple) and len(item) == 2 and item[0] is _COMMENTARY:
commentary_text = item[1]
break
self._accumulated += item
except queue.Empty:
break
@@ -166,13 +129,11 @@ class GatewayStreamConsumer:
should_edit = (
got_done
or got_segment_break
or commentary_text is not None
or (elapsed >= self._current_edit_interval
or (elapsed >= self.cfg.edit_interval
and self._accumulated)
or len(self._accumulated) >= self.cfg.buffer_threshold
)
current_update_visible = False
if should_edit and self._accumulated:
# Split overflow: if accumulated text exceeds the platform
# limit, split into properly sized chunks.
@@ -194,7 +155,6 @@ class GatewayStreamConsumer:
self._last_sent_text = ""
self._last_edit_time = time.monotonic()
if got_done:
self._final_response_sent = self._already_sent
return
if got_segment_break:
self._message_id = None
@@ -213,39 +173,25 @@ class GatewayStreamConsumer:
if split_at < _safe_limit // 2:
split_at = _safe_limit
chunk = self._accumulated[:split_at]
ok = await self._send_or_edit(chunk)
if self._fallback_final_send or not ok:
# Edit failed (or backed off due to flood control)
# while attempting to split an oversized message.
# Keep the full accumulated text intact so the
# fallback final-send path can deliver the remaining
# continuation without dropping content.
await self._send_or_edit(chunk)
if self._fallback_final_send:
# Edit failed while attempting to split an oversized
# message. Keep the full accumulated text intact so
# the fallback final-send path can deliver the
# remaining continuation without dropping content.
break
self._accumulated = self._accumulated[split_at:].lstrip("\n")
self._message_id = None
self._last_sent_text = ""
display_text = self._accumulated
if not got_done and not got_segment_break and commentary_text is None:
if not got_done and not got_segment_break:
display_text += self.cfg.cursor
current_update_visible = await self._send_or_edit(display_text)
await self._send_or_edit(display_text)
self._last_edit_time = time.monotonic()
if got_done:
# Extract inline keyboard tags before the final delivery
# so the raw [KEYBOARD:...] block is never shown to users.
inline_keyboard = None
if self._accumulated:
try:
inline_keyboard, self._accumulated = self.adapter.extract_inline_keyboard(self._accumulated)
except Exception as kb_err:
logger.warning("Stream: failed to extract keyboard tag: %s", kb_err)
# Keyboard-only response: provide placeholder so there's
# a message to attach buttons to.
if inline_keyboard and not self._accumulated:
self._accumulated = "Please choose:"
# Final edit without cursor. If progressive editing failed
# mid-stream, send a single continuation/fallback message
# here instead of letting the base gateway path send the
@@ -253,50 +199,22 @@ class GatewayStreamConsumer:
if self._accumulated:
if self._fallback_final_send:
await self._send_fallback_final(self._accumulated)
elif current_update_visible:
self._final_response_sent = True
elif self._message_id:
self._final_response_sent = await self._send_or_edit(self._accumulated)
await self._send_or_edit(self._accumulated)
elif not self._already_sent:
self._final_response_sent = await self._send_or_edit(self._accumulated)
# Attach inline keyboard to the final streamed message
if (inline_keyboard
and self._message_id
and self._message_id != "__no_edit__"):
try:
await self.adapter.attach_inline_keyboard(
chat_id=self.chat_id,
message_id=str(self._message_id),
buttons=inline_keyboard,
)
except Exception as kb_err:
logger.warning("Stream: failed to attach keyboard: %s", kb_err)
await self._send_or_edit(self._accumulated)
return
if commentary_text is not None:
self._reset_segment_state()
await self._send_commentary(commentary_text)
self._last_edit_time = time.monotonic()
self._reset_segment_state()
# Tool boundary: reset message state so the next text chunk
# creates a fresh message below any tool-progress messages.
#
# Exception: when _message_id is "__no_edit__" the platform
# never returned a real message ID (e.g. Signal, webhook with
# github_comment delivery). Resetting to None would re-enter
# the "first send" path on every tool boundary and post one
# platform message per tool call — that is what caused 155
# comments under a single PR. Instead, preserve the sentinel
# so the full continuation is delivered once via
# _send_fallback_final.
# (When editing fails mid-stream due to flood control the id is
# a real string like "msg_1", not "__no_edit__", so that case
# still resets and creates a fresh segment as intended.)
# Tool boundary: the should_edit block above already flushed
# accumulated text without a cursor. Reset state so the next
# text chunk creates a fresh message below any tool-progress
# messages the gateway sent in between.
if got_segment_break:
self._reset_segment_state(preserve_no_edit=True)
self._message_id = None
self._accumulated = ""
self._last_sent_text = ""
self._fallback_final_send = False
self._fallback_prefix = ""
await asyncio.sleep(0.05) # Small yield to not busy-loop
@@ -314,8 +232,6 @@ class GatewayStreamConsumer:
# Matches the simple cleanup regex used by the non-streaming path in
# gateway/platforms/base.py for post-processing.
_MEDIA_RE = re.compile(r'''[`"']?MEDIA:\s*\S+[`"']?''')
# Pattern to strip complete [KEYBOARD: ...] blocks from streaming display.
_KEYBOARD_RE = re.compile(r'\[KEYBOARD:\s*\n.*?\]', re.DOTALL)
@staticmethod
def _clean_for_display(text: str) -> str:
@@ -328,17 +244,10 @@ class GatewayStreamConsumer:
stream finishes we just need to hide the raw directives from the
user.
"""
if ("MEDIA:" not in text
and "[[audio_as_voice]]" not in text
and "[KEYBOARD:" not in text):
if "MEDIA:" not in text and "[[audio_as_voice]]" not in text:
return text
cleaned = text.replace("[[audio_as_voice]]", "")
cleaned = GatewayStreamConsumer._MEDIA_RE.sub("", cleaned)
# Strip complete [KEYBOARD:...] blocks; if the closing ']' hasn't
# arrived yet, hide everything from the opening tag onward.
cleaned = GatewayStreamConsumer._KEYBOARD_RE.sub("", cleaned)
if "[KEYBOARD:" in cleaned:
cleaned = cleaned.split("[KEYBOARD:", 1)[0]
# Collapse excessive blank lines left behind by removed tags
cleaned = re.sub(r'\n{3,}', '\n\n', cleaned)
# Strip trailing whitespace/newlines but preserve leading content
@@ -404,17 +313,13 @@ class GatewayStreamConsumer:
return chunks
async def _send_fallback_final(self, text: str) -> None:
"""Send the final continuation after streaming edits stop working.
Retries each chunk once on flood-control failures with a short delay.
"""
"""Send the final continuation after streaming edits stop working."""
final_text = self._clean_for_display(text)
continuation = self._continuation_text(final_text)
self._fallback_final_send = False
if not continuation.strip():
# Nothing new to send — the visible partial already matches final text.
self._already_sent = True
self._final_response_sent = True
return
raw_limit = getattr(self.adapter, "MAX_MESSAGE_LENGTH", 4096)
@@ -425,31 +330,17 @@ class GatewayStreamConsumer:
last_successful_chunk = ""
sent_any_chunk = False
for chunk in chunks:
# Try sending with one retry on flood-control errors.
result = None
for attempt in range(2):
result = await self.adapter.send(
chat_id=self.chat_id,
content=chunk,
metadata=self.metadata,
)
if result.success:
break
if attempt == 0 and self._is_flood_error(result):
logger.debug(
"Flood control on fallback send, retrying in 3s"
)
await asyncio.sleep(3.0)
else:
break # non-flood error or second attempt failed
if not result or not result.success:
result = await self.adapter.send(
chat_id=self.chat_id,
content=chunk,
metadata=self.metadata,
)
if not result.success:
if sent_any_chunk:
# Some continuation text already reached the user. Suppress
# the base gateway final-send path so we don't resend the
# full response and create another duplicate.
self._already_sent = True
self._final_response_sent = True
self._message_id = last_message_id
self._last_sent_text = last_successful_chunk
self._fallback_prefix = ""
@@ -467,74 +358,23 @@ class GatewayStreamConsumer:
self._message_id = last_message_id
self._already_sent = True
self._final_response_sent = True
self._last_sent_text = chunks[-1]
self._fallback_prefix = ""
def _is_flood_error(self, result) -> bool:
"""Check if a SendResult failure is due to flood control / rate limiting."""
err = getattr(result, "error", "") or ""
err_lower = err.lower()
return "flood" in err_lower or "retry after" in err_lower or "rate" in err_lower
async def _try_strip_cursor(self) -> None:
"""Best-effort edit to remove the cursor from the last visible message.
Called when entering fallback mode so the user doesn't see a stuck
cursor () in the partial message.
"""
if not self._message_id or self._message_id == "__no_edit__":
return
prefix = self._visible_prefix()
if not prefix or not prefix.strip():
return
try:
await self.adapter.edit_message(
chat_id=self.chat_id,
message_id=self._message_id,
content=prefix,
)
self._last_sent_text = prefix
except Exception:
pass # best-effort — don't let this block the fallback path
async def _send_commentary(self, text: str) -> bool:
"""Send a completed interim assistant commentary message."""
text = self._clean_for_display(text)
if not text.strip():
return False
try:
result = await self.adapter.send(
chat_id=self.chat_id,
content=text,
metadata=self.metadata,
)
if result.success:
self._already_sent = True
return True
except Exception as e:
logger.error("Commentary send error: %s", e)
return False
async def _send_or_edit(self, text: str) -> bool:
"""Send or edit the streaming message.
Returns True if the text was successfully delivered (sent or edited),
False otherwise. Callers like the overflow split loop use this to
decide whether to advance past the delivered chunk.
"""
async def _send_or_edit(self, text: str) -> None:
"""Send or edit the streaming message."""
# Strip MEDIA: directives so they don't appear as visible text.
# Media files are delivered as native attachments after the stream
# finishes (via _deliver_media_from_response in gateway/run.py).
text = self._clean_for_display(text)
if not text.strip():
return True # nothing to send is "success"
return
try:
if self._message_id is not None:
if self._edit_supported:
# Skip if text is identical to what we last sent
if text == self._last_sent_text:
return True
return
# Edit existing message
result = await self.adapter.edit_message(
chat_id=self.chat_id,
@@ -544,52 +384,19 @@ class GatewayStreamConsumer:
if result.success:
self._already_sent = True
self._last_sent_text = text
# Successful edit — reset flood strike counter
self._flood_strikes = 0
return True
else:
# Edit failed. If this looks like flood control / rate
# limiting, use adaptive backoff: double the edit interval
# and retry on the next cycle. Only permanently disable
# edits after _MAX_FLOOD_STRIKES consecutive failures.
if self._is_flood_error(result):
self._flood_strikes += 1
self._current_edit_interval = min(
self._current_edit_interval * 2, 10.0,
)
logger.debug(
"Flood control on edit (strike %d/%d), "
"backoff interval → %.1fs",
self._flood_strikes,
self._MAX_FLOOD_STRIKES,
self._current_edit_interval,
)
if self._flood_strikes < self._MAX_FLOOD_STRIKES:
# Don't disable edits yet — just slow down.
# Update _last_edit_time so the next edit
# respects the new interval.
self._last_edit_time = time.monotonic()
return False
# Non-flood error OR flood strikes exhausted: enter
# fallback mode — send only the missing tail once the
# If an edit fails mid-stream (especially Telegram flood control),
# stop progressive edits and send only the missing tail once the
# final response is available.
logger.debug(
"Edit failed (strikes=%d), entering fallback mode",
self._flood_strikes,
)
logger.debug("Edit failed, disabling streaming for this adapter")
self._fallback_prefix = self._visible_prefix()
self._fallback_final_send = True
self._edit_supported = False
self._already_sent = True
# Best-effort: strip the cursor from the last visible
# message so the user doesn't see a stuck ▉.
await self._try_strip_cursor()
return False
else:
# Editing not supported — skip intermediate updates.
# The final response will be sent by the fallback path.
return False
pass
else:
# First message — send new
result = await self.adapter.send(
@@ -597,25 +404,23 @@ class GatewayStreamConsumer:
content=text,
metadata=self.metadata,
)
if result.success:
if result.message_id:
self._message_id = result.message_id
else:
self._edit_supported = False
if result.success and result.message_id:
self._message_id = result.message_id
self._already_sent = True
self._last_sent_text = text
if not result.message_id:
self._fallback_prefix = self._visible_prefix()
self._fallback_final_send = True
# Sentinel prevents re-entering the first-send path on
# every delta/tool boundary when platforms accept a
# message but do not return an editable message id.
self._message_id = "__no_edit__"
return True
elif result.success:
# Platform accepted the message but returned no message_id
# (e.g. Signal). Can't edit without an ID — switch to
# fallback mode: suppress intermediate deltas, send only
# the missing tail once the final response is ready.
self._already_sent = True
self._edit_supported = False
self._fallback_prefix = self._clean_for_display(text)
self._fallback_final_send = True
# Sentinel prevents re-entering this branch on every delta
self._message_id = "__no_edit__"
else:
# Initial send failed — disable streaming for this session
self._edit_supported = False
return False
except Exception as e:
logger.error("Stream send/edit error: %s", e)
return False
+3 -173
View File
@@ -198,14 +198,6 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
api_key_env_vars=("DEEPSEEK_API_KEY",),
base_url_env_var="DEEPSEEK_BASE_URL",
),
"xai": ProviderConfig(
id="xai",
name="xAI",
auth_type="api_key",
inference_base_url="https://api.x.ai/v1",
api_key_env_vars=("XAI_API_KEY",),
base_url_env_var="XAI_BASE_URL",
),
"ai-gateway": ProviderConfig(
id="ai-gateway",
name="AI Gateway",
@@ -250,39 +242,9 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
api_key_env_vars=("HF_TOKEN",),
base_url_env_var="HF_BASE_URL",
),
"xiaomi": ProviderConfig(
id="xiaomi",
name="Xiaomi MiMo",
auth_type="api_key",
inference_base_url="https://api.xiaomimimo.com/v1",
api_key_env_vars=("XIAOMI_API_KEY",),
base_url_env_var="XIAOMI_BASE_URL",
),
}
# =============================================================================
# Anthropic Key Helper
# =============================================================================
def get_anthropic_key() -> str:
"""Return the first usable Anthropic credential, or ``""``.
Checks both the ``.env`` file (via ``get_env_value``) and the process
environment (``os.getenv``). The fallback order mirrors the
``PROVIDER_REGISTRY["anthropic"].api_key_env_vars`` tuple:
ANTHROPIC_API_KEY -> ANTHROPIC_TOKEN -> CLAUDE_CODE_OAUTH_TOKEN
"""
from hermes_cli.config import get_env_value
for var in PROVIDER_REGISTRY["anthropic"].api_key_env_vars:
value = get_env_value(var) or os.getenv(var, "")
if value:
return value
return ""
# =============================================================================
# Kimi Code Endpoint Detection
# =============================================================================
@@ -742,27 +704,6 @@ def write_credential_pool(provider_id: str, entries: List[Dict[str, Any]]) -> Pa
return _save_auth_store(auth_store)
def suppress_credential_source(provider_id: str, source: str) -> None:
"""Mark a credential source as suppressed so it won't be re-seeded."""
with _auth_store_lock():
auth_store = _load_auth_store()
suppressed = auth_store.setdefault("suppressed_sources", {})
provider_list = suppressed.setdefault(provider_id, [])
if source not in provider_list:
provider_list.append(source)
_save_auth_store(auth_store)
def is_source_suppressed(provider_id: str, source: str) -> bool:
"""Check if a credential source has been suppressed by the user."""
try:
auth_store = _load_auth_store()
suppressed = auth_store.get("suppressed_sources", {})
return source in suppressed.get(provider_id, [])
except Exception:
return False
def get_provider_auth_state(provider_id: str) -> Optional[Dict[str, Any]]:
"""Return persisted auth state for a provider, or None."""
auth_store = _load_auth_store()
@@ -775,57 +716,6 @@ def get_active_provider() -> Optional[str]:
return auth_store.get("active_provider")
def is_provider_explicitly_configured(provider_id: str) -> bool:
"""Return True only if the user has explicitly configured this provider.
Checks:
1. active_provider in auth.json matches
2. model.provider in config.yaml matches
3. Provider-specific env vars are set (e.g. ANTHROPIC_API_KEY)
This is used to gate auto-discovery of external credentials (e.g.
Claude Code's ~/.claude/.credentials.json) so they are never used
without the user's explicit choice. See PR #4210 for the same
pattern applied to the setup wizard gate.
"""
normalized = (provider_id or "").strip().lower()
# 1. Check auth.json active_provider
try:
auth_store = _load_auth_store()
active = (auth_store.get("active_provider") or "").strip().lower()
if active and active == normalized:
return True
except Exception:
pass
# 2. Check config.yaml model.provider
try:
from hermes_cli.config import load_config
cfg = load_config()
model_cfg = cfg.get("model")
if isinstance(model_cfg, dict):
cfg_provider = (model_cfg.get("provider") or "").strip().lower()
if cfg_provider == normalized:
return True
except Exception:
pass
# 3. Check provider-specific env vars
# Exclude CLAUDE_CODE_OAUTH_TOKEN — it's set by Claude Code itself,
# not by the user explicitly configuring anthropic in Hermes.
_IMPLICIT_ENV_VARS = {"CLAUDE_CODE_OAUTH_TOKEN"}
pconfig = PROVIDER_REGISTRY.get(normalized)
if pconfig and pconfig.auth_type == "api_key":
for env_var in pconfig.api_key_env_vars:
if env_var in _IMPLICIT_ENV_VARS:
continue
if has_usable_secret(os.getenv(env_var, "")):
return True
return False
def clear_provider_auth(provider_id: Optional[str] = None) -> bool:
"""
Clear auth state for a provider. Used by `hermes logout`.
@@ -928,7 +818,7 @@ def resolve_provider(
_PROVIDER_ALIASES = {
"glm": "zai", "z-ai": "zai", "z.ai": "zai", "zhipu": "zai",
"google": "gemini", "google-gemini": "gemini", "google-ai-studio": "gemini",
"kimi": "kimi-coding", "kimi-for-coding": "kimi-coding", "moonshot": "kimi-coding",
"kimi": "kimi-coding", "moonshot": "kimi-coding",
"minimax-china": "minimax-cn", "minimax_cn": "minimax-cn",
"claude": "anthropic", "claude-code": "anthropic",
"github": "copilot", "github-copilot": "copilot",
@@ -938,7 +828,6 @@ def resolve_provider(
"opencode": "opencode-zen", "zen": "opencode-zen",
"qwen-portal": "qwen-oauth", "qwen-cli": "qwen-oauth", "qwen-oauth": "qwen-oauth",
"hf": "huggingface", "hugging-face": "huggingface", "huggingface-hub": "huggingface",
"mimo": "xiaomi", "xiaomi-mimo": "xiaomi",
"go": "opencode-go", "opencode-go-sub": "opencode-go",
"kilo": "kilocode", "kilo-code": "kilocode", "kilo-gateway": "kilocode",
# Local server aliases — route through the generic custom provider
@@ -1303,49 +1192,6 @@ def _read_codex_tokens(*, _lock: bool = True) -> Dict[str, Any]:
}
def _write_codex_cli_tokens(
access_token: str,
refresh_token: str,
*,
last_refresh: Optional[str] = None,
) -> None:
"""Write refreshed tokens back to ~/.codex/auth.json.
OpenAI OAuth refresh tokens are single-use and rotate on every refresh.
When Hermes refreshes a token it consumes the old refresh_token; if we
don't write the new pair back, the Codex CLI (or VS Code extension) will
fail with ``refresh_token_reused`` on its next refresh attempt.
This mirrors the Anthropic write-back to ~/.claude/.credentials.json
via ``_write_claude_code_credentials()``.
"""
codex_home = os.getenv("CODEX_HOME", "").strip()
if not codex_home:
codex_home = str(Path.home() / ".codex")
auth_path = Path(codex_home).expanduser() / "auth.json"
try:
existing: Dict[str, Any] = {}
if auth_path.is_file():
existing = json.loads(auth_path.read_text(encoding="utf-8"))
if not isinstance(existing, dict):
existing = {}
tokens_dict = existing.get("tokens")
if not isinstance(tokens_dict, dict):
tokens_dict = {}
tokens_dict["access_token"] = access_token
tokens_dict["refresh_token"] = refresh_token
existing["tokens"] = tokens_dict
if last_refresh is not None:
existing["last_refresh"] = last_refresh
auth_path.parent.mkdir(parents=True, exist_ok=True)
auth_path.write_text(json.dumps(existing, indent=2), encoding="utf-8")
auth_path.chmod(0o600)
except (OSError, IOError) as exc:
logger.debug("Failed to write refreshed tokens to %s: %s", auth_path, exc)
def _save_codex_tokens(tokens: Dict[str, str], last_refresh: str = None) -> None:
"""Save Codex OAuth tokens to Hermes auth store (~/.hermes/auth.json)."""
if last_refresh is None:
@@ -1468,12 +1314,6 @@ def _refresh_codex_auth_tokens(
updated_tokens["refresh_token"] = refreshed["refresh_token"]
_save_codex_tokens(updated_tokens)
# Write back to ~/.codex/auth.json so Codex CLI / VS Code stay in sync.
_write_codex_cli_tokens(
refreshed["access_token"],
refreshed["refresh_token"],
last_refresh=refreshed.get("last_refresh"),
)
return updated_tokens
@@ -1601,15 +1441,7 @@ def _resolve_verify(
if effective_insecure:
return False
if effective_ca:
ca_path = str(effective_ca)
if not os.path.isfile(ca_path):
import logging
logging.getLogger("hermes.auth").warning(
"CA bundle path does not exist: %s — falling back to default certificates",
ca_path,
)
return True
return ca_path
return str(effective_ca)
return True
@@ -2712,8 +2544,6 @@ def _prompt_model_selection(
title=effective_title,
)
idx = menu.show()
from hermes_cli.curses_ui import flush_stdin
flush_stdin()
if idx is None:
return None
print()
@@ -2723,7 +2553,7 @@ def _prompt_model_selection(
custom = input("Enter model name: ").strip()
return custom if custom else None
return None
except (ImportError, NotImplementedError, OSError, subprocess.SubprocessError):
except (ImportError, NotImplementedError):
pass
# Fallback: numbered list
+2 -5
View File
@@ -347,11 +347,8 @@ def auth_remove_command(args) -> None:
print("Cleared Hermes Anthropic OAuth credentials")
elif removed.source == "claude_code" and provider == "anthropic":
from hermes_cli.auth import suppress_credential_source
suppress_credential_source(provider, "claude_code")
print("Suppressed claude_code credential — it will not be re-seeded.")
print("Note: Claude Code credentials still live in ~/.claude/.credentials.json")
print("Run `hermes auth add anthropic` to re-enable if needed.")
print("Note: Claude Code credentials live in ~/.claude/.credentials.json")
print(" Remove them manually if you want to deauthorize Claude Code.")
def auth_reset_command(args) -> None:
-655
View File
@@ -1,655 +0,0 @@
"""
Backup and import commands for hermes CLI.
`hermes backup` creates a zip archive of the entire ~/.hermes/ directory
(excluding the hermes-agent repo and transient files).
`hermes import` restores from a backup zip, overlaying onto the current
HERMES_HOME root.
"""
import json
import logging
import os
import shutil
import sqlite3
import sys
import tempfile
import time
import zipfile
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional
from hermes_constants import get_default_hermes_root, get_hermes_home, display_hermes_home
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Exclusion rules
# ---------------------------------------------------------------------------
# Directory names to skip entirely (matched against each path component)
_EXCLUDED_DIRS = {
"hermes-agent", # the codebase repo — re-clone instead
"__pycache__", # bytecode caches — regenerated on import
".git", # nested git dirs (profiles shouldn't have these, but safety)
"node_modules", # js deps if website/ somehow leaks in
}
# File-name suffixes to skip
_EXCLUDED_SUFFIXES = (
".pyc",
".pyo",
)
# File names to skip (runtime state that's meaningless on another machine)
_EXCLUDED_NAMES = {
"gateway.pid",
"cron.pid",
}
def _should_exclude(rel_path: Path) -> bool:
"""Return True if *rel_path* (relative to hermes root) should be skipped."""
parts = rel_path.parts
# Any path component matches an excluded dir name
for part in parts:
if part in _EXCLUDED_DIRS:
return True
name = rel_path.name
if name in _EXCLUDED_NAMES:
return True
if name.endswith(_EXCLUDED_SUFFIXES):
return True
return False
# ---------------------------------------------------------------------------
# SQLite safe copy
# ---------------------------------------------------------------------------
def _safe_copy_db(src: Path, dst: Path) -> bool:
"""Copy a SQLite database safely using the backup() API.
Handles WAL mode produces a consistent snapshot even while
the DB is being written to. Falls back to raw copy on failure.
"""
try:
conn = sqlite3.connect(f"file:{src}?mode=ro", uri=True)
backup_conn = sqlite3.connect(str(dst))
conn.backup(backup_conn)
backup_conn.close()
conn.close()
return True
except Exception as exc:
logger.warning("SQLite safe copy failed for %s: %s", src, exc)
try:
shutil.copy2(src, dst)
return True
except Exception as exc2:
logger.error("Raw copy also failed for %s: %s", src, exc2)
return False
# ---------------------------------------------------------------------------
# Backup
# ---------------------------------------------------------------------------
def _format_size(nbytes: int) -> str:
"""Human-readable file size."""
for unit in ("B", "KB", "MB", "GB"):
if nbytes < 1024:
return f"{nbytes:.1f} {unit}" if unit != "B" else f"{nbytes} {unit}"
nbytes /= 1024
return f"{nbytes:.1f} TB"
def run_backup(args) -> None:
"""Create a zip backup of the Hermes home directory."""
hermes_root = get_default_hermes_root()
if not hermes_root.is_dir():
print(f"Error: Hermes home directory not found at {hermes_root}")
sys.exit(1)
# Determine output path
if args.output:
out_path = Path(args.output).expanduser().resolve()
# If user gave a directory, put the zip inside it
if out_path.is_dir():
stamp = datetime.now().strftime("%Y-%m-%d-%H%M%S")
out_path = out_path / f"hermes-backup-{stamp}.zip"
else:
stamp = datetime.now().strftime("%Y-%m-%d-%H%M%S")
out_path = Path.home() / f"hermes-backup-{stamp}.zip"
# Ensure the suffix is .zip
if out_path.suffix.lower() != ".zip":
out_path = out_path.with_suffix(out_path.suffix + ".zip")
# Ensure parent directory exists
out_path.parent.mkdir(parents=True, exist_ok=True)
# Collect files
print(f"Scanning {display_hermes_home()} ...")
files_to_add: list[tuple[Path, Path]] = [] # (absolute, relative)
skipped_dirs = set()
for dirpath, dirnames, filenames in os.walk(hermes_root, followlinks=False):
dp = Path(dirpath)
rel_dir = dp.relative_to(hermes_root)
# Prune excluded directories in-place so os.walk doesn't descend
orig_dirnames = dirnames[:]
dirnames[:] = [
d for d in dirnames
if d not in _EXCLUDED_DIRS
]
for removed in set(orig_dirnames) - set(dirnames):
skipped_dirs.add(str(rel_dir / removed))
for fname in filenames:
fpath = dp / fname
rel = fpath.relative_to(hermes_root)
if _should_exclude(rel):
continue
# Skip the output zip itself if it happens to be inside hermes root
try:
if fpath.resolve() == out_path.resolve():
continue
except (OSError, ValueError):
pass
files_to_add.append((fpath, rel))
if not files_to_add:
print("No files to back up.")
return
# Create the zip
file_count = len(files_to_add)
print(f"Backing up {file_count} files ...")
total_bytes = 0
errors = []
t0 = time.monotonic()
with zipfile.ZipFile(out_path, "w", zipfile.ZIP_DEFLATED, compresslevel=6) as zf:
for i, (abs_path, rel_path) in enumerate(files_to_add, 1):
try:
# Safe copy for SQLite databases (handles WAL mode)
if abs_path.suffix == ".db":
with tempfile.NamedTemporaryFile(suffix=".db", delete=False) as tmp:
tmp_db = Path(tmp.name)
if _safe_copy_db(abs_path, tmp_db):
zf.write(tmp_db, arcname=str(rel_path))
total_bytes += tmp_db.stat().st_size
tmp_db.unlink(missing_ok=True)
else:
tmp_db.unlink(missing_ok=True)
errors.append(f" {rel_path}: SQLite safe copy failed")
continue
else:
zf.write(abs_path, arcname=str(rel_path))
total_bytes += abs_path.stat().st_size
except (PermissionError, OSError) as exc:
errors.append(f" {rel_path}: {exc}")
continue
# Progress every 500 files
if i % 500 == 0:
print(f" {i}/{file_count} files ...")
elapsed = time.monotonic() - t0
zip_size = out_path.stat().st_size
# Summary
print()
print(f"Backup complete: {out_path}")
print(f" Files: {file_count}")
print(f" Original: {_format_size(total_bytes)}")
print(f" Compressed: {_format_size(zip_size)}")
print(f" Time: {elapsed:.1f}s")
if skipped_dirs:
print(f"\n Excluded directories:")
for d in sorted(skipped_dirs):
print(f" {d}/")
if errors:
print(f"\n Warnings ({len(errors)} files skipped):")
for e in errors[:10]:
print(e)
if len(errors) > 10:
print(f" ... and {len(errors) - 10} more")
print(f"\nRestore with: hermes import {out_path.name}")
# ---------------------------------------------------------------------------
# Import
# ---------------------------------------------------------------------------
def _validate_backup_zip(zf: zipfile.ZipFile) -> tuple[bool, str]:
"""Check that a zip looks like a Hermes backup.
Returns (ok, reason).
"""
names = zf.namelist()
if not names:
return False, "zip archive is empty"
# Look for telltale files that a hermes home would have
markers = {"config.yaml", ".env", "state.db"}
found = set()
for n in names:
# Could be at the root or one level deep (if someone zipped the directory)
basename = Path(n).name
if basename in markers:
found.add(basename)
if not found:
return False, (
"zip does not appear to be a Hermes backup "
"(no config.yaml, .env, or state databases found)"
)
return True, ""
def _detect_prefix(zf: zipfile.ZipFile) -> str:
"""Detect if the zip has a common directory prefix wrapping all entries.
Some tools zip as `.hermes/config.yaml` instead of `config.yaml`.
Returns the prefix to strip (empty string if none).
"""
names = [n for n in zf.namelist() if not n.endswith("/")]
if not names:
return ""
# Find common prefix
parts_list = [Path(n).parts for n in names]
# Check if all entries share a common first directory
first_parts = {p[0] for p in parts_list if len(p) > 1}
if len(first_parts) == 1:
prefix = first_parts.pop()
# Only strip if it looks like a hermes dir name
if prefix in (".hermes", "hermes"):
return prefix + "/"
return ""
def run_import(args) -> None:
"""Restore a Hermes backup from a zip file."""
zip_path = Path(args.zipfile).expanduser().resolve()
if not zip_path.is_file():
print(f"Error: File not found: {zip_path}")
sys.exit(1)
if not zipfile.is_zipfile(zip_path):
print(f"Error: Not a valid zip file: {zip_path}")
sys.exit(1)
hermes_root = get_default_hermes_root()
with zipfile.ZipFile(zip_path, "r") as zf:
# Validate
ok, reason = _validate_backup_zip(zf)
if not ok:
print(f"Error: {reason}")
sys.exit(1)
prefix = _detect_prefix(zf)
members = [n for n in zf.namelist() if not n.endswith("/")]
file_count = len(members)
print(f"Backup contains {file_count} files")
print(f"Target: {display_hermes_home()}")
if prefix:
print(f"Detected archive prefix: {prefix!r} (will be stripped)")
# Check for existing installation
has_config = (hermes_root / "config.yaml").exists()
has_env = (hermes_root / ".env").exists()
if (has_config or has_env) and not args.force:
print()
print("Warning: Target directory already has Hermes configuration.")
print("Importing will overwrite existing files with backup contents.")
print()
try:
answer = input("Continue? [y/N] ").strip().lower()
except (EOFError, KeyboardInterrupt):
print("\nAborted.")
sys.exit(1)
if answer not in ("y", "yes"):
print("Aborted.")
return
# Extract
print(f"\nImporting {file_count} files ...")
hermes_root.mkdir(parents=True, exist_ok=True)
errors = []
restored = 0
t0 = time.monotonic()
for member in members:
# Strip prefix if detected
if prefix and member.startswith(prefix):
rel = member[len(prefix):]
else:
rel = member
if not rel:
continue
target = hermes_root / rel
# Security: reject absolute paths and traversals
try:
target.resolve().relative_to(hermes_root.resolve())
except ValueError:
errors.append(f" {rel}: path traversal blocked")
continue
try:
target.parent.mkdir(parents=True, exist_ok=True)
with zf.open(member) as src, open(target, "wb") as dst:
dst.write(src.read())
restored += 1
except (PermissionError, OSError) as exc:
errors.append(f" {rel}: {exc}")
if restored % 500 == 0:
print(f" {restored}/{file_count} files ...")
elapsed = time.monotonic() - t0
# Summary
print()
print(f"Import complete: {restored} files restored in {elapsed:.1f}s")
print(f" Target: {display_hermes_home()}")
if errors:
print(f"\n Warnings ({len(errors)} files skipped):")
for e in errors[:10]:
print(e)
if len(errors) > 10:
print(f" ... and {len(errors) - 10} more")
# Post-import: restore profile wrapper scripts
profiles_dir = hermes_root / "profiles"
restored_profiles = []
if profiles_dir.is_dir():
try:
from hermes_cli.profiles import (
create_wrapper_script, check_alias_collision,
_is_wrapper_dir_in_path, _get_wrapper_dir,
)
for entry in sorted(profiles_dir.iterdir()):
if not entry.is_dir():
continue
profile_name = entry.name
# Only create wrappers for directories with config
if not (entry / "config.yaml").exists() and not (entry / ".env").exists():
continue
collision = check_alias_collision(profile_name)
if collision:
print(f" Skipped alias '{profile_name}': {collision}")
restored_profiles.append((profile_name, False))
else:
wrapper = create_wrapper_script(profile_name)
restored_profiles.append((profile_name, wrapper is not None))
if restored_profiles:
created = [n for n, ok in restored_profiles if ok]
skipped = [n for n, ok in restored_profiles if not ok]
if created:
print(f"\n Profile aliases restored: {', '.join(created)}")
if skipped:
print(f" Profile aliases skipped: {', '.join(skipped)}")
if not _is_wrapper_dir_in_path():
print(f"\n Note: {_get_wrapper_dir()} is not in your PATH.")
print(' Add to your shell config (~/.bashrc or ~/.zshrc):')
print(' export PATH="$HOME/.local/bin:$PATH"')
except ImportError:
# hermes_cli.profiles might not be available (fresh install)
if any(profiles_dir.iterdir()):
print(f"\n Profiles detected but aliases could not be created.")
print(f" Run: hermes profile list (after installing hermes)")
# Guidance
print()
if not (hermes_root / "hermes-agent").is_dir():
print("Note: The hermes-agent codebase was not included in the backup.")
print(" If this is a fresh install, run: hermes update")
if restored_profiles:
gw_profiles = [n for n, _ in restored_profiles]
print("\nTo re-enable gateway services for profiles:")
for pname in gw_profiles:
print(f" hermes -p {pname} gateway install")
print("Done. Your Hermes configuration has been restored.")
# ---------------------------------------------------------------------------
# Quick state snapshots (used by /snapshot slash command and hermes backup --quick)
# ---------------------------------------------------------------------------
# Critical state files to include in quick snapshots (relative to HERMES_HOME).
# Everything else is either regeneratable (logs, cache) or managed separately
# (skills, repo, sessions/).
_QUICK_STATE_FILES = (
"state.db",
"config.yaml",
".env",
"auth.json",
"cron/jobs.json",
"gateway_state.json",
"channel_directory.json",
"processes.json",
)
_QUICK_SNAPSHOTS_DIR = "state-snapshots"
_QUICK_DEFAULT_KEEP = 20
def _quick_snapshot_root(hermes_home: Optional[Path] = None) -> Path:
home = hermes_home or get_hermes_home()
return home / _QUICK_SNAPSHOTS_DIR
def create_quick_snapshot(
label: Optional[str] = None,
hermes_home: Optional[Path] = None,
) -> Optional[str]:
"""Create a quick state snapshot of critical files.
Copies STATE_FILES to a timestamped directory under state-snapshots/.
Auto-prunes old snapshots beyond the keep limit.
Returns:
Snapshot ID (timestamp-based), or None if no files found.
"""
home = hermes_home or get_hermes_home()
root = _quick_snapshot_root(home)
ts = datetime.now(timezone.utc).strftime("%Y%m%d-%H%M%S")
snap_id = f"{ts}-{label}" if label else ts
snap_dir = root / snap_id
snap_dir.mkdir(parents=True, exist_ok=True)
manifest: Dict[str, int] = {} # rel_path -> file size
for rel in _QUICK_STATE_FILES:
src = home / rel
if not src.exists() or not src.is_file():
continue
dst = snap_dir / rel
dst.parent.mkdir(parents=True, exist_ok=True)
try:
if src.suffix == ".db":
if not _safe_copy_db(src, dst):
continue
else:
shutil.copy2(src, dst)
manifest[rel] = dst.stat().st_size
except (OSError, PermissionError) as exc:
logger.warning("Could not snapshot %s: %s", rel, exc)
if not manifest:
shutil.rmtree(snap_dir, ignore_errors=True)
return None
# Write manifest
meta = {
"id": snap_id,
"timestamp": ts,
"label": label,
"file_count": len(manifest),
"total_size": sum(manifest.values()),
"files": manifest,
}
with open(snap_dir / "manifest.json", "w") as f:
json.dump(meta, f, indent=2)
# Auto-prune
_prune_quick_snapshots(root, keep=_QUICK_DEFAULT_KEEP)
logger.info("State snapshot created: %s (%d files)", snap_id, len(manifest))
return snap_id
def list_quick_snapshots(
limit: int = 20,
hermes_home: Optional[Path] = None,
) -> List[Dict[str, Any]]:
"""List existing quick state snapshots, most recent first."""
root = _quick_snapshot_root(hermes_home)
if not root.exists():
return []
results = []
for d in sorted(root.iterdir(), reverse=True):
if not d.is_dir():
continue
manifest_path = d / "manifest.json"
if manifest_path.exists():
try:
with open(manifest_path) as f:
results.append(json.load(f))
except (json.JSONDecodeError, OSError):
results.append({"id": d.name, "file_count": 0, "total_size": 0})
if len(results) >= limit:
break
return results
def restore_quick_snapshot(
snapshot_id: str,
hermes_home: Optional[Path] = None,
) -> bool:
"""Restore state from a quick snapshot.
Overwrites current state files with the snapshot's copies.
Returns True if at least one file was restored.
"""
home = hermes_home or get_hermes_home()
root = _quick_snapshot_root(home)
snap_dir = root / snapshot_id
if not snap_dir.is_dir():
return False
manifest_path = snap_dir / "manifest.json"
if not manifest_path.exists():
return False
with open(manifest_path) as f:
meta = json.load(f)
restored = 0
for rel in meta.get("files", {}):
src = snap_dir / rel
if not src.exists():
continue
dst = home / rel
dst.parent.mkdir(parents=True, exist_ok=True)
try:
if dst.suffix == ".db":
# Atomic-ish replace for databases
tmp = dst.parent / f".{dst.name}.snap_restore"
shutil.copy2(src, tmp)
dst.unlink(missing_ok=True)
shutil.move(str(tmp), str(dst))
else:
shutil.copy2(src, dst)
restored += 1
except (OSError, PermissionError) as exc:
logger.error("Failed to restore %s: %s", rel, exc)
logger.info("Restored %d files from snapshot %s", restored, snapshot_id)
return restored > 0
def _prune_quick_snapshots(root: Path, keep: int = _QUICK_DEFAULT_KEEP) -> int:
"""Remove oldest quick snapshots beyond the keep limit. Returns count deleted."""
if not root.exists():
return 0
dirs = sorted(
(d for d in root.iterdir() if d.is_dir()),
key=lambda d: d.name,
reverse=True,
)
deleted = 0
for d in dirs[keep:]:
try:
shutil.rmtree(d)
deleted += 1
except OSError as exc:
logger.warning("Failed to prune snapshot %s: %s", d.name, exc)
return deleted
def prune_quick_snapshots(
keep: int = _QUICK_DEFAULT_KEEP,
hermes_home: Optional[Path] = None,
) -> int:
"""Manually prune quick snapshots. Returns count deleted."""
return _prune_quick_snapshots(_quick_snapshot_root(hermes_home), keep=keep)
def run_quick_backup(args) -> None:
"""CLI entry point for hermes backup --quick."""
label = getattr(args, "label", None)
snap_id = create_quick_snapshot(label=label)
if snap_id:
print(f"State snapshot created: {snap_id}")
snaps = list_quick_snapshots()
print(f" {len(snaps)} snapshot(s) stored in {display_hermes_home()}/state-snapshots/")
print(f" Restore with: /snapshot restore {snap_id}")
else:
print("No state files found to snapshot.")
+67 -235
View File
@@ -1,9 +1,8 @@
"""hermes claw — OpenClaw migration commands.
Usage:
hermes claw migrate # Preview then migrate (always shows preview first)
hermes claw migrate --dry-run # Preview only, no changes
hermes claw migrate --yes # Skip confirmation prompt
hermes claw migrate # Interactive migration from ~/.openclaw
hermes claw migrate --dry-run # Preview what would be migrated
hermes claw migrate --preset full --overwrite # Full migration, overwrite conflicts
hermes claw cleanup # Archive leftover OpenClaw directories
hermes claw cleanup --dry-run # Preview what would be archived
@@ -11,7 +10,6 @@ Usage:
import importlib.util
import logging
import subprocess
import sys
from datetime import datetime
from pathlib import Path
@@ -51,138 +49,10 @@ _OPENCLAW_SCRIPT_INSTALLED = (
)
# Known OpenClaw directory names (current + legacy)
_OPENCLAW_DIR_NAMES = (".openclaw", ".clawdbot", ".moltbot")
_OPENCLAW_DIR_NAMES = (".openclaw", ".clawdbot", ".moldbot")
def _detect_openclaw_processes() -> list[str]:
"""Detect running OpenClaw processes and services.
Returns a list of human-readable descriptions of what was found.
An empty list means nothing was detected.
"""
found: list[str] = []
# -- systemd service (Linux) ------------------------------------------
if sys.platform != "win32":
try:
result = subprocess.run(
["systemctl", "--user", "is-active", "openclaw-gateway.service"],
capture_output=True, text=True, timeout=5,
)
if result.stdout.strip() == "active":
found.append("systemd service: openclaw-gateway.service")
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
# -- process scan ------------------------------------------------------
if sys.platform == "win32":
try:
for exe in ("openclaw.exe", "clawd.exe"):
result = subprocess.run(
["tasklist", "/FI", f"IMAGENAME eq {exe}"],
capture_output=True, text=True, timeout=5,
)
if exe in result.stdout.lower():
found.append(f"process: {exe}")
# Node.js-hosted OpenClaw — tasklist doesn't show command lines,
# so fall back to PowerShell.
ps_cmd = (
'Get-CimInstance Win32_Process -Filter "Name = \'node.exe\'" | '
'Where-Object { $_.CommandLine -match "openclaw|clawd" } | '
'Select-Object -First 1 ProcessId'
)
result = subprocess.run(
["powershell", "-NoProfile", "-Command", ps_cmd],
capture_output=True, text=True, timeout=5,
)
if result.stdout.strip():
found.append(f"node.exe process with openclaw in command line (PID {result.stdout.strip()})")
except Exception:
pass
else:
try:
result = subprocess.run(
["pgrep", "-f", "openclaw"],
capture_output=True, text=True, timeout=3,
)
if result.returncode == 0:
pids = result.stdout.strip().split()
found.append(f"openclaw process(es) (PIDs: {', '.join(pids)})")
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
return found
def _warn_if_openclaw_running(auto_yes: bool) -> None:
"""Warn if OpenClaw is still running before migration.
Telegram, Discord, and Slack only allow one active connection per bot
token. Migrating while OpenClaw is running causes both to fight for the
same token.
"""
running = _detect_openclaw_processes()
if not running:
return
print()
print_error("OpenClaw appears to be running:")
for detail in running:
print_info(f" * {detail}")
print_info(
"Messaging platforms (Telegram, Discord, Slack) only allow one "
"active session per bot token. If you continue, both OpenClaw and "
"Hermes may try to use the same token, causing disconnects."
)
print_info("Recommendation: stop OpenClaw before migrating.")
print()
if auto_yes:
return
if not sys.stdin.isatty():
print_info("Non-interactive session — continuing to preview only.")
return
if not prompt_yes_no("Continue anyway?", default=False):
print_info("Migration cancelled. Stop OpenClaw and try again.")
sys.exit(0)
def _warn_if_gateway_running(auto_yes: bool) -> None:
"""Check if a Hermes gateway is running with connected platforms.
Migrating bot tokens while the gateway is polling will cause conflicts
(e.g. Telegram 409 "terminated by other getUpdates request"). Warn the
user and let them decide whether to continue.
"""
from gateway.status import get_running_pid, read_runtime_status
if not get_running_pid():
return
data = read_runtime_status() or {}
platforms = data.get("platforms") or {}
connected = [name for name, info in platforms.items()
if isinstance(info, dict) and info.get("state") == "connected"]
if not connected:
return
print()
print_error(
"Hermes gateway is running with active connections: "
+ ", ".join(connected)
)
print_info(
"Migrating bot tokens while the gateway is active will cause "
"conflicts (Telegram, Discord, and Slack only allow one active "
"session per token)."
)
print_info("Recommendation: stop the gateway first with 'hermes stop'.")
print()
if not auto_yes and not prompt_yes_no("Continue anyway?", default=False):
print_info("Migration cancelled. Stop the gateway and try again.")
sys.exit(0)
# State files commonly found in OpenClaw workspace directories — listed
# during cleanup to help the user decide whether to archive
# State files commonly found in OpenClaw workspace directories that cause
# confusion after migration (the agent discovers them and writes to them)
_WORKSPACE_STATE_GLOBS = (
"*/todo.json",
"*/sessions/*",
@@ -227,7 +97,7 @@ def _find_openclaw_dirs() -> list[Path]:
def _scan_workspace_state(source_dir: Path) -> list[tuple[Path, str]]:
"""Scan an OpenClaw directory for workspace state files.
"""Scan an OpenClaw directory for workspace state files that cause confusion.
Returns a list of (path, description) tuples.
"""
@@ -310,7 +180,7 @@ def _cmd_migrate(args):
source_dir = Path.home() / ".openclaw"
if not source_dir.is_dir():
# Try legacy directory names
for legacy in (".clawdbot", ".moltbot"):
for legacy in (".clawdbot", ".moldbot"):
candidate = Path.home() / legacy
if candidate.is_dir():
source_dir = candidate
@@ -367,12 +237,12 @@ def _cmd_migrate(args):
# Show what we're doing
hermes_home = get_hermes_home()
auto_yes = getattr(args, "yes", False)
print()
print_header("Migration Settings")
print_info(f"Source: {source_dir}")
print_info(f"Target: {hermes_home}")
print_info(f"Preset: {preset}")
print_info(f"Mode: {'dry run (preview only)' if dry_run else 'execute'}")
print_info(f"Overwrite: {'yes' if overwrite else 'no (skip conflicts)'}")
print_info(f"Secrets: {'yes (allowlisted only)' if migrate_secrets else 'no'}")
if skill_conflict != "skip":
@@ -381,88 +251,31 @@ def _cmd_migrate(args):
print_info(f"Workspace: {workspace_target}")
print()
# Check if OpenClaw is still running — migrating tokens while both are
# active will cause conflicts (e.g. Telegram 409).
_warn_if_openclaw_running(auto_yes)
# Check if a Hermes gateway is running with connected platforms.
_warn_if_gateway_running(auto_yes)
# For execute mode (non-dry-run), confirm unless --yes was passed
if not dry_run and not getattr(args, "yes", False):
if not prompt_yes_no("Proceed with migration?", default=True):
print_info("Migration cancelled.")
return
# Ensure config.yaml exists before migration tries to read it
config_path = get_config_path()
if not config_path.exists():
save_config(load_config())
# Load the migration module
# Load and run the migration
try:
mod = _load_migration_module(script_path)
if mod is None:
print_error("Could not load migration script.")
return
except Exception as e:
print()
print_error(f"Could not load migration script: {e}")
logger.debug("OpenClaw migration error", exc_info=True)
return
selected = mod.resolve_selected_options(None, None, preset=preset)
ws_target = Path(workspace_target).resolve() if workspace_target else None
selected = mod.resolve_selected_options(None, None, preset=preset)
ws_target = Path(workspace_target).resolve() if workspace_target else None
# ── Phase 1: Always preview first ──────────────────────────
try:
preview = mod.Migrator(
source_root=source_dir.resolve(),
target_root=hermes_home.resolve(),
execute=False,
workspace_target=ws_target,
overwrite=overwrite,
migrate_secrets=migrate_secrets,
output_dir=None,
selected_options=selected,
preset_name=preset,
skill_conflict_mode=skill_conflict,
)
preview_report = preview.migrate()
except Exception as e:
print()
print_error(f"Migration preview failed: {e}")
logger.debug("OpenClaw migration preview error", exc_info=True)
return
preview_summary = preview_report.get("summary", {})
preview_count = preview_summary.get("migrated", 0)
if preview_count == 0:
print()
print_info("Nothing to migrate from OpenClaw.")
_print_migration_report(preview_report, dry_run=True)
return
print()
print_header(f"Migration Preview — {preview_count} item(s) would be imported")
print_info("No changes have been made yet. Review the list below:")
_print_migration_report(preview_report, dry_run=True)
# If --dry-run, stop here
if dry_run:
return
# ── Phase 2: Confirm and execute ───────────────────────────
print()
if not auto_yes:
if not sys.stdin.isatty():
print_info("Non-interactive session — preview only.")
print_info("To execute, re-run with: hermes claw migrate --yes")
return
if not prompt_yes_no("Proceed with migration?", default=True):
print_info("Migration cancelled.")
return
try:
migrator = mod.Migrator(
source_root=source_dir.resolve(),
target_root=hermes_home.resolve(),
execute=True,
execute=not dry_run,
workspace_target=ws_target,
overwrite=overwrite,
migrate_secrets=migrate_secrets,
@@ -479,18 +292,62 @@ def _cmd_migrate(args):
return
# Print results
_print_migration_report(report, dry_run=False)
_print_migration_report(report, dry_run)
# Source directory is left untouched — archiving is not the migration
# tool's responsibility. Users who want to clean up can run
# 'hermes claw cleanup' separately.
# After successful non-dry-run migration, offer to archive the source directory
if not dry_run and report.get("summary", {}).get("migrated", 0) > 0:
_offer_source_archival(source_dir, getattr(args, "yes", False))
def _offer_source_archival(source_dir: Path, auto_yes: bool = False):
"""After migration, offer to rename the source directory to prevent state fragmentation.
OpenClaw workspace directories contain state files (todo.json, sessions, etc.)
that the agent may discover and write to, causing confusion. Renaming the
directory prevents this.
"""
if not source_dir.is_dir():
return
# Scan for state files that could cause problems
state_files = _scan_workspace_state(source_dir)
print()
print_header("Post-Migration Cleanup")
print_info("The OpenClaw directory still exists and contains workspace state files")
print_info("that can confuse the agent (todo lists, sessions, logs).")
if state_files:
print()
print(color(" Found state files:", Colors.YELLOW))
# Show up to 10 most relevant findings
for path, desc in state_files[:10]:
print(f" {desc}")
if len(state_files) > 10:
print(f" ... and {len(state_files) - 10} more")
print()
print_info(f"Recommend: rename {source_dir.name}/ to {source_dir.name}.pre-migration/")
print_info("This prevents the agent from discovering old workspace directories.")
print_info("You can always rename it back if needed.")
print()
if auto_yes or prompt_yes_no(f"Archive {source_dir} now?", default=True):
try:
archive_path = _archive_directory(source_dir)
print_success(f"Archived: {source_dir}{archive_path}")
print_info("The original directory has been renamed, not deleted.")
print_info(f"To undo: mv {archive_path} {source_dir}")
except OSError as e:
print_error(f"Could not archive: {e}")
print_info(f"You can do it manually: mv {source_dir} {source_dir}.pre-migration")
else:
print_info("Skipped. You can archive later with: hermes claw cleanup")
def _cmd_cleanup(args):
"""Archive leftover OpenClaw directories after migration.
Scans for OpenClaw directories that still exist after migration and offers
to rename them to .pre-migration to free disk space.
to rename them to .pre-migration to prevent state fragmentation.
"""
dry_run = getattr(args, "dry_run", False)
auto_yes = getattr(args, "yes", False)
@@ -527,28 +384,6 @@ def _cmd_cleanup(args):
print_success("No OpenClaw directories found. Nothing to clean up.")
return
# Warn if OpenClaw is still running — archiving while the service is
# active causes it to recreate an empty skeleton directory (#8502).
running = _detect_openclaw_processes()
if running:
print()
print_error("OpenClaw appears to be still running:")
for detail in running:
print_info(f" * {detail}")
print_info(
"Archiving .openclaw/ while the service is active may cause it to "
"immediately recreate an empty skeleton directory, destroying your config."
)
print_info("Stop OpenClaw first: systemctl --user stop openclaw-gateway.service")
print()
if not auto_yes:
if not sys.stdin.isatty():
print_info("Non-interactive session — aborting. Stop OpenClaw and re-run.")
return
if not prompt_yes_no("Proceed anyway?", default=False):
print_info("Aborted. Stop OpenClaw first, then re-run: hermes claw cleanup")
return
total_archived = 0
for source_dir in dirs_to_check:
@@ -587,7 +422,7 @@ def _cmd_cleanup(args):
if state_files:
print()
print(color(f" {len(state_files)} state file(s) found:", Colors.YELLOW))
print(color(f" {len(state_files)} state file(s) that could cause confusion:", Colors.YELLOW))
for path, desc in state_files[:8]:
print(f" {desc}")
if len(state_files) > 8:
@@ -598,9 +433,6 @@ def _cmd_cleanup(args):
if dry_run:
archive_path = _archive_directory(source_dir, dry_run=True)
print_info(f"Would archive: {source_dir}{archive_path}")
elif not auto_yes and not sys.stdin.isatty():
print_info(f"Non-interactive session — would archive: {source_dir}")
print_info("To execute, re-run with: hermes claw cleanup --yes")
else:
if auto_yes or prompt_yes_no(f"Archive {source_dir}?", default=True):
try:
-79
View File
@@ -1,79 +0,0 @@
"""Shared CLI output helpers for Hermes CLI modules.
Extracts the identical ``print_info/success/warning/error`` and ``prompt()``
functions previously duplicated across setup.py, tools_config.py,
mcp_config.py, and memory_setup.py.
"""
import getpass
import sys
from hermes_cli.colors import Colors, color
# ─── Print Helpers ────────────────────────────────────────────────────────────
def print_info(text: str) -> None:
"""Print a dim informational message."""
print(color(f" {text}", Colors.DIM))
def print_success(text: str) -> None:
"""Print a green success message with ✓ prefix."""
print(color(f"{text}", Colors.GREEN))
def print_warning(text: str) -> None:
"""Print a yellow warning message with ⚠ prefix."""
print(color(f"{text}", Colors.YELLOW))
def print_error(text: str) -> None:
"""Print a red error message with ✗ prefix."""
print(color(f"{text}", Colors.RED))
def print_header(text: str) -> None:
"""Print a bold yellow header."""
print(color(f"\n {text}", Colors.YELLOW))
# ─── Input Prompts ────────────────────────────────────────────────────────────
def prompt(
question: str,
default: str | None = None,
password: bool = False,
) -> str:
"""Prompt the user for input with optional default and password masking.
Replaces the four independent ``_prompt()`` / ``prompt()`` implementations
in setup.py, tools_config.py, mcp_config.py, and memory_setup.py.
Returns the user's input (stripped), or *default* if the user presses Enter.
Returns empty string on Ctrl-C or EOF.
"""
suffix = f" [{default}]" if default else ""
display = color(f" {question}{suffix}: ", Colors.YELLOW)
try:
if password:
value = getpass.getpass(display)
else:
value = input(display)
value = value.strip()
return value if value else (default or "")
except (KeyboardInterrupt, EOFError):
print()
return ""
def prompt_yes_no(question: str, default: bool = True) -> bool:
"""Prompt for a yes/no answer. Returns bool."""
hint = "Y/n" if default else "y/N"
answer = prompt(f"{question} ({hint})")
if not answer:
return default
return answer.lower().startswith("y")
+16 -2
View File
@@ -19,10 +19,11 @@ import subprocess
import sys
from pathlib import Path
from hermes_constants import is_wsl as _is_wsl
logger = logging.getLogger(__name__)
# Cache WSL detection (checked once per process)
_wsl_detected: bool | None = None
def save_clipboard_image(dest: Path) -> bool:
"""Extract an image from the system clipboard and save it as PNG.
@@ -216,6 +217,19 @@ def _windows_save(dest: Path) -> bool:
# ── Linux ────────────────────────────────────────────────────────────────
def _is_wsl() -> bool:
"""Detect if running inside WSL (1 or 2)."""
global _wsl_detected
if _wsl_detected is not None:
return _wsl_detected
try:
with open("/proc/version", "r") as f:
_wsl_detected = "microsoft" in f.read().lower()
except Exception:
_wsl_detected = False
return _wsl_detected
def _linux_save(dest: Path) -> bool:
"""Try clipboard backends in priority order: WSL → Wayland → X11."""
if _is_wsl():
+6 -43
View File
@@ -16,18 +16,8 @@ from collections.abc import Callable, Mapping
from dataclasses import dataclass
from typing import Any
# prompt_toolkit is an optional CLI dependency — only needed for
# SlashCommandCompleter and SlashCommandAutoSuggest. Gateway and test
# environments that lack it must still be able to import this module
# for resolve_command, gateway_help_lines, and COMMAND_REGISTRY.
try:
from prompt_toolkit.auto_suggest import AutoSuggest, Suggestion
from prompt_toolkit.completion import Completer, Completion
except ImportError: # pragma: no cover
AutoSuggest = object # type: ignore[assignment,misc]
Completer = object # type: ignore[assignment,misc]
Suggestion = None # type: ignore[assignment]
Completion = None # type: ignore[assignment]
from prompt_toolkit.auto_suggest import AutoSuggest, Suggestion
from prompt_toolkit.completion import Completer, Completion
# ---------------------------------------------------------------------------
@@ -69,12 +59,9 @@ COMMAND_REGISTRY: list[CommandDef] = [
args_hint="[name]"),
CommandDef("branch", "Branch the current session (explore a different path)", "Session",
aliases=("fork",), args_hint="[name]"),
CommandDef("compress", "Manually compress conversation context", "Session",
args_hint="[focus topic]"),
CommandDef("compress", "Manually compress conversation context", "Session"),
CommandDef("rollback", "List or restore filesystem checkpoints", "Session",
args_hint="[number]"),
CommandDef("snapshot", "Create or restore state snapshots of Hermes config/state", "Session",
aliases=("snap",), args_hint="[create|restore <id>|prune]"),
CommandDef("stop", "Kill all running background processes", "Session"),
CommandDef("approve", "Approve a pending dangerous command", "Session",
gateway_only=True, args_hint="[session|always]"),
@@ -86,7 +73,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
args_hint="<question>"),
CommandDef("queue", "Queue a prompt for the next turn (doesn't interrupt)", "Session",
aliases=("q",), args_hint="<prompt>"),
CommandDef("status", "Show session info", "Session"),
CommandDef("status", "Show session info", "Session",
gateway_only=True),
CommandDef("profile", "Show active profile name and home directory", "Info"),
CommandDef("sethome", "Set this chat as the home channel", "Session",
gateway_only=True, aliases=("set-home",)),
@@ -112,9 +100,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("reasoning", "Manage reasoning effort and display", "Configuration",
args_hint="[level|show|hide]",
subcommands=("none", "minimal", "low", "medium", "high", "xhigh", "show", "hide", "on", "off")),
CommandDef("fast", "Toggle fast mode — OpenAI Priority Processing / Anthropic Fast Mode (Normal/Fast)", "Configuration",
args_hint="[normal|fast|status]",
subcommands=("normal", "fast", "status", "on", "off")),
CommandDef("skin", "Show or change the display skin/theme", "Configuration",
cli_only=True, args_hint="[name]"),
CommandDef("voice", "Toggle voice mode", "Configuration",
@@ -131,7 +116,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("cron", "Manage scheduled tasks", "Tools & Skills",
cli_only=True, args_hint="[subcommand]",
subcommands=("list", "add", "create", "edit", "pause", "resume", "run", "remove")),
CommandDef("reload", "Reload .env variables into the running session", "Tools & Skills"),
CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
aliases=("reload_mcp",)),
CommandDef("browser", "Connect browser tools to your live Chrome via CDP", "Tools & Skills",
@@ -144,8 +128,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("commands", "Browse all commands and skills (paginated)", "Info",
gateway_only=True, args_hint="[page]"),
CommandDef("help", "Show available commands", "Info"),
CommandDef("restart", "Gracefully restart the gateway after draining active runs", "Session",
gateway_only=True),
CommandDef("usage", "Show token usage and rate limits for the current session", "Info"),
CommandDef("insights", "Show usage insights and analytics", "Info",
args_hint="[days]"),
@@ -153,11 +135,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
cli_only=True, aliases=("gateway",)),
CommandDef("paste", "Check clipboard for an image and attach it", "Info",
cli_only=True),
CommandDef("image", "Attach a local image file for your next prompt", "Info",
cli_only=True, args_hint="<path>"),
CommandDef("update", "Update Hermes Agent to the latest version", "Info",
gateway_only=True),
CommandDef("debug", "Upload debug report (system info + logs) and get shareable links", "Info"),
# Exit
CommandDef("quit", "Exit the CLI", "Exit",
@@ -652,18 +631,8 @@ class SlashCommandCompleter(Completer):
def __init__(
self,
skill_commands_provider: Callable[[], Mapping[str, dict[str, Any]]] | None = None,
command_filter: Callable[[str], bool] | None = None,
) -> None:
self._skill_commands_provider = skill_commands_provider
self._command_filter = command_filter
def _command_allowed(self, slash_command: str) -> bool:
if self._command_filter is None:
return True
try:
return bool(self._command_filter(slash_command))
except Exception:
return True
def _iter_skill_commands(self) -> Mapping[str, dict[str, Any]]:
if self._skill_commands_provider is None:
@@ -941,7 +910,7 @@ class SlashCommandCompleter(Completer):
return
# Static subcommand completions
if " " not in sub_text and base_cmd in SUBCOMMANDS and self._command_allowed(base_cmd):
if " " not in sub_text and base_cmd in SUBCOMMANDS:
for sub in SUBCOMMANDS[base_cmd]:
if sub.startswith(sub_lower) and sub != sub_lower:
yield Completion(
@@ -954,8 +923,6 @@ class SlashCommandCompleter(Completer):
word = text[1:]
for cmd, desc in COMMANDS.items():
if not self._command_allowed(cmd):
continue
cmd_name = cmd[1:]
if cmd_name.startswith(word):
yield Completion(
@@ -1014,8 +981,6 @@ class SlashCommandAutoSuggest(AutoSuggest):
# Still typing the command name: /upd → suggest "ate"
word = text[1:].lower()
for cmd in COMMANDS:
if self._completer is not None and not self._completer._command_allowed(cmd):
continue
cmd_name = cmd[1:] # strip leading /
if cmd_name.startswith(word) and cmd_name != word:
return Suggestion(cmd_name[len(word):])
@@ -1026,8 +991,6 @@ class SlashCommandAutoSuggest(AutoSuggest):
sub_lower = sub_text.lower()
# Static subcommands
if self._completer is not None and not self._completer._command_allowed(base_cmd):
return None
if base_cmd in SUBCOMMANDS and SUBCOMMANDS[base_cmd]:
if " " not in sub_text:
for sub in SUBCOMMANDS[base_cmd]:
+22 -288
View File
@@ -32,25 +32,19 @@ _ENV_VAR_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
_EXTRA_ENV_KEYS = frozenset({
"OPENAI_API_KEY", "OPENAI_BASE_URL",
"ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN",
"AUXILIARY_VISION_MODEL",
"DISCORD_HOME_CHANNEL", "TELEGRAM_HOME_CHANNEL",
"SIGNAL_ACCOUNT", "SIGNAL_HTTP_URL",
"SIGNAL_ALLOWED_USERS", "SIGNAL_GROUP_ALLOWED_USERS",
"DINGTALK_CLIENT_ID", "DINGTALK_CLIENT_SECRET",
"FEISHU_APP_ID", "FEISHU_APP_SECRET", "FEISHU_ENCRYPT_KEY", "FEISHU_VERIFICATION_TOKEN",
"WECOM_BOT_ID", "WECOM_SECRET",
"WECOM_CALLBACK_CORP_ID", "WECOM_CALLBACK_CORP_SECRET", "WECOM_CALLBACK_AGENT_ID",
"WECOM_CALLBACK_TOKEN", "WECOM_CALLBACK_ENCODING_AES_KEY",
"WECOM_CALLBACK_HOST", "WECOM_CALLBACK_PORT",
"WEIXIN_ACCOUNT_ID", "WEIXIN_TOKEN", "WEIXIN_BASE_URL", "WEIXIN_CDN_BASE_URL",
"WEIXIN_HOME_CHANNEL", "WEIXIN_HOME_CHANNEL_NAME", "WEIXIN_DM_POLICY", "WEIXIN_GROUP_POLICY",
"WEIXIN_ALLOWED_USERS", "WEIXIN_GROUP_ALLOWED_USERS", "WEIXIN_ALLOW_ALL_USERS",
"BLUEBUBBLES_SERVER_URL", "BLUEBUBBLES_PASSWORD",
"TERMINAL_ENV", "TERMINAL_SSH_KEY", "TERMINAL_SSH_PORT",
"WHATSAPP_MODE", "WHATSAPP_ENABLED",
"MATTERMOST_HOME_CHANNEL", "MATTERMOST_REPLY_MODE",
"MATRIX_PASSWORD", "MATRIX_ENCRYPTION", "MATRIX_DEVICE_ID", "MATRIX_HOME_ROOM",
"MATRIX_REQUIRE_MENTION", "MATRIX_FREE_RESPONSE_ROOMS", "MATRIX_AUTO_THREAD",
"MATRIX_RECOVERY_KEY",
})
import yaml
@@ -144,55 +138,6 @@ def managed_error(action: str = "modify configuration"):
print(format_managed_message(action), file=sys.stderr)
# =============================================================================
# Container-aware CLI (NixOS container mode)
# =============================================================================
def get_container_exec_info() -> Optional[dict]:
"""Read container mode metadata from HERMES_HOME/.container-mode.
Returns a dict with keys: backend, container_name, exec_user, hermes_bin
or None if container mode is not active, we're already inside the
container, or HERMES_DEV=1 is set.
The .container-mode file is written by the NixOS activation script when
container.enable = true. It tells the host CLI to exec into the container
instead of running locally.
"""
if os.environ.get("HERMES_DEV") == "1":
return None
from hermes_constants import is_container
if is_container():
return None
container_mode_file = get_hermes_home() / ".container-mode"
try:
info = {}
with open(container_mode_file, "r") as f:
for line in f:
line = line.strip()
if "=" in line and not line.startswith("#"):
key, _, value = line.partition("=")
info[key.strip()] = value.strip()
except FileNotFoundError:
return None
# All other exceptions (PermissionError, malformed data, etc.) propagate
backend = info.get("backend", "docker")
container_name = info.get("container_name", "hermes-agent")
exec_user = info.get("exec_user", "hermes")
hermes_bin = info.get("hermes_bin", "/data/current-package/bin/hermes")
return {
"backend": backend,
"container_name": container_name,
"exec_user": exec_user,
"hermes_bin": hermes_bin,
}
# =============================================================================
# Config paths
# =============================================================================
@@ -213,27 +158,16 @@ def get_project_root() -> Path:
return Path(__file__).parent.parent.resolve()
def _secure_dir(path):
"""Set directory to owner-only access (0700 by default). No-op on Windows.
"""Set directory to owner-only access (0700). No-op on Windows.
Skipped in managed mode the NixOS module sets group-readable
permissions (0750) so interactive users in the hermes group can
share state with the gateway service.
The mode can be overridden via the HERMES_HOME_MODE environment variable
(e.g. HERMES_HOME_MODE=0701) for deployments where a web server (nginx,
caddy, etc.) needs to traverse HERMES_HOME to reach a served subdirectory.
The execute-only bit on a directory permits cd-through without exposing
directory listings.
"""
if is_managed():
return
try:
mode_str = os.environ.get("HERMES_HOME_MODE", "").strip()
mode = int(mode_str, 8) if mode_str else 0o700
except ValueError:
mode = 0o700
try:
os.chmod(path, mode)
os.chmod(path, 0o700)
except (OSError, NotImplementedError):
pass
@@ -321,12 +255,6 @@ DEFAULT_CONFIG = {
# tools or receiving API responses. Only fires when the agent has
# been completely idle for this duration. 0 = unlimited.
"gateway_timeout": 1800,
# Graceful drain timeout for gateway stop/restart (seconds).
# The gateway stops accepting new work, waits for running agents
# to finish, then interrupts any remaining runs after the timeout.
# 0 = no drain, interrupt immediately.
"restart_drain_timeout": 60,
"service_tier": "",
# Tool-use enforcement: injects system prompt guidance that tells the
# model to actually call tools instead of describing intended actions.
# Values: "auto" (default — applies to gpt/codex models), true/false
@@ -337,10 +265,6 @@ DEFAULT_CONFIG = {
# threshold before escalating to a full timeout. The warning fires
# once per run and does not interrupt the agent. 0 = disable warning.
"gateway_timeout_warning": 900,
# Periodic "still working" notification interval (seconds).
# Sends a status message every N seconds so the user knows the
# agent hasn't died during long tasks. 0 = disable notifications.
"gateway_notify_interval": 600,
},
"terminal": {
@@ -437,7 +361,7 @@ DEFAULT_CONFIG = {
"model": "", # e.g. "google/gemini-2.5-flash", "gpt-4o"
"base_url": "", # direct OpenAI-compatible endpoint (takes precedence over provider)
"api_key": "", # API key for base_url (falls back to OPENAI_API_KEY)
"timeout": 120, # seconds — LLM API call timeout; vision payloads need generous timeout
"timeout": 30, # seconds — LLM API call timeout; increase for slow local vision models
"download_timeout": 30, # seconds — image HTTP download timeout; increase for slow connections
},
"web_extract": {
@@ -502,11 +426,9 @@ DEFAULT_CONFIG = {
"inline_diffs": True, # Show inline diff previews for write actions (write_file, patch, skill_manage)
"show_cost": False, # Show $ cost in the status bar (off by default)
"skin": "default",
"interim_assistant_messages": True, # Gateway: show natural mid-turn assistant status messages
"tool_progress_command": False, # Enable /verbose command in messaging gateway
"tool_progress_overrides": {}, # DEPRECATED — use display.platforms instead
"tool_progress_overrides": {}, # Per-platform overrides: {"signal": "off", "telegram": "all"}
"tool_preview_length": 0, # Max chars for tool call previews (0 = no limit, show full paths/commands)
"platforms": {}, # Per-platform display overrides: {"telegram": {"tool_progress": "all"}, "slack": {"tool_progress": "off"}}
},
# Privacy settings
@@ -516,7 +438,7 @@ DEFAULT_CONFIG = {
# Text-to-speech configuration
"tts": {
"provider": "edge", # "edge" (free) | "elevenlabs" (premium) | "openai" | "minimax" | "mistral" | "neutts" (local)
"provider": "edge", # "edge" (free) | "elevenlabs" (premium) | "openai" | "neutts" (local)
"edge": {
"voice": "en-US-AriaNeural",
# Popular: AriaNeural, JennyNeural, AndrewNeural, BrianNeural, SoniaNeural
@@ -530,10 +452,6 @@ DEFAULT_CONFIG = {
"voice": "alloy",
# Voices: alloy, echo, fable, onyx, nova, shimmer
},
"mistral": {
"model": "voxtral-mini-tts-2603",
"voice_id": "c69964a6-ab8b-4f8a-9465-ec0925096ec8", # Paul - Neutral
},
"neutts": {
"ref_audio": "", # Path to reference voice audio (empty = bundled default)
"ref_text": "", # Path to reference voice transcript (empty = bundled default)
@@ -571,16 +489,6 @@ DEFAULT_CONFIG = {
"max_ms": 2500,
},
# Context engine -- controls how the context window is managed when
# approaching the model's token limit.
# "compressor" = built-in lossy summarization (default).
# Set to a plugin name to activate an alternative engine (e.g. "lcm"
# for Lossless Context Management). The engine must be installed as
# a plugin in plugins/context_engine/<name>/ or ~/.hermes/plugins/.
"context": {
"engine": "compressor",
},
# Persistent memory -- bounded curated memory injected into system prompt
"memory": {
"memory_enabled": True,
@@ -605,8 +513,6 @@ DEFAULT_CONFIG = {
"api_key": "", # API key for delegation.base_url (falls back to OPENAI_API_KEY)
"max_iterations": 50, # per-subagent iteration cap (each subagent gets its own budget,
# independent of the parent's max_iterations)
"reasoning_effort": "", # reasoning effort for subagents: "xhigh", "high", "medium",
# "low", "minimal", "none" (empty = inherit parent's level)
},
# Ephemeral prefill messages file — JSON list of {role, content} dicts
@@ -634,7 +540,6 @@ DEFAULT_CONFIG = {
"discord": {
"require_mention": True, # Require @mention to respond in server channels
"free_response_channels": "", # Comma-separated channel IDs where bot responds without mention
"allowed_channels": "", # If set, bot ONLY responds in these channel IDs (whitelist)
"auto_thread": True, # Auto-create threads on @mention in channels (like Slack)
"reactions": True, # Add 👀/✅/❌ reactions to messages during processing
},
@@ -693,16 +598,8 @@ DEFAULT_CONFIG = {
"backup_count": 3, # Number of rotated backup files to keep
},
# Network settings — workarounds for connectivity issues.
"network": {
# Force IPv4 connections. On servers with broken or unreachable IPv6,
# Python tries AAAA records first and hangs for the full TCP timeout
# before falling back to IPv4. Set to true to skip IPv6 entirely.
"force_ipv4": False,
},
# Config schema version - bump this when adding new required fields
"_config_version": 16,
"_config_version": 13,
}
# =============================================================================
@@ -934,21 +831,6 @@ OPTIONAL_ENV_VARS = {
"category": "provider",
"advanced": True,
},
"XIAOMI_API_KEY": {
"description": "Xiaomi MiMo API key for MiMo models (mimo-v2-pro, mimo-v2-omni, mimo-v2-flash)",
"prompt": "Xiaomi MiMo API Key",
"url": "https://platform.xiaomimimo.com",
"password": True,
"category": "provider",
},
"XIAOMI_BASE_URL": {
"description": "Xiaomi MiMo base URL override (default: https://api.xiaomimimo.com/v1)",
"prompt": "Xiaomi base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
# ── Tool API keys ──
"EXA_API_KEY": {
@@ -1101,13 +983,6 @@ OPTIONAL_ENV_VARS = {
"password": True,
"category": "tool",
},
"MISTRAL_API_KEY": {
"description": "Mistral API key for Voxtral TTS and transcription (STT)",
"prompt": "Mistral API key",
"url": "https://console.mistral.ai/",
"password": True,
"category": "tool",
},
"GITHUB_TOKEN": {
"description": "GitHub token for Skills Hub (higher API rate limits, skill publish)",
"prompt": "GitHub Token",
@@ -1280,14 +1155,6 @@ OPTIONAL_ENV_VARS = {
"category": "messaging",
"advanced": True,
},
"MATRIX_RECOVERY_KEY": {
"description": "Matrix recovery key for cross-signing verification after device key rotation (from Element: Settings → Security → Recovery Key)",
"prompt": "Matrix recovery key",
"url": None,
"password": True,
"category": "messaging",
"advanced": True,
},
"BLUEBUBBLES_SERVER_URL": {
"description": "BlueBubbles server URL for iMessage integration (e.g. http://192.168.1.10:1234)",
"prompt": "BlueBubbles server URL",
@@ -1326,8 +1193,8 @@ OPTIONAL_ENV_VARS = {
"advanced": True,
},
"API_SERVER_KEY": {
"description": "Bearer token for API server authentication. Required for non-loopback binding; server refuses to start without it. On loopback (127.0.0.1), all requests are allowed if empty.",
"prompt": "API server auth key (required for network access)",
"description": "Bearer token for API server authentication. If empty, all requests are allowed (local use only).",
"prompt": "API server auth key (optional)",
"url": None,
"password": True,
"category": "messaging",
@@ -1342,21 +1209,13 @@ OPTIONAL_ENV_VARS = {
"advanced": True,
},
"API_SERVER_HOST": {
"description": "Host/bind address for the API server (default: 127.0.0.1). Use 0.0.0.0 for network access — server refuses to start without API_SERVER_KEY.",
"description": "Host/bind address for the API server (default: 127.0.0.1). Use 0.0.0.0 for network access — requires API_SERVER_KEY for security.",
"prompt": "API server host",
"url": None,
"password": False,
"category": "messaging",
"advanced": True,
},
"API_SERVER_MODEL_NAME": {
"description": "Model name advertised on /v1/models. Defaults to the profile name (or 'hermes-agent' for the default profile). Useful for multi-user setups with OpenWebUI.",
"prompt": "API server model name",
"url": None,
"password": False,
"category": "messaging",
"advanced": True,
},
"WEBHOOK_ENABLED": {
"description": "Enable the webhook platform adapter for receiving events from GitHub, GitLab, etc.",
"prompt": "Enable webhooks (true/false)",
@@ -1567,12 +1426,12 @@ _KNOWN_ROOT_KEYS = {
"_config_version", "model", "providers", "fallback_model",
"fallback_providers", "credential_pool_strategies", "toolsets",
"agent", "terminal", "display", "compression", "delegation",
"auxiliary", "custom_providers", "context", "memory", "gateway",
"auxiliary", "custom_providers", "memory", "gateway",
}
# Valid fields inside a custom_providers list entry
_VALID_CUSTOM_PROVIDER_FIELDS = {
"name", "base_url", "api_key", "api_mode", "model", "models",
"name", "base_url", "api_key", "api_mode", "models",
"context_length", "rate_limit_delay",
}
@@ -1887,94 +1746,6 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
except Exception:
pass
# ── Version 13 → 14: migrate legacy flat stt.model to provider section ──
# Old configs (and cli-config.yaml.example) had a flat `stt.model` key
# that was provider-agnostic. When the provider was "local" this caused
# OpenAI model names (e.g. "whisper-1") to be fed to faster-whisper,
# crashing with "Invalid model size". Move the value into the correct
# provider-specific section and remove the flat key.
if current_ver < 14:
# Read raw config (no defaults merged) to check what the user actually
# wrote, then apply changes to the merged config for saving.
raw = read_raw_config()
raw_stt = raw.get("stt", {})
if isinstance(raw_stt, dict) and "model" in raw_stt:
legacy_model = raw_stt["model"]
provider = raw_stt.get("provider", "local")
config = load_config()
stt = config.get("stt", {})
# Remove the legacy flat key
stt.pop("model", None)
# Place it in the appropriate provider section only if the
# user didn't already set a model there
if provider in ("local", "local_command"):
# Don't migrate an OpenAI model name into the local section
_local_models = {
"tiny.en", "tiny", "base.en", "base", "small.en", "small",
"medium.en", "medium", "large-v1", "large-v2", "large-v3",
"large", "distil-large-v2", "distil-medium.en",
"distil-small.en", "distil-large-v3", "distil-large-v3.5",
"large-v3-turbo", "turbo",
}
if legacy_model in _local_models:
# Check raw config — only set if user didn't already
# have a nested local.model
raw_local = raw_stt.get("local", {})
if not isinstance(raw_local, dict) or "model" not in raw_local:
local_cfg = stt.setdefault("local", {})
local_cfg["model"] = legacy_model
# else: drop it — it was an OpenAI model name, local section
# already defaults to "base" via DEFAULT_CONFIG
else:
# Cloud provider — put it in that provider's section only
# if user didn't already set a nested model
raw_provider = raw_stt.get(provider, {})
if not isinstance(raw_provider, dict) or "model" not in raw_provider:
provider_cfg = stt.setdefault(provider, {})
provider_cfg["model"] = legacy_model
config["stt"] = stt
save_config(config)
if not quiet:
print(f" ✓ Migrated legacy stt.model to provider-specific config")
# ── Version 14 → 15: add explicit gateway interim-message gate ──
if current_ver < 15:
config = read_raw_config()
display = config.get("display", {})
if not isinstance(display, dict):
display = {}
if "interim_assistant_messages" not in display:
display["interim_assistant_messages"] = True
config["display"] = display
results["config_added"].append("display.interim_assistant_messages=true (default)")
save_config(config)
if not quiet:
print(" ✓ Added display.interim_assistant_messages=true")
# ── Version 15 → 16: migrate tool_progress_overrides into display.platforms ──
if current_ver < 16:
config = read_raw_config()
display = config.get("display", {})
if not isinstance(display, dict):
display = {}
old_overrides = display.get("tool_progress_overrides")
if isinstance(old_overrides, dict) and old_overrides:
platforms = display.get("platforms", {})
if not isinstance(platforms, dict):
platforms = {}
for plat, mode in old_overrides.items():
if plat not in platforms:
platforms[plat] = {}
if "tool_progress" not in platforms[plat]:
platforms[plat]["tool_progress"] = mode
display["platforms"] = platforms
config["display"] = display
save_config(config)
if not quiet:
migrated = ", ".join(f"{p}={m}" for p, m in old_overrides.items())
print(f" ✓ Migrated tool_progress_overrides → display.platforms: {migrated}")
results["config_added"].append("display.platforms (migrated from tool_progress_overrides)")
if current_ver < latest_ver and not quiet:
print(f"Config version: {current_ver}{latest_ver}")
@@ -2384,13 +2155,7 @@ def save_config(config: Dict[str, Any]):
def load_env() -> Dict[str, str]:
"""Load environment variables from ~/.hermes/.env.
Sanitizes lines before parsing so that corrupted files (e.g.
concatenated KEY=VALUE pairs on a single line) are handled
gracefully instead of producing mangled values such as duplicated
bot tokens. See #8908.
"""
"""Load environment variables from ~/.hermes/.env."""
env_path = get_env_path()
env_vars = {}
@@ -2399,21 +2164,17 @@ def load_env() -> Dict[str, str]:
# fail on UTF-8 .env files. Use explicit UTF-8 only on Windows.
open_kw = {"encoding": "utf-8", "errors": "replace"} if _IS_WINDOWS else {}
with open(env_path, **open_kw) as f:
raw_lines = f.readlines()
# Sanitize before parsing: split concatenated lines & drop stale
# placeholders so corrupted .env files don't produce invalid tokens.
lines = _sanitize_env_lines(raw_lines)
for line in lines:
line = line.strip()
if line and not line.startswith('#') and '=' in line:
key, _, value = line.partition('=')
env_vars[key.strip()] = value.strip().strip('"\'')
for line in f:
line = line.strip()
if line and not line.startswith('#') and '=' in line:
key, _, value = line.partition('=')
env_vars[key.strip()] = value.strip().strip('"\'')
return env_vars
def _sanitize_env_lines(lines: list) -> list:
"""Fix corrupted .env lines before reading or writing.
"""Fix corrupted .env lines before writing.
Handles two known corruption patterns:
1. Concatenated KEY=VALUE pairs on a single line (missing newline between
@@ -2646,28 +2407,6 @@ def save_env_value_secure(key: str, value: str) -> Dict[str, Any]:
def reload_env() -> int:
"""Re-read ~/.hermes/.env into os.environ. Returns count of vars updated.
Adds/updates vars that changed and removes vars that were deleted from
the .env file (but only vars known to Hermes OPTIONAL_ENV_VARS and
_EXTRA_ENV_KEYS to avoid clobbering unrelated environment).
"""
env_vars = load_env()
known_keys = set(OPTIONAL_ENV_VARS.keys()) | _EXTRA_ENV_KEYS
count = 0
for key, value in env_vars.items():
if os.environ.get(key) != value:
os.environ[key] = value
count += 1
# Remove known Hermes vars that are no longer in .env
for key in known_keys:
if key not in env_vars and key in os.environ:
del os.environ[key]
count += 1
return count
def get_env_value(key: str) -> Optional[str]:
"""Get a value from ~/.hermes/.env or environment."""
# Check environment first
@@ -2727,8 +2466,7 @@ def show_config():
for env_key, name in keys:
value = get_env_value(env_key)
print(f" {name:<14} {redact_key(value)}")
from hermes_cli.auth import get_anthropic_key
anthropic_value = get_anthropic_key()
anthropic_value = get_env_value("ANTHROPIC_TOKEN") or get_env_value("ANTHROPIC_API_KEY")
print(f" {'Anthropic':<14} {redact_key(anthropic_value)}")
# Model settings
@@ -2944,8 +2682,8 @@ def set_config_value(key: str, value: str):
# Write only user config back (not the full merged defaults)
ensure_hermes_home()
from utils import atomic_yaml_write
atomic_yaml_write(config_path, user_config, sort_keys=False)
with open(config_path, 'w', encoding="utf-8") as f:
yaml.dump(user_config, f, default_flow_style=False, sort_keys=False)
# Keep .env in sync for keys that terminal_tool reads directly from env vars.
# config.yaml is authoritative, but terminal_tool only reads TERMINAL_ENV etc.
@@ -2961,10 +2699,6 @@ def set_config_value(key: str, value: str):
"terminal.timeout": "TERMINAL_TIMEOUT",
"terminal.sandbox_dir": "TERMINAL_SANDBOX_DIR",
"terminal.persistent_shell": "TERMINAL_PERSISTENT_SHELL",
"terminal.container_cpu": "TERMINAL_CONTAINER_CPU",
"terminal.container_memory": "TERMINAL_CONTAINER_MEMORY",
"terminal.container_disk": "TERMINAL_CONTAINER_DISK",
"terminal.container_persistent": "TERMINAL_CONTAINER_PERSISTENT",
}
if key in _config_to_env_sync:
save_env_value(_config_to_env_sync[key], str(value))
-1
View File
@@ -273,7 +273,6 @@ def copilot_request_headers(
headers: dict[str, str] = {
"Editor-Version": "vscode/1.104.1",
"User-Agent": "HermesAgent/1.0",
"Copilot-Integration-Id": "vscode-chat",
"Openai-Intent": "conversation-edits",
"x-initiator": "agent" if is_agent_turn else "user",
}
-273
View File
@@ -10,28 +10,6 @@ from typing import Callable, List, Optional, Set
from hermes_cli.colors import Colors, color
def flush_stdin() -> None:
"""Flush any stray bytes from the stdin input buffer.
Must be called after ``curses.wrapper()`` (or any terminal-mode library
like simple_term_menu) returns, **before** the next ``input()`` /
``getpass.getpass()`` call. ``curses.endwin()`` restores the terminal
but does NOT drain the OS input buffer leftover escape-sequence bytes
(from arrow keys, terminal mode-switch responses, or rapid keypresses)
remain buffered and silently get consumed by the next ``input()`` call,
corrupting user data (e.g. writing ``^[^[`` into .env files).
On non-TTY stdin (piped, redirected) or Windows, this is a no-op.
"""
try:
if not sys.stdin.isatty():
return
import termios
termios.tcflush(sys.stdin, termios.TCIFLUSH)
except Exception:
pass
def curses_checklist(
title: str,
items: List[str],
@@ -153,263 +131,12 @@ def curses_checklist(
return
curses.wrapper(_draw)
flush_stdin()
return result_holder[0] if result_holder[0] is not None else cancel_returns
except Exception:
return _numbered_fallback(title, items, selected, cancel_returns, status_fn)
def curses_radiolist(
title: str,
items: List[str],
selected: int = 0,
*,
cancel_returns: int | None = None,
) -> int:
"""Curses single-select radio list. Returns the selected index.
Args:
title: Header line displayed above the list.
items: Display labels for each row.
selected: Index that starts selected (pre-selected).
cancel_returns: Returned on ESC/q. Defaults to the original *selected*.
"""
if cancel_returns is None:
cancel_returns = selected
if not sys.stdin.isatty():
return cancel_returns
try:
import curses
result_holder: list = [None]
def _draw(stdscr):
curses.curs_set(0)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
cursor = selected
scroll_offset = 0
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
# Header
try:
hattr = curses.A_BOLD
if curses.has_colors():
hattr |= curses.color_pair(2)
stdscr.addnstr(0, 0, title, max_x - 1, hattr)
stdscr.addnstr(
1, 0,
" \u2191\u2193 navigate ENTER/SPACE select ESC cancel",
max_x - 1, curses.A_DIM,
)
except curses.error:
pass
# Scrollable item list
visible_rows = max_y - 4
if cursor < scroll_offset:
scroll_offset = cursor
elif cursor >= scroll_offset + visible_rows:
scroll_offset = cursor - visible_rows + 1
for draw_i, i in enumerate(
range(scroll_offset, min(len(items), scroll_offset + visible_rows))
):
y = draw_i + 3
if y >= max_y - 1:
break
radio = "\u25cf" if i == selected else "\u25cb"
arrow = "\u2192" if i == cursor else " "
line = f" {arrow} ({radio}) {items[i]}"
attr = curses.A_NORMAL
if i == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(1)
try:
stdscr.addnstr(y, 0, line, max_x - 1, attr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
cursor = (cursor - 1) % len(items)
elif key in (curses.KEY_DOWN, ord("j")):
cursor = (cursor + 1) % len(items)
elif key in (ord(" "), curses.KEY_ENTER, 10, 13):
result_holder[0] = cursor
return
elif key in (27, ord("q")):
result_holder[0] = cancel_returns
return
curses.wrapper(_draw)
flush_stdin()
return result_holder[0] if result_holder[0] is not None else cancel_returns
except Exception:
return _radio_numbered_fallback(title, items, selected, cancel_returns)
def _radio_numbered_fallback(
title: str,
items: List[str],
selected: int,
cancel_returns: int,
) -> int:
"""Text-based numbered fallback for radio selection."""
print(color(f"\n {title}", Colors.YELLOW))
print(color(" Select by number, Enter to confirm.\n", Colors.DIM))
for i, label in enumerate(items):
marker = color("(\u25cf)", Colors.GREEN) if i == selected else "(\u25cb)"
print(f" {marker} {i + 1:>2}. {label}")
print()
try:
val = input(color(f" Choice [default {selected + 1}]: ", Colors.DIM)).strip()
if not val:
return selected
idx = int(val) - 1
if 0 <= idx < len(items):
return idx
return selected
except (ValueError, KeyboardInterrupt, EOFError):
return cancel_returns
def curses_single_select(
title: str,
items: List[str],
default_index: int = 0,
*,
cancel_label: str = "Cancel",
) -> int | None:
"""Curses single-select menu. Returns selected index or None on cancel.
Works inside prompt_toolkit because curses.wrapper() restores the terminal
safely, unlike simple_term_menu which conflicts with /dev/tty.
"""
if not sys.stdin.isatty():
return None
try:
import curses
result_holder: list = [None]
all_items = list(items) + [cancel_label]
cancel_idx = len(items)
def _draw(stdscr):
curses.curs_set(0)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
cursor = min(default_index, len(all_items) - 1)
scroll_offset = 0
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
try:
hattr = curses.A_BOLD
if curses.has_colors():
hattr |= curses.color_pair(2)
stdscr.addnstr(0, 0, title, max_x - 1, hattr)
stdscr.addnstr(
1, 0,
" ↑↓ navigate ENTER confirm ESC/q cancel",
max_x - 1, curses.A_DIM,
)
except curses.error:
pass
visible_rows = max_y - 3
if cursor < scroll_offset:
scroll_offset = cursor
elif cursor >= scroll_offset + visible_rows:
scroll_offset = cursor - visible_rows + 1
for draw_i, i in enumerate(
range(scroll_offset, min(len(all_items), scroll_offset + visible_rows))
):
y = draw_i + 3
if y >= max_y - 1:
break
arrow = "" if i == cursor else " "
line = f" {arrow} {all_items[i]}"
attr = curses.A_NORMAL
if i == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(1)
try:
stdscr.addnstr(y, 0, line, max_x - 1, attr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
cursor = (cursor - 1) % len(all_items)
elif key in (curses.KEY_DOWN, ord("j")):
cursor = (cursor + 1) % len(all_items)
elif key in (curses.KEY_ENTER, 10, 13):
result_holder[0] = cursor
return
elif key in (27, ord("q")):
result_holder[0] = None
return
curses.wrapper(_draw)
flush_stdin()
if result_holder[0] is not None and result_holder[0] >= cancel_idx:
return None
return result_holder[0]
except Exception:
all_items = list(items) + [cancel_label]
cancel_idx = len(items)
return _numbered_single_fallback(title, all_items, cancel_idx)
def _numbered_single_fallback(
title: str,
items: List[str],
cancel_idx: int,
) -> int | None:
"""Text-based numbered fallback for single-select."""
print(f"\n {title}\n")
for i, label in enumerate(items, 1):
print(f" {i}. {label}")
print()
try:
val = input(f" Choice [1-{len(items)}]: ").strip()
if not val:
return None
idx = int(val) - 1
if 0 <= idx < len(items) and idx < cancel_idx:
return idx
if idx == cancel_idx:
return None
except (ValueError, KeyboardInterrupt, EOFError):
pass
return None
def _numbered_fallback(
title: str,
items: List[str],
-336
View File
@@ -1,336 +0,0 @@
"""``hermes debug`` — debug tools for Hermes Agent.
Currently supports:
hermes debug share Upload debug report (system info + logs) to a
paste service and print a shareable URL.
"""
import io
import sys
import urllib.error
import urllib.parse
import urllib.request
from pathlib import Path
from typing import Optional
from hermes_constants import get_hermes_home
# ---------------------------------------------------------------------------
# Paste services — try paste.rs first, dpaste.com as fallback.
# ---------------------------------------------------------------------------
_PASTE_RS_URL = "https://paste.rs/"
_DPASTE_COM_URL = "https://dpaste.com/api/"
# Maximum bytes to read from a single log file for upload.
# paste.rs caps at ~1 MB; we stay under that with headroom.
_MAX_LOG_BYTES = 512_000
def _upload_paste_rs(content: str) -> str:
"""Upload to paste.rs. Returns the paste URL.
paste.rs accepts a plain POST body and returns the URL directly.
"""
data = content.encode("utf-8")
req = urllib.request.Request(
_PASTE_RS_URL, data=data, method="POST",
headers={
"Content-Type": "text/plain; charset=utf-8",
"User-Agent": "hermes-agent/debug-share",
},
)
with urllib.request.urlopen(req, timeout=30) as resp:
url = resp.read().decode("utf-8").strip()
if not url.startswith("http"):
raise ValueError(f"Unexpected response from paste.rs: {url[:200]}")
return url
def _upload_dpaste_com(content: str, expiry_days: int = 7) -> str:
"""Upload to dpaste.com. Returns the paste URL.
dpaste.com uses multipart form data.
"""
boundary = "----HermesDebugBoundary9f3c"
def _field(name: str, value: str) -> str:
return (
f"--{boundary}\r\n"
f'Content-Disposition: form-data; name="{name}"\r\n'
f"\r\n"
f"{value}\r\n"
)
body = (
_field("content", content)
+ _field("syntax", "text")
+ _field("expiry_days", str(expiry_days))
+ f"--{boundary}--\r\n"
).encode("utf-8")
req = urllib.request.Request(
_DPASTE_COM_URL, data=body, method="POST",
headers={
"Content-Type": f"multipart/form-data; boundary={boundary}",
"User-Agent": "hermes-agent/debug-share",
},
)
with urllib.request.urlopen(req, timeout=30) as resp:
url = resp.read().decode("utf-8").strip()
if not url.startswith("http"):
raise ValueError(f"Unexpected response from dpaste.com: {url[:200]}")
return url
def upload_to_pastebin(content: str, expiry_days: int = 7) -> str:
"""Upload *content* to a paste service, trying paste.rs then dpaste.com.
Returns the paste URL on success, raises on total failure.
"""
errors: list[str] = []
# Try paste.rs first (simple, fast)
try:
return _upload_paste_rs(content)
except Exception as exc:
errors.append(f"paste.rs: {exc}")
# Fallback: dpaste.com (supports expiry)
try:
return _upload_dpaste_com(content, expiry_days=expiry_days)
except Exception as exc:
errors.append(f"dpaste.com: {exc}")
raise RuntimeError(
"Failed to upload to any paste service:\n " + "\n ".join(errors)
)
# ---------------------------------------------------------------------------
# Log file reading
# ---------------------------------------------------------------------------
def _resolve_log_path(log_name: str) -> Optional[Path]:
"""Find the log file for *log_name*, falling back to the .1 rotation.
Returns the path if found, or None.
"""
from hermes_cli.logs import LOG_FILES
filename = LOG_FILES.get(log_name)
if not filename:
return None
log_dir = get_hermes_home() / "logs"
primary = log_dir / filename
if primary.exists() and primary.stat().st_size > 0:
return primary
# Fall back to the most recent rotated file (.1).
rotated = log_dir / f"{filename}.1"
if rotated.exists() and rotated.stat().st_size > 0:
return rotated
return None
def _read_log_tail(log_name: str, num_lines: int) -> str:
"""Read the last *num_lines* from a log file, or return a placeholder."""
from hermes_cli.logs import _read_last_n_lines
log_path = _resolve_log_path(log_name)
if log_path is None:
return "(file not found)"
try:
lines = _read_last_n_lines(log_path, num_lines)
return "".join(lines).rstrip("\n")
except Exception as exc:
return f"(error reading: {exc})"
def _read_full_log(log_name: str, max_bytes: int = _MAX_LOG_BYTES) -> Optional[str]:
"""Read a log file for standalone upload.
Returns the file content (last *max_bytes* if truncated), or None if the
file doesn't exist or is empty.
"""
log_path = _resolve_log_path(log_name)
if log_path is None:
return None
try:
size = log_path.stat().st_size
if size == 0:
return None
if size <= max_bytes:
return log_path.read_text(encoding="utf-8", errors="replace")
# File is larger than max_bytes — read the tail.
with open(log_path, "rb") as f:
f.seek(size - max_bytes)
# Skip partial line at the seek point.
f.readline()
content = f.read().decode("utf-8", errors="replace")
return f"[... truncated — showing last ~{max_bytes // 1024}KB ...]\n{content}"
except Exception:
return None
# ---------------------------------------------------------------------------
# Debug report collection
# ---------------------------------------------------------------------------
def _capture_dump() -> str:
"""Run ``hermes dump`` and return its stdout as a string."""
from hermes_cli.dump import run_dump
class _FakeArgs:
show_keys = False
old_stdout = sys.stdout
sys.stdout = capture = io.StringIO()
try:
run_dump(_FakeArgs())
except SystemExit:
pass
finally:
sys.stdout = old_stdout
return capture.getvalue()
def collect_debug_report(*, log_lines: int = 200, dump_text: str = "") -> str:
"""Build the summary debug report: system dump + log tails.
Parameters
----------
log_lines
Number of recent lines to include per log file.
dump_text
Pre-captured dump output. If empty, ``hermes dump`` is run
internally.
Returns the report as a plain-text string ready for upload.
"""
buf = io.StringIO()
if not dump_text:
dump_text = _capture_dump()
buf.write(dump_text)
# ── Recent log tails (summary only) ──────────────────────────────────
buf.write("\n\n")
buf.write(f"--- agent.log (last {log_lines} lines) ---\n")
buf.write(_read_log_tail("agent", log_lines))
buf.write("\n\n")
errors_lines = min(log_lines, 100)
buf.write(f"--- errors.log (last {errors_lines} lines) ---\n")
buf.write(_read_log_tail("errors", errors_lines))
buf.write("\n\n")
buf.write(f"--- gateway.log (last {errors_lines} lines) ---\n")
buf.write(_read_log_tail("gateway", errors_lines))
buf.write("\n")
return buf.getvalue()
# ---------------------------------------------------------------------------
# CLI entry points
# ---------------------------------------------------------------------------
def run_debug_share(args):
"""Collect debug report + full logs, upload each, print URLs."""
log_lines = getattr(args, "lines", 200)
expiry = getattr(args, "expire", 7)
local_only = getattr(args, "local", False)
print("Collecting debug report...")
# Capture dump once — prepended to every paste for context.
dump_text = _capture_dump()
report = collect_debug_report(log_lines=log_lines, dump_text=dump_text)
agent_log = _read_full_log("agent")
gateway_log = _read_full_log("gateway")
# Prepend dump header to each full log so every paste is self-contained.
if agent_log:
agent_log = dump_text + "\n\n--- full agent.log ---\n" + agent_log
if gateway_log:
gateway_log = dump_text + "\n\n--- full gateway.log ---\n" + gateway_log
if local_only:
print(report)
if agent_log:
print(f"\n\n{'=' * 60}")
print("FULL agent.log")
print(f"{'=' * 60}\n")
print(agent_log)
if gateway_log:
print(f"\n\n{'=' * 60}")
print("FULL gateway.log")
print(f"{'=' * 60}\n")
print(gateway_log)
return
print("Uploading...")
urls: dict[str, str] = {}
failures: list[str] = []
# 1. Summary report (required)
try:
urls["Report"] = upload_to_pastebin(report, expiry_days=expiry)
except RuntimeError as exc:
print(f"\nUpload failed: {exc}", file=sys.stderr)
print("\nFull report printed below — copy-paste it manually:\n")
print(report)
sys.exit(1)
# 2. Full agent.log (optional)
if agent_log:
try:
urls["agent.log"] = upload_to_pastebin(agent_log, expiry_days=expiry)
except Exception as exc:
failures.append(f"agent.log: {exc}")
# 3. Full gateway.log (optional)
if gateway_log:
try:
urls["gateway.log"] = upload_to_pastebin(gateway_log, expiry_days=expiry)
except Exception as exc:
failures.append(f"gateway.log: {exc}")
# Print results
label_width = max(len(k) for k in urls)
print(f"\nDebug report uploaded:")
for label, url in urls.items():
print(f" {label:<{label_width}} {url}")
if failures:
print(f"\n (failed to upload: {', '.join(failures)})")
print(f"\nShare these links with the Hermes team for support.")
def run_debug(args):
"""Route debug subcommands."""
subcmd = getattr(args, "debug_command", None)
if subcmd == "share":
run_debug_share(args)
else:
# Default: show help
print("Usage: hermes debug share [--lines N] [--expire N] [--local]")
print()
print("Commands:")
print(" share Upload debug report to a paste service and print URL")
print()
print("Options:")
print(" --lines N Number of log lines to include (default: 200)")
print(" --expire N Paste expiry in days (default: 7)")
print(" --local Print report locally instead of uploading")
+14 -65
View File
@@ -51,36 +51,9 @@ _PROVIDER_ENV_HINTS = (
"AI_GATEWAY_API_KEY",
"OPENCODE_ZEN_API_KEY",
"OPENCODE_GO_API_KEY",
"XIAOMI_API_KEY",
)
from hermes_constants import is_termux as _is_termux
def _python_install_cmd() -> str:
return "python -m pip install" if _is_termux() else "uv pip install"
def _system_package_install_cmd(pkg: str) -> str:
if _is_termux():
return f"pkg install {pkg}"
if sys.platform == "darwin":
return f"brew install {pkg}"
return f"sudo apt install {pkg}"
def _termux_browser_setup_steps(node_installed: bool) -> list[str]:
steps: list[str] = []
step = 1
if not node_installed:
steps.append(f"{step}) pkg install nodejs")
step += 1
steps.append(f"{step}) npm install -g agent-browser")
steps.append(f"{step + 1}) agent-browser install")
return steps
def _has_provider_env_config(content: str) -> bool:
"""Return True when ~/.hermes/.env contains provider auth/base URL settings."""
return any(key in content for key in _PROVIDER_ENV_HINTS)
@@ -227,7 +200,7 @@ def run_doctor(args):
check_ok(name)
except ImportError:
check_fail(name, "(missing)")
issues.append(f"Install {name}: {_python_install_cmd()} {module}")
issues.append(f"Install {name}: uv pip install {module}")
for module, name in optional_packages:
try:
@@ -336,8 +309,8 @@ def run_doctor(args):
model_section[k] = raw_config.pop(k)
else:
raw_config.pop(k)
from utils import atomic_yaml_write
atomic_yaml_write(config_path, raw_config)
with open(config_path, "w") as f:
yaml.dump(raw_config, f, default_flow_style=False)
check_ok("Migrated stale root-level keys into model section")
fixed_count += 1
else:
@@ -530,7 +503,7 @@ def run_doctor(args):
check_ok("ripgrep (rg)", "(faster file search)")
else:
check_warn("ripgrep (rg) not found", "(file search uses grep fallback)")
check_info(f"Install for faster search: {_system_package_install_cmd('ripgrep')}")
check_info("Install for faster search: sudo apt install ripgrep")
# Docker (optional)
terminal_env = os.getenv("TERMINAL_ENV", "local")
@@ -553,10 +526,7 @@ def run_doctor(args):
if shutil.which("docker"):
check_ok("docker", "(optional)")
else:
if _is_termux():
check_info("Docker backend is not available inside Termux (expected on Android)")
else:
check_warn("docker not found", "(optional)")
check_warn("docker not found", "(optional)")
# SSH (if using ssh backend)
if terminal_env == "ssh":
@@ -604,23 +574,9 @@ def run_doctor(args):
if agent_browser_path.exists():
check_ok("agent-browser (Node.js)", "(browser automation)")
else:
if _is_termux():
check_info("agent-browser is not installed (expected in the tested Termux path)")
check_info("Install it manually later with: npm install -g agent-browser && agent-browser install")
check_info("Termux browser setup:")
for step in _termux_browser_setup_steps(node_installed=True):
check_info(step)
else:
check_warn("agent-browser not installed", "(run: npm install)")
check_warn("agent-browser not installed", "(run: npm install)")
else:
if _is_termux():
check_info("Node.js not found (browser tools are optional in the tested Termux path)")
check_info("Install Node.js on Termux with: pkg install nodejs")
check_info("Termux browser setup:")
for step in _termux_browser_setup_steps(node_installed=False):
check_info(step)
else:
check_warn("Node.js not found", "(optional, needed for browser tools)")
check_warn("Node.js not found", "(optional, needed for browser tools)")
# npm audit for all Node.js packages
if shutil.which("npm"):
@@ -686,8 +642,7 @@ def run_doctor(args):
else:
check_warn("OpenRouter API", "(not configured)")
from hermes_cli.auth import get_anthropic_key
anthropic_key = get_anthropic_key()
anthropic_key = os.getenv("ANTHROPIC_TOKEN") or os.getenv("ANTHROPIC_API_KEY")
if anthropic_key:
print(" Checking Anthropic API...", end="", flush=True)
try:
@@ -724,9 +679,9 @@ def run_doctor(args):
("DeepSeek", ("DEEPSEEK_API_KEY",), "https://api.deepseek.com/v1/models", "DEEPSEEK_BASE_URL", True),
("Hugging Face", ("HF_TOKEN",), "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
("Alibaba/DashScope", ("DASHSCOPE_API_KEY",), "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
# MiniMax: the /anthropic endpoint doesn't support /models, but the /v1 endpoint does.
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL", True),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL", True),
# MiniMax APIs don't support /models endpoint — https://github.com/NousResearch/hermes-agent/issues/811
("MiniMax", ("MINIMAX_API_KEY",), None, "MINIMAX_BASE_URL", False),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), None, "MINIMAX_CN_BASE_URL", False),
("AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
("Kilo Code", ("KILOCODE_API_KEY",), "https://api.kilo.ai/api/gateway/models", "KILOCODE_BASE_URL", True),
("OpenCode Zen", ("OPENCODE_ZEN_API_KEY",), "https://opencode.ai/zen/v1/models", "OPENCODE_ZEN_BASE_URL", True),
@@ -751,15 +706,10 @@ def run_doctor(args):
# Auto-detect Kimi Code keys (sk-kimi-) → api.kimi.com
if not _base and _key.startswith("sk-kimi-"):
_base = "https://api.kimi.com/coding/v1"
# Anthropic-compat endpoints (/anthropic) don't support /models.
# Rewrite to the OpenAI-compat /v1 surface for health checks.
if _base and _base.rstrip("/").endswith("/anthropic"):
from agent.auxiliary_client import _to_openai_base_url
_base = _to_openai_base_url(_base)
_url = (_base.rstrip("/") + "/models") if _base else _default_url
_headers = {"Authorization": f"Bearer {_key}"}
if "api.kimi.com" in _url.lower():
_headers["User-Agent"] = "KimiCLI/1.30.0"
_headers["User-Agent"] = "KimiCLI/1.0"
_resp = httpx.get(
_url,
headers=_headers,
@@ -789,9 +739,8 @@ def run_doctor(args):
__import__("tinker_atropos")
check_ok("tinker-atropos", "(RL training backend)")
except ImportError:
install_cmd = f"{_python_install_cmd()} -e ./tinker-atropos"
check_warn("tinker-atropos found but not installed", f"(run: {install_cmd})")
issues.append(f"Install tinker-atropos: {install_cmd}")
check_warn("tinker-atropos found but not installed", "(run: uv pip install -e ./tinker-atropos)")
issues.append("Install tinker-atropos: uv pip install -e ./tinker-atropos")
else:
check_warn("tinker-atropos requires Python 3.11+", f"(current: {py_version.major}.{py_version.minor})")
else:
-12
View File
@@ -44,16 +44,6 @@ def _redact(value: str) -> str:
def _gateway_status() -> str:
"""Return a short gateway status string."""
if sys.platform.startswith("linux"):
from hermes_constants import is_container
if is_container():
try:
from hermes_cli.gateway import find_gateway_pids
pids = find_gateway_pids()
if pids:
return f"running (docker, pid {pids[0]})"
return "stopped (docker)"
except Exception:
return "stopped (docker)"
try:
from hermes_cli.gateway import get_service_name
svc = get_service_name()
@@ -129,8 +119,6 @@ def _configured_platforms() -> list[str]:
"dingtalk": "DINGTALK_CLIENT_ID",
"feishu": "FEISHU_APP_ID",
"wecom": "WECOM_BOT_ID",
"wecom_callback": "WECOM_CALLBACK_CORP_ID",
"weixin": "WEIXIN_ACCOUNT_ID",
}
return [name for name, env in checks.items() if os.getenv(env)]
-49
View File
@@ -15,51 +15,6 @@ def _load_dotenv_with_fallback(path: Path, *, override: bool) -> None:
load_dotenv(dotenv_path=path, override=override, encoding="latin-1")
def _sanitize_env_file_if_needed(path: Path) -> None:
"""Pre-sanitize a .env file before python-dotenv reads it.
python-dotenv does not handle corrupted lines where multiple
KEY=VALUE pairs are concatenated on a single line (missing newline).
This produces mangled values e.g. a bot token duplicated 8×
(see #8908).
We delegate to ``hermes_cli.config._sanitize_env_lines`` which
already knows all valid Hermes env-var names and can split
concatenated lines correctly.
"""
if not path.exists():
return
try:
from hermes_cli.config import _sanitize_env_lines
except ImportError:
return # early bootstrap — config module not available yet
read_kw = {"encoding": "utf-8", "errors": "replace"}
try:
with open(path, **read_kw) as f:
original = f.readlines()
sanitized = _sanitize_env_lines(original)
if sanitized != original:
import tempfile
fd, tmp = tempfile.mkstemp(
dir=str(path.parent), suffix=".tmp", prefix=".env_"
)
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
f.writelines(sanitized)
f.flush()
os.fsync(f.fileno())
os.replace(tmp, path)
except BaseException:
try:
os.unlink(tmp)
except OSError:
pass
raise
except Exception:
pass # best-effort — don't block gateway startup
def load_hermes_dotenv(
*,
hermes_home: str | os.PathLike | None = None,
@@ -79,10 +34,6 @@ def load_hermes_dotenv(
user_env = home_path / ".env"
project_env_path = Path(project_env) if project_env else None
# Fix corrupted .env files before python-dotenv parses them (#8908).
if user_env.exists():
_sanitize_env_file_if_needed(user_env)
if user_env.exists():
_load_dotenv_with_fallback(user_env, override=True)
loaded.append(user_env)
+112 -849
View File
File diff suppressed because it is too large Load Diff
+9 -64
View File
@@ -1,18 +1,16 @@
"""``hermes logs`` — view and filter Hermes log files.
Supports tailing, following, session filtering, level filtering,
component filtering, and relative time ranges. All log files live
under ``~/.hermes/logs/``.
Supports tailing, following, session filtering, level filtering, and
relative time ranges. All log files live under ``~/.hermes/logs/``.
Usage examples::
hermes logs # last 50 lines of agent.log
hermes logs -f # follow agent.log in real time
hermes logs errors # last 50 lines of errors.log
hermes logs gateway -n 100 # last 100 lines of gateway.log
hermes logs gateway -n 100 # last 100 lines of gateway.log
hermes logs --level WARNING # only WARNING+ lines
hermes logs --session abc123 # filter by session ID substring
hermes logs --component tools # only tool-related lines
hermes logs --since 1h # lines from the last hour
hermes logs --since 30m -f # follow, starting 30 min ago
"""
@@ -22,7 +20,7 @@ import sys
import time
from datetime import datetime, timedelta
from pathlib import Path
from typing import Optional, Sequence
from typing import Optional
from hermes_constants import get_hermes_home, display_hermes_home
@@ -40,15 +38,6 @@ _TS_RE = re.compile(r"^(\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2})")
# Level extraction — matches " INFO ", " WARNING ", " ERROR ", " DEBUG ", " CRITICAL "
_LEVEL_RE = re.compile(r"\s(DEBUG|INFO|WARNING|ERROR|CRITICAL)\s")
# Logger name extraction — after level and optional session tag, the next
# non-space token before ":" is the logger name.
# Matches: "INFO gateway.run:" or "INFO [sess_abc] tools.terminal_tool:"
_LOGGER_NAME_RE = re.compile(
r"\s(?:DEBUG|INFO|WARNING|ERROR|CRITICAL)" # level
r"(?:\s+\[.*?\])?" # optional session tag
r"\s+(\S+):" # logger name
)
# Level ordering for >= filtering
_LEVEL_ORDER = {"DEBUG": 0, "INFO": 1, "WARNING": 2, "ERROR": 3, "CRITICAL": 4}
@@ -90,27 +79,12 @@ def _extract_level(line: str) -> Optional[str]:
return m.group(1) if m else None
def _extract_logger_name(line: str) -> Optional[str]:
"""Extract the logger name from a log line."""
m = _LOGGER_NAME_RE.search(line)
return m.group(1) if m else None
def _line_matches_component(line: str, prefixes: Sequence[str]) -> bool:
"""Check if a log line's logger name starts with any of *prefixes*."""
name = _extract_logger_name(line)
if name is None:
return False
return name.startswith(tuple(prefixes))
def _matches_filters(
line: str,
*,
min_level: Optional[str] = None,
session_filter: Optional[str] = None,
since: Optional[datetime] = None,
component_prefixes: Optional[Sequence[str]] = None,
) -> bool:
"""Check if a log line passes all active filters."""
if since is not None:
@@ -128,10 +102,6 @@ def _matches_filters(
if session_filter not in line:
return False
if component_prefixes is not None:
if not _line_matches_component(line, component_prefixes):
return False
return True
@@ -143,7 +113,6 @@ def tail_log(
level: Optional[str] = None,
session: Optional[str] = None,
since: Optional[str] = None,
component: Optional[str] = None,
) -> None:
"""Read and display log lines, optionally following in real time.
@@ -161,8 +130,6 @@ def tail_log(
Session ID substring to filter on.
since
Relative time string (e.g. ``"1h"``, ``"30m"``).
component
Component name to filter by (e.g. ``"gateway"``, ``"tools"``).
"""
filename = LOG_FILES.get(log_name)
if filename is None:
@@ -188,29 +155,13 @@ def tail_log(
print(f"Invalid --level: {level!r}. Use DEBUG, INFO, WARNING, ERROR, or CRITICAL.")
sys.exit(1)
# Resolve component to logger name prefixes
component_prefixes = None
if component:
from hermes_logging import COMPONENT_PREFIXES
component_lower = component.lower()
if component_lower not in COMPONENT_PREFIXES:
available = ", ".join(sorted(COMPONENT_PREFIXES))
print(f"Unknown component: {component!r}. Available: {available}")
sys.exit(1)
component_prefixes = COMPONENT_PREFIXES[component_lower]
has_filters = (
min_level is not None
or session is not None
or since_dt is not None
or component_prefixes is not None
)
has_filters = min_level is not None or session is not None or since_dt is not None
# Read and display the tail
try:
lines = _read_tail(log_path, num_lines, has_filters=has_filters,
min_level=min_level, session_filter=session,
since=since_dt, component_prefixes=component_prefixes)
since=since_dt)
except PermissionError:
print(f"Permission denied: {log_path}")
sys.exit(1)
@@ -221,8 +172,6 @@ def tail_log(
filter_parts.append(f"level>={min_level}")
if session:
filter_parts.append(f"session={session}")
if component:
filter_parts.append(f"component={component}")
if since:
filter_parts.append(f"since={since}")
filter_desc = f" [{', '.join(filter_parts)}]" if filter_parts else ""
@@ -241,7 +190,7 @@ def tail_log(
# Follow mode — poll for new content
try:
_follow_log(log_path, min_level=min_level, session_filter=session,
since=since_dt, component_prefixes=component_prefixes)
since=since_dt)
except KeyboardInterrupt:
print("\n--- stopped ---")
@@ -254,7 +203,6 @@ def _read_tail(
min_level: Optional[str] = None,
session_filter: Optional[str] = None,
since: Optional[datetime] = None,
component_prefixes: Optional[Sequence[str]] = None,
) -> list:
"""Read the last *num_lines* matching lines from a log file.
@@ -267,8 +215,7 @@ def _read_tail(
filtered = [
l for l in raw_lines
if _matches_filters(l, min_level=min_level,
session_filter=session_filter, since=since,
component_prefixes=component_prefixes)
session_filter=session_filter, since=since)
]
return filtered[-num_lines:]
else:
@@ -337,7 +284,6 @@ def _follow_log(
min_level: Optional[str] = None,
session_filter: Optional[str] = None,
since: Optional[datetime] = None,
component_prefixes: Optional[Sequence[str]] = None,
) -> None:
"""Poll a log file for new content and print matching lines."""
with open(path, "r", encoding="utf-8", errors="replace") as f:
@@ -347,8 +293,7 @@ def _follow_log(
line = f.readline()
if line:
if _matches_filters(line, min_level=min_level,
session_filter=session_filter, since=since,
component_prefixes=component_prefixes):
session_filter=session_filter, since=since):
print(line, end="")
sys.stdout.flush()
else:
+104 -512
View File
@@ -97,11 +97,10 @@ def _apply_profile_override() -> None:
consume = 1
break
# 2. If no flag, check active_profile in the hermes root
# 2. If no flag, check ~/.hermes/active_profile
if profile_name is None:
try:
from hermes_constants import get_default_hermes_root
active_path = get_default_hermes_root() / "active_profile"
active_path = Path.home() / ".hermes" / "active_profile"
if active_path.exists():
name = active_path.read_text().strip()
if name and name != "default":
@@ -151,18 +150,6 @@ try:
except Exception:
pass # best-effort — don't crash the CLI if logging setup fails
# Apply IPv4 preference early, before any HTTP clients are created.
try:
from hermes_cli.config import load_config as _load_config_early
from hermes_constants import apply_ipv4_preference as _apply_ipv4
_early_cfg = _load_config_early()
_net = _early_cfg.get("network", {})
if isinstance(_net, dict) and _net.get("force_ipv4"):
_apply_ipv4(force=True)
del _early_cfg, _net
except Exception:
pass # best-effort — don't crash if config isn't available yet
import logging
import time as _time
from datetime import datetime
@@ -540,113 +527,6 @@ def _resolve_last_cli_session() -> Optional[str]:
return None
def _probe_container(cmd: list, backend: str, via_sudo: bool = False):
"""Run a container inspect probe, returning the CompletedProcess.
Catches TimeoutExpired specifically for a human-readable message;
all other exceptions propagate naturally.
"""
try:
return subprocess.run(cmd, capture_output=True, text=True, timeout=15)
except subprocess.TimeoutExpired:
label = f"sudo {backend}" if via_sudo else backend
print(
f"Error: timed out waiting for {label} to respond.\n"
f"The {backend} daemon may be unresponsive or starting up.",
file=sys.stderr,
)
sys.exit(1)
def _exec_in_container(container_info: dict, cli_args: list):
"""Replace the current process with a command inside the managed container.
Probes whether sudo is needed (rootful containers), then os.execvp
into the container. On success the Python process is replaced entirely
and the container's exit code becomes the process exit code (OS semantics).
On failure, OSError propagates naturally.
Args:
container_info: dict with backend, container_name, exec_user, hermes_bin
cli_args: the original CLI arguments (everything after 'hermes')
"""
import shutil
backend = container_info["backend"]
container_name = container_info["container_name"]
exec_user = container_info["exec_user"]
hermes_bin = container_info["hermes_bin"]
runtime = shutil.which(backend)
if not runtime:
print(f"Error: {backend} not found on PATH. Cannot route to container.",
file=sys.stderr)
sys.exit(1)
# Rootful containers (NixOS systemd service) are invisible to unprivileged
# users — Podman uses per-user namespaces, Docker needs group access.
# Probe whether the runtime can see the container; if not, try via sudo.
sudo_path = None
probe = _probe_container(
[runtime, "inspect", "--format", "ok", container_name], backend,
)
if probe.returncode != 0:
sudo_path = shutil.which("sudo")
if sudo_path:
probe2 = _probe_container(
[sudo_path, "-n", runtime, "inspect", "--format", "ok", container_name],
backend, via_sudo=True,
)
if probe2.returncode != 0:
print(
f"Error: container '{container_name}' not found via {backend}.\n"
f"\n"
f"The container is likely running as root. Your user cannot see it\n"
f"because {backend} uses per-user namespaces. Grant passwordless\n"
f"sudo for {backend} — the -n (non-interactive) flag is required\n"
f"because a password prompt would hang or break piped commands.\n"
f"\n"
f"On NixOS:\n"
f"\n"
f' security.sudo.extraRules = [{{\n'
f' users = [ "{os.getenv("USER", "your-user")}" ];\n'
f' commands = [{{ command = "{runtime}"; options = [ "NOPASSWD" ]; }}];\n'
f' }}];\n'
f"\n"
f"Or run: sudo hermes {' '.join(cli_args)}",
file=sys.stderr,
)
sys.exit(1)
else:
print(
f"Error: container '{container_name}' not found via {backend}.\n"
f"The container may be running under root. Try: sudo hermes {' '.join(cli_args)}",
file=sys.stderr,
)
sys.exit(1)
is_tty = sys.stdin.isatty()
tty_flags = ["-it"] if is_tty else ["-i"]
env_flags = []
for var in ("TERM", "COLORTERM", "LANG", "LC_ALL"):
val = os.environ.get(var)
if val:
env_flags.extend(["-e", f"{var}={val}"])
cmd_prefix = [sudo_path, "-n", runtime] if sudo_path else [runtime]
exec_cmd = (
cmd_prefix + ["exec"]
+ tty_flags
+ ["-u", exec_user]
+ env_flags
+ [container_name, hermes_bin]
+ cli_args
)
os.execvp(exec_cmd[0], exec_cmd)
def _resolve_session_by_name_or_id(name_or_id: str) -> Optional[str]:
"""Resolve a session name (title) or ID to a session ID.
@@ -766,7 +646,6 @@ def cmd_chat(args):
"verbose": args.verbose,
"quiet": getattr(args, "quiet", False),
"query": args.query,
"image": getattr(args, "image", None),
"resume": getattr(args, "resume", None),
"worktree": getattr(args, "worktree", False),
"checkpoints": getattr(args, "checkpoints", False),
@@ -978,6 +857,7 @@ def cmd_whatsapp(args):
def cmd_setup(args):
"""Interactive setup wizard."""
_require_tty("setup")
from hermes_cli.setup import run_setup_wizard
run_setup_wizard(args)
@@ -1053,7 +933,6 @@ def select_provider_and_model(args=None):
"kilocode": "Kilo Code",
"alibaba": "Alibaba Cloud (DashScope)",
"huggingface": "Hugging Face",
"xiaomi": "Xiaomi MiMo",
"custom": "Custom endpoint",
}
active_label = provider_labels.get(active, active) if active else "none"
@@ -1086,14 +965,12 @@ def select_provider_and_model(args=None):
("opencode-go", "OpenCode Go (open models, $10/month subscription)"),
("ai-gateway", "AI Gateway (Vercel — 200+ models, pay-per-use)"),
("alibaba", "Alibaba Cloud / DashScope Coding (Qwen + multi-provider)"),
("xiaomi", "Xiaomi MiMo (MiMo-V2 models — pro, omni, flash)"),
]
def _named_custom_provider_map(cfg) -> dict[str, dict[str, str]]:
custom_providers_cfg = cfg.get("custom_providers") or []
custom_provider_map = {}
if not isinstance(custom_providers_cfg, list):
return custom_provider_map
# Add user-defined custom providers from config.yaml
custom_providers_cfg = config.get("custom_providers") or []
_custom_provider_map = {} # key → {name, base_url, api_key}
if isinstance(custom_providers_cfg, list):
for entry in custom_providers_cfg:
if not isinstance(entry, dict):
continue
@@ -1102,24 +979,16 @@ def select_provider_and_model(args=None):
if not name or not base_url:
continue
key = "custom:" + name.lower().replace(" ", "-")
custom_provider_map[key] = {
short_url = base_url.replace("https://", "").replace("http://", "").rstrip("/")
saved_model = entry.get("model", "")
model_hint = f"{saved_model}" if saved_model else ""
top_providers.append((key, f"{name} ({short_url}){model_hint}"))
_custom_provider_map[key] = {
"name": name,
"base_url": base_url,
"api_key": entry.get("api_key", ""),
"model": entry.get("model", ""),
"api_mode": entry.get("api_mode", ""),
"model": saved_model,
}
return custom_provider_map
# Add user-defined custom providers from config.yaml
_custom_provider_map = _named_custom_provider_map(config) # key → {name, base_url, api_key}
for key, provider_info in _custom_provider_map.items():
name = provider_info["name"]
base_url = provider_info["base_url"]
short_url = base_url.replace("https://", "").replace("http://", "").rstrip("/")
saved_model = provider_info.get("model", "")
model_hint = f"{saved_model}" if saved_model else ""
top_providers.append((key, f"{name} ({short_url}){model_hint}"))
top_keys = {k for k, _ in top_providers}
extended_keys = {k for k, _ in extended_providers}
@@ -1184,60 +1053,17 @@ def select_provider_and_model(args=None):
_model_flow_copilot(config, current_model)
elif selected_provider == "custom":
_model_flow_custom(config)
elif selected_provider.startswith("custom:"):
provider_info = _named_custom_provider_map(load_config()).get(selected_provider)
if provider_info is None:
print(
"Warning: the selected saved custom provider is no longer available. "
"It may have been removed from config.yaml. No change."
)
return
_model_flow_named_custom(config, provider_info)
elif selected_provider.startswith("custom:") and selected_provider in _custom_provider_map:
_model_flow_named_custom(config, _custom_provider_map[selected_provider])
elif selected_provider == "remove-custom":
_remove_custom_provider(config)
elif selected_provider == "anthropic":
_model_flow_anthropic(config, current_model)
elif selected_provider == "kimi-coding":
_model_flow_kimi(config, current_model)
elif selected_provider in ("gemini", "zai", "minimax", "minimax-cn", "kilocode", "opencode-zen", "opencode-go", "ai-gateway", "alibaba", "huggingface", "xiaomi"):
elif selected_provider in ("gemini", "zai", "minimax", "minimax-cn", "kilocode", "opencode-zen", "opencode-go", "ai-gateway", "alibaba", "huggingface"):
_model_flow_api_key_provider(config, selected_provider, current_model)
# ── Post-switch cleanup: clear stale OPENAI_BASE_URL ──────────────
# When the user switches to a named provider (anything except "custom"),
# a leftover OPENAI_BASE_URL in ~/.hermes/.env can poison auxiliary
# clients that use provider:auto. Clear it proactively. (#5161)
if selected_provider not in ("custom", "cancel", "remove-custom") \
and not selected_provider.startswith("custom:"):
_clear_stale_openai_base_url()
def _clear_stale_openai_base_url():
"""Remove OPENAI_BASE_URL from ~/.hermes/.env if the active provider is not 'custom'.
After a provider switch, a leftover OPENAI_BASE_URL causes auxiliary
clients (compression, vision, delegation) with provider:auto to route
requests to the old custom endpoint instead of the newly selected
provider. See issue #5161.
"""
from hermes_cli.config import get_env_value, save_env_value, load_config
cfg = load_config()
model_cfg = cfg.get("model", {})
if isinstance(model_cfg, dict):
provider = (model_cfg.get("provider") or "").strip().lower()
else:
provider = ""
if provider == "custom" or not provider:
return # custom provider legitimately uses OPENAI_BASE_URL
stale_url = get_env_value("OPENAI_BASE_URL")
if stale_url:
save_env_value("OPENAI_BASE_URL", "")
print(f"Cleared stale OPENAI_BASE_URL from .env (was: {stale_url[:40]}...)"
if len(stale_url) > 40
else f"Cleared stale OPENAI_BASE_URL from .env (was: {stale_url})")
def _prompt_provider_choice(choices, *, default=0):
"""Show provider selection menu with curses arrow-key navigation.
@@ -1301,10 +1127,10 @@ def _model_flow_openrouter(config, current_model=""):
print()
from hermes_cli.models import model_ids, get_pricing_for_provider
openrouter_models = model_ids(force_refresh=True)
openrouter_models = model_ids()
# Fetch live pricing (non-blocking — returns empty dict on failure)
pricing = get_pricing_for_provider("openrouter", force_refresh=True)
pricing = get_pricing_for_provider("openrouter")
selected = _prompt_model_selection(openrouter_models, current_model=current_model, pricing=pricing)
if selected:
@@ -1831,10 +1657,8 @@ def _remove_custom_provider(config):
title="Select provider to remove:",
)
idx = menu.show()
from hermes_cli.curses_ui import flush_stdin
flush_stdin()
print()
except (ImportError, NotImplementedError, OSError, subprocess.SubprocessError):
except (ImportError, NotImplementedError):
for i, c in enumerate(choices, 1):
print(f" {i}. {c}")
print()
@@ -1858,9 +1682,8 @@ def _remove_custom_provider(config):
def _model_flow_named_custom(config, provider_info):
"""Handle a named custom provider from config.yaml custom_providers list.
Always probes the endpoint's /models API to let the user pick a model.
If a model was previously saved, it is pre-selected in the menu.
Falls back to the saved model if probing fails.
If the entry has a saved model name, activates it immediately.
Otherwise probes the endpoint's /models API to let the user pick one.
"""
from hermes_cli.auth import _save_model_choice, deactivate_provider
from hermes_cli.config import load_config, save_config
@@ -1871,46 +1694,54 @@ def _model_flow_named_custom(config, provider_info):
api_key = provider_info.get("api_key", "")
saved_model = provider_info.get("model", "")
# If a model is saved, just activate immediately — no probing needed
if saved_model:
_save_model_choice(saved_model)
cfg = load_config()
model = cfg.get("model")
if not isinstance(model, dict):
model = {"default": model} if model else {}
cfg["model"] = model
model["provider"] = "custom"
model["base_url"] = base_url
if api_key:
model["api_key"] = api_key
save_config(cfg)
deactivate_provider()
print(f"✅ Switched to: {saved_model}")
print(f" Provider: {name} ({base_url})")
return
# No saved model — probe endpoint and let user pick
print(f" Provider: {name}")
print(f" URL: {base_url}")
if saved_model:
print(f" Current: {saved_model}")
print()
print("Fetching available models...")
print("No model saved for this provider. Fetching available models...")
models = fetch_api_models(api_key, base_url, timeout=8.0)
if models:
default_idx = 0
if saved_model and saved_model in models:
default_idx = models.index(saved_model)
print(f"Found {len(models)} model(s):\n")
try:
from simple_term_menu import TerminalMenu
menu_items = [
f" {m} (current)" if m == saved_model else f" {m}"
for m in models
] + [" Cancel"]
menu_items = [f" {m}" for m in models] + [" Cancel"]
menu = TerminalMenu(
menu_items, cursor_index=default_idx,
menu_items, cursor_index=0,
menu_cursor="-> ", menu_cursor_style=("fg_green", "bold"),
menu_highlight_style=("fg_green",),
cycle_cursor=True, clear_screen=False,
title=f"Select model from {name}:",
)
idx = menu.show()
from hermes_cli.curses_ui import flush_stdin
flush_stdin()
print()
if idx is None or idx >= len(models):
print("Cancelled.")
return
model_name = models[idx]
except (ImportError, NotImplementedError, OSError, subprocess.SubprocessError):
except (ImportError, NotImplementedError):
for i, m in enumerate(models, 1):
suffix = " (current)" if m == saved_model else ""
print(f" {i}. {m}{suffix}")
print(f" {i}. {m}")
print(f" {len(models) + 1}. Cancel")
print()
try:
@@ -1926,13 +1757,6 @@ def _model_flow_named_custom(config, provider_info):
except (ValueError, KeyboardInterrupt, EOFError):
print("\nCancelled.")
return
elif saved_model:
print("Could not fetch models from endpoint.")
try:
model_name = input(f"Model name [{saved_model}]: ").strip() or saved_model
except (KeyboardInterrupt, EOFError):
print("\nCancelled.")
return
else:
print("Could not fetch models from endpoint. Enter model name manually.")
try:
@@ -1956,12 +1780,6 @@ def _model_flow_named_custom(config, provider_info):
model["base_url"] = base_url
if api_key:
model["api_key"] = api_key
# Apply api_mode from custom_providers entry, or clear stale value
custom_api_mode = provider_info.get("api_mode", "")
if custom_api_mode:
model["api_mode"] = custom_api_mode
else:
model.pop("api_mode", None) # let runtime auto-detect from URL
save_config(cfg)
deactivate_provider()
@@ -2034,8 +1852,6 @@ def _prompt_reasoning_effort_selection(efforts, current_effort=""):
title="Select reasoning effort:",
)
idx = menu.show()
from hermes_cli.curses_ui import flush_stdin
flush_stdin()
if idx is None:
return None
print()
@@ -2044,7 +1860,7 @@ def _prompt_reasoning_effort_selection(efforts, current_effort=""):
if idx == len(ordered):
return "none"
return None
except (ImportError, NotImplementedError, OSError, subprocess.SubprocessError):
except (ImportError, NotImplementedError):
pass
print("Select reasoning effort:")
@@ -2499,11 +2315,8 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
print()
override = ""
if override and base_url_env:
if not override.startswith(("http://", "https://")):
print(" Invalid URL — must start with http:// or https://. Keeping current value.")
else:
save_env_value(base_url_env, override)
effective_base = override
save_env_value(base_url_env, override)
effective_base = override
# Model selection — resolution order:
# 1. models.dev registry (cached, filtered for agentic/tool-capable models)
@@ -2678,8 +2491,13 @@ def _model_flow_anthropic(config, current_model=""):
from hermes_cli.models import _PROVIDER_MODELS
# Check ALL credential sources
from hermes_cli.auth import get_anthropic_key
existing_key = get_anthropic_key()
existing_key = (
get_env_value("ANTHROPIC_TOKEN")
or os.getenv("ANTHROPIC_TOKEN", "")
or get_env_value("ANTHROPIC_API_KEY")
or os.getenv("ANTHROPIC_API_KEY", "")
or os.getenv("CLAUDE_CODE_OAUTH_TOKEN", "")
)
cc_available = False
try:
from agent.anthropic_adapter import read_claude_code_credentials, is_claude_code_token_valid
@@ -2834,34 +2652,12 @@ def cmd_dump(args):
run_dump(args)
def cmd_debug(args):
"""Debug tools (share report, etc.)."""
from hermes_cli.debug import run_debug
run_debug(args)
def cmd_config(args):
"""Configuration management."""
from hermes_cli.config import config_command
config_command(args)
def cmd_backup(args):
"""Back up Hermes home directory to a zip file."""
if getattr(args, "quick", False):
from hermes_cli.backup import run_quick_backup
run_quick_backup(args)
else:
from hermes_cli.backup import run_backup
run_backup(args)
def cmd_import(args):
"""Restore a Hermes backup from a zip file."""
from hermes_cli.backup import run_import
run_import(args)
def cmd_version(args):
"""Show version."""
print(f"Hermes Agent v{__version__} ({__release_date__})")
@@ -2980,44 +2776,6 @@ def _gateway_prompt(prompt_text: str, default: str = "", timeout: float = 300.0)
return default
def _build_web_ui(web_dir: Path, *, fatal: bool = False) -> bool:
"""Build the web UI frontend if npm is available.
Args:
web_dir: Path to the ``web/`` source directory.
fatal: If True, print error guidance and return False on failure
instead of a soft warning (used by ``hermes web``).
Returns True if the build succeeded or was skipped (no package.json).
"""
if not (web_dir / "package.json").exists():
return True
import shutil
npm = shutil.which("npm")
if not npm:
if fatal:
print("Web UI frontend not built and npm is not available.")
print("Install Node.js, then run: cd web && npm install && npm run build")
return not fatal
print("→ Building web UI...")
r1 = subprocess.run([npm, "install", "--silent"], cwd=web_dir, capture_output=True)
if r1.returncode != 0:
print(f" {'' if fatal else ''} Web UI npm install failed"
+ ("" if fatal else " (hermes web will not be available)"))
if fatal:
print(" Run manually: cd web && npm install && npm run build")
return False
r2 = subprocess.run([npm, "run", "build"], cwd=web_dir, capture_output=True)
if r2.returncode != 0:
print(f" {'' if fatal else ''} Web UI build failed"
+ ("" if fatal else " (hermes web will not be available)"))
if fatal:
print(" Run manually: cd web && npm install && npm run build")
return False
print(" ✓ Web UI built")
return True
def _update_via_zip(args):
"""Update Hermes Agent by downloading a ZIP archive.
@@ -3112,10 +2870,7 @@ def _update_via_zip(args):
check=True,
)
_install_python_dependencies_with_optional_fallback(pip_cmd)
# Build web UI frontend (optional — requires npm)
_build_web_ui(PROJECT_ROOT / "web")
# Sync skills
try:
from tools.skills_sync import sync_skills
@@ -3266,19 +3021,33 @@ def _restore_stashed_changes(
print("\nYour stashed changes are preserved — nothing is lost.")
print(f" Stash ref: {stash_ref}")
# Always reset to clean state — leaving conflict markers in source
# files makes hermes completely unrunnable (SyntaxError on import).
# The user's changes are safe in the stash for manual recovery.
subprocess.run(
git_cmd + ["reset", "--hard", "HEAD"],
cwd=cwd,
capture_output=True,
)
print("Working tree reset to clean state.")
print(f"Restore your changes later with: git stash apply {stash_ref}")
# Don't sys.exit — the code update itself succeeded, only the stash
# restore had conflicts. Let cmd_update continue with pip install,
# skill sync, and gateway restart.
# Ask before resetting (if interactive)
do_reset = True
if prompt_user:
print("\nReset working tree to clean state so Hermes can run?")
print(" (You can re-apply your changes later with: git stash apply)")
print("[Y/n] ", end="", flush=True)
response = input().strip().lower()
if response not in ("", "y", "yes"):
do_reset = False
if do_reset:
subprocess.run(
git_cmd + ["reset", "--hard", "HEAD"],
cwd=cwd,
capture_output=True,
)
print("Working tree reset to clean state.")
else:
print("Working tree left as-is (may have conflict markers).")
print("Resolve conflicts manually, then run: git stash drop")
print(f"Restore your changes with: git stash apply {stash_ref}")
# In non-interactive mode (gateway /update), don't abort — the code
# update itself succeeded, only the stash restore had conflicts.
# Aborting would report the entire update as failed.
if prompt_user:
sys.exit(1)
return False
stash_selector = _resolve_stash_selector(git_cmd, cwd, stash_ref)
@@ -3539,11 +3308,10 @@ def _invalidate_update_cache():
``hermes update``, every profile is now current.
"""
homes = []
# Default profile home (Docker-aware — uses /opt/data in Docker)
from hermes_constants import get_default_hermes_root
default_home = get_default_hermes_root()
# Default profile home
default_home = Path.home() / ".hermes"
homes.append(default_home)
# Named profiles under <root>/profiles/
# Named profiles under ~/.hermes/profiles/
profiles_root = default_home / "profiles"
if profiles_root.is_dir():
for entry in profiles_root.iterdir():
@@ -3862,10 +3630,7 @@ def cmd_update(args):
if shutil.which("npm"):
print("→ Updating Node.js dependencies...")
subprocess.run(["npm", "install", "--silent"], cwd=PROJECT_ROOT, check=False)
# Build web UI frontend (optional — requires npm)
_build_web_ui(PROJECT_ROOT / "web")
print()
print("✓ Code updated!")
@@ -3993,32 +3758,12 @@ def cmd_update(args):
print()
print("✓ Update complete!")
# Write exit code *before* the gateway restart attempt.
# When running as ``hermes update --gateway`` (spawned by the gateway's
# /update command), this process lives inside the gateway's systemd
# cgroup. ``systemctl restart hermes-gateway`` kills everything in the
# cgroup (KillMode=mixed → SIGKILL to remaining processes), including
# us and the wrapping bash shell. The shell never reaches its
# ``printf $status > .update_exit_code`` epilogue, so the exit-code
# marker file is never created. The new gateway's update watcher then
# polls for 30 minutes and sends a spurious timeout message.
#
# Writing the marker here — after git pull + pip install succeed but
# before we attempt the restart — ensures the new gateway sees it
# regardless of how we die.
if gateway_mode:
_exit_code_path = get_hermes_home() / ".update_exit_code"
try:
_exit_code_path.write_text("0")
except OSError:
pass
# Auto-restart ALL gateways after update.
# The code update (git pull) is shared across all profiles, so every
# running gateway needs restarting to pick up the new code.
try:
from hermes_cli.gateway import (
is_macos, supports_systemd_services, _ensure_user_systemd_env,
is_macos, is_linux, _ensure_user_systemd_env,
find_gateway_pids,
_get_service_pids,
)
@@ -4029,7 +3774,7 @@ def cmd_update(args):
# --- Systemd services (Linux) ---
# Discover all hermes-gateway* units (default + profiles)
if supports_systemd_services():
if is_linux():
try:
_ensure_user_systemd_env()
except Exception:
@@ -4091,7 +3836,7 @@ def cmd_update(args):
# Exclude PIDs that belong to just-restarted services so we don't
# immediately kill the process that systemd/launchd just spawned.
service_pids = _get_service_pids()
manual_pids = find_gateway_pids(exclude_pids=service_pids, all_profiles=True)
manual_pids = find_gateway_pids(exclude_pids=service_pids)
for pid in manual_pids:
try:
os.kill(pid, _signal.SIGTERM)
@@ -4147,7 +3892,7 @@ def _coalesce_session_name_args(argv: list) -> list:
"chat", "model", "gateway", "setup", "whatsapp", "login", "logout", "auth",
"status", "cron", "doctor", "config", "pairing", "skills", "tools",
"mcp", "sessions", "insights", "version", "update", "uninstall",
"profile", "dashboard",
"profile",
}
_SESSION_FLAGS = {"-c", "--continue", "-r", "--resume"}
@@ -4297,24 +4042,15 @@ def cmd_profile(args):
print(f' Add to your shell config (~/.bashrc or ~/.zshrc):')
print(f' export PATH="$HOME/.local/bin:$PATH"')
# Profile dir for display
try:
profile_dir_display = "~/" + str(profile_dir.relative_to(Path.home()))
except ValueError:
profile_dir_display = str(profile_dir)
# Next steps
print(f"\nNext steps:")
print(f" {name} setup Configure API keys and model")
print(f" {name} chat Start chatting")
print(f" {name} gateway start Start the messaging gateway")
if clone or clone_all:
profile_dir_display = f"~/.hermes/profiles/{name}"
print(f"\n Edit {profile_dir_display}/.env for different API keys")
print(f" Edit {profile_dir_display}/SOUL.md for different personality")
else:
print(f"\n ⚠ This profile has no API keys yet. Run '{name} setup' first,")
print(f" or it will inherit keys from your shell environment.")
print(f" Edit {profile_dir_display}/SOUL.md to customize personality")
print()
except (ValueError, FileExistsError, FileNotFoundError) as e:
@@ -4425,27 +4161,6 @@ def cmd_profile(args):
sys.exit(1)
def cmd_dashboard(args):
"""Start the web UI server."""
try:
import fastapi # noqa: F401
import uvicorn # noqa: F401
except ImportError:
print("Web UI dependencies not installed.")
print("Install them with: pip install hermes-agent[web]")
sys.exit(1)
if not _build_web_ui(PROJECT_ROOT / "web", fatal=True):
sys.exit(1)
from hermes_cli.web_server import start_server
start_server(
host=args.host,
port=args.port,
open_browser=not args.no_open,
)
def cmd_completion(args):
"""Print shell completion script."""
from hermes_cli.profiles import generate_bash_completion, generate_zsh_completion
@@ -4473,7 +4188,6 @@ def cmd_logs(args):
level=getattr(args, "level", None),
session=getattr(args, "session", None),
since=getattr(args, "since", None),
component=getattr(args, "component", None),
)
@@ -4511,7 +4225,6 @@ Examples:
hermes logs -f Follow agent.log in real time
hermes logs errors View errors.log
hermes logs --since 1h Lines from the last hour
hermes debug share Upload debug report for support
hermes update Update to latest version
For more help on a command:
@@ -4578,10 +4291,6 @@ For more help on a command:
"-q", "--query",
help="Single query (non-interactive mode)"
)
chat_parser.add_argument(
"--image",
help="Optional local image path to attach to a single query"
)
chat_parser.add_argument(
"-m", "--model",
help="Model to use (e.g., anthropic/claude-sonnet-4)"
@@ -4598,7 +4307,7 @@ For more help on a command:
)
chat_parser.add_argument(
"--provider",
choices=["auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot", "anthropic", "gemini", "huggingface", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "xiaomi"],
choices=["auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot", "anthropic", "gemini", "huggingface", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode"],
default=None,
help="Inference provider (default: auto)"
)
@@ -4724,7 +4433,7 @@ For more help on a command:
gateway_subparsers = gateway_parser.add_subparsers(dest="gateway_command")
# gateway run (default)
gateway_run = gateway_subparsers.add_parser("run", help="Run gateway in foreground (recommended for WSL, Docker, Termux)")
gateway_run = gateway_subparsers.add_parser("run", help="Run gateway in foreground")
gateway_run.add_argument("-v", "--verbose", action="count", default=0,
help="Increase stderr log verbosity (-v=INFO, -vv=DEBUG)")
gateway_run.add_argument("-q", "--quiet", action="store_true",
@@ -4733,7 +4442,7 @@ For more help on a command:
help="Replace any existing gateway instance (useful for systemd)")
# gateway start
gateway_start = gateway_subparsers.add_parser("start", help="Start the installed systemd/launchd background service")
gateway_start = gateway_subparsers.add_parser("start", help="Start gateway service")
gateway_start.add_argument("--system", action="store_true", help="Target the Linux system-level gateway service")
# gateway stop
@@ -4751,7 +4460,7 @@ For more help on a command:
gateway_status.add_argument("--system", action="store_true", help="Target the Linux system-level gateway service")
# gateway install
gateway_install = gateway_subparsers.add_parser("install", help="Install gateway as a systemd/launchd background service")
gateway_install = gateway_subparsers.add_parser("install", help="Install gateway as service")
gateway_install.add_argument("--force", action="store_true", help="Force reinstall")
gateway_install.add_argument("--system", action="store_true", help="Install as a Linux system-level service (starts at boot)")
gateway_install.add_argument("--run-as-user", dest="run_as_user", help="User account the Linux system service should run as")
@@ -4772,12 +4481,12 @@ For more help on a command:
"setup",
help="Interactive setup wizard",
description="Configure Hermes Agent with an interactive wizard. "
"Run a specific section: hermes setup model|tts|terminal|gateway|tools|agent"
"Run a specific section: hermes setup model|terminal|gateway|tools|agent"
)
setup_parser.add_argument(
"section",
nargs="?",
choices=["model", "tts", "terminal", "gateway", "tools", "agent"],
choices=["model", "terminal", "gateway", "tools", "agent"],
default=None,
help="Run a specific setup section instead of the full wizard"
)
@@ -5040,90 +4749,7 @@ For more help on a command:
help="Show redacted API key prefixes (first/last 4 chars) instead of just set/not set"
)
dump_parser.set_defaults(func=cmd_dump)
# =========================================================================
# debug command
# =========================================================================
debug_parser = subparsers.add_parser(
"debug",
help="Debug tools — upload logs and system info for support",
description="Debug utilities for Hermes Agent. Use 'hermes debug share' to "
"upload a debug report (system info + recent logs) to a paste "
"service and get a shareable URL.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""\
Examples:
hermes debug share Upload debug report and print URL
hermes debug share --lines 500 Include more log lines
hermes debug share --expire 30 Keep paste for 30 days
hermes debug share --local Print report locally (no upload)
""",
)
debug_sub = debug_parser.add_subparsers(dest="debug_command")
share_parser = debug_sub.add_parser(
"share",
help="Upload debug report to a paste service and print a shareable URL",
)
share_parser.add_argument(
"--lines", type=int, default=200,
help="Number of log lines to include per log file (default: 200)",
)
share_parser.add_argument(
"--expire", type=int, default=7,
help="Paste expiry in days (default: 7)",
)
share_parser.add_argument(
"--local", action="store_true",
help="Print the report locally instead of uploading",
)
debug_parser.set_defaults(func=cmd_debug)
# =========================================================================
# backup command
# =========================================================================
backup_parser = subparsers.add_parser(
"backup",
help="Back up Hermes home directory to a zip file",
description="Create a zip archive of your entire Hermes configuration, "
"skills, sessions, and data (excludes the hermes-agent codebase). "
"Use --quick for a fast snapshot of just critical state files."
)
backup_parser.add_argument(
"-o", "--output",
help="Output path for the zip file (default: ~/hermes-backup-<timestamp>.zip)"
)
backup_parser.add_argument(
"-q", "--quick",
action="store_true",
help="Quick snapshot: only critical state files (config, state.db, .env, auth, cron)"
)
backup_parser.add_argument(
"-l", "--label",
help="Label for the snapshot (only used with --quick)"
)
backup_parser.set_defaults(func=cmd_backup)
# =========================================================================
# import command
# =========================================================================
import_parser = subparsers.add_parser(
"import",
help="Restore a Hermes backup from a zip file",
description="Extract a previously created Hermes backup into your "
"Hermes home directory, restoring configuration, skills, "
"sessions, and data"
)
import_parser.add_argument(
"zipfile",
help="Path to the backup zip file"
)
import_parser.add_argument(
"--force", "-f",
action="store_true",
help="Overwrite existing files without confirmation"
)
import_parser.set_defaults(func=cmd_import)
# =========================================================================
# config command
# =========================================================================
@@ -5473,8 +5099,6 @@ Examples:
mcp_add_p.add_argument("--command", help="Stdio command (e.g. npx)")
mcp_add_p.add_argument("--args", nargs="*", default=[], help="Arguments for stdio command")
mcp_add_p.add_argument("--auth", choices=["oauth", "header"], help="Auth method")
mcp_add_p.add_argument("--preset", help="Known MCP preset name")
mcp_add_p.add_argument("--env", nargs="*", default=[], help="Environment variables for stdio servers (KEY=VALUE)")
mcp_rm_p = mcp_sub.add_parser("remove", aliases=["rm"], help="Remove an MCP server")
mcp_rm_p.add_argument("name", help="Server name to remove")
@@ -5737,8 +5361,7 @@ Examples:
claw_migrate = claw_subparsers.add_parser(
"migrate",
help="Migrate from OpenClaw to Hermes",
description="Import settings, memories, skills, and API keys from an OpenClaw installation. "
"Always shows a preview before making changes."
description="Import settings, memories, skills, and API keys from an OpenClaw installation"
)
claw_migrate.add_argument(
"--source",
@@ -5747,7 +5370,7 @@ Examples:
claw_migrate.add_argument(
"--dry-run",
action="store_true",
help="Preview only — stop after showing what would be migrated"
help="Preview what would be migrated without making changes"
)
claw_migrate.add_argument(
"--preset",
@@ -5941,19 +5564,6 @@ Examples:
)
completion_parser.set_defaults(func=cmd_completion)
# =========================================================================
# dashboard command
# =========================================================================
dashboard_parser = subparsers.add_parser(
"dashboard",
help="Start the web UI dashboard",
description="Launch the Hermes Agent web dashboard for managing config, API keys, and sessions",
)
dashboard_parser.add_argument("--port", type=int, default=9119, help="Port (default 9119)")
dashboard_parser.add_argument("--host", default="127.0.0.1", help="Host (default 127.0.0.1)")
dashboard_parser.add_argument("--no-open", action="store_true", help="Don't open browser automatically")
dashboard_parser.set_defaults(func=cmd_dashboard)
# =========================================================================
# logs command
# =========================================================================
@@ -5970,7 +5580,6 @@ Examples:
hermes logs gateway -n 100 Show last 100 lines of gateway.log
hermes logs --level WARNING Only show WARNING and above
hermes logs --session abc123 Filter by session ID
hermes logs --component tools Only show tool-related lines
hermes logs --since 1h Lines from the last hour
hermes logs --since 30m -f Follow, starting from 30 min ago
hermes logs list List available log files with sizes
@@ -6000,10 +5609,6 @@ Examples:
"--since", metavar="TIME",
help="Show lines since TIME ago (e.g. 1h, 30m, 2d)",
)
logs_parser.add_argument(
"--component", metavar="NAME",
help="Filter by component: gateway, agent, tools, cli, cron",
)
logs_parser.set_defaults(func=cmd_logs)
# =========================================================================
@@ -6012,22 +5617,9 @@ Examples:
# Pre-process argv so unquoted multi-word session names after -c / -r
# are merged into a single token before argparse sees them.
# e.g. ``hermes -c Pokemon Agent Dev`` → ``hermes -c 'Pokemon Agent Dev'``
# ── Container-aware routing ────────────────────────────────────────
# When NixOS container mode is active, route ALL subcommands into
# the managed container. This MUST run before parse_args() so that
# --help, unrecognised flags, and every subcommand are forwarded
# transparently instead of being intercepted by argparse on the host.
from hermes_cli.config import get_container_exec_info
container_info = get_container_exec_info()
if container_info:
_exec_in_container(container_info, sys.argv[1:])
# Unreachable: os.execvp never returns on success (process is replaced)
# and raises OSError on failure (which propagates as a traceback).
sys.exit(1)
_processed_argv = _coalesce_session_name_args(sys.argv[1:])
args = parser.parse_args(_processed_argv)
# Handle --version flag
if args.version:
cmd_version(args)
+16 -87
View File
@@ -9,6 +9,7 @@ configuration in ~/.hermes/config.yaml under the ``mcp_servers`` key.
"""
import asyncio
import getpass
import logging
import os
import re
@@ -27,11 +28,6 @@ from hermes_constants import display_hermes_home
logger = logging.getLogger(__name__)
_ENV_VAR_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
_MCP_PRESETS: Dict[str, Dict[str, Any]] = {}
# ─── UI Helpers ───────────────────────────────────────────────────────────────
@@ -61,8 +57,19 @@ def _confirm(question: str, default: bool = True) -> bool:
def _prompt(question: str, *, password: bool = False, default: str = "") -> str:
from hermes_cli.cli_output import prompt as _shared_prompt
return _shared_prompt(question, default=default, password=password)
display = f" {question}"
if default:
display += f" [{default}]"
display += ": "
try:
if password:
value = getpass.getpass(color(display, Colors.YELLOW))
else:
value = input(color(display, Colors.YELLOW))
return value.strip() or default
except (KeyboardInterrupt, EOFError):
print()
return default
# ─── Config Helpers ───────────────────────────────────────────────────────────
@@ -102,59 +109,6 @@ def _env_key_for_server(name: str) -> str:
return f"MCP_{name.upper().replace('-', '_')}_API_KEY"
def _parse_env_assignments(raw_env: Optional[List[str]]) -> Dict[str, str]:
"""Parse ``KEY=VALUE`` strings from CLI args into an env dict."""
parsed: Dict[str, str] = {}
for item in raw_env or []:
text = str(item or "").strip()
if not text:
continue
if "=" not in text:
raise ValueError(f"Invalid --env value '{text}' (expected KEY=VALUE)")
key, value = text.split("=", 1)
key = key.strip()
if not key:
raise ValueError(f"Invalid --env value '{text}' (missing variable name)")
if not _ENV_VAR_NAME_RE.match(key):
raise ValueError(f"Invalid --env variable name '{key}'")
parsed[key] = value
return parsed
def _apply_mcp_preset(
name: str,
*,
preset_name: Optional[str],
url: Optional[str],
command: Optional[str],
cmd_args: List[str],
server_config: Dict[str, Any],
) -> tuple[Optional[str], Optional[str], List[str], bool]:
"""Apply a known MCP preset when transport details were omitted."""
if not preset_name:
return url, command, cmd_args, False
preset = _MCP_PRESETS.get(preset_name)
if not preset:
raise ValueError(f"Unknown MCP preset: {preset_name}")
if url or command:
return url, command, cmd_args, False
url = preset.get("url")
command = preset.get("command")
cmd_args = list(preset.get("args") or [])
if url:
server_config["url"] = url
if command:
server_config["command"] = command
if cmd_args:
server_config["args"] = cmd_args
return url, command, cmd_args, True
# ─── Discovery (temporary connect) ───────────────────────────────────────────
def _probe_single_server(
@@ -223,35 +177,13 @@ def cmd_mcp_add(args):
command = getattr(args, "command", None)
cmd_args = getattr(args, "args", None) or []
auth_type = getattr(args, "auth", None)
preset_name = getattr(args, "preset", None)
raw_env = getattr(args, "env", None)
server_config: Dict[str, Any] = {}
try:
explicit_env = _parse_env_assignments(raw_env)
url, command, cmd_args, _preset_applied = _apply_mcp_preset(
name,
preset_name=preset_name,
url=url,
command=command,
cmd_args=list(cmd_args),
server_config=server_config,
)
except ValueError as exc:
_error(str(exc))
return
if url and explicit_env:
_error("--env is only supported for stdio MCP servers (--command or stdio presets)")
return
# Validate transport
if not url and not command:
_error("Must specify --url <endpoint>, --command <cmd>, or --preset <name>")
_error("Must specify --url <endpoint> or --command <cmd>")
_info("Examples:")
_info(' hermes mcp add ink --url "https://mcp.ml.ink/mcp"')
_info(' hermes mcp add github --command npx --args @modelcontextprotocol/server-github')
_info(' hermes mcp add myserver --preset mypreset')
return
# Check if server already exists
@@ -262,15 +194,13 @@ def cmd_mcp_add(args):
return
# Build initial config
server_config: Dict[str, Any] = {}
if url:
server_config["url"] = url
else:
server_config["command"] = command
if cmd_args:
server_config["args"] = cmd_args
if explicit_env:
server_config["env"] = explicit_env
# ── Authentication ────────────────────────────────────────────────
@@ -708,7 +638,6 @@ def mcp_command(args):
_info("hermes mcp serve Run as MCP server")
_info("hermes mcp add <name> --url <endpoint> Add an MCP server")
_info("hermes mcp add <name> --command <cmd> Add a stdio server")
_info("hermes mcp add <name> --preset <preset> Add from a known preset")
_info("hermes mcp remove <name> Remove a server")
_info("hermes mcp list List servers")
_info("hermes mcp test <name> Test connection")
+79 -7
View File
@@ -25,13 +25,85 @@ def _curses_select(title: str, items: list[tuple[str, str]], default: int = 0) -
items: list of (label, description) tuples.
Returns selected index, or default on escape/quit.
"""
from hermes_cli.curses_ui import curses_radiolist
# Format (label, desc) tuples into display strings
display_items = [
f"{label} {desc}" if desc else label
for label, desc in items
]
return curses_radiolist(title, display_items, selected=default, cancel_returns=default)
try:
import curses
result = [default]
def _menu(stdscr):
curses.curs_set(0)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
curses.init_pair(3, curses.COLOR_CYAN, -1)
cursor = default
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
# Title
try:
stdscr.addnstr(0, 0, title, max_x - 1,
curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0))
stdscr.addnstr(1, 0, " ↑↓ navigate ⏎ select q quit", max_x - 1,
curses.color_pair(3) if curses.has_colors() else curses.A_DIM)
except curses.error:
pass
for i, (label, desc) in enumerate(items):
y = i + 3
if y >= max_y - 1:
break
arrow = "" if i == cursor else " "
line = f" {arrow} {label}"
if desc:
line += f" {desc}"
attr = curses.A_NORMAL
if i == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(1)
try:
stdscr.addnstr(y, 0, line[:max_x - 1], max_x - 1, attr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord('k')):
cursor = (cursor - 1) % len(items)
elif key in (curses.KEY_DOWN, ord('j')):
cursor = (cursor + 1) % len(items)
elif key in (curses.KEY_ENTER, 10, 13):
result[0] = cursor
return
elif key in (27, ord('q')):
return
curses.wrapper(_menu)
return result[0]
except Exception:
# Fallback: numbered input
print(f"\n {title}\n")
for i, (label, desc) in enumerate(items):
marker = "" if i == default else " "
d = f" {desc}" if desc else ""
print(f" {marker} {i + 1}. {label}{d}")
while True:
try:
val = input(f"\n Select [1-{len(items)}] ({default + 1}): ")
if not val:
return default
idx = int(val) - 1
if 0 <= idx < len(items):
return idx
except (ValueError, EOFError):
return default
def _prompt(label: str, default: str | None = None, secret: bool = False) -> str:
+13 -82
View File
@@ -8,9 +8,8 @@ Different LLM providers expect model identifiers in different formats:
hyphens: ``claude-sonnet-4-6``.
- **Copilot** expects bare names *with* dots preserved:
``claude-sonnet-4.6``.
- **OpenCode Zen** preserves dots for GPT/GLM/Gemini/Kimi/MiniMax-style
model IDs, but Claude still uses hyphenated native names like
``claude-sonnet-4-6``.
- **OpenCode Zen** follows the same dot-to-hyphen convention as
Anthropic: ``claude-sonnet-4-6``.
- **OpenCode Go** preserves dots in model names: ``minimax-m2.7``.
- **DeepSeek** only accepts two model identifiers:
``deepseek-chat`` and ``deepseek-reasoner``.
@@ -68,31 +67,26 @@ _AGGREGATOR_PROVIDERS: frozenset[str] = frozenset({
# Providers that want bare names with dots replaced by hyphens.
_DOT_TO_HYPHEN_PROVIDERS: frozenset[str] = frozenset({
"anthropic",
"opencode-zen",
})
# Providers that want bare names with dots preserved.
_STRIP_VENDOR_ONLY_PROVIDERS: frozenset[str] = frozenset({
"copilot",
"copilot-acp",
"openai-codex",
})
# Providers whose native naming is authoritative -- pass through unchanged.
_AUTHORITATIVE_NATIVE_PROVIDERS: frozenset[str] = frozenset({
# Providers whose own naming is authoritative -- pass through unchanged.
_PASSTHROUGH_PROVIDERS: frozenset[str] = frozenset({
"gemini",
"huggingface",
})
# Direct providers that accept bare native names but should repair a matching
# provider/ prefix when users copy the aggregator form into config.yaml.
_MATCHING_PREFIX_STRIP_PROVIDERS: frozenset[str] = frozenset({
"zai",
"kimi-coding",
"minimax",
"minimax-cn",
"alibaba",
"qwen-oauth",
"xiaomi",
"huggingface",
"openai-codex",
"custom",
})
@@ -174,40 +168,6 @@ def _dots_to_hyphens(model_name: str) -> str:
return model_name.replace(".", "-")
def _normalize_provider_alias(provider_name: str) -> str:
"""Resolve provider aliases to Hermes' canonical ids."""
raw = (provider_name or "").strip().lower()
if not raw:
return raw
try:
from hermes_cli.models import normalize_provider
return normalize_provider(raw)
except Exception:
return raw
def _strip_matching_provider_prefix(model_name: str, target_provider: str) -> str:
"""Strip ``provider/`` only when the prefix matches the target provider.
This prevents arbitrary slash-bearing model IDs from being mangled on
native providers while still repairing manual config values like
``zai/glm-5.1`` for the ``zai`` provider.
"""
if "/" not in model_name:
return model_name
prefix, remainder = model_name.split("/", 1)
if not prefix.strip() or not remainder.strip():
return model_name
normalized_prefix = _normalize_provider_alias(prefix)
normalized_target = _normalize_provider_alias(target_provider)
if normalized_prefix and normalized_prefix == normalized_target:
return remainder.strip()
return model_name
def detect_vendor(model_name: str) -> Optional[str]:
"""Detect the vendor slug from a bare model name.
@@ -329,9 +289,6 @@ def normalize_model_for_provider(model_input: str, target_provider: str) -> str:
>>> normalize_model_for_provider("claude-sonnet-4.6", "opencode-zen")
'claude-sonnet-4-6'
>>> normalize_model_for_provider("minimax-m2.5-free", "opencode-zen")
'minimax-m2.5-free'
>>> normalize_model_for_provider("deepseek-v3", "deepseek")
'deepseek-chat'
@@ -348,50 +305,24 @@ def normalize_model_for_provider(model_input: str, target_provider: str) -> str:
if not name:
return name
provider = _normalize_provider_alias(target_provider)
provider = (target_provider or "").strip().lower()
# --- Aggregators: need vendor/model format ---
if provider in _AGGREGATOR_PROVIDERS:
return _prepend_vendor(name)
# --- OpenCode Zen: Claude stays hyphenated; other models keep dots ---
if provider == "opencode-zen":
bare = _strip_matching_provider_prefix(name, provider)
if "/" in bare:
return bare
if bare.lower().startswith("claude-"):
return _dots_to_hyphens(bare)
return bare
# --- Anthropic: strip matching provider prefix, dots -> hyphens ---
# --- Anthropic / OpenCode: strip vendor, dots -> hyphens ---
if provider in _DOT_TO_HYPHEN_PROVIDERS:
bare = _strip_matching_provider_prefix(name, provider)
if "/" in bare:
return bare
bare = _strip_vendor_prefix(name)
return _dots_to_hyphens(bare)
# --- Copilot: strip matching provider prefix, keep dots ---
# --- Copilot: strip vendor, keep dots ---
if provider in _STRIP_VENDOR_ONLY_PROVIDERS:
stripped = _strip_matching_provider_prefix(name, provider)
if stripped == name and name.startswith("openai/"):
# openai-codex maps openai/gpt-5.4 -> gpt-5.4
return name.split("/", 1)[1]
return stripped
return _strip_vendor_prefix(name)
# --- DeepSeek: map to one of two canonical names ---
if provider == "deepseek":
bare = _strip_matching_provider_prefix(name, provider)
if "/" in bare:
return bare
return _normalize_for_deepseek(bare)
# --- Direct providers: repair matching provider prefixes only ---
if provider in _MATCHING_PREFIX_STRIP_PROVIDERS:
return _strip_matching_provider_prefix(name, provider)
# --- Authoritative native providers: preserve user-facing slugs as-is ---
if provider in _AUTHORITATIVE_NATIVE_PROVIDERS:
return name
return _normalize_for_deepseek(name)
# --- Custom & all others: pass through as-is ---
return name
+12 -157
View File
@@ -21,12 +21,10 @@ OpenRouter variant suffixes (``:free``, ``:extended``, ``:fast``).
from __future__ import annotations
import logging
import re
from dataclasses import dataclass
from typing import List, NamedTuple, Optional
from hermes_cli.providers import (
custom_provider_slug,
determine_api_mode,
get_label,
is_aggregator,
@@ -58,36 +56,10 @@ _HERMES_MODEL_WARNING = (
"(Claude, GPT, Gemini, DeepSeek, etc.)."
)
# Match only the real Nous Research Hermes 3 / Hermes 4 chat families.
# The previous substring check (`"hermes" in name.lower()`) false-positived on
# unrelated local Modelfiles like ``hermes-brain:qwen3-14b-ctx16k`` that just
# happen to carry "hermes" in their tag but are fully tool-capable.
#
# Positive examples the regex must match:
# NousResearch/Hermes-3-Llama-3.1-70B, hermes-4-405b, openrouter/hermes3:70b
# Negative examples it must NOT match:
# hermes-brain:qwen3-14b-ctx16k, qwen3:14b, claude-opus-4-6
_NOUS_HERMES_NON_AGENTIC_RE = re.compile(
r"(?:^|[/:])hermes[-_ ]?[34](?:[-_.:]|$)",
re.IGNORECASE,
)
def is_nous_hermes_non_agentic(model_name: str) -> bool:
"""Return True if *model_name* is a real Nous Hermes 3/4 chat model.
Used to decide whether to surface the non-agentic warning at startup.
Callers in :mod:`cli.py` and here should go through this single helper
so the two sites don't drift.
"""
if not model_name:
return False
return bool(_NOUS_HERMES_NON_AGENTIC_RE.search(model_name))
def _check_hermes_model_warning(model_name: str) -> str:
"""Return a warning string if *model_name* is a Nous Hermes 3/4 chat model."""
if is_nous_hermes_non_agentic(model_name):
"""Return a warning string if *model_name* looks like a Hermes LLM model."""
if "hermes" in model_name.lower():
return _HERMES_MODEL_WARNING
return ""
@@ -364,7 +336,6 @@ def resolve_alias(
def get_authenticated_provider_slugs(
current_provider: str = "",
user_providers: dict = None,
custom_providers: list | None = None,
) -> list[str]:
"""Return slugs of providers that have credentials.
@@ -375,7 +346,6 @@ def get_authenticated_provider_slugs(
providers = list_authenticated_providers(
current_provider=current_provider,
user_providers=user_providers,
custom_providers=custom_providers,
max_models=0,
)
return [p["slug"] for p in providers]
@@ -413,7 +383,6 @@ def switch_model(
is_global: bool = False,
explicit_provider: str = "",
user_providers: dict = None,
custom_providers: list | None = None,
) -> ModelSwitchResult:
"""Core model-switching pipeline shared between CLI and gateway.
@@ -447,7 +416,6 @@ def switch_model(
is_global: Whether to persist the switch.
explicit_provider: From --provider flag (empty = no explicit provider).
user_providers: The ``providers:`` dict from config.yaml (for user endpoints).
custom_providers: The ``custom_providers:`` list from config.yaml.
Returns:
ModelSwitchResult with all information the caller needs.
@@ -468,11 +436,7 @@ def switch_model(
# =================================================================
if explicit_provider:
# Resolve the provider
pdef = resolve_provider_full(
explicit_provider,
user_providers,
custom_providers,
)
pdef = resolve_provider_full(explicit_provider, user_providers)
if pdef is None:
_switch_err = (
f"Unknown provider '{explicit_provider}'. "
@@ -552,7 +516,6 @@ def switch_model(
authed = get_authenticated_provider_slugs(
current_provider=current_provider,
user_providers=user_providers,
custom_providers=custom_providers,
)
fallback_result = _resolve_alias_fallback(raw_input, authed)
if fallback_result is not None:
@@ -627,14 +590,6 @@ def switch_model(
provider_changed = target_provider != current_provider
provider_label = get_label(target_provider)
if target_provider.startswith("custom:"):
custom_pdef = resolve_provider_full(
target_provider,
user_providers,
custom_providers,
)
if custom_pdef is not None:
provider_label = custom_pdef.name
# --- Resolve credentials ---
api_key = current_api_key
@@ -753,7 +708,6 @@ def switch_model(
def list_authenticated_providers(
current_provider: str = "",
user_providers: dict = None,
custom_providers: list | None = None,
max_models: int = 8,
) -> List[dict]:
"""Detect which providers have credentials and list their curated models.
@@ -836,104 +790,42 @@ def list_authenticated_providers(
})
seen_slugs.add(slug)
# --- 2. Check Hermes-only providers (nous, openai-codex, copilot, opencode-go) ---
# --- 2. Check Hermes-only providers (nous, openai-codex, copilot) ---
from hermes_cli.providers import HERMES_OVERLAYS
from hermes_cli.auth import PROVIDER_REGISTRY as _auth_registry
# Build reverse mapping: models.dev ID → Hermes provider ID.
# HERMES_OVERLAYS keys may be models.dev IDs (e.g. "github-copilot")
# while _PROVIDER_MODELS and config.yaml use Hermes IDs ("copilot").
_mdev_to_hermes = {v: k for k, v in PROVIDER_TO_MODELS_DEV.items()}
for pid, overlay in HERMES_OVERLAYS.items():
if pid in seen_slugs:
continue
# Resolve Hermes slug — e.g. "github-copilot" → "copilot"
hermes_slug = _mdev_to_hermes.get(pid, pid)
if hermes_slug in seen_slugs:
continue
# Check if credentials exist
has_creds = False
if overlay.extra_env_vars:
has_creds = any(os.environ.get(ev) for ev in overlay.extra_env_vars)
# Also check api_key_env_vars from PROVIDER_REGISTRY for api_key auth_type
if not has_creds and overlay.auth_type == "api_key":
for _key in (pid, hermes_slug):
pcfg = _auth_registry.get(_key)
if pcfg and pcfg.api_key_env_vars:
if any(os.environ.get(ev) for ev in pcfg.api_key_env_vars):
has_creds = True
break
# Check auth store and credential pool for non-env-var credentials.
# This applies to OAuth providers AND api_key providers that also
# support OAuth (e.g. anthropic supports both API key and Claude Code
# OAuth via external credential files).
if not has_creds:
if overlay.auth_type in ("oauth_device_code", "oauth_external", "external_process"):
# These use auth stores, not env vars — check for auth.json entries
try:
from hermes_cli.auth import _load_auth_store
store = _load_auth_store()
providers_store = store.get("providers", {})
pool_store = store.get("credential_pool", {})
if store and (
pid in providers_store or hermes_slug in providers_store
or pid in pool_store or hermes_slug in pool_store
):
if store and (pid in store.get("providers", {}) or pid in store.get("credential_pool", {})):
has_creds = True
except Exception as exc:
logger.debug("Auth store check failed for %s: %s", pid, exc)
# Fallback: check the credential pool with full auto-seeding.
# This catches credentials that exist in external stores (e.g.
# Codex CLI ~/.codex/auth.json) which _seed_from_singletons()
# imports on demand but aren't in the raw auth.json yet.
if not has_creds:
try:
from agent.credential_pool import load_pool
pool = load_pool(hermes_slug)
if pool.has_credentials():
has_creds = True
except Exception as exc:
logger.debug("Credential pool check failed for %s: %s", hermes_slug, exc)
# Fallback: check external credential files directly.
# The credential pool gates anthropic behind
# is_provider_explicitly_configured() to prevent auxiliary tasks
# from silently consuming Claude Code tokens (PR #4210).
# But the /model picker is discovery-oriented — we WANT to show
# providers the user can switch to, even if they aren't currently
# configured.
if not has_creds and hermes_slug == "anthropic":
try:
from agent.anthropic_adapter import (
read_claude_code_credentials,
read_hermes_oauth_credentials,
)
hermes_creds = read_hermes_oauth_credentials()
cc_creds = read_claude_code_credentials()
if (hermes_creds and hermes_creds.get("accessToken")) or \
(cc_creds and cc_creds.get("accessToken")):
has_creds = True
except Exception as exc:
logger.debug("Anthropic external creds check failed: %s", exc)
if not has_creds:
continue
# Use curated list — look up by Hermes slug, fall back to overlay key
model_ids = curated.get(hermes_slug, []) or curated.get(pid, [])
# Use curated list
model_ids = curated.get(pid, [])
total = len(model_ids)
top = model_ids[:max_models]
results.append({
"slug": hermes_slug,
"name": get_label(hermes_slug),
"is_current": hermes_slug == current_provider or pid == current_provider,
"slug": pid,
"name": get_label(pid),
"is_current": pid == current_provider,
"is_user_defined": False,
"models": top,
"total_models": total,
"source": "hermes",
})
seen_slugs.add(pid)
seen_slugs.add(hermes_slug)
# --- 3. User-defined endpoints from config ---
if user_providers and isinstance(user_providers, dict):
@@ -961,43 +853,6 @@ def list_authenticated_providers(
"api_url": api_url,
})
# --- 4. Saved custom providers from config ---
if custom_providers and isinstance(custom_providers, list):
for entry in custom_providers:
if not isinstance(entry, dict):
continue
display_name = (entry.get("name") or "").strip()
api_url = (
entry.get("base_url", "")
or entry.get("url", "")
or entry.get("api", "")
or ""
).strip()
if not display_name or not api_url:
continue
slug = custom_provider_slug(display_name)
if slug in seen_slugs:
continue
models_list = []
default_model = (entry.get("model") or "").strip()
if default_model:
models_list.append(default_model)
results.append({
"slug": slug,
"name": display_name,
"is_current": slug == current_provider,
"is_user_defined": True,
"models": models_list,
"total_models": len(models_list),
"source": "user-config",
"api_url": api_url,
})
seen_slugs.add(slug)
# Sort: current provider first, then by model count descending
results.sort(key=lambda r: (not r["is_current"], -r["total_models"]))
+32 -254
View File
@@ -20,20 +20,18 @@ COPILOT_EDITOR_VERSION = "vscode/1.104.1"
COPILOT_REASONING_EFFORTS_GPT5 = ["minimal", "low", "medium", "high"]
COPILOT_REASONING_EFFORTS_O_SERIES = ["low", "medium", "high"]
# Fallback OpenRouter snapshot used when the live catalog is unavailable.
# (model_id, display description shown in menus)
OPENROUTER_MODELS: list[tuple[str, str]] = [
("anthropic/claude-opus-4.6", "recommended"),
("anthropic/claude-sonnet-4.6", ""),
("qwen/qwen3.6-plus", ""),
("qwen/qwen3.6-plus:free", "free"),
("anthropic/claude-sonnet-4.5", ""),
("anthropic/claude-haiku-4.5", ""),
("openai/gpt-5.4", ""),
("openai/gpt-5.4-mini", ""),
("xiaomi/mimo-v2-pro", ""),
("openai/gpt-5.3-codex", ""),
("google/gemini-3-pro-image-preview", ""),
("google/gemini-3-pro-preview", ""),
("google/gemini-3-flash-preview", ""),
("google/gemini-3.1-pro-preview", ""),
("google/gemini-3.1-flash-lite-preview", ""),
@@ -45,7 +43,7 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("z-ai/glm-5.1", ""),
("z-ai/glm-5-turbo", ""),
("moonshotai/kimi-k2.5", ""),
("x-ai/grok-4.20", ""),
("x-ai/grok-4.20-beta", ""),
("nvidia/nemotron-3-super-120b-a12b", ""),
("nvidia/nemotron-3-super-120b-a12b:free", "free"),
("arcee-ai/trinity-large-preview:free", "free"),
@@ -54,29 +52,15 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("openai/gpt-5.4-nano", ""),
]
_openrouter_catalog_cache: list[tuple[str, str]] | None = None
def _codex_curated_models() -> list[str]:
"""Derive the openai-codex curated list from codex_models.py.
Single source of truth: DEFAULT_CODEX_MODELS + forward-compat synthesis.
This keeps the gateway /model picker in sync with the CLI `hermes model`
flow without maintaining a separate static list.
"""
from hermes_cli.codex_models import DEFAULT_CODEX_MODELS, _add_forward_compat_models
return _add_forward_compat_models(list(DEFAULT_CODEX_MODELS))
_PROVIDER_MODELS: dict[str, list[str]] = {
"nous": [
"xiaomi/mimo-v2-pro",
"anthropic/claude-opus-4.6",
"anthropic/claude-sonnet-4.6",
"anthropic/claude-sonnet-4.5",
"anthropic/claude-haiku-4.5",
"openai/gpt-5.4",
"openai/gpt-5.4-mini",
"xiaomi/mimo-v2-pro",
"openai/gpt-5.3-codex",
"google/gemini-3-pro-preview",
"google/gemini-3-flash-preview",
@@ -98,7 +82,12 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"openai/gpt-5.4-pro",
"openai/gpt-5.4-nano",
],
"openai-codex": _codex_curated_models(),
"openai-codex": [
"gpt-5.3-codex",
"gpt-5.2-codex",
"gpt-5.1-codex-mini",
"gpt-5.1-codex-max",
],
"copilot-acp": [
"copilot-acp",
],
@@ -130,26 +119,12 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"gemma-4-26b-it",
],
"zai": [
"glm-5.1",
"glm-5",
"glm-5-turbo",
"glm-4.7",
"glm-4.5",
"glm-4.5-flash",
],
"xai": [
"grok-4.20-0309-reasoning",
"grok-4.20-0309-non-reasoning",
"grok-4.20-multi-agent-0309",
"grok-4-1-fast-reasoning",
"grok-4-1-fast-non-reasoning",
"grok-4-fast-reasoning",
"grok-4-fast-non-reasoning",
"grok-4-0709",
"grok-code-fast-1",
"grok-3",
"grok-3-mini",
],
"kimi-coding": [
"kimi-for-coding",
"kimi-k2.5",
@@ -165,16 +140,22 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"kimi-k2-0905-preview",
],
"minimax": [
"MiniMax-M2.7",
"MiniMax-M1",
"MiniMax-M1-40k",
"MiniMax-M1-80k",
"MiniMax-M1-128k",
"MiniMax-M1-256k",
"MiniMax-M2.5",
"MiniMax-M2.1",
"MiniMax-M2",
"MiniMax-M2.7",
],
"minimax-cn": [
"MiniMax-M2.7",
"MiniMax-M1",
"MiniMax-M1-40k",
"MiniMax-M1-80k",
"MiniMax-M1-128k",
"MiniMax-M1-256k",
"MiniMax-M2.5",
"MiniMax-M2.1",
"MiniMax-M2",
"MiniMax-M2.7",
],
"anthropic": [
"claude-opus-4-6",
@@ -189,11 +170,6 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"deepseek-chat",
"deepseek-reasoner",
],
"xiaomi": [
"mimo-v2-pro",
"mimo-v2-omni",
"mimo-v2-flash",
],
"opencode-zen": [
"gpt-5.4-pro",
"gpt-5.4",
@@ -499,7 +475,6 @@ _PROVIDER_LABELS = {
"alibaba": "Alibaba Cloud (DashScope)",
"qwen-oauth": "Qwen OAuth (Portal)",
"huggingface": "Hugging Face",
"xiaomi": "Xiaomi MiMo",
"custom": "Custom endpoint",
}
@@ -542,101 +517,12 @@ _PROVIDER_ALIASES = {
"hf": "huggingface",
"hugging-face": "huggingface",
"huggingface-hub": "huggingface",
"mimo": "xiaomi",
"xiaomi-mimo": "xiaomi",
}
def get_default_model_for_provider(provider: str) -> str:
"""Return the default model for a provider, or empty string if unknown.
Uses the first entry in _PROVIDER_MODELS as the default. This is the
model a user would be offered first in the ``hermes model`` picker.
Used as a fallback when the user has configured a provider but never
selected a model (e.g. ``hermes auth add openai-codex`` without
``hermes model``).
"""
models = _PROVIDER_MODELS.get(provider, [])
return models[0] if models else ""
def _openrouter_model_is_free(pricing: Any) -> bool:
"""Return True when both prompt and completion pricing are zero."""
if not isinstance(pricing, dict):
return False
try:
return float(pricing.get("prompt", "0")) == 0 and float(pricing.get("completion", "0")) == 0
except (TypeError, ValueError):
return False
def fetch_openrouter_models(
timeout: float = 8.0,
*,
force_refresh: bool = False,
) -> list[tuple[str, str]]:
"""Return the curated OpenRouter picker list, refreshed from the live catalog when possible."""
global _openrouter_catalog_cache
if _openrouter_catalog_cache is not None and not force_refresh:
return list(_openrouter_catalog_cache)
fallback = list(OPENROUTER_MODELS)
preferred_ids = [mid for mid, _ in fallback]
try:
req = urllib.request.Request(
"https://openrouter.ai/api/v1/models",
headers={"Accept": "application/json"},
)
with urllib.request.urlopen(req, timeout=timeout) as resp:
payload = json.loads(resp.read().decode())
except Exception:
return list(_openrouter_catalog_cache or fallback)
live_items = payload.get("data", [])
if not isinstance(live_items, list):
return list(_openrouter_catalog_cache or fallback)
live_by_id: dict[str, dict[str, Any]] = {}
for item in live_items:
if not isinstance(item, dict):
continue
mid = str(item.get("id") or "").strip()
if not mid:
continue
live_by_id[mid] = item
curated: list[tuple[str, str]] = []
for preferred_id in preferred_ids:
live_item = live_by_id.get(preferred_id)
if live_item is None:
continue
desc = "free" if _openrouter_model_is_free(live_item.get("pricing")) else ""
curated.append((preferred_id, desc))
if not curated:
return list(_openrouter_catalog_cache or fallback)
first_id, _ = curated[0]
curated[0] = (first_id, "recommended")
_openrouter_catalog_cache = curated
return list(curated)
def model_ids(*, force_refresh: bool = False) -> list[str]:
def model_ids() -> list[str]:
"""Return just the OpenRouter model-id strings."""
return [mid for mid, _ in fetch_openrouter_models(force_refresh=force_refresh)]
def menu_labels(*, force_refresh: bool = False) -> list[str]:
"""Return display labels like 'anthropic/claude-opus-4.6 (recommended)'."""
labels = []
for mid, desc in fetch_openrouter_models(force_refresh=force_refresh):
labels.append(f"{mid} ({desc})" if desc else mid)
return labels
return [mid for mid, _ in OPENROUTER_MODELS]
# ---------------------------------------------------------------------------
@@ -798,14 +684,13 @@ def _resolve_nous_pricing_credentials() -> tuple[str, str]:
return ("", "")
def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> dict[str, dict[str, str]]:
def get_pricing_for_provider(provider: str) -> dict[str, dict[str, str]]:
"""Return live pricing for providers that support it (openrouter, nous)."""
normalized = normalize_provider(provider)
if normalized == "openrouter":
return fetch_models_with_pricing(
api_key=_resolve_openrouter_api_key(),
base_url="https://openrouter.ai/api",
force_refresh=force_refresh,
)
if normalized == "nous":
api_key, base_url = _resolve_nous_pricing_credentials()
@@ -818,7 +703,6 @@ def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> d
return fetch_models_with_pricing(
api_key=api_key,
base_url=stripped,
force_refresh=force_refresh,
)
return {}
@@ -842,7 +726,7 @@ def list_available_providers() -> list[dict[str, str]]:
"openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
"gemini", "huggingface",
"zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
"qwen-oauth", "xiaomi",
"qwen-oauth",
"opencode-zen", "opencode-go",
"ai-gateway", "deepseek", "custom",
]
@@ -927,11 +811,7 @@ def _get_custom_base_url() -> str:
return ""
def curated_models_for_provider(
provider: Optional[str],
*,
force_refresh: bool = False,
) -> list[tuple[str, str]]:
def curated_models_for_provider(provider: Optional[str]) -> list[tuple[str, str]]:
"""Return ``(model_id, description)`` tuples for a provider's model list.
Tries to fetch the live model list from the provider's API first,
@@ -940,7 +820,7 @@ def curated_models_for_provider(
"""
normalized = normalize_provider(provider)
if normalized == "openrouter":
return fetch_openrouter_models(force_refresh=force_refresh)
return list(OPENROUTER_MODELS)
# Try live API first (Codex, Nous, etc. all support /models)
live = provider_model_ids(normalized)
@@ -1059,12 +939,12 @@ def _find_openrouter_slug(model_name: str) -> Optional[str]:
return None
# Exact match (already has provider/ prefix)
for mid in model_ids():
for mid, _ in OPENROUTER_MODELS:
if name_lower == mid.lower():
return mid
# Try matching just the model part (after the /)
for mid in model_ids():
for mid, _ in OPENROUTER_MODELS:
if "/" in mid:
_, model_part = mid.split("/", 1)
if name_lower == model_part.lower():
@@ -1094,79 +974,6 @@ def provider_label(provider: Optional[str]) -> str:
return _PROVIDER_LABELS.get(normalized, original or "OpenRouter")
# Models that support OpenAI Priority Processing (service_tier="priority").
# See https://openai.com/api-priority-processing/ for the canonical list.
# Only the bare model slug is stored (no vendor prefix).
_PRIORITY_PROCESSING_MODELS: frozenset[str] = frozenset({
"gpt-5.4",
"gpt-5.4-mini",
"gpt-5.2",
"gpt-5.1",
"gpt-5",
"gpt-5-mini",
"gpt-4.1",
"gpt-4.1-mini",
"gpt-4.1-nano",
"gpt-4o",
"gpt-4o-mini",
"o3",
"o4-mini",
})
# Models that support Anthropic Fast Mode (speed="fast").
# See https://platform.claude.com/docs/en/build-with-claude/fast-mode
# Currently only Claude Opus 4.6. Both hyphen and dot variants are stored
# to handle native Anthropic (claude-opus-4-6) and OpenRouter (claude-opus-4.6).
_ANTHROPIC_FAST_MODE_MODELS: frozenset[str] = frozenset({
"claude-opus-4-6",
"claude-opus-4.6",
})
def _strip_vendor_prefix(model_id: str) -> str:
"""Strip vendor/ prefix from a model ID (e.g. 'anthropic/claude-opus-4-6' -> 'claude-opus-4-6')."""
raw = str(model_id or "").strip().lower()
if "/" in raw:
raw = raw.split("/", 1)[1]
return raw
def model_supports_fast_mode(model_id: Optional[str]) -> bool:
"""Return whether Hermes should expose the /fast toggle for this model."""
raw = _strip_vendor_prefix(str(model_id or ""))
if raw in _PRIORITY_PROCESSING_MODELS:
return True
# Anthropic fast mode — strip date suffixes (e.g. claude-opus-4-6-20260401)
# and OpenRouter variant tags (:fast, :beta) for matching.
base = raw.split(":")[0]
return base in _ANTHROPIC_FAST_MODE_MODELS
def _is_anthropic_fast_model(model_id: Optional[str]) -> bool:
"""Return True if the model supports Anthropic's fast mode (speed='fast')."""
raw = _strip_vendor_prefix(str(model_id or ""))
base = raw.split(":")[0]
return base in _ANTHROPIC_FAST_MODE_MODELS
def resolve_fast_mode_overrides(model_id: Optional[str]) -> dict[str, Any] | None:
"""Return request_overrides for fast/priority mode, or None if unsupported.
Returns provider-appropriate overrides:
- OpenAI models: ``{"service_tier": "priority"}`` (Priority Processing)
- Anthropic models: ``{"speed": "fast"}`` (Anthropic Fast Mode beta)
The overrides are injected into the API request kwargs by
``_build_api_kwargs`` in run_agent.py each API path handles its own
keys (service_tier for OpenAI/Codex, speed for Anthropic Messages).
"""
if not model_supports_fast_mode(model_id):
return None
if _is_anthropic_fast_model(model_id):
return {"speed": "fast"}
return {"service_tier": "priority"}
def _resolve_copilot_catalog_api_key() -> str:
"""Best-effort GitHub token for fetching the Copilot model catalog."""
try:
@@ -1178,7 +985,7 @@ def _resolve_copilot_catalog_api_key() -> str:
return ""
def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False) -> list[str]:
def provider_model_ids(provider: Optional[str]) -> list[str]:
"""Return the best known model catalog for a provider.
Tries live API endpoints for providers that support them (Codex, Nous),
@@ -1186,7 +993,7 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
"""
normalized = normalize_provider(provider)
if normalized == "openrouter":
return model_ids(force_refresh=force_refresh)
return model_ids()
if normalized == "openai-codex":
from hermes_cli.codex_models import get_codex_model_ids
@@ -1824,35 +1631,6 @@ def validate_requested_model(
"message": message,
}
# OpenAI Codex has its own catalog path; /v1/models probing is not the right validation path.
if normalized == "openai-codex":
try:
codex_models = provider_model_ids("openai-codex")
except Exception:
codex_models = []
if codex_models:
if requested_for_lookup in set(codex_models):
return {
"accepted": True,
"persist": True,
"recognized": True,
"message": None,
}
suggestions = get_close_matches(requested_for_lookup, codex_models, n=3, cutoff=0.5)
suggestion_text = ""
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": True,
"persist": True,
"recognized": False,
"message": (
f"Note: `{requested}` was not found in the OpenAI Codex model listing. "
f"It may still work if your account has access to it."
f"{suggestion_text}"
),
}
# Probe the live API to check if the model actually exists
api_models = fetch_api_models(api_key, base_url)
-2
View File
@@ -143,7 +143,6 @@ def _tts_label(current_provider: str) -> str:
"openai": "OpenAI TTS",
"elevenlabs": "ElevenLabs",
"edge": "Edge TTS",
"mistral": "Mistral Voxtral TTS",
"neutts": "NeuTTS",
}
return mapping.get(current_provider or "edge", current_provider or "Edge TTS")
@@ -310,7 +309,6 @@ def get_nous_subscription_features(
tts_current_provider in {"edge", "neutts"}
or (tts_current_provider == "openai" and (managed_tts_available or direct_openai_tts))
or (tts_current_provider == "elevenlabs" and direct_elevenlabs)
or (tts_current_provider == "mistral" and bool(get_env_value("MISTRAL_API_KEY")))
)
tts_active = bool(tts_tool_enabled and tts_available)
-46
View File
@@ -1,46 +0,0 @@
"""
Shared platform registry for Hermes Agent.
Single source of truth for platform metadata consumed by both
skills_config (label display) and tools_config (default toolset
resolution). Import ``PLATFORMS`` from here instead of maintaining
duplicate dicts in each module.
"""
from collections import OrderedDict
from typing import NamedTuple
class PlatformInfo(NamedTuple):
"""Metadata for a single platform entry."""
label: str
default_toolset: str
# Ordered so that TUI menus are deterministic.
PLATFORMS: OrderedDict[str, PlatformInfo] = OrderedDict([
("cli", PlatformInfo(label="🖥️ CLI", default_toolset="hermes-cli")),
("telegram", PlatformInfo(label="📱 Telegram", default_toolset="hermes-telegram")),
("discord", PlatformInfo(label="💬 Discord", default_toolset="hermes-discord")),
("slack", PlatformInfo(label="💼 Slack", default_toolset="hermes-slack")),
("whatsapp", PlatformInfo(label="📱 WhatsApp", default_toolset="hermes-whatsapp")),
("signal", PlatformInfo(label="📡 Signal", default_toolset="hermes-signal")),
("bluebubbles", PlatformInfo(label="💙 BlueBubbles", default_toolset="hermes-bluebubbles")),
("email", PlatformInfo(label="📧 Email", default_toolset="hermes-email")),
("homeassistant", PlatformInfo(label="🏠 Home Assistant", default_toolset="hermes-homeassistant")),
("mattermost", PlatformInfo(label="💬 Mattermost", default_toolset="hermes-mattermost")),
("matrix", PlatformInfo(label="💬 Matrix", default_toolset="hermes-matrix")),
("dingtalk", PlatformInfo(label="💬 DingTalk", default_toolset="hermes-dingtalk")),
("feishu", PlatformInfo(label="🪽 Feishu", default_toolset="hermes-feishu")),
("wecom", PlatformInfo(label="💬 WeCom", default_toolset="hermes-wecom")),
("wecom_callback", PlatformInfo(label="💬 WeCom Callback", default_toolset="hermes-wecom-callback")),
("weixin", PlatformInfo(label="💬 Weixin", default_toolset="hermes-weixin")),
("webhook", PlatformInfo(label="🔗 Webhook", default_toolset="hermes-webhook")),
("api_server", PlatformInfo(label="🌐 API Server", default_toolset="hermes-api-server")),
])
def platform_label(key: str, default: str = "") -> str:
"""Return the display label for a platform key, or *default*."""
info = PLATFORMS.get(key)
return info.label if info is not None else default
+2 -39
View File
@@ -201,7 +201,8 @@ class PluginContext:
The *setup_fn* receives an argparse subparser and should add any
arguments/sub-subparsers. If *handler_fn* is provided it is set
as the default dispatch function via ``set_defaults(func=...)``."""
as the default dispatch function via ``set_defaults(func=...)``.
"""
self._manager._cli_commands[name] = {
"name": name,
"help": help,
@@ -212,38 +213,6 @@ class PluginContext:
}
logger.debug("Plugin %s registered CLI command: %s", self.manifest.name, name)
# -- context engine registration -----------------------------------------
def register_context_engine(self, engine) -> None:
"""Register a context engine to replace the built-in ContextCompressor.
Only one context engine plugin is allowed. If a second plugin tries
to register one, it is rejected with a warning.
The engine must be an instance of ``agent.context_engine.ContextEngine``.
"""
if self._manager._context_engine is not None:
logger.warning(
"Plugin '%s' tried to register a context engine, but one is "
"already registered. Only one context engine plugin is allowed.",
self.manifest.name,
)
return
# Defer the import to avoid circular deps at module level
from agent.context_engine import ContextEngine
if not isinstance(engine, ContextEngine):
logger.warning(
"Plugin '%s' tried to register a context engine that does not "
"inherit from ContextEngine. Ignoring.",
self.manifest.name,
)
return
self._manager._context_engine = engine
logger.info(
"Plugin '%s' registered context engine: %s",
self.manifest.name, engine.name,
)
# -- hook registration --------------------------------------------------
def register_hook(self, hook_name: str, callback: Callable) -> None:
@@ -276,7 +245,6 @@ class PluginManager:
self._hooks: Dict[str, List[Callable]] = {}
self._plugin_tool_names: Set[str] = set()
self._cli_commands: Dict[str, dict] = {}
self._context_engine = None # Set by a plugin via register_context_engine()
self._discovered: bool = False
self._cli_ref = None # Set by CLI after plugin discovery
@@ -598,11 +566,6 @@ def get_plugin_cli_commands() -> Dict[str, dict]:
return dict(get_plugin_manager()._cli_commands)
def get_plugin_context_engine():
"""Return the plugin-registered context engine, or None."""
return get_plugin_manager()._context_engine
def get_plugin_toolsets() -> List[tuple]:
"""Return plugin toolsets as ``(key, label, description)`` tuples.
+29 -467
View File
@@ -531,7 +531,7 @@ def cmd_disable(name: str) -> None:
disabled.add(name)
_save_disabled_set(disabled)
console.print(f"[yellow]\u2298[/yellow] Plugin [bold]{name}[/bold] disabled. Takes effect on next session.")
console.print(f"[yellow][/yellow] Plugin [bold]{name}[/bold] disabled. Takes effect on next session.")
def cmd_list() -> None:
@@ -594,152 +594,8 @@ def cmd_list() -> None:
console.print("[dim]Enable/disable:[/dim] hermes plugins enable/disable <name>")
# ---------------------------------------------------------------------------
# Provider plugin discovery helpers
# ---------------------------------------------------------------------------
def _discover_memory_providers() -> list[tuple[str, str]]:
"""Return [(name, description), ...] for available memory providers."""
try:
from plugins.memory import discover_memory_providers
return [(name, desc) for name, desc, _avail in discover_memory_providers()]
except Exception:
return []
def _discover_context_engines() -> list[tuple[str, str]]:
"""Return [(name, description), ...] for available context engines."""
try:
from plugins.context_engine import discover_context_engines
return [(name, desc) for name, desc, _avail in discover_context_engines()]
except Exception:
return []
def _get_current_memory_provider() -> str:
"""Return the current memory.provider from config (empty = built-in)."""
try:
from hermes_cli.config import load_config
config = load_config()
return config.get("memory", {}).get("provider", "") or ""
except Exception:
return ""
def _get_current_context_engine() -> str:
"""Return the current context.engine from config."""
try:
from hermes_cli.config import load_config
config = load_config()
return config.get("context", {}).get("engine", "compressor") or "compressor"
except Exception:
return "compressor"
def _save_memory_provider(name: str) -> None:
"""Persist memory.provider to config.yaml."""
from hermes_cli.config import load_config, save_config
config = load_config()
if "memory" not in config:
config["memory"] = {}
config["memory"]["provider"] = name
save_config(config)
def _save_context_engine(name: str) -> None:
"""Persist context.engine to config.yaml."""
from hermes_cli.config import load_config, save_config
config = load_config()
if "context" not in config:
config["context"] = {}
config["context"]["engine"] = name
save_config(config)
def _configure_memory_provider() -> bool:
"""Launch a radio picker for memory providers. Returns True if changed."""
from hermes_cli.curses_ui import curses_radiolist
current = _get_current_memory_provider()
providers = _discover_memory_providers()
# Build items: "built-in" first, then discovered providers
items = ["built-in (default)"]
names = [""] # empty string = built-in
selected = 0
for name, desc in providers:
names.append(name)
label = f"{name} \u2014 {desc}" if desc else name
items.append(label)
if name == current:
selected = len(items) - 1
# If current provider isn't in discovered list, add it
if current and current not in names:
names.append(current)
items.append(f"{current} (not found)")
selected = len(items) - 1
choice = curses_radiolist(
title="Memory Provider (select one)",
items=items,
selected=selected,
)
new_provider = names[choice]
if new_provider != current:
_save_memory_provider(new_provider)
return True
return False
def _configure_context_engine() -> bool:
"""Launch a radio picker for context engines. Returns True if changed."""
from hermes_cli.curses_ui import curses_radiolist
current = _get_current_context_engine()
engines = _discover_context_engines()
# Build items: "compressor" first (built-in), then discovered engines
items = ["compressor (default)"]
names = ["compressor"]
selected = 0
for name, desc in engines:
names.append(name)
label = f"{name} \u2014 {desc}" if desc else name
items.append(label)
if name == current:
selected = len(items) - 1
# If current engine isn't in discovered list and isn't compressor, add it
if current != "compressor" and current not in names:
names.append(current)
items.append(f"{current} (not found)")
selected = len(items) - 1
choice = curses_radiolist(
title="Context Engine (select one)",
items=items,
selected=selected,
)
new_engine = names[choice]
if new_engine != current:
_save_context_engine(new_engine)
return True
return False
# ---------------------------------------------------------------------------
# Composite plugins UI
# ---------------------------------------------------------------------------
def cmd_toggle() -> None:
"""Interactive composite UI — general plugins + provider plugin categories."""
"""Interactive curses checklist to enable/disable installed plugins."""
from rich.console import Console
try:
@@ -750,13 +606,18 @@ def cmd_toggle() -> None:
console = Console()
plugins_dir = _plugins_dir()
# -- General plugins discovery --
dirs = sorted(d for d in plugins_dir.iterdir() if d.is_dir())
if not dirs:
console.print("[dim]No plugins installed.[/dim]")
console.print("[dim]Install with:[/dim] hermes plugins install owner/repo")
return
disabled = _get_disabled_set()
plugin_names = []
plugin_labels = []
plugin_selected = set()
# Build items list: "name — description" for display
names = []
labels = []
selected = set()
for i, d in enumerate(dirs):
manifest_file = d / "plugin.yaml"
@@ -772,335 +633,36 @@ def cmd_toggle() -> None:
except Exception:
pass
plugin_names.append(name)
label = f"{name} \u2014 {description}" if description else name
plugin_labels.append(label)
names.append(name)
label = f"{name} {description}" if description else name
labels.append(label)
if name not in disabled and d.name not in disabled:
plugin_selected.add(i)
selected.add(i)
# -- Provider categories --
current_memory = _get_current_memory_provider() or "built-in"
current_context = _get_current_context_engine()
categories = [
("Memory Provider", current_memory, _configure_memory_provider),
("Context Engine", current_context, _configure_context_engine),
]
from hermes_cli.curses_ui import curses_checklist
has_plugins = bool(plugin_names)
has_categories = bool(categories)
result = curses_checklist(
title="Plugins — toggle enabled/disabled",
items=labels,
selected=selected,
)
if not has_plugins and not has_categories:
console.print("[dim]No plugins installed and no provider categories available.[/dim]")
console.print("[dim]Install with:[/dim] hermes plugins install owner/repo")
return
# Non-TTY fallback
if not sys.stdin.isatty():
console.print("[dim]Interactive mode requires a terminal.[/dim]")
return
# Launch the composite curses UI
try:
import curses
_run_composite_ui(curses, plugin_names, plugin_labels, plugin_selected,
disabled, categories, console)
except ImportError:
_run_composite_fallback(plugin_names, plugin_labels, plugin_selected,
disabled, categories, console)
def _run_composite_ui(curses, plugin_names, plugin_labels, plugin_selected,
disabled, categories, console):
"""Custom curses screen with checkboxes + category action rows."""
from hermes_cli.curses_ui import flush_stdin
chosen = set(plugin_selected)
n_plugins = len(plugin_names)
# Total rows: plugins + separator + categories
# separator is not navigable
n_categories = len(categories)
total_items = n_plugins + n_categories # navigable items
result_holder = {"plugins_changed": False, "providers_changed": False}
def _draw(stdscr):
curses.curs_set(0)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
curses.init_pair(3, curses.COLOR_CYAN, -1)
curses.init_pair(4, 8, -1) # dim gray
cursor = 0
scroll_offset = 0
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
# Header
try:
hattr = curses.A_BOLD
if curses.has_colors():
hattr |= curses.color_pair(2)
stdscr.addnstr(0, 0, "Plugins", max_x - 1, hattr)
stdscr.addnstr(
1, 0,
" \u2191\u2193 navigate SPACE toggle ENTER configure/confirm ESC done",
max_x - 1, curses.A_DIM,
)
except curses.error:
pass
# Build display rows
# Row layout:
# [plugins section header] (not navigable, skipped in scroll math)
# plugin checkboxes (navigable, indices 0..n_plugins-1)
# [separator] (not navigable)
# [categories section header] (not navigable)
# category action rows (navigable, indices n_plugins..total_items-1)
visible_rows = max_y - 4
if cursor < scroll_offset:
scroll_offset = cursor
elif cursor >= scroll_offset + visible_rows:
scroll_offset = cursor - visible_rows + 1
y = 3 # start drawing after header
# Determine which items are visible based on scroll
# We need to map logical cursor positions to screen rows
# accounting for non-navigable separator/headers
draw_row = 0 # tracks navigable item index
# --- General Plugins section ---
if n_plugins > 0:
# Section header
if y < max_y - 1:
try:
sattr = curses.A_BOLD
if curses.has_colors():
sattr |= curses.color_pair(2)
stdscr.addnstr(y, 0, " General Plugins", max_x - 1, sattr)
except curses.error:
pass
y += 1
for i in range(n_plugins):
if y >= max_y - 1:
break
check = "\u2713" if i in chosen else " "
arrow = "\u2192" if i == cursor else " "
line = f" {arrow} [{check}] {plugin_labels[i]}"
attr = curses.A_NORMAL
if i == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(1)
try:
stdscr.addnstr(y, 0, line, max_x - 1, attr)
except curses.error:
pass
y += 1
# --- Separator ---
if y < max_y - 1:
y += 1 # blank line
# --- Provider Plugins section ---
if n_categories > 0 and y < max_y - 1:
try:
sattr = curses.A_BOLD
if curses.has_colors():
sattr |= curses.color_pair(2)
stdscr.addnstr(y, 0, " Provider Plugins", max_x - 1, sattr)
except curses.error:
pass
y += 1
for ci, (cat_name, cat_current, _cat_fn) in enumerate(categories):
if y >= max_y - 1:
break
cat_idx = n_plugins + ci
arrow = "\u2192" if cat_idx == cursor else " "
line = f" {arrow} {cat_name:<24} \u25b8 {cat_current}"
attr = curses.A_NORMAL
if cat_idx == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(3)
try:
stdscr.addnstr(y, 0, line, max_x - 1, attr)
except curses.error:
pass
y += 1
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
if total_items > 0:
cursor = (cursor - 1) % total_items
elif key in (curses.KEY_DOWN, ord("j")):
if total_items > 0:
cursor = (cursor + 1) % total_items
elif key == ord(" "):
if cursor < n_plugins:
# Toggle general plugin
chosen.symmetric_difference_update({cursor})
else:
# Provider category — launch sub-screen
ci = cursor - n_plugins
if 0 <= ci < n_categories:
curses.endwin()
_cat_name, _cat_cur, cat_fn = categories[ci]
changed = cat_fn()
if changed:
result_holder["providers_changed"] = True
# Refresh current values
categories[ci] = (
_cat_name,
_get_current_memory_provider() or "built-in" if ci == 0
else _get_current_context_engine(),
cat_fn,
)
# Re-enter curses
stdscr = curses.initscr()
curses.noecho()
curses.cbreak()
stdscr.keypad(True)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
curses.init_pair(3, curses.COLOR_CYAN, -1)
curses.init_pair(4, 8, -1)
curses.curs_set(0)
elif key in (curses.KEY_ENTER, 10, 13):
if cursor < n_plugins:
# ENTER on a plugin checkbox — confirm and exit
result_holder["plugins_changed"] = True
return
else:
# ENTER on a category — same as SPACE, launch sub-screen
ci = cursor - n_plugins
if 0 <= ci < n_categories:
curses.endwin()
_cat_name, _cat_cur, cat_fn = categories[ci]
changed = cat_fn()
if changed:
result_holder["providers_changed"] = True
categories[ci] = (
_cat_name,
_get_current_memory_provider() or "built-in" if ci == 0
else _get_current_context_engine(),
cat_fn,
)
stdscr = curses.initscr()
curses.noecho()
curses.cbreak()
stdscr.keypad(True)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
curses.init_pair(3, curses.COLOR_CYAN, -1)
curses.init_pair(4, 8, -1)
curses.curs_set(0)
elif key in (27, ord("q")):
# Save plugin changes on exit
result_holder["plugins_changed"] = True
return
curses.wrapper(_draw)
flush_stdin()
# Persist general plugin changes
# Compute new disabled set from deselected items
new_disabled = set()
for i, name in enumerate(plugin_names):
if i not in chosen:
for i, name in enumerate(names):
if i not in result:
new_disabled.add(name)
if new_disabled != disabled:
_save_disabled_set(new_disabled)
enabled_count = len(plugin_names) - len(new_disabled)
enabled_count = len(names) - len(new_disabled)
console.print(
f"\n[green]\u2713[/green] General plugins: {enabled_count} enabled, "
f"{len(new_disabled)} disabled."
f"\n[green]✓[/green] {enabled_count} enabled, {len(new_disabled)} disabled. "
f"Takes effect on next session."
)
elif n_plugins > 0:
console.print("\n[dim]General plugins unchanged.[/dim]")
if result_holder["providers_changed"]:
new_memory = _get_current_memory_provider() or "built-in"
new_context = _get_current_context_engine()
console.print(
f"[green]\u2713[/green] Memory provider: [bold]{new_memory}[/bold] "
f"Context engine: [bold]{new_context}[/bold]"
)
if n_plugins > 0 or result_holder["providers_changed"]:
console.print("[dim]Changes take effect on next session.[/dim]")
console.print()
def _run_composite_fallback(plugin_names, plugin_labels, plugin_selected,
disabled, categories, console):
"""Text-based fallback for the composite plugins UI."""
from hermes_cli.colors import Colors, color
print(color("\n Plugins", Colors.YELLOW))
# General plugins
if plugin_names:
chosen = set(plugin_selected)
print(color("\n General Plugins", Colors.YELLOW))
print(color(" Toggle by number, Enter to confirm.\n", Colors.DIM))
while True:
for i, label in enumerate(plugin_labels):
marker = color("[\u2713]", Colors.GREEN) if i in chosen else "[ ]"
print(f" {marker} {i + 1:>2}. {label}")
print()
try:
val = input(color(" Toggle # (or Enter to confirm): ", Colors.DIM)).strip()
if not val:
break
idx = int(val) - 1
if 0 <= idx < len(plugin_names):
chosen.symmetric_difference_update({idx})
except (ValueError, KeyboardInterrupt, EOFError):
return
print()
new_disabled = set()
for i, name in enumerate(plugin_names):
if i not in chosen:
new_disabled.add(name)
if new_disabled != disabled:
_save_disabled_set(new_disabled)
# Provider categories
if categories:
print(color("\n Provider Plugins", Colors.YELLOW))
for ci, (cat_name, cat_current, cat_fn) in enumerate(categories):
print(f" {ci + 1}. {cat_name} [{cat_current}]")
print()
try:
val = input(color(" Configure # (or Enter to skip): ", Colors.DIM)).strip()
if val:
ci = int(val) - 1
if 0 <= ci < len(categories):
categories[ci][2]() # call the configure function
except (ValueError, KeyboardInterrupt, EOFError):
pass
print()
else:
console.print("\n[dim]No changes.[/dim]")
def plugins_command(args) -> None:
+6 -31
View File
@@ -42,11 +42,6 @@ _PROFILE_DIRS = [
"plans",
"workspace",
"cron",
# Per-profile HOME for subprocesses: isolates system tool configs (git,
# ssh, gh, npm …) so credentials don't bleed between profiles. In Docker
# this also ensures tool configs land inside the persistent volume.
# See hermes_constants.get_subprocess_home() and issue #4426.
"home",
]
# Files copied during --clone (if they exist in the source)
@@ -120,26 +115,16 @@ _HERMES_SUBCOMMANDS = frozenset({
def _get_profiles_root() -> Path:
"""Return the directory where named profiles are stored.
Anchored to the hermes root, NOT to the current HERMES_HOME
(which may itself be a profile). This ensures ``coder profile list``
can see all profiles.
In Docker/custom deployments where HERMES_HOME points outside
``~/.hermes``, profiles live under ``HERMES_HOME/profiles/`` so
they persist on the mounted volume.
Always ``~/.hermes/profiles/`` anchored to the user's home,
NOT to the current HERMES_HOME (which may itself be a profile).
This ensures ``coder profile list`` can see all profiles.
"""
return _get_default_hermes_home() / "profiles"
return Path.home() / ".hermes" / "profiles"
def _get_default_hermes_home() -> Path:
"""Return the default (pre-profile) HERMES_HOME path.
In standard deployments this is ``~/.hermes``.
In Docker/custom deployments where HERMES_HOME is outside ``~/.hermes``
(e.g. ``/opt/data``), returns HERMES_HOME directly.
"""
from hermes_constants import get_default_hermes_root
return get_default_hermes_root()
"""Return the default (pre-profile) HERMES_HOME path."""
return Path.home() / ".hermes"
def _get_active_profile_path() -> Path:
@@ -459,16 +444,6 @@ def create_profile(
dst.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(src, dst)
# Seed a default SOUL.md so the user has a file to customize immediately.
# Skipped when the profile already has one (from --clone / --clone-all).
soul_path = profile_dir / "SOUL.md"
if not soul_path.exists():
try:
from hermes_cli.default_soul import DEFAULT_SOUL_MD
soul_path.write_text(DEFAULT_SOUL_MD, encoding="utf-8")
except Exception:
pass # best-effort — don't fail profile creation over this
return profile_dir
+2 -82
View File
@@ -88,11 +88,11 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
base_url_env_var="KIMI_BASE_URL",
),
"minimax": HermesOverlay(
transport="anthropic_messages",
transport="openai_chat",
base_url_env_var="MINIMAX_BASE_URL",
),
"minimax-cn": HermesOverlay(
transport="anthropic_messages",
transport="openai_chat",
base_url_env_var="MINIMAX_CN_BASE_URL",
),
"deepseek": HermesOverlay(
@@ -127,15 +127,6 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
is_aggregator=True,
base_url_env_var="HF_BASE_URL",
),
"xai": HermesOverlay(
transport="openai_chat",
base_url_override="https://api.x.ai/v1",
base_url_env_var="XAI_BASE_URL",
),
"xiaomi": HermesOverlay(
transport="openai_chat",
base_url_env_var="XIAOMI_BASE_URL",
),
}
@@ -172,10 +163,6 @@ ALIASES: Dict[str, str] = {
"z.ai": "zai",
"zhipu": "zai",
# xai
"x-ai": "xai",
"x.ai": "xai",
# kimi-for-coding (models.dev ID)
"kimi": "kimi-for-coding",
"kimi-coding": "kimi-for-coding",
@@ -226,10 +213,6 @@ ALIASES: Dict[str, str] = {
"hugging-face": "huggingface",
"huggingface-hub": "huggingface",
# xiaomi
"mimo": "xiaomi",
"xiaomi-mimo": "xiaomi",
# Local server aliases → virtual "local" concept (resolved via user config)
"lmstudio": "lmstudio",
"lm-studio": "lmstudio",
@@ -250,7 +233,6 @@ _LABEL_OVERRIDES: Dict[str, str] = {
"nous": "Nous Portal",
"openai-codex": "OpenAI Codex",
"copilot-acp": "GitHub Copilot ACP",
"xiaomi": "Xiaomi MiMo",
"local": "Local endpoint",
}
@@ -359,7 +341,6 @@ def get_label(provider_id: str) -> str:
def is_aggregator(provider: str) -> bool:
"""Return True when the provider is a multi-model aggregator."""
pdef = get_provider(provider)
@@ -430,64 +411,9 @@ def resolve_user_provider(name: str, user_config: Dict[str, Any]) -> Optional[Pr
)
def custom_provider_slug(display_name: str) -> str:
"""Build a canonical slug for a custom_providers entry.
Matches the convention used by runtime_provider and credential_pool
(``custom:<normalized-name>``). Centralised here so all call-sites
produce identical slugs.
"""
return "custom:" + display_name.strip().lower().replace(" ", "-")
def resolve_custom_provider(
name: str,
custom_providers: Optional[List[Dict[str, Any]]],
) -> Optional[ProviderDef]:
"""Resolve a provider from the user's config.yaml ``custom_providers`` list."""
if not custom_providers or not isinstance(custom_providers, list):
return None
requested = (name or "").strip().lower()
if not requested:
return None
for entry in custom_providers:
if not isinstance(entry, dict):
continue
display_name = (entry.get("name") or "").strip()
api_url = (
entry.get("base_url", "")
or entry.get("url", "")
or entry.get("api", "")
or ""
).strip()
if not display_name or not api_url:
continue
slug = custom_provider_slug(display_name)
if requested not in {display_name.lower(), slug}:
continue
return ProviderDef(
id=slug,
name=display_name,
transport="openai_chat",
api_key_env_vars=(),
base_url=api_url,
is_aggregator=False,
auth_type="api_key",
source="user-config",
)
return None
def resolve_provider_full(
name: str,
user_providers: Optional[Dict[str, Any]] = None,
custom_providers: Optional[List[Dict[str, Any]]] = None,
) -> Optional[ProviderDef]:
"""Full resolution chain: built-in → models.dev → user config.
@@ -496,7 +422,6 @@ def resolve_provider_full(
Args:
name: Provider name or alias.
user_providers: The ``providers:`` dict from config.yaml (optional).
custom_providers: The ``custom_providers:`` list from config.yaml (optional).
Returns:
ProviderDef if found, else None.
@@ -519,11 +444,6 @@ def resolve_provider_full(
if user_pdef is not None:
return user_pdef
# 2b. Saved custom providers from config
custom_pdef = resolve_custom_provider(name, custom_providers)
if custom_pdef is not None:
return custom_pdef
# 3. Try models.dev directly (for providers not in our ALIASES)
try:
from agent.models_dev import get_provider_info as _mdev_provider
+1 -30
View File
@@ -16,7 +16,6 @@ from hermes_cli.auth import (
DEFAULT_CODEX_BASE_URL,
DEFAULT_QWEN_BASE_URL,
PROVIDER_REGISTRY,
_agent_key_is_usable,
format_auth_error,
resolve_provider,
resolve_nous_runtime_credentials,
@@ -304,9 +303,6 @@ def _get_named_custom_provider(requested_provider: str) -> Optional[Dict[str, An
api_mode = _parse_api_mode(entry.get("api_mode"))
if api_mode:
result["api_mode"] = api_mode
model_name = str(entry.get("model", "") or "").strip()
if model_name:
result["model"] = model_name
return result
return None
@@ -332,11 +328,6 @@ def _resolve_named_custom_runtime(
# Check if a credential pool exists for this custom endpoint
pool_result = _try_resolve_from_custom_pool(base_url, "custom", custom_provider.get("api_mode"))
if pool_result:
# Propagate the model name even when using pooled credentials —
# the pool doesn't know about the custom_providers model field.
model_name = custom_provider.get("model")
if model_name:
pool_result["model"] = model_name
return pool_result
api_key_candidates = [
@@ -347,7 +338,7 @@ def _resolve_named_custom_runtime(
]
api_key = next((candidate for candidate in api_key_candidates if has_usable_secret(candidate)), "")
result = {
return {
"provider": "custom",
"api_mode": custom_provider.get("api_mode")
or _detect_api_mode_for_url(base_url)
@@ -356,11 +347,6 @@ def _resolve_named_custom_runtime(
"api_key": api_key or "no-key-required",
"source": f"custom_provider:{custom_provider.get('name', requested_provider)}",
}
# Propagate the model name so callers can override self.model when the
# provider name differs from the actual model string the API expects.
if custom_provider.get("model"):
result["model"] = custom_provider["model"]
return result
def _resolve_openrouter_runtime(
@@ -658,21 +644,6 @@ def resolve_runtime_provider(
getattr(entry, "runtime_api_key", None)
or getattr(entry, "access_token", "")
)
# For Nous, the pool entry's runtime_api_key is the agent_key — a
# short-lived inference credential (~30 min TTL). The pool doesn't
# refresh it during selection (that would trigger network calls in
# non-runtime contexts like `hermes auth list`). If the key is
# expired, clear pool_api_key so we fall through to
# resolve_nous_runtime_credentials() which handles refresh + mint.
if provider == "nous" and entry is not None and pool_api_key:
min_ttl = max(60, int(os.getenv("HERMES_NOUS_MIN_KEY_TTL_SECONDS", "1800")))
nous_state = {
"agent_key": getattr(entry, "agent_key", None),
"agent_key_expires_at": getattr(entry, "agent_key_expires_at", None),
}
if not _agent_key_is_usable(nous_state, min_ttl):
logger.debug("Nous pool entry agent_key expired/missing, falling through to runtime resolution")
pool_api_key = ""
if entry is not None and pool_api_key:
return _resolve_runtime_from_pool_entry(
provider=provider,
+134 -175
View File
@@ -16,7 +16,6 @@ import logging
import os
import shutil
import sys
import copy
from pathlib import Path
from typing import Optional, Dict, Any
@@ -104,10 +103,10 @@ _DEFAULT_PROVIDER_MODELS = {
"gemini-2.5-pro", "gemini-2.5-flash", "gemini-2.5-flash-lite",
"gemma-4-31b-it", "gemma-4-26b-it",
],
"zai": ["glm-5.1", "glm-5", "glm-4.7", "glm-4.5", "glm-4.5-flash"],
"zai": ["glm-5", "glm-4.7", "glm-4.5", "glm-4.5-flash"],
"kimi-coding": ["kimi-k2.5", "kimi-k2-thinking", "kimi-k2-turbo-preview"],
"minimax": ["MiniMax-M2.7", "MiniMax-M2.5", "MiniMax-M2.1", "MiniMax-M2"],
"minimax-cn": ["MiniMax-M2.7", "MiniMax-M2.5", "MiniMax-M2.1", "MiniMax-M2"],
"minimax": ["MiniMax-M1", "MiniMax-M1-40k", "MiniMax-M1-80k", "MiniMax-M1-128k", "MiniMax-M1-256k", "MiniMax-M2.5", "MiniMax-M2.7"],
"minimax-cn": ["MiniMax-M1", "MiniMax-M1-40k", "MiniMax-M1-80k", "MiniMax-M1-128k", "MiniMax-M1-256k", "MiniMax-M2.5", "MiniMax-M2.7"],
"ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
"kilocode": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5.4", "google/gemini-3-pro-preview", "google/gemini-3-flash-preview"],
"opencode-zen": ["gpt-5.4", "gpt-5.3-codex", "claude-sonnet-4-6", "gemini-3-flash", "glm-5", "kimi-k2.5", "minimax-m2.7"],
@@ -176,7 +175,6 @@ def _setup_copilot_reasoning_selection(
# Import config helpers
from hermes_cli.config import (
DEFAULT_CONFIG,
get_hermes_home,
get_config_path,
get_env_path,
@@ -197,12 +195,24 @@ def print_header(title: str):
print(color(f"{title}", Colors.CYAN, Colors.BOLD))
from hermes_cli.cli_output import ( # noqa: E402
print_error,
print_info,
print_success,
print_warning,
)
def print_info(text: str):
"""Print info text."""
print(color(f" {text}", Colors.DIM))
def print_success(text: str):
"""Print success message."""
print(color(f"{text}", Colors.GREEN))
def print_warning(text: str):
"""Print warning message."""
print(color(f"{text}", Colors.YELLOW))
def print_error(text: str):
"""Print error message."""
print(color(f"{text}", Colors.RED))
def is_interactive_stdin() -> bool:
@@ -257,9 +267,78 @@ def prompt(question: str, default: str = None, password: bool = False) -> str:
def _curses_prompt_choice(question: str, choices: list, default: int = 0) -> int:
"""Single-select menu using curses. Delegates to curses_radiolist."""
from hermes_cli.curses_ui import curses_radiolist
return curses_radiolist(question, choices, selected=default, cancel_returns=-1)
"""Single-select menu using curses to avoid simple_term_menu rendering bugs."""
try:
import curses
result_holder = [default]
def _curses_menu(stdscr):
curses.curs_set(0)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
cursor = default
scroll_offset = 0
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
# Rows available for list items: rows 2..(max_y-2) inclusive.
visible = max(1, max_y - 3)
# Scroll the viewport so the cursor is always visible.
if cursor < scroll_offset:
scroll_offset = cursor
elif cursor >= scroll_offset + visible:
scroll_offset = cursor - visible + 1
scroll_offset = max(0, min(scroll_offset, max(0, len(choices) - visible)))
try:
stdscr.addnstr(
0,
0,
question,
max_x - 1,
curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0),
)
except curses.error:
pass
for row, i in enumerate(range(scroll_offset, min(scroll_offset + visible, len(choices)))):
y = row + 2
if y >= max_y - 1:
break
arrow = "" if i == cursor else " "
line = f" {arrow} {choices[i]}"
attr = curses.A_NORMAL
if i == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(1)
try:
stdscr.addnstr(y, 0, line, max_x - 1, attr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
cursor = (cursor - 1) % len(choices)
elif key in (curses.KEY_DOWN, ord("j")):
cursor = (cursor + 1) % len(choices)
elif key in (curses.KEY_ENTER, 10, 13):
result_holder[0] = cursor
return
elif key in (27, ord("q")):
return
curses.wrapper(_curses_menu)
return result_holder[0]
except Exception:
return -1
@@ -474,8 +553,6 @@ def _print_setup_summary(config: dict, hermes_home):
tool_status.append(("Text-to-Speech (OpenAI)", True, None))
elif tts_provider == "minimax" and get_env_value("MINIMAX_API_KEY"):
tool_status.append(("Text-to-Speech (MiniMax)", True, None))
elif tts_provider == "mistral" and get_env_value("MISTRAL_API_KEY"):
tool_status.append(("Text-to-Speech (Mistral Voxtral)", True, None))
elif tts_provider == "neutts":
try:
import importlib.util
@@ -703,10 +780,8 @@ def setup_model_provider(config: dict, *, quick: bool = False):
# changes with stale values (#4172).
_refreshed = load_config()
config["model"] = _refreshed.get("model", config.get("model"))
if "custom_providers" in _refreshed:
if _refreshed.get("custom_providers"):
config["custom_providers"] = _refreshed["custom_providers"]
else:
config.pop("custom_providers", None)
# Derive the selected provider for downstream steps (vision setup).
selected_provider = None
@@ -790,6 +865,8 @@ def setup_model_provider(config: dict, *, quick: bool = False):
strategy_value = ["fill_first", "round_robin", "random"][strategy_idx]
_set_credential_pool_strategy(config, selected_provider, strategy_value)
print_success(f"Saved {selected_provider} rotation strategy: {strategy_value}")
else:
_set_credential_pool_strategy(config, selected_provider, "fill_first")
except Exception as exc:
logger.debug("Could not configure same-provider fallback in setup: %s", exc)
@@ -963,7 +1040,6 @@ def _setup_tts_provider(config: dict):
"elevenlabs": "ElevenLabs",
"openai": "OpenAI TTS",
"minimax": "MiniMax TTS",
"mistral": "Mistral Voxtral TTS",
"neutts": "NeuTTS",
}
current_label = provider_labels.get(current_provider, current_provider)
@@ -984,11 +1060,10 @@ def _setup_tts_provider(config: dict):
"ElevenLabs (premium quality, needs API key)",
"OpenAI TTS (good quality, needs API key)",
"MiniMax TTS (high quality with voice cloning, needs API key)",
"Mistral Voxtral TTS (multilingual, native Opus, needs API key)",
"NeuTTS (local on-device, free, ~300MB model download)",
]
)
providers.extend(["edge", "elevenlabs", "openai", "minimax", "mistral", "neutts"])
providers.extend(["edge", "elevenlabs", "openai", "minimax", "neutts"])
choices.append(f"Keep current ({current_label})")
keep_current_idx = len(choices) - 1
idx = prompt_choice("Select TTS provider:", choices, keep_current_idx)
@@ -1066,18 +1141,6 @@ def _setup_tts_provider(config: dict):
print_warning("No API key provided. Falling back to Edge TTS.")
selected = "edge"
elif selected == "mistral":
existing = get_env_value("MISTRAL_API_KEY")
if not existing:
print()
api_key = prompt("Mistral API key for TTS", password=True)
if api_key:
save_env_value("MISTRAL_API_KEY", api_key)
print_success("Mistral TTS API key saved")
else:
print_warning("No API key provided. Falling back to Edge TTS.")
selected = "edge"
# Save the selection
if "tts" not in config:
config["tts"] = {}
@@ -1858,9 +1921,9 @@ def _setup_matrix():
save_env_value("MATRIX_ENCRYPTION", "true")
print_success("E2EE enabled")
matrix_pkg = "mautrix[encryption]" if want_e2ee else "mautrix"
matrix_pkg = "matrix-nio[e2e]" if want_e2ee else "matrix-nio"
try:
__import__("mautrix")
__import__("nio")
except ImportError:
print_info(f"Installing {matrix_pkg}...")
import subprocess
@@ -1963,54 +2026,6 @@ def _setup_whatsapp():
print_info("or personal self-chat) and pair via QR code.")
def _setup_weixin():
"""Configure Weixin (personal WeChat) via iLink Bot API QR login."""
from hermes_cli.gateway import _setup_weixin as _gateway_setup_weixin
_gateway_setup_weixin()
def _setup_signal():
"""Configure Signal via gateway setup."""
from hermes_cli.gateway import _setup_signal as _gateway_setup_signal
_gateway_setup_signal()
def _setup_email():
"""Configure Email via gateway setup."""
from hermes_cli.gateway import _setup_email as _gateway_setup_email
_gateway_setup_email()
def _setup_sms():
"""Configure SMS (Twilio) via gateway setup."""
from hermes_cli.gateway import _setup_sms as _gateway_setup_sms
_gateway_setup_sms()
def _setup_dingtalk():
"""Configure DingTalk via gateway setup."""
from hermes_cli.gateway import _setup_dingtalk as _gateway_setup_dingtalk
_gateway_setup_dingtalk()
def _setup_feishu():
"""Configure Feishu / Lark via gateway setup."""
from hermes_cli.gateway import _setup_feishu as _gateway_setup_feishu
_gateway_setup_feishu()
def _setup_wecom():
"""Configure WeCom (Enterprise WeChat) via gateway setup."""
from hermes_cli.gateway import _setup_wecom as _gateway_setup_wecom
_gateway_setup_wecom()
def _setup_wecom_callback():
"""Configure WeCom Callback (self-built app) via gateway setup."""
from hermes_cli.gateway import _setup_wecom_callback as _gw_setup
_gw_setup()
def _setup_bluebubbles():
"""Configure BlueBubbles iMessage gateway."""
print_header("BlueBubbles (iMessage)")
@@ -2127,17 +2142,9 @@ _GATEWAY_PLATFORMS = [
("Telegram", "TELEGRAM_BOT_TOKEN", _setup_telegram),
("Discord", "DISCORD_BOT_TOKEN", _setup_discord),
("Slack", "SLACK_BOT_TOKEN", _setup_slack),
("Signal", "SIGNAL_HTTP_URL", _setup_signal),
("Email", "EMAIL_ADDRESS", _setup_email),
("SMS (Twilio)", "TWILIO_ACCOUNT_SID", _setup_sms),
("Matrix", "MATRIX_ACCESS_TOKEN", _setup_matrix),
("Mattermost", "MATTERMOST_TOKEN", _setup_mattermost),
("WhatsApp", "WHATSAPP_ENABLED", _setup_whatsapp),
("DingTalk", "DINGTALK_CLIENT_ID", _setup_dingtalk),
("Feishu / Lark", "FEISHU_APP_ID", _setup_feishu),
("WeCom (Enterprise WeChat)", "WECOM_BOT_ID", _setup_wecom),
("WeCom Callback (Self-Built App)", "WECOM_CALLBACK_CORP_ID", _setup_wecom_callback),
("Weixin (WeChat)", "WEIXIN_ACCOUNT_ID", _setup_weixin),
("BlueBubbles (iMessage)", "BLUEBUBBLES_SERVER_URL", _setup_bluebubbles),
("Webhooks (GitHub, GitLab, etc.)", "WEBHOOK_ENABLED", _setup_webhooks),
]
@@ -2178,17 +2185,10 @@ def setup_gateway(config: dict):
get_env_value("TELEGRAM_BOT_TOKEN")
or get_env_value("DISCORD_BOT_TOKEN")
or get_env_value("SLACK_BOT_TOKEN")
or get_env_value("SIGNAL_HTTP_URL")
or get_env_value("EMAIL_ADDRESS")
or get_env_value("TWILIO_ACCOUNT_SID")
or get_env_value("MATTERMOST_TOKEN")
or get_env_value("MATRIX_ACCESS_TOKEN")
or get_env_value("MATRIX_PASSWORD")
or get_env_value("WHATSAPP_ENABLED")
or get_env_value("DINGTALK_CLIENT_ID")
or get_env_value("FEISHU_APP_ID")
or get_env_value("WECOM_BOT_ID")
or get_env_value("WEIXIN_ACCOUNT_ID")
or get_env_value("BLUEBUBBLES_SERVER_URL")
or get_env_value("WEBHOOK_ENABLED")
)
@@ -2232,7 +2232,6 @@ def setup_gateway(config: dict):
from hermes_cli.gateway import (
_is_service_installed,
_is_service_running,
supports_systemd_services,
has_conflicting_systemd_units,
install_linux_gateway_from_setup,
print_systemd_scope_conflict_warning,
@@ -2245,18 +2244,16 @@ def setup_gateway(config: dict):
service_installed = _is_service_installed()
service_running = _is_service_running()
supports_systemd = supports_systemd_services()
supports_service_manager = supports_systemd or _is_macos
print()
if supports_systemd and has_conflicting_systemd_units():
if _is_linux and has_conflicting_systemd_units():
print_systemd_scope_conflict_warning()
print()
if service_running:
if prompt_yes_no(" Restart the gateway to pick up changes?", True):
try:
if supports_systemd:
if _is_linux:
systemd_restart()
elif _is_macos:
launchd_restart()
@@ -2265,14 +2262,14 @@ def setup_gateway(config: dict):
elif service_installed:
if prompt_yes_no(" Start the gateway service?", True):
try:
if supports_systemd:
if _is_linux:
systemd_start()
elif _is_macos:
launchd_start()
except Exception as e:
print_error(f" Start failed: {e}")
elif supports_service_manager:
svc_name = "systemd" if supports_systemd else "launchd"
elif _is_linux or _is_macos:
svc_name = "systemd" if _is_linux else "launchd"
if prompt_yes_no(
f" Install the gateway as a {svc_name} service? (runs in background, starts on boot)",
True,
@@ -2280,7 +2277,7 @@ def setup_gateway(config: dict):
try:
installed_scope = None
did_install = False
if supports_systemd:
if _is_linux:
installed_scope, did_install = install_linux_gateway_from_setup(force=False)
else:
launchd_install(force=False)
@@ -2288,7 +2285,7 @@ def setup_gateway(config: dict):
print()
if did_install and prompt_yes_no(" Start the service now?", True):
try:
if supports_systemd:
if _is_linux:
systemd_start(system=installed_scope == "system")
elif _is_macos:
launchd_start()
@@ -2299,21 +2296,12 @@ def setup_gateway(config: dict):
print_info(" You can try manually: hermes gateway install")
else:
print_info(" You can install later: hermes gateway install")
if supports_systemd:
if _is_linux:
print_info(" Or as a boot-time service: sudo hermes gateway install --system")
print_info(" Or run in foreground: hermes gateway")
else:
from hermes_constants import is_container
if is_container():
print_info("Start the gateway to bring your bots online:")
print_info(" hermes gateway run # Run as container main process")
print_info("")
print_info("For automatic restarts, use a Docker restart policy:")
print_info(" docker run --restart unless-stopped ...")
print_info(" docker restart <container> # Manual restart")
else:
print_info("Start the gateway to bring your bots online:")
print_info(" hermes gateway # Run in foreground")
print_info("Start the gateway to bring your bots online:")
print_info(" hermes gateway # Run in foreground")
print_info("" * 50)
@@ -2389,30 +2377,12 @@ def _get_section_config_summary(config: dict, section_key: str) -> Optional[str]
platforms.append("Discord")
if get_env_value("SLACK_BOT_TOKEN"):
platforms.append("Slack")
if get_env_value("SIGNAL_ACCOUNT"):
platforms.append("Signal")
if get_env_value("EMAIL_ADDRESS"):
platforms.append("Email")
if get_env_value("TWILIO_ACCOUNT_SID"):
platforms.append("SMS")
if get_env_value("MATRIX_ACCESS_TOKEN") or get_env_value("MATRIX_PASSWORD"):
platforms.append("Matrix")
if get_env_value("MATTERMOST_TOKEN"):
platforms.append("Mattermost")
if get_env_value("WHATSAPP_PHONE_NUMBER_ID"):
platforms.append("WhatsApp")
if get_env_value("DINGTALK_CLIENT_ID"):
platforms.append("DingTalk")
if get_env_value("FEISHU_APP_ID"):
platforms.append("Feishu")
if get_env_value("WECOM_BOT_ID"):
platforms.append("WeCom")
if get_env_value("WEIXIN_ACCOUNT_ID"):
platforms.append("Weixin")
if get_env_value("SIGNAL_ACCOUNT"):
platforms.append("Signal")
if get_env_value("BLUEBUBBLES_SERVER_URL"):
platforms.append("BlueBubbles")
if get_env_value("WEBHOOK_ENABLED"):
platforms.append("Webhooks")
if platforms:
return ", ".join(platforms)
return None # No platforms configured — section must run
@@ -2733,7 +2703,6 @@ def run_setup_wizard(args):
Supports full, quick, and section-specific setup:
hermes setup full or quick (auto-detected)
hermes setup model just model/provider
hermes setup tts just text-to-speech
hermes setup terminal just terminal backend
hermes setup gateway just messaging platforms
hermes setup tools just tool configuration
@@ -2745,11 +2714,6 @@ def run_setup_wizard(args):
return
ensure_hermes_home()
reset_requested = bool(getattr(args, "reset", False))
if reset_requested:
save_config(copy.deepcopy(DEFAULT_CONFIG))
print_success("Configuration reset to defaults.")
config = load_config()
hermes_home = get_hermes_home()
@@ -2850,13 +2814,18 @@ def run_setup_wizard(args):
menu_choices = [
"Quick Setup - configure missing items only",
"Full Setup - reconfigure everything",
"---",
"Model & Provider",
"Terminal Backend",
"Messaging Platforms (Gateway)",
"Tools",
"Agent Settings",
"---",
"Exit",
]
# Separator indices (not selectable, but prompt_choice doesn't filter them,
# so we handle them below)
choice = prompt_choice("What would you like to do?", menu_choices, 0)
if choice == 0:
@@ -2866,14 +2835,18 @@ def run_setup_wizard(args):
elif choice == 1:
# Full setup — fall through to run all sections
pass
elif choice == 7:
elif choice in (2, 8):
# Separator — treat as exit
print_info("Exiting. Run 'hermes setup' again when ready.")
return
elif 2 <= choice <= 6:
elif choice == 9:
print_info("Exiting. Run 'hermes setup' again when ready.")
return
elif 3 <= choice <= 7:
# Individual section — map by key, not by position.
# SETUP_SECTIONS includes TTS but the returning-user menu skips it,
# so positional indexing (choice - 2) would dispatch the wrong section.
section_key = RETURNING_USER_MENU_SECTION_KEYS[choice - 2]
# so positional indexing (choice - 3) would dispatch the wrong section.
section_key = RETURNING_USER_MENU_SECTION_KEYS[choice - 3]
section = next((s for s in SETUP_SECTIONS if s[0] == section_key), None)
if section:
_, label, func = section
@@ -2941,33 +2914,19 @@ def run_setup_wizard(args):
_offer_launch_chat()
def _resolve_hermes_chat_argv() -> Optional[list[str]]:
"""Resolve argv for launching ``hermes chat`` in a fresh process."""
hermes_bin = shutil.which("hermes")
if hermes_bin:
return [hermes_bin, "chat"]
try:
if importlib.util.find_spec("hermes_cli") is not None:
return [sys.executable, "-m", "hermes_cli.main", "chat"]
except Exception:
pass
return None
def _offer_launch_chat():
"""Prompt the user to jump straight into chat after setup."""
print()
if not prompt_yes_no("Launch hermes chat now?", True):
return
chat_argv = _resolve_hermes_chat_argv()
if not chat_argv:
print_info("Could not relaunch Hermes automatically. Run 'hermes chat' manually.")
return
os.execvp(chat_argv[0], chat_argv)
if prompt_yes_no("Launch hermes chat now?", True):
from hermes_cli.main import cmd_chat
from types import SimpleNamespace
cmd_chat(SimpleNamespace(
query=None, resume=None, continue_last=None, model=None,
provider=None, effort=None, skin=None, oneshot=False,
quiet=False, verbose=False, toolsets=None, skills=None,
yolo=False, source=None, worktree=False, checkpoints=False,
pass_session_id=False, max_turns=None,
))
def _run_first_time_quick_setup(config: dict, hermes_home, is_existing: bool):
+17 -5
View File
@@ -15,12 +15,24 @@ from typing import List, Optional, Set
from hermes_cli.config import load_config, save_config
from hermes_cli.colors import Colors, color
from hermes_cli.platforms import PLATFORMS as _PLATFORMS, platform_label
# Backward-compatible view: {key: label_string} so existing code that
# iterates ``PLATFORMS.items()`` or calls ``PLATFORMS.get(key)`` keeps
# working without changes to every call site.
PLATFORMS = {k: info.label for k, info in _PLATFORMS.items() if k != "api_server"}
PLATFORMS = {
"cli": "🖥️ CLI",
"telegram": "📱 Telegram",
"discord": "💬 Discord",
"slack": "💼 Slack",
"whatsapp": "📱 WhatsApp",
"signal": "📡 Signal",
"bluebubbles": "💬 BlueBubbles",
"email": "📧 Email",
"homeassistant": "🏠 Home Assistant",
"mattermost": "💬 Mattermost",
"matrix": "💬 Matrix",
"dingtalk": "💬 DingTalk",
"feishu": "🪽 Feishu",
"wecom": "💬 WeCom",
"webhook": "🔗 Webhook",
}
# ─── Config Helpers ───────────────────────────────────────────────────────────
+25 -44
View File
@@ -151,8 +151,7 @@ def do_search(query: str, source: str = "all", limit: int = 10,
auth = GitHubAuth()
sources = create_source_router(auth)
with c.status("[bold]Searching registries..."):
results = unified_search(query, sources, source_filter=source, limit=limit)
results = unified_search(query, sources, source_filter=source, limit=limit)
if not results:
c.print("[dim]No skills found matching your query.[/]\n")
@@ -188,7 +187,7 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all",
Official skills are always shown first, regardless of source filter.
"""
from tools.skills_hub import (
GitHubAuth, create_source_router, parallel_search_sources,
GitHubAuth, create_source_router,
)
# Clamp page_size to safe range
@@ -199,23 +198,27 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all",
auth = GitHubAuth()
sources = create_source_router(auth)
# Collect results from all (or filtered) sources in parallel.
# Per-source limits are generous — parallelism + 30s timeout cap prevents hangs.
# Collect results from all (or filtered) sources
# Use empty query to get everything; per-source limits prevent overload
_TRUST_RANK = {"builtin": 3, "trusted": 2, "community": 1}
_PER_SOURCE_LIMIT = {
"official": 200, "skills-sh": 200, "well-known": 50,
"github": 200, "clawhub": 500, "claude-marketplace": 100,
"lobehub": 500,
}
_PER_SOURCE_LIMIT = {"official": 100, "skills-sh": 100, "well-known": 25, "github": 100, "clawhub": 50,
"claude-marketplace": 50, "lobehub": 50}
with c.status("[bold]Fetching skills from registries..."):
all_results, source_counts, timed_out = parallel_search_sources(
sources,
query="",
per_source_limits=_PER_SOURCE_LIMIT,
source_filter=source,
overall_timeout=30,
)
all_results: list = []
source_counts: dict = {}
for src in sources:
sid = src.source_id()
if source != "all" and sid != source and sid != "official":
# Always include official source for the "first" placement
continue
try:
limit = _PER_SOURCE_LIMIT.get(sid, 50)
results = src.search("", limit=limit)
source_counts[sid] = len(results)
all_results.extend(results)
except Exception:
continue
if not all_results:
c.print("[dim]No skills found in the Skills Hub.[/]\n")
@@ -249,11 +252,8 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all",
# Build header
source_label = f"{source}" if source != "all" else "— all sources"
loaded_label = f"{total} skills loaded"
if timed_out:
loaded_label += f", {len(timed_out)} source(s) still loading"
c.print(f"\n[bold]Skills Hub — Browse {source_label}[/]"
f" [dim]({loaded_label}, page {page}/{total_pages})[/]")
f" [dim]({total} skills, page {page}/{total_pages})[/]")
if official_count > 0 and page == 1:
c.print(f"[bright_cyan]★ {official_count} official optional skill(s) from Nous Research[/]")
c.print()
@@ -300,11 +300,8 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all",
parts = [f"{sid}: {ct}" for sid, ct in sorted(source_counts.items())]
c.print(f" [dim]Sources: {', '.join(parts)}[/]")
if timed_out:
c.print(f" [yellow]⚡ Slow sources skipped: {', '.join(timed_out)} "
f"— run again for cached results[/]")
c.print("[dim]Tip: 'hermes skills search <query>' searches deeper across all registries[/]\n")
c.print("[dim]Use: hermes skills inspect <identifier> to preview, "
"hermes skills install <identifier> to install[/]\n")
def do_install(identifier: str, category: str = "", force: bool = False,
@@ -335,23 +332,7 @@ def do_install(identifier: str, category: str = "", force: bool = False,
meta, bundle, _matched_source = _resolve_source_meta_and_bundle(identifier, sources)
if not bundle:
# Check if any source hit GitHub API rate limit
rate_limited = any(
getattr(src, "is_rate_limited", False)
or getattr(getattr(src, "github", None), "is_rate_limited", False)
for src in sources
)
c.print(f"[bold red]Error:[/] Could not fetch '{identifier}' from any source.")
if rate_limited:
c.print(
"[yellow]Hint:[/] GitHub API rate limit exhausted "
"(unauthenticated: 60 requests/hour).\n"
"Set [bold]GITHUB_TOKEN[/] in your .env or install the "
"[bold]gh[/] CLI and run [bold]gh auth login[/] "
"to raise the limit to 5,000/hr.\n"
)
else:
c.print()
c.print(f"[bold red]Error:[/] Could not fetch '{identifier}' from any source.\n")
return
# Auto-detect category for official skills (e.g. "official/autonomous-ai-agents/blackbox")
+21 -53
View File
@@ -79,9 +79,6 @@ def _effective_provider_label() -> str:
return provider_label(effective)
from hermes_constants import is_termux as _is_termux
def show_status(args):
"""Show status of all Hermes Agent components."""
show_all = getattr(args, 'all', False)
@@ -141,8 +138,11 @@ def show_status(args):
display = redact_key(value) if not show_all else value
print(f" {name:<12} {check_mark(has_key)} {display}")
from hermes_cli.auth import get_anthropic_key
anthropic_value = get_anthropic_key()
anthropic_value = (
get_env_value("ANTHROPIC_TOKEN")
or get_env_value("ANTHROPIC_API_KEY")
or ""
)
anthropic_display = redact_key(anthropic_value) if not show_all else anthropic_value
print(f" {'Anthropic':<12} {check_mark(bool(anthropic_value))} {anthropic_display}")
@@ -302,8 +302,6 @@ def show_status(args):
"DingTalk": ("DINGTALK_CLIENT_ID", None),
"Feishu": ("FEISHU_APP_ID", "FEISHU_HOME_CHANNEL"),
"WeCom": ("WECOM_BOT_ID", "WECOM_HOME_CHANNEL"),
"WeCom Callback": ("WECOM_CALLBACK_CORP_ID", None),
"Weixin": ("WEIXIN_ACCOUNT_ID", "WEIXIN_HOME_CHANNEL"),
"BlueBubbles": ("BLUEBUBBLES_SERVER_URL", "BLUEBUBBLES_HOME_CHANNEL"),
}
@@ -327,54 +325,24 @@ def show_status(args):
print()
print(color("◆ Gateway Service", Colors.CYAN, Colors.BOLD))
if _is_termux():
if sys.platform.startswith('linux'):
try:
from hermes_cli.gateway import find_gateway_pids
gateway_pids = find_gateway_pids()
from hermes_cli.gateway import get_service_name
_gw_svc = get_service_name()
except Exception:
gateway_pids = []
is_running = bool(gateway_pids)
print(f" Status: {check_mark(is_running)} {'running' if is_running else 'stopped'}")
print(" Manager: Termux / manual process")
if gateway_pids:
rendered = ", ".join(str(pid) for pid in gateway_pids[:3])
if len(gateway_pids) > 3:
rendered += ", ..."
print(f" PID(s): {rendered}")
else:
print(" Start with: hermes gateway")
print(" Note: Android may stop background jobs when Termux is suspended")
elif sys.platform.startswith('linux'):
from hermes_constants import is_container
if is_container():
# Docker/Podman: no systemd — check for running gateway processes
try:
from hermes_cli.gateway import find_gateway_pids
gateway_pids = find_gateway_pids()
is_active = len(gateway_pids) > 0
except Exception:
is_active = False
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
print(" Manager: docker (foreground)")
else:
try:
from hermes_cli.gateway import get_service_name
_gw_svc = get_service_name()
except Exception:
_gw_svc = "hermes-gateway"
try:
result = subprocess.run(
["systemctl", "--user", "is-active", _gw_svc],
capture_output=True,
text=True,
timeout=5
)
is_active = result.stdout.strip() == "active"
except (FileNotFoundError, subprocess.TimeoutExpired):
is_active = False
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
print(" Manager: systemd (user)")
_gw_svc = "hermes-gateway"
try:
result = subprocess.run(
["systemctl", "--user", "is-active", _gw_svc],
capture_output=True,
text=True,
timeout=5
)
is_active = result.stdout.strip() == "active"
except subprocess.TimeoutExpired:
is_active = False
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
print(" Manager: systemd (user)")
elif sys.platform == 'darwin':
from hermes_cli.gateway import get_launchd_label

Some files were not shown because too many files have changed in this diff Show More