Compare commits

..

243 Commits

Author SHA1 Message Date
Shannon Sands f8ba6a4a3e fix(setup): use npm ci instead of npm install in hermes update
npm install re-resolves the dependency graph and rewrites package-lock.json,
leaving a dirty working tree after every update. npm ci installs exactly
from the committed lockfile without mutating it, which is the correct
command for reproducible installs in update/deployment contexts.

Closes #4048
2026-03-31 10:54:58 +10:00
Teknium ffd5d37f9b fix: treat non-sk-ant- keys as regular API keys, not OAuth tokens (#4093)
* fix: treat non-sk-ant- prefixed keys (Azure AI Foundry) as regular API keys, not OAuth tokens

* fix: treat non-sk-ant- keys as regular API keys, not OAuth tokens

_is_oauth_token() returned True for any key not starting with
sk-ant-api, misclassifying Azure AI Foundry keys as OAuth tokens
and sending Bearer auth instead of x-api-key → 401 rejection.

Real Anthropic OAuth tokens all start with sk-ant-oat (confirmed
from live .credentials.json). Non-sk-ant- keys are third-party
provider keys that should use x-api-key.

Test fixtures updated to use realistic sk-ant-oat01- prefixed
tokens instead of fake strings.

Salvaged from PR #4075 by @HangGlidersRule.

---------

Co-authored-by: Clawdbot <clawdbot@openclaw.ai>
2026-03-30 17:41:13 -07:00
Teknium 720507efac feat: add post-migration cleanup for OpenClaw directories (#4100)
After migrating from OpenClaw, leftover workspace directories contain
state files (todo.json, sessions, logs) that confuse the agent — it
discovers them and reads/writes to stale locations instead of the
Hermes state directory, causing issues like cron jobs reading a
different todo list than interactive sessions.

Changes:
- hermes claw migrate now offers to archive the source directory after
  successful migration (rename to .pre-migration, not delete)
- New `hermes claw cleanup` subcommand for users who already migrated
  and need to archive leftover OpenClaw directories
- Migration notes updated with explicit cleanup guidance
- 42 tests covering all new functionality

Reported by SteveSkedasticity — multiple todo.json files across
~/.hermes/, ~/.openclaw/workspace/, and ~/.openclaw/workspace-assistant/
caused cron jobs to read from wrong locations.
2026-03-30 17:39:08 -07:00
Teknium 8a794d029d fix(ci): add repo conditionals to prevent fork workflow failures (#4107)
Add github.repository checks to docker-publish and deploy-site
workflows so they skip on forks where upstream-specific resources
(Docker Hub org, custom domain) are unavailable.

Co-authored-by: StreamOfRon <StreamOfRon@users.noreply.github.com>
2026-03-30 17:38:32 -07:00
Teknium e64b047663 chore: prepare Hermes for Homebrew packaging (#4099)
Co-authored-by: Yabuku-xD <78594762+Yabuku-xD@users.noreply.github.com>
2026-03-30 17:34:43 -07:00
Teknium 11aa44d34d docs(telegram): add webhook mode documentation (#4089)
Documents the Telegram webhook mode from #3880:
- New 'Webhook Mode' section in telegram.md with polling vs webhook
  comparison, config table, Fly.io deployment example, troubleshooting
- Add TELEGRAM_WEBHOOK_URL/PORT/SECRET to environment-variables.md
- Add Telegram section to .env.example (existing + webhook vars)

Co-authored-by: raulbcs <raulbcs@users.noreply.github.com>
2026-03-30 17:21:59 -07:00
Teknium 07746dca0c fix(matrix): E2EE decryption — request keys, auto-trust devices, retry buffered events (#4083)
When the Matrix adapter receives encrypted events it can't decrypt
(MegolmEvent), it now:

1. Requests the missing room key from other devices via
   client.request_room_key(event) instead of silently dropping the message

2. Buffers undecrypted events (bounded to 100, 5 min TTL) and retries
   decryption after each E2EE maintenance cycle when new keys arrive

3. Auto-trusts/verifies all devices after key queries so other clients
   share session keys with the bot proactively

4. Exports Megolm keys on disconnect and imports them on connect, so
   session keys survive gateway restarts

This addresses the 'could not decrypt event' warnings that caused the
bot to miss messages in encrypted rooms.
2026-03-30 17:16:09 -07:00
Teknium 7e0c2c3ce3 docs: comprehensive documentation audit — fix 9 HIGH, 20+ MEDIUM gaps (#4087)
Reference docs fixes:
- cli-commands.md: remove non-existent --provider alibaba, add hermes
  profile/completion/plugins/mcp to top-level table, add --profile/-p
  global flag, add --source chat option
- slash-commands.md: add /yolo and /commands, fix /q alias conflict
  (resolves to /queue not /quit), add missing aliases (/bg, /set-home,
  /reload_mcp, /gateway)
- toolsets-reference.md: fix hermes-api-server (not same as hermes-cli,
  omits clarify/send_message/text_to_speech)
- profile-commands.md: fix show name required not optional, --clone-from
  not --from, add --remove/--name to alias, fix alias path, fix export/
  import arg types, remove non-existent fish completion
- tools-reference.md: add EXA_API_KEY to web tools requires_env
- mcp-config-reference.md: add auth key for OAuth, tool name sanitization
- environment-variables.md: add EXA_API_KEY, update provider values
- plugins.md: remove non-existent ctx.register_command(), add
  ctx.inject_message()

Feature docs additions:
- security.md: add /yolo mode, approval modes (manual/smart/off),
  configurable timeout, expanded dangerous patterns table
- cron.md: add wrap_response config, [SILENT] suppression
- mcp.md: add dynamic tool discovery, MCP sampling support
- cli.md: add Ctrl+Z suspend, busy_input_mode, tool_preview_length
- docker.md: add skills/credential file mounting

Messaging platform docs:
- telegram.md: add webhook mode, DoH fallback IPs
- slack.md: add multi-workspace OAuth support
- discord.md: add DISCORD_IGNORE_NO_MENTION
- matrix.md: add MSC3245 native voice messages
- feishu.md: expand from 129 to 365 lines (encrypt key, verification
  token, group policy, card actions, media, rate limiting, markdown,
  troubleshooting)
- wecom.md: expand from 86 to 264 lines (per-group allowlists, media,
  AES decryption, stream replies, reconnection, troubleshooting)

Configuration docs:
- quickstart.md: add DeepSeek, Copilot, Copilot ACP providers
- configuration.md: add DeepSeek provider, Exa web backend, terminal
  env_passthrough/images, browser.command_timeout, compression params,
  discord config, security/tirith config, timezone, auxiliary models

21 files changed, ~1000 lines added
2026-03-30 17:15:21 -07:00
SHL0MS 3c8f910973 feat: respect NO_COLOR env var and TERM=dumb (#4079)
Add should_use_color() function to hermes_cli/colors.py that checks
NO_COLOR (https://no-color.org/) and TERM=dumb before emitting ANSI
escapes. The existing color() helper now uses this function instead
of a bare isatty() check.

This is the foundation — cli.py and banner.py still have inline ANSI
constants that bypass this module (tracked in #4071).

Closes #4066

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
2026-03-30 17:07:21 -07:00
Teknium 13f3e67165 ux: show 'Initializing agent...' on first message (#4086)
Display a brief status message before the heavy agent initialization
(OpenAI client setup, tool loading, memory init, etc.) so users
aren't staring at a blank screen for several seconds.

Only prints when self.agent is None (first use or after model switch).

Closes #4060

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
2026-03-30 17:05:40 -07:00
Teknium 4a7c17fca5 fix(gateway): read custom_providers context_length in hygiene compression (#4085)
Gateway hygiene pre-compression only checked model.context_length from
the top-level config, missing per-model context_length defined in
custom_providers entries. This caused premature compression for custom
provider users (e.g. 128K default instead of 200K configured).

The AIAgent's own compressor already reads custom_providers correctly
(run_agent.py lines 1171-1189). This adds the same fallback to the
gateway hygiene path, running after runtime provider resolution so
the base_url is available for matching.
2026-03-30 17:04:31 -07:00
Teknium f007284d05 fix: rate-limit pairing rejection messages to prevent spam (#4081)
* fix: rate-limit pairing rejection messages to prevent spam

When generate_code() returns None (rate limited or max pending), the
"Too many pairing requests" message was sent on every subsequent DM
with no cooldown. A user sending 30 messages would get 30 rejection
replies — reported as potential hack on WhatsApp.

Now check _is_rate_limited() before any pairing response, and record
rate limit after sending a rejection. Subsequent messages from the
same user are silently ignored until the rate limit window expires.

* test: add coverage for pairing response rate limiting

Follow-up to cherry-picked PR #4042 — adds tests verifying:
- Rate-limited users get silently ignored (no response sent)
- Rejection messages record rate limit for subsequent suppression

---------

Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
2026-03-30 16:48:00 -07:00
Teknium 3d47af01c3 fix(honcho): write config to instance-local path for profile isolation (#4037)
Multiple agents/profiles running 'hermes honcho setup' all wrote to
the shared global ~/.honcho/config.json, overwriting each other's
configuration.

Root cause: _write_config() defaulted to resolve_config_path() which
returns the global path when no instance-local file exists yet (i.e.
on first setup).

Fix: _write_config() now defaults to _local_config_path() which always
returns $HERMES_HOME/honcho.json. Each profile gets its own config file.
Reading still falls back to global for cross-app interop and seeding.

Also updates cmd_setup and cmd_status messaging to show the actual
write path.

Includes 10 new tests verifying profile isolation, global fallback
reads, and multi-profile independence.
2026-03-30 16:41:19 -07:00
SHL0MS 275fcc6673 Merge pull request #4054 from NousResearch/ascii-video/text-readability-and-layout-oracle
ascii-video skill: text readability techniques and external layout oracle
2026-03-30 15:52:14 -07:00
SHL0MS ab62614a89 ascii-video: add text readability techniques and external layout oracle pattern
- composition.md: add text backdrop (gaussian dark mask behind glyphs) and
  external layout oracle pattern (browser-based text layout → JSON → Python
  renderer pipeline for obstacle-aware text reflow)
- shaders.md: add reverse vignette shader (center-darkening for text readability)
- troubleshooting.md: add diagnostic entries for text-over-busy-background
  readability and kaleidoscope-destroys-text pitfall
2026-03-30 18:48:22 -04:00
Teknium de368cac54 fix(tools): show browser and TTS in reconfigure menu (#4041)
* fix(gateway): honor default for invalid bool-like config values

* refactor: simplify web backend priority detection

Replace cascading boolean conditions with a priority-ordered loop.
Same behavior (verified against all 16 env var combinations),
half the lines, trivially extensible for new backends.

* fix(tools): show browser and TTS in reconfigure menu

_toolset_has_keys() returned False for toolsets with no-key providers
(Local Browser, Edge TTS) because it only checked providers with
env_vars. Users couldn't find these tools in the reconfigure list
and had no obvious way to switch browser/TTS backends.

Now treats providers with empty env_vars as always-configured, so
toolsets with free/local options always appear in the reconfigure menu.

---------

Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-30 14:11:39 -07:00
Teknium 0d1003559d refactor: simplify web backend priority detection (#4036)
* fix(gateway): honor default for invalid bool-like config values

* refactor: simplify web backend priority detection

Replace cascading boolean conditions with a priority-ordered loop.
Same behavior (verified against all 16 env var combinations),
half the lines, trivially extensible for new backends.

---------

Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-30 13:37:25 -07:00
Teknium eba8d52d54 fix: show correct shell config path for macOS/zsh in install script (#4025)
- print_success() hardcoded 'source ~/.bashrc' regardless of user's shell
- On macOS (default zsh), ~/.bashrc doesn't exist, leaving users unable to
  find the hermes command after install
- Now detects $SHELL and shows the correct file (zshrc/bashrc)
- Also captures .[all] install failure output instead of silencing with
  2>/dev/null, so users can diagnose why full extras failed
2026-03-30 13:25:11 -07:00
Teknium 72104eb06f fix(gateway): honor default for invalid bool-like config values (#4029)
Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-30 13:24:48 -07:00
Teknium 4b35836ba4 fix(auth): use bearer auth for MiniMax Anthropic endpoints (#4028)
MiniMax's /anthropic endpoints implement Anthropic's Messages API but
require Authorization: Bearer instead of x-api-key. Without this fix,
MiniMax users get 401 errors in gateway sessions.

Adds _requires_bearer_auth() to detect MiniMax endpoints and route
through auth_token in the Anthropic SDK. Check runs before OAuth
token detection so MiniMax keys aren't misclassified as setup tokens.

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-30 13:21:39 -07:00
Teknium bd376fe976 fix(docs): improve mobile sidebar navigation
The sidebar had all categories expanded by default (collapsed: false),
which on mobile created a 60+ item flat list when opening the sidebar.
Reported by danny on Discord.

Changes:
- Set all top-level categories to collapsed: true (tap to expand)
- Enable autoCollapseCategories: true (accordion — opening one section
  closes others, prevents the overwhelming flat list)
- Enable hideable sidebar (swipe-to-dismiss on mobile)
- Add mobile CSS: larger touch targets (0.75rem padding), bolder
  category headers, visible subcategory indentation with left border,
  wider sidebar (85vw / 360px max), darker backdrop overlay
2026-03-30 13:20:55 -07:00
Teknium f93637b3a1 feat: add /profile slash command to show active profile (#4027)
Adds /profile to COMMAND_REGISTRY (Info category) with handlers in
both CLI and gateway. Shows the active profile name and home directory.

Works on all platforms — CLI, Telegram, Discord, Slack, etc.
Detects profile by checking if HERMES_HOME is under ~/.hermes/profiles/.
Shows 'default' when running without a profile.
2026-03-30 13:20:06 -07:00
Teknium 7b4fe0528f fix(auth): use bearer auth for MiniMax Anthropic endpoints (#4028)
MiniMax's /anthropic endpoints implement Anthropic's Messages API but
require Authorization: Bearer instead of x-api-key. Without this fix,
MiniMax users get 401 errors in gateway sessions.

Adds _requires_bearer_auth() to detect MiniMax endpoints and route
through auth_token in the Anthropic SDK. Check runs before OAuth
token detection so MiniMax keys aren't misclassified as setup tokens.

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-30 13:19:44 -07:00
Teknium 950f69475f feat(browser): add Camofox local anti-detection browser backend (#4008)
Camofox-browser is a self-hosted Node.js server wrapping Camoufox
(Firefox fork with C++ fingerprint spoofing). When CAMOFOX_URL is set,
all 11 browser tools route through the Camofox REST API instead of
the agent-browser CLI.

Maps 1:1 to the existing browser tool interface:
- Navigate, snapshot, click, type, scroll, back, press, close
- Get images, vision (screenshot + LLM analysis)
- Console (returns empty with note — camofox limitation)

Setup: npm start in camofox-browser dir, or docker run -p 9377:9377
Then: CAMOFOX_URL=http://localhost:9377 in ~/.hermes/.env

Advantages over Browserbase (cloud):
- Free (no per-session API costs)
- Local (zero network latency for browser ops)
- Anti-detection at C++ level (bypasses Cloudflare/Google bot detection)
- Works offline, Docker-ready

Files:
- tools/browser_camofox.py: Full REST backend (~400 lines)
- tools/browser_tool.py: Routing at each tool function
- hermes_cli/config.py: CAMOFOX_URL env var entry
- tests/tools/test_browser_camofox.py: 20 tests
2026-03-30 13:18:42 -07:00
Teknium 7dac75f2ae fix: prevent context pressure warning spam after compression (#4012)
* feat: add /yolo slash command to toggle dangerous command approvals

Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping
all dangerous command approval prompts for the current session. Works in
both CLI and gateway (Telegram, Discord, etc.).

- /yolo -> ON: all commands auto-approved, no confirmation prompts
- /yolo -> OFF: normal approval flow restored

The --yolo CLI flag already existed for launch-time opt-in. This adds
the ability to toggle mid-session without restarting.

Session-scoped — resets when the process ends. Uses the existing
HERMES_YOLO_MODE env var that check_all_command_guards() already
respects.

* fix: prevent context pressure warning spam (agent loop + gateway rate-limit)

Two complementary fixes for repeated context pressure warnings spamming
gateway users (Telegram, Discord, etc.):

1. Agent-level loop fix (run_agent.py):
   After compression, only reset _context_pressure_warned if the
   post-compression estimate is actually below the 85% warning level.
   Previously the flag was unconditionally reset, causing the warning
   to re-fire every loop iteration when compression couldn't reduce
   below 85% of the threshold (e.g. very low threshold like 15%,
   or system prompt alone exceeds the warning level).

2. Gateway-level rate-limit (gateway/run.py, salvaged from PR #3786):
   Per-chat_id cooldown of 1 hour on compression warning messages.
   Both warning paths ('still large after compression' and 'compression
   failed') are gated. Defense-in-depth — even if the agent-level fix
   has edge cases, users won't see more than one warning per hour.

Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>

---------

Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>
2026-03-30 13:18:21 -07:00
Teknium ed9af6e589 fix: create AsyncOpenAI lazily in trajectory_compressor to avoid closed event loop (#4013)
The AsyncOpenAI client was created once at __init__ and stored as an
instance attribute. process_directory() calls asyncio.run() which creates
and closes a fresh event loop. On a second call, the client's httpx
transport is still bound to the closed loop, raising RuntimeError:
"Event loop is closed" — the same pattern fixed by PR #3398 for the
main agent loop.

Create the client lazily in _get_async_client() so each asyncio.run()
gets a client bound to the current loop.

Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>
2026-03-30 13:16:16 -07:00
Teknium 158f49f19a fix: enforce priority order in Telegram menu — core > plugins > skills (#4023)
The menu now has explicit priority tiers:
1. Core CommandDef commands (always included, never bumped)
2. Plugin slash commands (take precedence over skills)
3. Built-in skill commands (fill remaining slots alphabetically)

Only skills get trimmed when the 100-command cap is hit. Adding new
core commands or plugin commands automatically pushes skills out,
not the other way around.
2026-03-30 13:04:06 -07:00
Teknium 86250a3e45 docs: expand terminal backends section + fix docs build (#4016)
* feat(telegram): add webhook mode as alternative to polling

When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook
server (via python-telegram-bot's start_webhook()) instead of long
polling. This enables cloud platforms like Fly.io and Railway to
auto-wake suspended machines on inbound HTTP traffic.

Polling remains the default — no behavior change unless the env var
is set.

Env vars:
  TELEGRAM_WEBHOOK_URL    Public HTTPS URL for Telegram to push to
  TELEGRAM_WEBHOOK_PORT   Local listen port (default 8443)
  TELEGRAM_WEBHOOK_SECRET Secret token for update verification

Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all
current main enhancements (network error recovery, polling conflict
detection, DM topics setup).

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>

* fix: send_document call in background task delivery + vision download timeout

Two fixes salvaged from PR #2269 by amethystani:

1. gateway/run.py: adapter.send_file() → adapter.send_document()
   send_file() doesn't exist on BasePlatformAdapter. Background task
   media files were silently never delivered (AttributeError swallowed
   by except Exception: pass).

2. tools/vision_tools.py: configurable image download timeout via
   HERMES_VISION_DOWNLOAD_TIMEOUT env var (default 30s), plus guard
   against raise None when max_retries=0.

The third fix in #2269 (opencode-go auth config) was already resolved
on main.

Co-authored-by: amethystani <amethystani@users.noreply.github.com>

* docs: expand terminal backends section + fix feishu MDX build error

---------

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
Co-authored-by: amethystani <amethystani@users.noreply.github.com>
2026-03-30 12:59:58 -07:00
Teknium ea342f2382 Fix banner alignment in installer script (#4011)
Co-authored-by: Ahmed Khaled <wakeupwithme000@gmail.com>
2026-03-30 11:24:10 -07:00
Teknium 60ecde8ac7 fix: fit all 100 commands in Telegram menu with 40-char descriptions (#4010)
* fix: truncate skill descriptions to 100 chars in Telegram menu

* fix: 40-char desc cap + 100 command limit for Telegram menu

setMyCommands has an undocumented total payload size limit.
50 commands with 256-char descriptions failed, 50 with 100-char
worked, and 100 with 40-char descriptions also works (~5300 total
chars). Truncate skill descriptions to 40 chars in the menu picker
and set cap back to 100. Full descriptions available via /commands.
2026-03-30 11:21:13 -07:00
Teknium f3069c649c fix(cli): add missing subprocess.run() timeouts in doctor and status (#4009)
Add timeout parameters to 4 subprocess.run() calls that could hang
indefinitely if the child process blocks (e.g., unresponsive docker
daemon, systemctl waiting for D-Bus):

- doctor.py: docker info (timeout=10), ssh check (timeout=15)
- status.py: systemctl is-active (timeout=5), launchctl list (timeout=5)

Each call site now catches subprocess.TimeoutExpired and treats it as
a failure, consistent with how non-zero return codes are already handled.

Add AST-based regression test that verifies every subprocess.run() call
in CLI modules specifies a timeout keyword argument.

Co-authored-by: dieutx <dangtc94@gmail.com>
2026-03-30 11:17:15 -07:00
Teknium 0976bf6cd0 feat: add /yolo slash command to toggle dangerous command approvals (#3990)
Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping
all dangerous command approval prompts for the current session. Works in
both CLI and gateway (Telegram, Discord, etc.).

- /yolo -> ON: all commands auto-approved, no confirmation prompts
- /yolo -> OFF: normal approval flow restored

The --yolo CLI flag already existed for launch-time opt-in. This adds
the ability to toggle mid-session without restarting.

Session-scoped — resets when the process ends. Uses the existing
HERMES_YOLO_MODE env var that check_all_command_guards() already
respects.
2026-03-30 11:17:09 -07:00
Teknium da3e22bcfa fix: cap Telegram menu at 50 commands — API rejects above ~60 (#4006)
* fix: use SKILLS_DIR not repo path for Telegram menu skill filter

Skills are synced to ~/.hermes/skills/ (SKILLS_DIR), not the repo's
skills/ directory. The previous filter compared against the repo path
so no skills matched. Now checks SKILLS_DIR and excludes .hub/
subdirectory (user-installed hub skills).

* fix: cap Telegram menu at 50 commands — API rejects above ~60

Telegram's setMyCommands returns BOT_COMMANDS_TOO_MUCH when
registering close to 100 commands despite docs claiming 100 is the
limit. Metadata overhead causes rejection above ~60. Cap at 50 for
reliability — remaining commands accessible via /commands.
2026-03-30 11:05:20 -07:00
Teknium 9fd78c7a8e fix: use SKILLS_DIR not repo path for Telegram menu skill filter (#4005)
Skills are synced to ~/.hermes/skills/ (SKILLS_DIR), not the repo's
skills/ directory. The previous filter compared against the repo path
so no skills matched. Now checks SKILLS_DIR and excludes .hub/
subdirectory (user-installed hub skills).
2026-03-30 11:01:13 -07:00
Teknium 5ceed021dc feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap (#3934)
* feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap

Map active skills to Telegram's slash command menu so users can
discover and invoke skills directly. Three changes:

1. Telegram menu now includes active skill commands alongside built-in
   commands, capped at 100 entries (Telegram Bot API limit). Overflow
   commands remain callable but hidden from the picker. Logged at
   startup when cap is hit.

2. New /commands [page] gateway command for paginated browsing of all
   commands + skills. /help now shows first 10 skill commands and
   points to /commands for the full list.

3. When a user types a slash command that matches a disabled or
   uninstalled skill, they get actionable guidance:
   - Disabled: 'Enable it with: hermes skills config'
   - Optional (not installed): 'Install with: hermes skills install official/<path>'

Built on ideas from PR #3921 by @kshitijk4poor.

* chore: move 21 niche skills to optional-skills

Move specialized/niche skills from built-in (skills/) to optional
(optional-skills/) to reduce the default skill count. Users can
install them with: hermes skills install official/<category>/<name>

Moved skills (21):
- mlops: accelerate, chroma, faiss, flash-attention,
  hermes-atropos-environments, huggingface-tokenizers, instructor,
  lambda-labs, llava, nemo-curator, pinecone, pytorch-lightning,
  qdrant, saelens, simpo, slime, tensorrt-llm, torchtitan
- research: domain-intel, duckduckgo-search
- devops: inference-sh cli

Built-in skills: 96 → 75
Optional skills: 22 → 43

* fix: only include repo built-in skills in Telegram menu, not user-installed

User-installed skills (from hub or manually added) stay accessible via
/skills and by typing the command directly, but don't get registered
in the Telegram slash command picker. Only skills whose SKILL.md is
under the repo's skills/ directory are included in the menu.

This keeps the Telegram menu focused on the curated built-in set while
user-installed skills remain discoverable through /skills and /commands.
2026-03-30 10:57:30 -07:00
Teknium 97d6813f51 fix(cache): use deterministic call_id fallbacks instead of random UUIDs (#3991)
When the API doesn't provide a call_id for tool calls, the fallback
generated a random uuid4 hex. This made every API call's input unique
when replayed, preventing OpenAI's prompt cache from matching the
prefix across turns.

Replaced all four uuid4 fallback sites with a deterministic hash of
(function_name, arguments, position_index). The same tool call now
always produces the same fallback call_id, preserving cache-friendly
input stability.

Affected code paths:
- _chat_messages_to_responses_input() — Codex input reconstruction
- _normalize_codex_response() — function_call and custom_tool_call
- _build_assistant_message() — assistant message construction
2026-03-30 09:43:56 -07:00
Teknium 37825189dd fix(skills): validate hub bundle paths before install (#3986)
Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>
2026-03-30 08:37:19 -07:00
Teknium e08778fa1e chore: release v0.6.0 (2026.3.30) (#3985) 2026-03-30 08:29:38 -07:00
Teknium fb634068df fix(security): extend secret redaction to ElevenLabs, Tavily and Exa API keys (#3920)
ElevenLabs (sk_), Tavily (tvly-), and Exa (exa_) keys were not covered
by _PREFIX_PATTERNS, leaking in plain text via printenv or log output.

Salvaged from PR #3790 by @memosr. Tests rewritten with correct
assertions (original tests had vacuously true checks).

Co-authored-by: memosr <memosr@users.noreply.github.com>
2026-03-30 08:13:01 -07:00
Teknium 74181fe726 fix: add TTY guard to interactive CLI commands to prevent CPU spin (#3933)
When interactive TUI commands are invoked non-interactively (e.g. via
the agent's terminal() tool through a subprocess pipe), curses loops
spin at 100% CPU and input() calls hang indefinitely.

Defense in depth — two layers:

1. Source-level guard in curses_checklist() (curses_ui.py + checklist.py):
   Returns cancel_returns immediately when stdin is not a TTY. This
   catches ALL callers automatically, including future code.

2. Command-level guards with clear error messages:
   - hermes tools (interactive checklist, not list/disable/enable)
   - hermes setup (interactive wizard)
   - hermes model (provider/model picker)
   - hermes whatsapp (pairing setup)
   - hermes skills config (skill toggle)
   - hermes mcp configure (tool selection)
   - hermes uninstall (confirmation prompt)

Non-interactive subcommands (hermes tools list, hermes tools enable,
hermes mcp add/remove/list/test, hermes skills search/install/browse)
remain unaffected.
2026-03-30 08:10:23 -07:00
Teknium 1e896b0251 fix: resolve 7 failing CI tests (#3936)
1. matrix voice: _on_room_message_media unconditionally overwrote
   media_urls with the image cache path (always None for non-images),
   wiping the locally-cached voice path. Now only overrides when
   cached_path is truthy.

2. cli_tools_command: /tools disable no longer prompts for confirmation
   (input() removed in earlier commit to fix TUI hang), but tests still
   expected the old Y/N prompt flow. Updated tests to match current
   behavior (direct apply + session reset).

3. slack app_mention: connect() was refactored for multi-workspace
   (creates AsyncWebClient per token), but test only mocked the old
   self._app.client path. Added AsyncWebClient and acquire_scoped_lock
   mocks.

4. website_policy: module-level _cached_policy from earlier tests caused
   fast-path return of None. Added invalidate_cache() before assertion.

5. codex 401 refresh: already passing on current main (fixed by
   intervening commit).
2026-03-30 08:10:14 -07:00
0xbyt4 0b0c1b326c fix: openclaw migration overwrites model config dict with string (#3924)
migrate_model_config() was writing `config["model"] = model_str` which
replaces the entire model dict (default, provider, base_url) with a
bare string. This causes 'str' object has no attribute 'get' errors
throughout Hermes when any code does model_cfg.get("default").

Now preserves the existing model dict and only updates the "default"
key, keeping provider/base_url intact.
2026-03-30 03:02:28 -07:00
Teknium b4496b33b5 fix: background task media delivery + vision download timeout (#3919)
* feat(telegram): add webhook mode as alternative to polling

When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook
server (via python-telegram-bot's start_webhook()) instead of long
polling. This enables cloud platforms like Fly.io and Railway to
auto-wake suspended machines on inbound HTTP traffic.

Polling remains the default — no behavior change unless the env var
is set.

Env vars:
  TELEGRAM_WEBHOOK_URL    Public HTTPS URL for Telegram to push to
  TELEGRAM_WEBHOOK_PORT   Local listen port (default 8443)
  TELEGRAM_WEBHOOK_SECRET Secret token for update verification

Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all
current main enhancements (network error recovery, polling conflict
detection, DM topics setup).

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>

* fix: send_document call in background task delivery + vision download timeout

Two fixes salvaged from PR #2269 by amethystani:

1. gateway/run.py: adapter.send_file() → adapter.send_document()
   send_file() doesn't exist on BasePlatformAdapter. Background task
   media files were silently never delivered (AttributeError swallowed
   by except Exception: pass).

2. tools/vision_tools.py: configurable image download timeout via
   HERMES_VISION_DOWNLOAD_TIMEOUT env var (default 30s), plus guard
   against raise None when max_retries=0.

The third fix in #2269 (opencode-go auth config) was already resolved
on main.

Co-authored-by: amethystani <amethystani@users.noreply.github.com>

---------

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
Co-authored-by: amethystani <amethystani@users.noreply.github.com>
2026-03-30 02:59:39 -07:00
Teknium d028a94b83 fix(whatsapp): skip reply prefix in bot mode — only needed for self-chat (#3931)
The WhatsApp bridge prepends '⚕ *Hermes Agent*\n────────────\n' to
every outgoing message. In self-chat mode this is necessary to
distinguish the bot's responses from the user's own messages. In bot
mode the messages already come from a different number, making the
prefix redundant and cluttered.

Now only prepends the prefix when WHATSAPP_MODE is 'self-chat' (the
default). Bot mode messages are sent clean.
2026-03-30 02:55:33 -07:00
Teknium 0e592aa5b4 fix(cli): remove input() from /tools disable that freezes the terminal (#3918)
input() hangs inside prompt_toolkit's TUI event loop — this is a known
pitfall (AGENTS.md). The /tools disable and /tools enable commands used
input() for a Y/N confirmation prompt, causing the terminal to freeze
with no way to type a response.

Fix: remove the confirmation prompt. The user typing '/tools disable web'
is implicit consent. The change is applied directly with a status message.
2026-03-30 02:53:21 -07:00
Wing Lian efae525dc5 feat(plugins): add inject_message interface for remote message injection (#3778) 2026-03-30 02:48:06 -07:00
Teknium 5148682b43 feat: mount skills directory into all remote backends with live sync (#3890)
Skills with scripts/, templates/, and references/ subdirectories need
those files available inside sandboxed execution environments. Previously
the skills directory was missing entirely from remote backends.

Live sync — files stay current as credentials refresh and skills update:
- Docker/Singularity: bind mounts are inherently live (host changes
  visible immediately)
- Modal: _sync_files() runs before each command with mtime+size caching,
  pushing only changed credential and skill files (~13μs no-op overhead)
- SSH: rsync --safe-links before each command (naturally incremental)
- Daytona: _upload_if_changed() with mtime+size caching before each command

Security — symlink filtering:
- Docker/Singularity: sanitized temp copy when symlinks detected
- Modal/Daytona: iter_skills_files() skips symlinks
- SSH: rsync --safe-links skips symlinks pointing outside source tree
- Temp dir cleanup via atexit + reuse across calls

Non-root user support:
- SSH: detects remote home via echo $HOME, syncs to $HOME/.hermes/
- Daytona: detects sandbox home before sync, uploads to $HOME/.hermes/
- Docker/Modal/Singularity: run as root, /root/.hermes/ is correct

Also:
- credential_files.py: fix name/path key fallback in required_credential_files
- Singularity, SSH, Daytona: gained credential file support
- 14 tests covering symlink filtering, name/path fallback, iter_skills_files
2026-03-30 02:45:41 -07:00
Teknium 791f4e94b2 feat(slack): multi-workspace support via OAuth token file (#3903)
Salvaged from PR #2033 by yoannes. Adds multi-workspace Slack support
so a single Hermes instance can serve multiple Slack workspaces after
OAuth installs.

Changes:
- Support comma-separated bot tokens in SLACK_BOT_TOKEN env var
- Load additional OAuth-persisted tokens from HERMES_HOME/slack_tokens.json
- Route all Slack API calls through workspace-aware _get_client(chat_id)
  instead of always using the primary app client
- Track channel → workspace mapping from incoming events
- Per-workspace bot_user_id for correct mention detection
- Workspace-aware file downloads (correct auth token per workspace)

Backward compatible: single-token setups work identically.

Token file format (slack_tokens.json):
  {"T12345": {"token": "xoxb-...", "team_name": "My Workspace"}}

Fixed from original PR:
- Uses get_hermes_home() instead of hardcoded ~/.hermes/ path

Co-authored-by: yoannes <yoannes@users.noreply.github.com>
2026-03-30 01:51:48 -07:00
Teknium a4b064763d fix(cron): tighten [SILENT] instruction to prevent report-with-silent-prefix (#3901)
The model was interpreting [SILENT] as a metadata prefix and writing
full reports with [SILENT] slapped at the front. The old instruction
said 'optionally followed by a brief internal note' which gave too
much room. New instruction explicitly says: [SILENT] means nothing
else, do NOT combine it with a report.
2026-03-30 00:11:00 -07:00
Teknium 138ea3fbe8 fix(docs): escape angle-bracket URLs in feishu.md breaking MDX build (#3902) 2026-03-30 00:09:30 -07:00
Teknium ee61485cac feat(matrix): support native voice messages via MSC3245 (#3877)
* feat(matrix): support native voice messages

* fix: skip matrix voice tests when matrix-nio not installed

---------

Co-authored-by: Carlos Alberto Pereira Gomes <carlosapgomes@users.noreply.github.com>
2026-03-30 00:02:51 -07:00
Teknium 947faed3bc feat(approvals): make dangerous command approval timeout configurable (#3886)
* feat(approvals): make dangerous command approval timeout configurable

Read `approvals.timeout` from config.yaml (default 60s) instead of
hardcoding 60 seconds in both the fallback CLI prompt and the TUI
prompt_toolkit callback.

Follows the same pattern as `clarify.timeout` which is already
configurable via CLI_CONFIG.

Closes #3765

* fix: add timeout default to approvals section in DEFAULT_CONFIG

---------

Co-authored-by: acsezen <asezen@icloud.com>
2026-03-30 00:02:02 -07:00
kshitij c288bbfb57 fix(cli): prevent status bar wrapping into duplicate rows (#3883)
- measure status bar display width using prompt_toolkit cell widths
- trim rendered status text when fragments would overflow
- add a final single-fragment fallback to prevent wrapping
- update width assertions to validate display cells instead of len()
2026-03-29 23:59:07 -07:00
Teknium a347921314 docs: comprehensive OpenClaw migration guide (#3900)
New standalone guide at guides/migrate-from-openclaw.md with:
- Complete config key mapping tables for every category
- Agent behavior mappings (thinkingDefault → reasoning_effort, etc.)
- Session reset policy mapping (session.reset vs resetTriggers)
- TTS dual-source explanation (messages.tts.providers + talk config)
- MCP server field-by-field mapping
- Messaging platform table with exact config paths and env vars
- API key resolution: 3 sources, priority order, supported targets
- SecretRef handling: plain strings, env templates, SecretRef objects
- Post-migration checklist (6 steps)
- Troubleshooting section
- Complete archived items table with recreation guidance

CLI commands reference condensed to summary + link to full guide.
Added to sidebar under Guides & Tutorials.
2026-03-29 23:58:12 -07:00
Teknium 09def65eff fix(migration): expand OpenClaw migration to cover full data footprint (#3869)
Cross-referenced the OpenClaw Zod schema and TypeScript source against
our migration script. Found and fixed:

Expanded data sources:
- Legacy config fallback: clawdbot.json, moldbot.json
- Legacy dir fallback: ~/.clawdbot/, ~/.moldbot/
- API keys from ~/.openclaw/.env and auth-profiles.json
- Personal skills from ~/.agents/skills/
- Project skills from workspace/.agents/skills/
- BOOTSTRAP.md archived (was silently skipped)
- Expanded env key allowlist: DEEPSEEK, GEMINI, ZAI, MINIMAX

Fixed wrong config paths (verified against Zod schema):
- humanDelay.enabled → humanDelay.mode (field doesn't exist as .enabled)
- agents.defaults.exec.timeout → tools.exec.timeoutSec (wrong path + name)
- messages.tts.elevenlabs.voiceId → messages.tts.providers.elevenlabs.voiceId
- session.resetTriggers (string[]) → session.reset (structured object)
- approvals.mode → approvals.exec.mode (no top-level mode)
- browser.inactivityTimeoutMs → doesn't exist; map cdpUrl+headless instead
- tools.webSearch.braveApiKey → tools.web.search.brave.apiKey
- tools.exec.timeout → tools.exec.timeoutSec

Added SecretRef resolution:
- All token/apiKey fields in OpenClaw can be strings, env templates
  (${VAR}), or SecretRef objects ({source:'env',id:'VAR'}). Added
  resolve_secret_input() to handle all three forms.

Fixed auth-profiles.json:
- Canonical field is 'key' not 'apiKey' (though alias accepted)
- File wraps entries in a 'profiles' key — now handled

Fixed TTS config:
- Provider settings at messages.tts.providers.{name} (not flat)
- Also checks top-level 'talk' config as fallback source

Docs updated with new sources and key list.
2026-03-29 22:49:34 -07:00
Teknium 649d149438 feat(telegram): add webhook mode as alternative to polling (#3880)
When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook
server (via python-telegram-bot's start_webhook()) instead of long
polling. This enables cloud platforms like Fly.io and Railway to
auto-wake suspended machines on inbound HTTP traffic.

Polling remains the default — no behavior change unless the env var
is set.

Env vars:
  TELEGRAM_WEBHOOK_URL    Public HTTPS URL for Telegram to push to
  TELEGRAM_WEBHOOK_PORT   Local listen port (default 8443)
  TELEGRAM_WEBHOOK_SECRET Secret token for update verification

Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all
current main enhancements (network error recovery, polling conflict
detection, DM topics setup).

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
2026-03-29 22:36:07 -07:00
Teknium 5602458794 security: harden dangerous command detection and add file tool path guards (#3872)
Closes gaps that allowed an agent to expose Docker's Remote API to the
internet by writing to /etc/docker/daemon.json.

Terminal tool (approval.py):
- chmod: now catches 666 and symbolic modes (o+w, a+w), not just 777
- cp/mv/install: detected when targeting /etc/
- sed -i/--in-place: detected when targeting /etc/

File tools (file_tools.py):
- write_file and patch now refuse to write to sensitive system paths
  (/etc/, /boot/, /usr/lib/systemd/, docker.sock)
- Directs users to the terminal tool (which has approval prompts) for
  system file modifications
2026-03-29 22:33:47 -07:00
Teknium 1c900c45e3 fix(agent): support full context length resolution for direct Gemini API endpoints (#3876)
* add .aac audio file format support to transcription tool

* fix(agent): support full context length resolution for direct Gemini API endpoints

Add generativelanguage.googleapis.com to _URL_TO_PROVIDER so direct
Gemini API users get correct 1M+ context length instead of the 128K
unknown-proxy fallback.

Co-authored-by: bb873 <bb873@users.noreply.github.com>

---------

Co-authored-by: Adrian Scott <adrian@adrianscott.com>
Co-authored-by: bb873 <bb873@users.noreply.github.com>
2026-03-29 21:56:07 -07:00
Teknium 227601c200 feat(discord): add message processing reactions (salvage #1980) (#3871)
Adds lifecycle hooks to the base platform adapter so Discord (and future
platforms) can react to message processing events:

  👀  when processing starts
    on successful completion (delivery confirmed)
    on failure, error, or cancellation

Implementation:
- base.py: on_processing_start/on_processing_complete hooks with
  _run_processing_hook error isolation wrapper; delivery tracking
  via _record_delivery closure for accurate success detection
- discord.py: _add_reaction/_remove_reaction helpers + hook overrides
- Tests for base hook lifecycle and Discord-specific reactions

Co-authored-by: alanwilhelm <alanwilhelm@users.noreply.github.com>
2026-03-29 21:55:23 -07:00
Teknium fd29933a6d fix: use argparse entrypoint in top-level launcher (#3874)
The ./hermes convenience script still used the legacy Fire-based
cli.main wrapper, which doesn't support subcommands (gateway, cron,
doctor, etc.). The installed 'hermes' command already uses
hermes_cli.main:main (argparse) — this aligns the launcher.

Salvaged from PR #2009 by gito369.
2026-03-29 21:54:36 -07:00
Teknium 839f798b74 feat(telegram): add group mention gating and regex triggers (#3870)
Adds Discord-style mention gating for Telegram groups:
- telegram.require_mention: gate group messages (default: false)
- telegram.mention_patterns: regex wake-word triggers
- telegram.free_response_chats: bypass gating for specific chats

When require_mention is enabled, group messages are accepted only for:
- slash commands
- replies to the bot
- @botusername mentions
- regex wake-word pattern matches

DMs remain unrestricted. @mention text is stripped before passing to
the agent. Invalid regex patterns are ignored with a warning.

Config bridges follow the existing Discord pattern (yaml → env vars).

Cherry-picked and adapted from PR #1977 by mcleay. Fixed ChatType
comparison to work without python-telegram-bot installed (uses string
matching instead of enum, consistent with other entity_type checks).

Co-authored-by: mcleay <mcleay@users.noreply.github.com>
2026-03-29 21:53:59 -07:00
Teknium 366bfc3c76 fix(setup): auto-install matrix-nio during hermes setup (#3873)
Setup previously only printed a manual install hint for matrix-nio,
causing the gateway to crash with 'matrix-nio not installed' after
configuring Matrix. Now auto-installs matrix-nio (or matrix-nio[e2e]
when E2EE is enabled) using the same uv-first/pip-fallback pattern
as Daytona and Modal backends.

Also adds hermes-agent[matrix] to the [all] extra in pyproject.toml
and a regression test to keep it there.

Co-authored-by: Gutslabs <Gutslabs@users.noreply.github.com>
Co-authored-by: cutepawss <cutepawss@users.noreply.github.com>
2026-03-29 21:53:28 -07:00
Teknium b4ceb541a7 fix(terminal): preserve partial output when command times out (#3868)
When a command timed out, all captured output was discarded — the agent
only saw 'Command timed out after Xs' with zero context. Now returns
the buffered output followed by a timeout marker, matching the existing
interrupt path behavior.

Salvaged from PR #3286 by @binhnt92.

Co-authored-by: nguyen binh <binhnt92@users.noreply.github.com>
2026-03-29 21:51:44 -07:00
Teknium ccf7bb1102 fix(nous): use curated model list instead of full API dump for Nous Portal (#3867)
All three Nous Portal model selection paths (hermes model, first-time
login, setup wizard) were hitting the live /models endpoint and showing
every model available — potentially hundreds. Now uses the curated
_PROVIDER_MODELS['nous'] list (25 agentic models matching OpenRouter
defaults) with 'Enter custom model name' for anything else.

Fixed in:
- hermes_cli/main.py: _model_flow_nous()
- hermes_cli/auth.py: _login_nous() model selection
- hermes_cli/setup.py: post-login model selection
2026-03-29 21:38:10 -07:00
Teknium ce2841f3c9 feat(gateway): add WeCom (Enterprise WeChat) platform support (#3847)
Adds WeCom as a gateway platform adapter using the AI Bot WebSocket
gateway for real-time bidirectional communication. No public endpoint
or new pip dependencies needed (uses existing aiohttp + httpx).

Features:
- WebSocket persistent connection with auto-reconnect (exponential backoff)
- DM and group messaging with configurable access policies
- Media upload/download with AES decryption for encrypted attachments
- Markdown rendering, quote context preservation
- Proactive + passive reply message modes
- Chunked media upload pipeline (512KB chunks)

Cherry-picked from PR #1898 by EvilRan with:
- Moved to current main (PR was 300 commits behind)
- Skipped base.py regressions (reply_to additions are good but belong
  in a separate PR since they affect all platforms)
- Fixed test assertions to match current base class send() signature
  (reply_to=None kwarg now explicit)
- All 16 integration points added surgically to current main
- No new pip dependencies (aiohttp + httpx already installed)

Fixes #1898

Co-authored-by: EvilRan <EvilRan@users.noreply.github.com>
2026-03-29 21:29:13 -07:00
Teknium e296efbf24 fix: add INFO-level logging for auxiliary provider resolution (#3866)
The auxiliary client's auto-detection chain was a black box — when
compression, summarization, or memory flush failed, the only clue was
a generic 'Request timed out' with no indication of which provider was
tried or why it was skipped.

Now logs at INFO level:
- 'Auxiliary auto-detect: using local/custom (qwen3.5-9b) — skipped:
  openrouter, nous' when auto-detection picks a provider
- 'Auxiliary compression: using auto (qwen3.5-9b) at http://localhost:11434/v1'
  before each auxiliary call
- 'Auxiliary compression: provider custom unavailable, falling back to
  openrouter' on fallback
- Clear warning with actionable guidance when NO provider is available:
  'Set OPENROUTER_API_KEY or configure a local model in config.yaml'
2026-03-29 21:29:00 -07:00
Teknium 2ff2cd3a59 add .aac audio file format support to transcription tool (#3865)
Co-authored-by: Adrian Scott <adrian@adrianscott.com>
2026-03-29 21:27:03 -07:00
Teknium f39ca81bab docs: comprehensive hermes claw migrate reference (#3864)
The existing docs were two lines. The migration script handles 35
categories of data across persona, memory, skills, messaging platforms,
model providers, MCP servers, agent config, and more.

New docs cover:
- All CLI options (--dry-run, --preset, --overwrite, --migrate-secrets,
  --source, --workspace-target, --skill-conflict, --yes)
- 27 directly-imported categories with source → destination mapping
- 7 archived categories with manual recreation guidance
- Security notes on API key allowlisting
- Usage examples for common migration scenarios
2026-03-29 21:25:13 -07:00
Teknium 3fad1e7cc1 fix(cron): resolve human-friendly delivery labels via channel directory (#3860)
Cron jobs configured with deliver labels from send_message(action='list')
like 'whatsapp:Alice (dm)' passed the label as a literal chat_id.
WhatsApp bridge failed with jidDecode error since 'Alice (dm)' isn't
a valid JID.

Now _resolve_delivery_target() strips display suffixes like ' (dm)' and
resolves human-friendly names via the channel directory before using
them. Raw IDs pass through unchanged when the directory has no match.

Fixes #1945.
2026-03-29 21:24:17 -07:00
Teknium 86ac23c8da fix(auth): stop silently falling back to OpenRouter when no provider is configured (#3862)
Previously, when no API keys or provider credentials were found, Hermes
silently defaulted to OpenRouter + Claude Opus. This caused confusion
when users configured local servers (LM Studio, Ollama, etc.) with a
typo or unrecognized provider name — the system would silently route to
OpenRouter instead of telling them something was wrong.

Changes:
- resolve_provider() now raises AuthError when no credentials are found
  instead of returning 'openrouter' as a silent fallback
- Added local server aliases: lmstudio, ollama, vllm, llamacpp → custom
- Removed hardcoded 'anthropic/claude-opus-4.6' fallback from gateway
  and cron scheduler (they read from config.yaml instead)
- Updated cli-config.yaml.example with complete provider documentation
  including all supported providers, aliases, and local server setup
2026-03-29 21:06:35 -07:00
Teknium 3cc50532d1 fix: auxiliary client uses placeholder key for local servers without auth (#3842)
Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR #2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
2026-03-29 21:05:36 -07:00
Teknium 2d607d36f6 fix(security): catch sensitive path writes in approval checks (#3859)
Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>
2026-03-29 20:57:57 -07:00
Teknium aa389924ad fix: prefer curated model list when live probe returns fewer models (#3856)
The model picker for API-key providers (MiniMax, z.ai, etc.) probes
the live /models endpoint when the curated list has fewer than 8
models. When the live endpoint returns fewer models than the curated
list (e.g. MiniMax's Anthropic-compatible endpoint doesn't list M2.7),
the incomplete live list was used instead.

Now falls back to the curated list when live returns fewer models,
ensuring new models like MiniMax-M2.7 always appear in the picker.
2026-03-29 20:55:15 -07:00
Teknium 5e67fc8c40 fix(vision): reject non-image files and enforce website policy (salvage #1940) (#3845)
Three safety gaps in vision_analyze_tool:

1. Local files accepted without checking if they're actually images —
   a renamed text file would get base64-encoded and sent to the model.
   Now validates magic bytes (PNG, JPEG, GIF, BMP, WebP, SVG).

2. No website policy enforcement on image URLs — blocked domains could
   be fetched via the vision tool. Now checks before download.

3. No redirect check — if an allowed URL redirected to a blocked domain,
   the download would proceed. Now re-checks the final URL.

Fixed one test that needed _validate_image_url mocked to bypass DNS
resolution on the fake blocked.test domain (is_safe_url does DNS
checks that were added after the original PR).

Co-authored-by: GutSlabs <GutSlabs@users.noreply.github.com>
2026-03-29 20:55:04 -07:00
Teknium b60cfd6ce6 fix(telegram): gracefully handle deleted reply targets (#3858)
* fix: add gpt-5.4-mini to Codex fallback catalog

* fix(telegram): gracefully handle deleted reply targets

When a user deletes their message while Hermes is processing, Telegram
returns BadRequest 'Message to be replied not found'. Previously this
was an unhandled permanent error causing silent delivery failure.

Now clears reply_to_id and retries so the response is still delivered,
matching the existing 'thread not found' recovery pattern.

Inspired by PR #3231 by @heathley. Fixes #3229.

---------

Co-authored-by: Clippy <clippy@grads.flow>
Co-authored-by: Nigel Gibbs <heathley@users.noreply.github.com>
2026-03-29 20:47:07 -07:00
Teknium 981e14001c fix: clear api_mode on provider switch instead of hardcoding chat_completions (#3857)
PR #3726 fixed stale codex_responses persisting when switching providers
by hardcoding api_mode=chat_completions in 5 model flows. This broke
MiniMax, MiniMax-CN, and Alibaba which use /anthropic endpoints that
need anthropic_messages — the hardcoded value overrides the URL-based
auto-detection in runtime_provider.py.

Fix: pop api_mode from config in the 3 URL-dependent flows (custom
endpoint, Kimi, api_key_provider) instead of hardcoding. The runtime
resolver already correctly auto-detects api_mode from the base_url
suffix (/anthropic -> anthropic_messages, else chat_completions).

OpenRouter and Copilot ACP flows keep the explicit value since their
api_mode is always known.

Reported by stefan171.
2026-03-29 20:44:39 -07:00
Teknium 9d28f4aba3 fix: add gpt-5.4-mini to Codex fallback catalog (#3855)
Co-authored-by: Clippy <clippy@grads.flow>
2026-03-29 20:10:00 -07:00
Teknium 3e203de125 fix(skills): block category path traversal in skill manager (#3844)
Validate category names in _create_skill() before using them as
filesystem path segments. Previously, categories like '../escape' or
'/tmp/pwned' could write skill files outside ~/.hermes/skills/.

Adds _validate_category() that rejects slashes, backslashes, absolute
paths, and non-alphanumeric characters (reuses existing VALID_NAME_RE).

Tests: 5 new tests for traversal, absolute paths, and valid categories.

Salvaged from PR #1939 by Gutslabs.
2026-03-29 20:08:22 -07:00
Teknium 2d264a4562 fix(tests): resolve 10 CI failures across hooks, tiktoken, plugins (#3848)
test_hooks.py (7 failures): Built-in boot-md hook was always loaded
by _register_builtin_hooks(), adding +1 to every expected hook count.
Mock out built-in registration in TestDiscoverAndLoad so tests isolate
user-hook discovery logic.

test_tool_token_estimation.py (2 failures): tiktoken is not in
core/[all] dependencies. The estimation function gracefully returns {}
when tiktoken is missing, but tests expected non-empty results. Added
skipif markers for tests that need tiktoken.

test_plugins_cmd.py (1 failure): bare 'hermes plugins' now dispatches
to cmd_toggle() (interactive curses UI) instead of cmd_list(). Updated
test to match the new behavior.
2026-03-29 20:05:59 -07:00
Teknium 3e2c8c529b fix(whatsapp): resolve LID↔phone aliases in allowlist matching (#3830)
WhatsApp DMs can arrive with LID sender IDs even when
WHATSAPP_ALLOWED_USERS is configured with phone numbers. The allowlist
check now reads bridge session mapping files (lid-mapping-*.json) to
resolve phone↔LID aliases, matching users regardless of which
identifier format the message uses.

Both the Python gateway (_is_user_authorized) and the Node bridge
(allowlist.js) now share the same mapping-file-based resolution logic.

Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>
2026-03-29 18:21:50 -07:00
Teknium e4d575e563 fix: report subagent status as completed when summary exists (#3829)
When a subagent hit max_iterations, status was always 'failed' even
if it produced a usable summary via _handle_max_iterations(). This
happened because the status check required both completed=True AND
a summary, but completed is False whenever max_iterations is reached
(run_agent.py line 7969).

Now gates status on whether a summary was produced — if the subagent
returned a final_response, the parent has usable output regardless of
iteration budget. The exit_reason field already distinguishes
'completed' vs 'max_iterations' for anything that needs to know how
the task ended.

Closes #1899.
2026-03-29 18:21:36 -07:00
Teknium 2a0e8b001f fix(cli): handle closed stdout ValueError in safe print paths (#3843)
When stdout is closed (piped to a dead process, broken terminal),
Python raises ValueError('I/O operation on closed file'), not OSError.
_safe_print and the API error printer only caught OSError, letting the
ValueError propagate and crash the agent.

Salvaged from PR #3760 by @apexscaleai. Fixes #3534.

Co-authored-by: apexscaleai <apexscaleai@users.noreply.github.com>
2026-03-29 18:21:27 -07:00
Teknium ca4907dfbc feat(gateway): add Feishu/Lark platform support (#3817)
Adds Feishu (ByteDance's enterprise messaging platform) as a gateway
platform adapter with full feature parity: WebSocket + webhook transports,
message batching, dedup, rate limiting, rich post/card content parsing,
media handling (images/audio/files/video), group @mention gating,
reaction routing, and interactive card button support.

Cherry-picked from PR #1793 by penwyp with:
- Moved to current main (PR was 458 commits behind)
- Fixed _send_with_retry shadowing BasePlatformAdapter method (renamed to
  _feishu_send_with_retry to avoid signature mismatch crash)
- Fixed import structure: aiohttp/websockets imported independently of
  lark_oapi so they remain available when SDK is missing
- Fixed get_hermes_home import (hermes_constants, not hermes_cli.config)
- Added skip decorators for tests requiring lark_oapi SDK
- All 16 integration points added surgically to current main

New dependency: lark-oapi>=1.5.3,<2 (optional, pip install hermes-agent[feishu])

Fixes #1788

Co-authored-by: penwyp <penwyp@users.noreply.github.com>
2026-03-29 18:17:42 -07:00
Teknium e314833c9d feat(display): configurable tool preview length -- show full paths by default (#3841)
Tool call previews (paths, commands, queries) were hardcoded to truncate
at 35-40 chars across CLI spinners, completion lines, and gateway progress
messages. Users could not see full file paths in tool output.

New config option: display.tool_preview_length (default 0 = no limit).
Set a positive number to truncate at that length.

Changes:
- display.py: module-level _tool_preview_max_len with getter/setter;
  build_tool_preview() and get_cute_tool_message() _trunc/_path respect it
- cli.py: reads config at startup, spinner widget respects config
- gateway/run.py: reads config per-message, progress callback respects config
- run_agent.py: removed redundant 30-char quiet-mode spinner truncation
- config.py: added display.tool_preview_length to DEFAULT_CONFIG

Reported by kriskaminski
2026-03-29 18:02:42 -07:00
Teknium 59f2b228f7 fix(paths): respect HERMES_HOME for protected .env write-deny path (#3840)
The write-deny list in file_operations.py hardcoded ~/.hermes/.env,
which misses the actual .env in custom HERMES_HOME or profile setups.
Use get_hermes_home() for profile-safe path resolution.

Salvaged from PR #3232 by @erhnysr.

Co-authored-by: Erhnysr <erhnysr@users.noreply.github.com>
2026-03-29 18:02:11 -07:00
Teknium d6b7836210 fix: update session_log_file during context compression (#3835)
When compression creates a child session with a new session_id,
session_log_file was still pointing to the old session's JSON file.
This caused _save_session_log() to write new data to the wrong file.

Closes #3731.

Co-authored-by: kelsia14 <kelsia14@users.noreply.github.com>
2026-03-29 17:49:58 -07:00
Teknium 17b6000e90 feat(skills): add songwriting-and-ai-music creative skill (salvage #1901) (#3834)
Adds a songwriting craft and AI music prompt engineering skill covering
song structure, rhyme/meter, emotional arcs, Suno metatag reference,
phonetic tricks for AI singers, parody adaptation, and production workflow.

Complements existing music skills (heartmula, audiocraft, songsee) which
cover model setup/usage — this one covers the creative process itself.

Also removes the empty skills/music-creation/ category (only had a
DESCRIPTION.md, no actual skills).

Co-authored-by: 123mikeyd <123mikeyd@users.noreply.github.com>
2026-03-29 17:49:19 -07:00
Teknium 45c8d3da96 fix(banner): show lazy-initialized tools in yellow instead of red (salvage #1854) (#3822)
Tools from check_fn-gated toolsets (honcho, homeassistant) showed as
red (disabled) in the startup banner even when properly configured.
This happened because check_fn runs lazily after session context is
set, but the banner renders before agent init.

Now distinguishes three states:
  - red:    truly unavailable (missing env var, no API key)
  - yellow: lazy-initialized (check_fn pending, will activate on use)
  - normal: available and ready

Only the banner fix was salvaged from the original PR; unrelated
bundled changes (context_compressor, STT config, auth default_model,
SessionResetPolicy) were discarded.

Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com>
2026-03-29 16:53:29 -07:00
Teknium 5ca6d681f0 feat(skills): add memento-flashcards optional skill (#3827)
* feat(skills): add memento-flashcards skill

* docs(skills): clarify memento-flashcards interaction model

* fix: use HERMES_HOME env var for profile-safe data path

---------

Co-authored-by: Magnus Ahmad <magnus.ahmad@gmail.com>
2026-03-29 16:52:52 -07:00
Teknium df806bdbaf feat(cron): add cron.wrap_response config to disable delivery wrapping (#3807)
Adds a config option to suppress the header/footer text that wraps
cron job responses when delivered to messaging platforms.

Set cron.wrap_response: false in config.yaml for clean output without
the 'Cronjob Response: <name>' header and 'The agent cannot see this
message' footer.  Default is true (preserves current behavior).
2026-03-29 16:31:01 -07:00
Teknium 0ef80c5f32 fix(whatsapp): reuse persistent aiohttp session across requests (#3818)
Replace per-request aiohttp.ClientSession() in every WhatsApp adapter
method with a single persistent self._http_session, matching the pattern
used by Mattermost, HomeAssistant, and SMS adapters.

Changes:
- Create self._http_session in connect(), close in disconnect()
- All bridge HTTP calls (send, edit, send-media, typing, get_chat_info,
  poll_messages) now use the shared session
- Explicitly cancel _poll_task on disconnect() instead of relying
  solely on self._running = False
- Health-check sessions in connect() remain ephemeral (persistent
  session not yet created at that point)
- Remove per-method ImportError guards for aiohttp (always available
  when gateway runs via [messaging] extras)

Salvaged from PR #1851 by Himess. The _poll_task storage was already
on main from PR #3267; this adds the disconnect cancellation and the
persistent session.

Tests: 4 new tests for session close, already-closed skip, poll task
cancellation, and done-task skip.
2026-03-29 16:25:20 -07:00
Teknium c4cf20f564 fix: clear __pycache__ during update to prevent stale bytecode ImportError (#3819)
Third report of gateway crashing with:
  ImportError: cannot import name 'get_hermes_home' from 'hermes_constants'

Root cause: stale .pyc bytecode files survive code updates. When Python
loads a cached .pyc that references names from the old source, the import
fails and the gateway won't start.

Two bugs fixed:
1. Git update path: no cache clearing at all after git pull
2. ZIP update path: __pycache__ was explicitly in the preserve set

Added _clear_bytecode_cache() helper that removes all __pycache__ dirs
under PROJECT_ROOT (skipping venv/node_modules/.git/.worktrees). Called
in both git and ZIP update paths, before pip install.
2026-03-29 16:23:36 -07:00
Teknium 68d5472810 fix: omit tools param entirely when empty instead of sending None (#3820)
Some providers (Fireworks AI) reject tools=null, and others (Anthropic)
reject tools=[]. The safest approach is to not include the key at all
when there are no tools — the OpenAI SDK treats a missing parameter as
NOT_GIVEN and omits it from the request entirely.

Inspired by PR #3736 (@kelsia14).
2026-03-29 16:12:47 -07:00
Teknium 252fbea005 feat(providers): add ordered fallback provider chain (salvage #1761) (#3813)
Extends the single fallback_model mechanism into an ordered chain.
When the primary model fails, Hermes tries each fallback provider in
sequence until one succeeds or the chain is exhausted.

Config format (new):
  fallback_providers:
    - provider: openrouter
      model: anthropic/claude-sonnet-4
    - provider: openai
      model: gpt-4o

Legacy single-dict fallback_model format still works unchanged.

Key fix vs original PR: the call sites in the retry loop now use
_fallback_index < len(_fallback_chain) instead of the old one-shot
_fallback_activated guard, so the chain actually advances through
all configured providers.

Changes:
- run_agent.py: _fallback_chain list + _fallback_index replaces
  one-shot _fallback_model; _try_activate_fallback() advances
  through chain; failed provider resolution skips to next entry;
  call sites updated to allow chain advancement
- cli.py: reads fallback_providers with legacy fallback_model compat
- gateway/run.py: same
- hermes_cli/config.py: fallback_providers: [] in DEFAULT_CONFIG
- tests: 12 new chain tests + 6 existing test fixtures updated

Co-authored-by: uzaylisak <uzaylisak@users.noreply.github.com>
2026-03-29 16:04:53 -07:00
Teknium c774833667 fix(banner): show honcho tools as available when configured (#3810)
The honcho check_fn only checked runtime session state, which isn't
set until the agent initializes. At banner time, honcho tools showed
as red/disabled even when properly configured.

Now checks configuration (enabled + api_key/base_url) as a fallback
when the session context isn't active yet. Fast path (session active)
unchanged; slow path (config check) only runs at banner time.

Adds 4 tests covering: session active, configured but no session,
not configured, and import failure graceful fallback.

Closes #1843.
2026-03-29 15:55:05 -07:00
Teknium d5d22fe7ba feat(mcp): dynamic tool discovery via notifications/tools/list_changed (#3812)
When a connected MCP server sends a ToolListChangedNotification (per the
MCP spec), Hermes now automatically re-fetches the tool list, deregisters
removed tools, and registers new ones — without requiring a restart.

This enables MCP servers with dynamic toolsets (e.g. GitHub MCP with
GITHUB_DYNAMIC_TOOLSETS=1) to add/remove tools at runtime.

Changes:
- registry.py: add ToolRegistry.deregister() for nuke-and-repave refresh
- mcp_tool.py: extract _register_server_tools() from
  _discover_and_register_server() as a shared helper for both initial
  discovery and dynamic refresh
- mcp_tool.py: add _make_message_handler() and _refresh_tools() on
  MCPServerTask, wired into all 3 ClientSession sites (stdio, new HTTP,
  deprecated HTTP)
- Graceful degradation: silently falls back to static discovery when the
  MCP SDK lacks notification types or message_handler support
- 8 new tests covering registration, refresh, handler dispatch, and
  deregister

Salvaged from PR #1794 by shivvor2.
2026-03-29 15:52:54 -07:00
Teknium bf84cdfa5e fix: ensure tool schema always includes name field in get_definitions (#3811)
When a tool plugin registers a schema without an explicit 'name' key,
get_definitions() crashes with KeyError:

    available_tool_names = {t["function"]["name"] for t in filtered_tools}

Fix: always merge entry.name into schema so 'name' is never missing.

Refs: #3729

Co-authored-by: ekkoitac <ekko.itac@gmail.com>
2026-03-29 15:49:21 -07:00
Teknium 38d694f559 fix(gateway): apply home channel env overrides consistently (#3808)
Home channel env vars (SLACK_HOME_CHANNEL, SIGNAL_HOME_CHANNEL, etc.)
for Slack, Signal, Mattermost, Matrix, Email, and SMS were nested
inside the credential-env blocks, so they were ignored when the
platform was already configured via config.yaml.

Moved the home channel handling outside the credential blocks with a
Platform.X in config.platforms guard, matching the existing pattern
for Telegram and Discord.

Co-authored-by: cutepawss <cutepawss@users.noreply.github.com>
2026-03-29 15:48:51 -07:00
Teknium ed6427e0a7 fix(agent): user-friendly 429 rate limit messages with Retry-After support (#3809)
When hitting rate limits (429), the agent now:
- Extracts the Retry-After header from the provider response and uses it
  as the wait time instead of blind exponential backoff (capped at 120s)
- Shows rate-limit-specific messaging: 'Rate limit reached. Waiting Xs
  before retry (attempt N/M)...'
- Shows a distinct exhaustion message: 'Rate limit persisted after N
  retries. Please try again later.'

Non-429 errors keep the existing exponential backoff and generic messaging.

Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
2026-03-29 15:48:06 -07:00
Teknium 0fd3b59ba1 feat(cli): add Ctrl+Z process suspend support (#3802)
Adds a Ctrl+Z key binding to suspend the hermes CLI to background
using standard Unix job control. Uses prompt_toolkit's run_in_terminal()
to properly save/restore terminal state, then sends SIGTSTP to the
process group. Prints a branded message with resume instructions.
Shows a not-supported notice on Windows.

Co-authored-by: CharlieKerfoot <CharlieKerfoot@users.noreply.github.com>
2026-03-29 15:47:55 -07:00
Teknium 6716e66e89 feat: add MCP server mode — hermes mcp serve (#3795)
hermes mcp serve starts a stdio MCP server that lets any MCP client
(Claude Code, Cursor, Codex, etc.) interact with Hermes conversations.

Matches OpenClaw's 9-tool channel bridge surface:

Tools exposed:
- conversations_list: list active sessions across all platforms
- conversation_get: details on one conversation
- messages_read: read message history
- attachments_fetch: extract non-text content from messages
- events_poll: poll for new events since a cursor
- events_wait: long-poll / block until next event (near-real-time)
- messages_send: send to any platform via send_message_tool
- channels_list: browse available messaging targets
- permissions_list_open: list pending approval requests
- permissions_respond: allow/deny approvals

Architecture:
- EventBridge: background thread polls SessionDB for new messages,
  maintains in-memory event queue with waiter support
- Reads sessions.json + SessionDB directly (no gateway dep for reads)
- Reuses send_message_tool for sending (same platform adapters)
- FastMCP server with stdio transport
- Zero new dependencies (uses existing mcp>=1.2.0 optional dep)

Files:
- mcp_serve.py: MCP server + EventBridge (~600 lines)
- hermes_cli/main.py: added serve sub-parser to hermes mcp
- hermes_cli/mcp_config.py: route serve action to run_mcp_server
- tests/test_mcp_serve.py: 53 tests
- docs: updated MCP page + CLI commands reference
2026-03-29 15:47:19 -07:00
Teknium d02561af85 feat: add Gemini 3.1 preview models to OpenRouter and Nous catalogs (#3803)
* Add new Gemini 3.1 model entries to models.py

* fix: also add Gemini 3.1 models to nous provider list

---------

Co-authored-by: Andrei Ignat <andrei@ignat.se>
2026-03-29 15:44:07 -07:00
Teknium 8eb70a6885 fix(email): close SMTP and IMAP connections on failure (#3804)
SMTP connections in _send_email() and _send_email_with_attachment() leak
when login() or send_message() raises before quit() is reached. Both now
wrapped in try/finally with a close() fallback if quit() also fails.

IMAP connection in _fetch_new_messages() leaks when UID processing raises,
since logout() sits after the loop. Restructured with try/finally so
logout() runs unconditionally.

Co-authored-by: Himess <Himess@users.noreply.github.com>
2026-03-29 15:38:32 -07:00
Teknium ee3d2941cc feat: show estimated tool token context in hermes tools checklist (#3805)
* feat: show estimated tool token context in hermes tools checklist

Adds a live token estimate indicator to the bottom of the interactive
tool configuration checklist (hermes tools / hermes setup). As users
toggle toolsets on/off, the total estimated context cost updates in
real time.

Implementation:
- tools/registry.py: Add get_schema() for check_fn-free schema access
- hermes_cli/curses_ui.py: Add optional status_fn callback to
  curses_checklist — renders at bottom-right of terminal, stays fixed
  while items scroll
- hermes_cli/tools_config.py: Add _estimate_tool_tokens() using
  tiktoken (cl100k_base, already installed) to count tokens in the
  JSON-serialised OpenAI-format tool schemas. Results are cached
  per-process. The status function deduplicates overlapping tools
  (e.g. browser includes web_search) for accurate totals.
- 12 new tests covering estimation, caching, graceful degradation
  when tiktoken is unavailable, status_fn wiring, deduplication,
  and the numbered fallback display

* fix: use effective toolsets (includes plugins) for token estimation index mapping

The status_fn closure built ts_keys from CONFIGURABLE_TOOLSETS but the
checklist uses _get_effective_configurable_toolsets() which appends plugin
toolsets. With plugins present, the indices would mismatch, causing
IndexError when selecting a plugin toolset.
2026-03-29 15:36:56 -07:00
Teknium 475205e30b fix: restore terminalbench2_env.py from patch-tool redaction corruption (#3801)
Commit ed27b826 introduced patch-tool redaction corruption that:
- Replaced max_token_length=16000 with max_token_length=***
- Truncated api_key=os.getenv(...) to api_key=os.get...EY
- Truncated tokenizer_name to NousRe...1-8B
- Deleted 409 lines including _run_tests(), _eval_with_timeout(),
  evaluate(), wandb_log(), and the __main__ entry point

Restores the file from pre-corruption state (ed27b826^) and re-applies
the two legitimate changes from subsequent commits:
- eval_concurrency config field (from ed27b826)
- docker_image registration in register_task_env_overrides (from ed27b826)
- ManagedServer branching for vLLM/SGLang backends (from 13f54596)

Closes #1737, #1740.
2026-03-29 15:33:52 -07:00
Teknium 612321631f fix(gateway): use atomic writes for config.yaml to prevent data loss (#3800)
Replace all 5 plain open(config_path, 'w') calls in gateway command
handlers with atomic_yaml_write() from utils.py. This uses the
established tempfile + fsync + os.replace pattern to ensure config.yaml
is never left half-written if the process is killed mid-write.

Affected handlers: /personality (clear + set), /sethome, /reasoning
(_save_config_key helper), /verbose (tool_progress cycling).

Also fixes missing encoding='utf-8' on the /personality clear write.

Salvaged from PR #1211 by albatrosjj.
2026-03-29 15:32:46 -07:00
Teknium 83cbf7b5bb fix(gateway): use atomic writes for config.yaml to prevent data loss (#3800)
Replace all 5 plain open(config_path, 'w') calls in gateway command
handlers with atomic_yaml_write() from utils.py. This uses the
established tempfile + fsync + os.replace pattern to ensure config.yaml
is never left half-written if the process is killed mid-write.

Affected handlers: /personality (clear + set), /sethome, /reasoning
(_save_config_key helper), /verbose (tool_progress cycling).

Also fixes missing encoding='utf-8' on the /personality clear write.

Salvaged from PR #1211 by albatrosjj.
2026-03-29 15:31:21 -07:00
Teknium 563101e2a9 feat: add Canvas LMS skill for fetching courses and assignments (#3799)
Adds a Canvas LMS integration skill under optional-skills/productivity/canvas/
with a Python CLI wrapper (canvas_api.py) for listing courses and assignments
via personal access token auth.

Cherry-picked from PR #1250 by Alicorn-Max-S with:
- Moved from skills/ to optional-skills/ (niche educational integration)
- Fixed hardcoded ~/.hermes/ path to use $HERMES_HOME
- Removed Canvas env vars from .env.example (optional skill)
- Cleaned stale 'mini-swe-agent backend' reference from .env.example header

Co-authored-by: Alicorn-Max-S <Alicorn-Max-S@users.noreply.github.com>
2026-03-29 15:28:32 -07:00
Teknium fe6a916284 feat(skills): add one-three-one-rule communication skill (#3797)
Adds a structured 1-3-1 decision-making framework as an optional skill.
Produces: one problem statement, three options with trade-offs, one
recommendation with definition of done and implementation plan.

Moved to optional-skills/ (niche communication framework, not broadly
needed by default). Improved description with clearer trigger conditions
and replaced implementation-specific example with a generic one.

Based on PR #1262 by Willardgmoore.

Co-authored-by: Willard Moore <willardgmoore@users.noreply.github.com>
2026-03-29 15:25:12 -07:00
Teknium 57481c8ac5 fix(tools): implement send_message routing for Matrix, Mattermost, HomeAssistant, DingTalk (#3796)
* fix(tools): implement send_message routing for Matrix, Mattermost, HomeAssistant, DingTalk

Matrix, Mattermost, HomeAssistant, and DingTalk were present in
platform_map but fell through to the "not yet implemented" else branch,
causing send_message tool calls to silently fail on these platforms.

Add four async sender functions:
- _send_mattermost: POST /api/v4/posts via Mattermost REST API
- _send_matrix: PUT /_matrix/client/v3/rooms/.../send via Matrix CS API
- _send_homeassistant: POST /api/services/notify/notify via HA REST API
- _send_dingtalk: POST to session webhook URL

Add routing in _send_to_platform() and 17 unit tests covering success,
HTTP errors, missing config, env var fallback, and Matrix txn_id uniqueness.

* fix: pass platform tokens explicitly to Mattermost/Matrix/HA senders

The original PR passed pconfig.extra to sender functions, but tokens
live at pconfig.token (not in extra). This caused the senders to always
fall through to env var lookup instead of using the gateway-resolved
token.

Changes:
- Mattermost/Matrix/HA: accept token as first arg, matching the
  Telegram/Discord/Slack sender pattern
- DingTalk: add DINGTALK_WEBHOOK_URL env var fallback + docstring
  explaining the session-webhook vs robot-webhook difference
- Tests updated for new signatures + new DingTalk env var test

---------

Co-authored-by: sprmn24 <oncuevtv@gmail.com>
2026-03-29 15:17:46 -07:00
Teknium c62cadb73a fix: make display_hermes_home imports lazy to prevent ImportError during hermes update (#3776)
When a user runs 'hermes update', the Python process caches old modules
in sys.modules.  After git pull updates files on disk, lazy imports of
newly-updated modules fail because they try to import display_hermes_home
from the cached (old) hermes_constants which doesn't have the function.

This specifically broke the gateway auto-restart in cmd_update — importing
hermes_cli/gateway.py triggered the top-level 'from hermes_constants
import display_hermes_home' against the cached old module.  The ImportError
was silently caught, so the gateway was never restarted after update.

Users with a running gateway then hit the ImportError on their next
Telegram/Discord message when the stale gateway process lazily loaded
run_agent.py (new version) which also had the top-level import.

Fixes:
- hermes_cli/gateway.py: lazy import at call site (line 940)
- run_agent.py: lazy import at call site (line 6927)
- tools/terminal_tool.py: lazy imports at 3 call sites
- tools/tts_tool.py: static schema string (no module-level call)
- hermes_cli/auth.py: lazy import at call site (line 2024)
- hermes_cli/main.py: reload hermes_constants after git pull in cmd_update

Also fixes 4 pre-existing test failures in test_parse_env_var caused by
NameError on display_hermes_home in terminal_tool.py.
2026-03-29 15:15:17 -07:00
Teknium 442888a05b fix: store token lock identity at acquire time for Slack and Discord
Community review (devoruncommented) correctly identified that the Slack
adapter re-read SLACK_APP_TOKEN from os.getenv() during disconnect,
which could differ from the value used during connect if the environment
changed. Discord had the same pattern with self.config.token (less risky
but still not bulletproof).

Both now follow the Telegram pattern: store the token identity on self
at acquire time, use the stored value for release, clear after release.

Also fixes docs: alias naming was hermes-<name> in docs but actual
implementation creates <name> directly (e.g. ~/.local/bin/coder not
~/.local/bin/hermes-coder).
2026-03-29 11:09:17 -07:00
Teknium b151d5f7a7 docs: fix profile alias naming and improve quick start
The docs incorrectly showed aliases as 'hermes-work' when the actual
implementation creates 'work' (profile name directly, no prefix).

Rewrote the user guide to lead with the alias pattern:
  hermes profile create coder → coder chat, coder setup, etc.

Also clarified that the banner shows 'Profile: coder' and the prompt
shows 'coder ❯' when a non-default profile is active.

Fixed alias paths in command reference (hermes-work → work).
2026-03-29 10:51:51 -07:00
Teknium f6db1b27ba feat: add profiles — run multiple isolated Hermes instances (#3681)
Each profile is a fully independent HERMES_HOME with its own config,
API keys, memory, sessions, skills, gateway, cron, and state.db.

Core module: hermes_cli/profiles.py (~900 lines)
  - Profile CRUD: create, delete, list, show, rename
  - Three clone levels: blank, --clone (config), --clone-all (everything)
  - Export/import: tar.gz archive for backup and migration
  - Wrapper alias scripts (~/.local/bin/<name>)
  - Collision detection for alias names
  - Sticky default via ~/.hermes/active_profile
  - Skill seeding via subprocess (handles module-level caching)
  - Auto-stop gateway on delete with disable-before-stop for services
  - Tab completion generation for bash and zsh

CLI integration (hermes_cli/main.py):
  - _apply_profile_override(): pre-import -p/--profile flag + sticky default
  - Full 'hermes profile' subcommand: list, use, create, delete, show,
    alias, rename, export, import
  - 'hermes completion bash/zsh' command
  - Multi-profile skill sync in hermes update

Display (cli.py, banner.py, gateway/run.py):
  - CLI prompt: 'coder ❯' when using a non-default profile
  - Banner shows profile name
  - Gateway startup log includes profile name

Gateway safety:
  - Token locks: Discord, Slack, WhatsApp, Signal (extends Telegram pattern)
  - Port conflict detection: API server, webhook adapter

Diagnostics (hermes_cli/doctor.py):
  - Profile health section: lists profiles, checks config, .env, aliases
  - Orphan alias detection: warns when wrapper points to deleted profile

Tests (tests/hermes_cli/test_profiles.py):
  - 71 automated tests covering: validation, CRUD, clone levels, rename,
    export/import, active profile, isolation, alias collision, completion
  - Full suite: 6760 passed, 0 new failures

Documentation:
  - website/docs/user-guide/profiles.md: full user guide (12 sections)
  - website/docs/reference/profile-commands.md: command reference (12 commands)
  - website/docs/reference/faq.md: 6 profile FAQ entries
  - website/sidebars.ts: navigation updated
2026-03-29 10:41:20 -07:00
Teknium 0df4d1278e feat(plugins): add enable/disable commands + interactive toggle UI (#3747)
Adds plugin management with three interfaces:

  hermes plugins          # interactive curses checklist (like hermes tools)
  hermes plugins enable   # non-interactive enable
  hermes plugins disable  # non-interactive disable
  hermes plugins list     # table with status column

Disabled plugins are stored in config.yaml under plugins.disabled and
skipped during discovery. Uses the same curses_checklist component as
hermes tools for the interactive UI.

Changes:
- hermes_cli/plugins.py: _get_disabled_plugins() + skip disabled during
  discover_and_load()
- hermes_cli/plugins_cmd.py: cmd_toggle() interactive UI, cmd_enable(),
  cmd_disable(), updated cmd_list() with status column
- hermes_cli/main.py: enable/disable subparser entries
- website/docs/reference/cli-commands.md: updated plugins section
- website/docs/user-guide/features/plugins.md: updated managing section
2026-03-29 10:39:57 -07:00
Teknium 95f99ea4b9 feat: built-in boot-md hook — run BOOT.md on gateway startup (#3733)
The gateway now ships with a built-in boot-md hook that checks for
~/.hermes/BOOT.md on every startup. If the file exists, the agent
executes its instructions in a background thread. No installation
or configuration needed — just create the file.

No BOOT.md = zero overhead (the hook silently returns).

Implementation:
- gateway/builtin_hooks/boot_md.py: handler with boot prompt,
  background thread, [SILENT] suppression, error handling
- gateway/hooks.py: _register_builtin_hooks() called at the start
  of discover_and_load() to wire in built-in hooks
- Docs updated: hooks page documents BOOT.md as a built-in feature
2026-03-29 10:19:54 -07:00
Teknium 811adca277 feat(skills): add SiYuan Note and Scrapling as optional skills (#3742)
Add two new optional skills:

- siyuan (optional-skills/productivity/): SiYuan Note knowledge base
  API skill — search, read, create, and manage blocks/documents in a
  self-hosted SiYuan instance via curl. Requires SIYUAN_TOKEN.

- scrapling (optional-skills/research/): Intelligent web scraping skill
  using the Scrapling library — anti-bot fetching, Cloudflare bypass,
  CSS/XPath selectors, spider framework for multi-page crawling.

Placed in optional-skills/ (not bundled) since both are niche tools
that require external dependencies.

Co-authored-by: FEUAZUR <FEUAZUR@users.noreply.github.com>
2026-03-29 09:34:56 -07:00
Teknium aafe37012a docs: update skills catalog — add red-teaming and optional skills (#3745)
* fix(discord): clean up deferred "thinking..." after slash commands complete

After a slash command is deferred (interaction.response.defer), the
"thinking..." indicator persisted indefinitely because the code used
followup.send() which creates a separate message instead of replacing
or removing the deferred response.

Fix: use edit_original_response() to replace "thinking..." with the
confirmation text when provided, or delete_original_response() to
remove it when there is no confirmation. Also consolidated /reasoning
and /voice handlers to use _run_simple_slash instead of duplicating
the defer+dispatch pattern.

Fixes #3595.

* docs: update skills catalog — add red-teaming category and all 16 optional skills

The skills catalog was missing:
- red-teaming category with the godmode jailbreaking skill
- The entire optional skills section (16 skills across 10 categories)

Added both with descriptions sourced from each SKILL.md frontmatter.
Verified against the actual skills/ and optional-skills/ directories.
2026-03-29 09:34:35 -07:00
Teknium 909de72426 fix: set api_mode when switching providers via hermes model (#3726)
When switching providers via 'hermes model', the previous provider's
api_mode persisted in config.yaml. Switching from Copilot
(codex_responses) to a chat_completions provider like Z.AI would send
requests to the wrong endpoint (404).

Set api_mode = chat_completions in the 4 provider flows that were
missing it: OpenRouter, custom endpoint, Kimi, and api_key_provider.

Co-authored-by: Nour Eddine Hamaidi <HenkDz@users.noreply.github.com>
2026-03-29 08:07:11 -07:00
Teknium ba1b600bce fix(tests): align skill/setup and platform mocks with current behavior (#3721)
- Skill invocation: no secret capture callback so SSH remote setup note is emitted
- Patch agent.skill_utils.sys for platform checks (skill_matches_platform)
- Skip CLAUDE.md priority test on Darwin (case-insensitive FS)

Made-with: Cursor

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-29 07:51:43 -07:00
Teknium fcd1645223 feat(skills): support external skill directories via config (#3678)
Add skills.external_dirs config option — a list of additional directories
to scan for skills alongside ~/.hermes/skills/. External dirs are read-only:
skill creation/editing always writes to the local dir. Local skills take
precedence when names collide.

This lets users share skills across tools/agents without copying them into
Hermes's own directory (e.g. ~/.agents/skills, /shared/team-skills).

Changes:
- agent/skill_utils.py: add get_external_skills_dirs() and get_all_skills_dirs()
- agent/prompt_builder.py: scan external dirs in build_skills_system_prompt()
- tools/skills_tool.py: _find_all_skills() and skill_view() search external dirs;
  security check recognizes configured external dirs as trusted
- agent/skill_commands.py: /skill slash commands discover external skills
- hermes_cli/config.py: add skills.external_dirs to DEFAULT_CONFIG
- cli-config.yaml.example: document the option
- tests/agent/test_external_skills.py: 11 tests covering discovery, precedence,
  deduplication, and skill_view for external skills

Requested by community member primco.
2026-03-29 00:33:30 -07:00
Teknium 253a9adc72 docs(skills): clarify DuckDuckGo runtime requirements (#3680)
Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-29 00:17:57 -07:00
Teknium 300964178f docs: document credential file passthrough and env var forwarding for remote backends (#3677)
Three docs pages updated:

- security.md: New 'Credential File Passthrough' section, updated
  sandbox filter table to include Docker/Modal rows, added info box
  about Docker env_passthrough merge
- creating-skills.md: New 'Credential File Requirements' section
  with frontmatter examples and guidance on when to use env vars
  vs credential files
- environment-variables.md: Updated TERMINAL_DOCKER_FORWARD_ENV
  description to note auto-passthrough from skills
2026-03-29 00:16:34 -07:00
Teknium 7a3682ac3f feat: mount skill credential files + fix env passthrough for remote backends (#3671)
Two related fixes for remote terminal backends (Modal/Docker):

1. NEW: Credential file mounting system
   Skills declare required_credential_files in frontmatter. Files are
   mounted into Docker (read-only bind mounts) and Modal (mounts at
   creation + sync via exec on each command for mid-session changes).
   Google Workspace skill updated with the new field.

2. FIX: Docker backend now includes env_passthrough vars
   Skills that declare required_environment_variables (e.g. Notion with
   NOTION_API_KEY) register vars in the env_passthrough system. The
   local backend checked this, but Docker's forward_env was a separate
   disconnected list. Now Docker exec merges both sources, so
   skill-declared env vars are forwarded into containers automatically.

   This fixes the reported issue where NOTION_API_KEY in ~/.hermes/.env
   wasn't reaching the Docker container despite being registered via
   the Notion skill's prerequisites.

Closes #3665
2026-03-28 23:53:40 -07:00
Teknium 9f01244137 fix: replace user-facing hardcoded ~/.hermes paths with display_hermes_home()
Prep for profiles: user-facing messages now use display_hermes_home() so
diagnostic output shows the correct path for each profile.

New helper: display_hermes_home() in hermes_constants.py
12 files swept, ~30 user-facing string replacements.
Includes dynamic TTS schema description.
2026-03-28 23:47:21 -07:00
Teknium 0a80dd9c7a fix(discord): clean up deferred "thinking..." after slash commands complete (#3674)
After a slash command is deferred (interaction.response.defer), the
"thinking..." indicator persisted indefinitely because the code used
followup.send() which creates a separate message instead of replacing
or removing the deferred response.

Fix: use edit_original_response() to replace "thinking..." with the
confirmation text when provided, or delete_original_response() to
remove it when there is no confirmation. Also consolidated /reasoning
and /voice handlers to use _run_simple_slash instead of duplicating
the defer+dispatch pattern.

Fixes #3595.
2026-03-28 23:46:43 -07:00
Teknium 4764e06fde fix(acp): complete session management surface for editor clients (salvage #3501) (#3675)
* fix acp adapter session methods

* test: stub local command in transcription provider cases

---------

Co-authored-by: David Zhang <david.d.zhang@gmail.com>
2026-03-28 23:45:53 -07:00
kshitij 4c532c153b fix: URL-encode Signal phone numbers and correct attachment RPC parameter (#3670)
Fixes two Signal bugs:

1. SSE connection: URL-encode phone numbers so + isn't interpreted as space (400 Bad Request)
2. Attachment fetch: use 'id' parameter instead of 'attachmentId' (NullPointerException in signal-cli)

Also refactors Signal tests with shared helpers.
2026-03-28 23:45:28 -07:00
kshitij a99c0478d0 fix(skills): move parallel-cli to optional-skills (#3673)
parallel-cli is a paid third-party vendor skill that requires
PARALLEL_API_KEY, but it was shipped in the default skills/ directory
with no env-var gate. This caused it to appear in every user's system
prompt even when they have no Parallel account or API key.

Move it to optional-skills/ so it is only visible through the Skills
Hub and must be explicitly installed. Also remove it from the default
skills catalog docs.
2026-03-28 23:45:05 -07:00
Teknium c6e3084baf fix(gateway): replace print() with logger calls in BasePlatformAdapter (#3669)
Salvage of PR #3616 (memosr). Replaces 6 print() calls with proper logger calls in BasePlatformAdapter + removes redundant traceback.print_exc().

Co-Authored-By: memosr <memosr@users.noreply.github.com>
2026-03-28 22:25:35 -07:00
Teknium dcbdfdbb2b feat(docker): add Docker container for the agent (salvage #1841) (#3668)
Adds a complete Docker packaging for Hermes Agent:
- Dockerfile based on debian:13.4 with all deps
- Entrypoint that bootstraps .env, config.yaml, SOUL.md on first run
- CI workflow to build, test, and push to DockerHub
- Documentation for interactive, gateway, and upgrade workflows

Closes #850, #913.

Changes vs original PR:
- Removed pre-created legacy cache/platform dirs from entrypoint
  (image_cache, audio_cache, pairing, whatsapp/session) — these are
  now created on demand by the application using the consolidated
  layout from get_hermes_dir()
- Moved docs from docs/docker.md to website/docs/user-guide/docker.md
  and added to Docusaurus sidebar

Co-authored-by: benbarclay <benbarclay@users.noreply.github.com>
2026-03-28 22:21:48 -07:00
Teknium 91b881f931 feat(mattermost): configurable mention behavior — respond without @mention (#3664)
Adds MATTERMOST_REQUIRE_MENTION and MATTERMOST_FREE_RESPONSE_CHANNELS
env vars, matching Discord's existing mention gating pattern.

- MATTERMOST_REQUIRE_MENTION=false: respond to all channel messages
- MATTERMOST_FREE_RESPONSE_CHANNELS=id1,id2: specific channels where
  bot responds without @mention even when require_mention is true
- DMs always respond regardless of mention settings
- @mention is now stripped from message text (clean agent input)

7 new tests for mention gating, free-response channels, DM bypass,
and mention stripping. Updated existing test for mention stripping.

Docs: updated mattermost.md with Mention Behavior section,
environment-variables.md with new vars, config.py with metadata.
2026-03-28 22:17:43 -07:00
Teknium 3e1157080a fix(tools): use non-deprecated streamable_http_client for MCP HTTP transport (#3646)
Switch MCP HTTP transport from the deprecated streamablehttp_client()
(mcp < 1.24.0) to the new streamable_http_client() API that accepts a
pre-built httpx.AsyncClient.

Changes vs the original PR #3391:
- Separate try/except imports so mcp < 1.24.0 doesn't break (graceful
  fallback to deprecated API instead of losing HTTP MCP entirely)
- Wrap httpx.AsyncClient in async-with for proper lifecycle management
  (the new SDK API explicitly skips closing caller-provided clients)
- Match SDK's own create_mcp_http_client defaults: follow_redirects=True,
  Timeout(connect_timeout, read=300.0)
- Keep deprecated code path as fallback for older SDK versions

Co-authored-by: HenkDz <HenkDz@users.noreply.github.com>
2026-03-28 18:20:49 -07:00
Teknium 1a032ccf79 fix(skills): stop marking persisted env vars missing on remote backends (#3650)
Salvage of PR #3452 (kentimsit). Fixes skill readiness checks on remote backends — persisted env vars are no longer incorrectly marked as missing.

Co-Authored-By: kentimsit <kentimsit@users.noreply.github.com>
2026-03-28 17:52:32 -07:00
Teknium 0bd7e95dfc fix(honcho): allow self-hosted local instances without API key (#3644)
Self-hosted Honcho on localhost doesn't require authentication, but
both the activation gates and the SDK client required an API key.

Combined fix from three contributor PRs:
- Relax all 8 activation gates to accept (api_key OR base_url) as
  valid credentials (#3482 by @cameronbergh)
- Use 'local' placeholder for the SDK client when base_url points to
  localhost/127.0.0.1/::1 (#3570 by @ygd58)

Files changed: run_agent.py (2 gates), cli.py (1 gate),
gateway/run.py (1 gate), honcho_integration/cli.py (2 gates),
hermes_cli/doctor.py (2 gates), honcho_integration/client.py (SDK).

Co-authored-by: cameronbergh <cameronbergh@users.noreply.github.com>
Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
Co-authored-by: devorun <devorun@users.noreply.github.com>
2026-03-28 17:49:56 -07:00
Teknium d35567c6e0 feat(web): add Exa as a web search and extract backend (#3648)
Adds Exa (https://exa.ai) as a fourth web backend alongside Parallel,
Firecrawl, and Tavily. Follows the exact same integration pattern:

- Backend selection: config web.backend=exa or auto-detect from EXA_API_KEY
- Search: _exa_search() with highlights for result descriptions
- Extract: _exa_extract() with full text content extraction
- Lazy singleton client with x-exa-integration header
- Wired into web_search_tool and web_extract_tool dispatchers
- check_web_api_key() and requires_env updated
- CLI: hermes setup summary, hermes tools config, hermes config show
- config.py: EXA_API_KEY in OPTIONAL_ENV_VARS with metadata
- pyproject.toml: exa-py>=2.9.0,<3 in dependencies


Salvaged from PR #1850.

Co-authored-by: louiswalsh <louiswalsh@users.noreply.github.com>
2026-03-28 17:35:53 -07:00
Teknium bea49e02a3 fix: route /bg spinner through TUI widget to prevent status bar collision (#3643)
Background agent's KawaiiSpinner wrote \r-based animation and stop()
messages through StdoutProxy, colliding with prompt_toolkit's status bar.

Two fixes:
- display.py: use isinstance(out, StdoutProxy) instead of fragile
  hasattr+name check for detecting prompt_toolkit's stdout wrapper
- cli.py: silence bg agent's raw spinner (_print_fn=no-op) and route
  thinking updates through the TUI widget only when no foreground
  agent is active; clear spinner text in finally block with same guard

Closes #2718

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-28 17:29:37 -07:00
nguyen binh c6e2e486bf fix: add download retry to cache_audio_from_url matching cache_image_from_url (#3401)
PR #3323 added retry with exponential backoff to cache_image_from_url
but missed the sibling function cache_audio_from_url 18 lines below in
the same file. A single transient 429/5xx/timeout loses voice messages
while image downloads now survive them.

Apply the same retry pattern: 3 attempts with 1.5s exponential backoff,
immediate raise on non-retryable 4xx.
2026-03-28 17:28:38 -07:00
Teknium 973deb4f76 fix(browser): guard LLM response content against None in snapshot and vision (#3642)
Salvage of PR #3532 (binhnt92). Guards browser_tool.py against None content from reasoning-only models (DeepSeek-R1, QwQ). Follow-up to #3449.

Co-Authored-By: binhnt92 <binhnt92@users.noreply.github.com>
2026-03-28 17:25:04 -07:00
Teknium dc74998718 fix(sessions): support stdout (-) in session and snapshot export (salvage #3617) (#3641)
* fix(sessions): support stdout when output path is '-' in session export

* fix: style cleanup + extend stdout support to snapshot export

Follow-up for salvaged PR #3617:
- Fix import sys; on one line (style consistency)
- Update help text to mention - for stdout
- Apply same stdout support to hermes skills snapshot export

---------

Co-authored-by: ygd58 <buraysandro9@gmail.com>
2026-03-28 17:24:32 -07:00
Teknium 17617e4399 feat(discord): DISCORD_IGNORE_NO_MENTION — skip messages that @mention others but not the bot (#3640)
Salvage of PR #3310 (luojiesi). When DISCORD_IGNORE_NO_MENTION=true (default), messages that @mention other users but not the bot are silently skipped in server channels. DMs excluded — mentions there are just references.

Co-Authored-By: luojiesi <luojiesi@users.noreply.github.com>
2026-03-28 17:19:41 -07:00
Siddharth Balyan ffdfeb91d8 fix(nix): unify directory and file permissions across all three layers (#3619)
Activation script, tmpfiles, and container entrypoint now agree on
0750 for all directories. Tighten config.yaml and workspace documents
from 0644 to 0640 (group-readable, no world access). Add explicit
chmod for .managed marker and container $TARGET_HOME to eliminate
umask dependence. Secrets (auth.json, .env) remain 0600.
2026-03-29 05:29:24 +05:30
Teknium 857a5d7b47 fix: sanitize surrogate characters from clipboard paste to prevent UnicodeEncodeError (#3624)
Pasting text from rich-text editors (Google Docs, Word, etc.) can inject
lone surrogate characters (U+D800..U+DFFF) that are invalid UTF-8.
The OpenAI SDK serializes messages with ensure_ascii=False, then encodes
to UTF-8 for the HTTP body — surrogates crash this with:
  UnicodeEncodeError: 'utf-8' codec can't encode character '\udce2'

Three-layer fix:
1. Primary: sanitize user_message at the top of run_conversation()
2. CLI: sanitize in chat() before appending to conversation_history
3. Safety net: catch UnicodeEncodeError in the API error handler,
   sanitize the entire messages list in-place, and retry once.
   Also exclude UnicodeEncodeError from is_local_validation_error
   so it doesn't get classified as non-retryable.

Includes 14 new tests covering the sanitization helpers and the
integration with run_conversation().
2026-03-28 16:53:14 -07:00
Teknium b029742092 fix(cli): strengthen paste collapse fallback for terminals without bracketed paste (#3625)
The _on_text_changed fallback only detected pastes when all characters
arrived in a single event (chars_added > 1).  Some terminals (notably
VSCode integrated terminal in certain configs) may deliver paste data
differently, causing the fallback to miss.

Add a second heuristic: if the newline count jumps by 4+ in a single
text-change event, treat it as a paste.  Alt+Enter only adds 1 newline
per event, so this never false-positives on manual multi-line input.

Also fixes: the fallback path was missing _paste_just_collapsed flag
set before replacing buffer text, which could cause a re-trigger loop.
2026-03-28 15:40:49 -07:00
Teknium 02fb7c4aaf docs: comprehensive docs audit — fix 12 stale/missing items across 10 pages (#3618)
Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from #3551/#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from #3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  #3572/#3573/#3576/#3580/#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (#3583).
- skills.md: List default GitHub taps including garrytan/gstack (#3605).
2026-03-28 15:26:35 -07:00
Teknium 1e924e99b9 refactor: consolidate ~/.hermes directory layout with backward compat (#3610)
New installs get a cleaner structure:
  cache/images/      (was image_cache/)
  cache/audio/       (was audio_cache/)
  cache/documents/   (was document_cache/)
  cache/screenshots/ (was browser_screenshots/)
  platforms/whatsapp/session/ (was whatsapp/session/)
  platforms/matrix/store/    (was matrix/store/)
  platforms/pairing/         (was pairing/)

Existing installs are unaffected -- get_hermes_dir() checks for the
old path first and uses it if present. No migration needed.

Adds get_hermes_dir(new_subpath, old_name) helper to hermes_constants.py
for reuse by any future subsystem.
2026-03-28 15:22:19 -07:00
Teknium 614e43d3d9 feat(skills): add garrytan/gstack as default Skills Hub tap (#3605)
Add the gstack community skills repo to the default tap list and fix
skill_identifier construction for repos with an empty path prefix.

Co-authored-by: Tugrul Guner <tugrulguner@users.noreply.github.com>
2026-03-28 14:55:49 -07:00
Teknium e4480ff426 fix(config): accept 'model' key as alias for 'default' in model config (#3603)
Users intuitively write model: { model: my-model } instead of
model: { default: my-model } and it silently falls back to the
hardcoded default. Now both spellings work across all three config
consumers: runtime_provider, CLI, and gateway.

Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
2026-03-28 14:55:27 -07:00
Teknium 9a364f2805 fix: cap percentage displays at 100% in stats, gateway, and memory tool (#3599)
Salvage of PR #3533 (binhnt92). Follow-up to #3480 — applies min(100, ...) to 5 remaining unclamped percentage display sites in context_compressor, cli /stats, gateway /stats, and memory tool. Defensive clamps now that the root cause (estimation heuristic) was already removed in #3480.

Co-Authored-By: binhnt92 <binhnt92@users.noreply.github.com>
2026-03-28 14:55:18 -07:00
Teknium 1b2d4f21f3 feat(cli): show resume-by-title command in exit summary (#3607)
When exiting a session that has a title (auto-generated or manual),
the exit summary now also shows:
  hermes -c "Session Title"
alongside the existing hermes --resume <id> command.

Also adds the title to the session info block.
2026-03-28 14:54:53 -07:00
Teknium 9009169eeb fix: recover updater when venv pip is missing (#3608)
Some environments lose pip inside the venv. Before invoking pip install,
check pip --version and bootstrap with ensurepip if missing. Applied to
both update code paths (_update_via_zip and cmd_update).


Salvaged from PR #3359.

Co-authored-by: Git-on-my-level <Git-on-my-level@users.noreply.github.com>
2026-03-28 14:54:49 -07:00
Teknium 0f042f3930 fix(email): filter automated/noreply senders to prevent reply loops (salvage #3461) (#3606)
* fix(gateway): filter automated/noreply senders in email adapter

Fixes #3453

Adds noreply/automated sender filtering to the email adapter. Drops emails from noreply, mailer-daemon, postmaster addresses and bulk mail headers (Auto-Submitted, Precedence, List-Unsubscribe) before dispatching. Prevents pairing codes and AI responses being sent to automated senders.

* fix: remove redundant seen_uids add + trailing whitespace cleanup

---------

Co-authored-by: devorun <130918800+devorun@users.noreply.github.com>
2026-03-28 14:50:50 -07:00
Siddharth Balyan 7a9e45e560 fix: regenerate uv.lock to match v0.5.0 in pyproject.toml (#3594)
The lockfile was still pinned to hermes-agent 0.4.0 after the v0.5.0
release, causing downstream consumers (e.g. the Nix package built via
uv2nix) to report the wrong version.  Also drops stale transitive deps
(bashlex, boto3, swe-rex) that were carried over from the removed
swe-rex integration.
2026-03-29 03:19:47 +05:30
Teknium a641f20cac fix(gateway): self-heal missing launchd plist on start (#3601)
When the plist is deleted (manual cleanup, failed upgrade),
hermes gateway start now regenerates it automatically instead of
failing. Also simplifies the returncode==3 error path since the
plist is guaranteed to exist at that point.

Co-authored-by: Bartok9 <Bartok9@users.noreply.github.com>
2026-03-28 14:48:55 -07:00
Teknium ee066b7be6 fix: use placeholder api_key for custom providers without credentials (#3604)
Local/custom OpenAI-compatible providers (Ollama, LM Studio, vLLM) that
don't require auth were hitting empty api_key rejections from the OpenAI
SDK, especially when used as smart model routing targets.

Uses the same 'no-key-required' placeholder already used in
_resolve_openrouter_runtime() for the identical scenario.


Salvaged from PR #3543.

Co-authored-by: scottlowry <scottlowry@users.noreply.github.com>
2026-03-28 14:47:41 -07:00
Mibay a6bc13ce13 fix(github-auth): check ~/.hermes/.env before ~/.git-credentials for token extraction (#3466)
* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials for token extraction

Users who configured their token via `hermes setup` have it stored in
~/.hermes/.env (GITHUB_TOKEN=...), not in ~/.git-credentials. On macOS
with osxkeychain as the default git credential helper, ~/.git-credentials
may not exist at all, causing silent 401 failures in all GitHub skills.

Add ~/.hermes/.env as the first fallback in the auth detection block and
the inline "Extracting the Token from Git Credentials" example.

Priority order: env var → ~/.hermes/.env → ~/.git-credentials → none

Part of fix for NousResearch/hermes-agent#3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464
2026-03-28 14:46:49 -07:00
Teknium f803f66339 fix(terminal): avoid merging heredoc EOF with fence wrapper (#3598)
One-shot local execution built `printf FENCE; <cmd>; __hermes_rc=...`, so a
command ending in a heredoc produced a closing line like `EOF; __hermes_rc=...`,
which is not a valid delimiter. Bash then treated the rest of the wrapper as
heredoc body, leaking it into tool output (e.g. gh issue/PR flows).

Use newline-separated wrapper lines so the delimiter stays alone and the
trailer runs after the heredoc completes.

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-28 14:43:41 -07:00
Teknium 839d9d7471 feat(agent): configurable timeouts for auxiliary LLM calls via config.yaml (#3597)
Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml
instead of hardcoded values. Users with slow local models (Ollama, llama.cpp)
can now increase timeouts for compression, vision, session search, etc.

Defaults:
  - auxiliary.compression.timeout: 120s (was hardcoded 45s)
  - auxiliary.vision.timeout: 30s (unchanged)
  - all other aux tasks: 30s (was hardcoded 30s)
  - title_generator: 30s (was hardcoded 15s)

call_llm/async_call_llm now auto-resolve timeout from config when not
explicitly passed. Callers can still override with an explicit timeout arg.

Based on PR #3406 by alanfwilliams. Converted from env vars to config.yaml
per project conventions.

Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
2026-03-28 14:35:28 -07:00
Teknium 404a0b823e fix: add self-termination guard for pkill/killall targeting hermes/gateway (#3593)
Prevent the agent from accidentally killing its own process with
pkill -f gateway, killall hermes, etc. Adds a dangerous command
pattern that triggers the approval flow.

Co-authored-by: arasovic <arasovic@users.noreply.github.com>
2026-03-28 14:33:48 -07:00
Teknium dabe3c34cc feat(webhook): hermes webhook CLI + skill for event-driven subscriptions (#3578)
Adds 'hermes webhook' CLI subcommand and a skill — zero new model tools.

CLI commands (require webhook platform to be enabled):
  hermes webhook subscribe <name> [--events, --prompt, --deliver, ...]
  hermes webhook list
  hermes webhook remove <name>
  hermes webhook test <name>

All commands gate on webhook platform being enabled in config. If not
configured, prints setup instructions (gateway setup wizard, manual
config.yaml, or env vars).

The agent uses these via terminal tool, guided by the webhook-subscriptions
skill which documents setup, common patterns (GitHub, Stripe, CI/CD,
monitoring), prompt template syntax, security, and troubleshooting.

Adapter enhancement: webhook.py hot-reloads dynamic subscriptions from
~/.hermes/webhook_subscriptions.json on each incoming request (mtime-gated).
Static config.yaml routes always take precedence.

Docs: updated webhooks.md with Dynamic Subscriptions section, added
hermes webhook to cli-commands.md reference.

No new model tools. No toolset changes.

24 new tests for CLI CRUD, persistence, enabled-gate, and adapter
dynamic route loading.
2026-03-28 14:33:35 -07:00
Teknium 82d6c28bd5 fix(skills): cache-aware /skills install and uninstall in TUI (#3586)
Two fixes for /skills install and /skills uninstall slash commands:

1. input() hangs indefinitely inside prompt_toolkit's TUI event loop,
   soft-locking the CLI. The user typing the slash command is already
   implicit consent, so confirmation is now always skipped.

2. Cache invalidation was unconditional — installing or uninstalling a
   skill mid-session silently broke the prompt cache, increasing costs.
   The slash handler now defers cache invalidation by default (skill
   takes effect next session). Pass --now to invalidate immediately,
   with a message explaining the cost tradeoff. The CLI argparse path
   (hermes skills install) is unaffected and still invalidates.

Fixes #3474
Salvaged from PR #3496 by dlkakbs.
2026-03-28 14:32:23 -07:00
Islandman93 dc7d504aca Remove incorrect docker alternative for signal-cli (#3545)
Removed docker alternative for signal-cli-rest-api from the documentation. It does not support the raw signal-cli http daemon. See https://github.com/bbernhard/signal-cli-rest-api/issues/720
2026-03-28 14:28:57 -07:00
Teknium 9e411f7d70 fix(update): skip config migration prompts in non-interactive sessions (#3584)
hermes update hangs on input() when run from cron, scripts, or piped
contexts. Check both stdin and stdout isatty(), catch EOFError as a
fallback, and print guidance to run 'hermes config migrate' later.

Co-authored-by: phippsbot-byte <phippsbot-byte@users.noreply.github.com>
2026-03-28 14:26:32 -07:00
Teknium 708f187549 fix(gateway): exit with failure when all platforms fail with retryable errors (#3592)
When all messaging platforms exhaust retries and get queued for background
reconnection, exit with code 1 so systemd Restart=on-failure can restart
the process. Previously the gateway stayed alive as a zombie with no
connected platforms and exit code 0.

Salvaged from PR #3567 by kelsia14. Test updates added.

Co-authored-by: kelsia14 <kelsia14@users.noreply.github.com>
2026-03-28 14:25:12 -07:00
Teknium d7c41f3cef fix(telegram): honor proxy env vars in fallback transport (salvage #3411) (#3591)
* fix: keep gateway running through telegram proxy failures

- continue gateway startup in degraded mode when Telegram cannot connect yet
- ensure Telegram fallback transport also honors proxy env vars
- support reconnect retries without taking down the whole gateway

* test(telegram): cover proxy env handling in fallback transport

---------

Co-authored-by: kufufu9 <pi@local>
2026-03-28 14:23:27 -07:00
Teknium 6893c3befc fix(gateway): inject PATH + VIRTUAL_ENV into launchd plist for macOS service (#3585)
Salvage of PR #2173 (hanai) and PR #3432 (timknip).

Injects PATH, VIRTUAL_ENV, and HERMES_HOME into the macOS launchd plist so gateway subprocesses find user-installed tools (node, ffmpeg, etc.). Matches systemd unit parity with venv/bin, node_modules/.bin, and resolved node dir in PATH. Includes 7 new tests and docs updates across 4 pages.

Co-Authored-By: Han <ihanai1991@gmail.com>
Co-Authored-By: timknip <timknip@users.noreply.github.com>
2026-03-28 14:23:26 -07:00
Teknium 5cdc24c2e2 docs(slack): add missing Messages Tab setup step (#3590)
Without enabling the Messages Tab in App Home settings, users see
"Sending messages to this app has been turned off" when trying to DM
the bot — even with all correct scopes and event subscriptions.

Add Step 5 (Enable the Messages Tab) between Event Subscriptions and
Install App, with a danger admonition. Also add troubleshooting entry
for this specific error message. Renumber subsequent steps (6→7→8→9).

Co-authored-by: Alberto Leal <mail4alberto@gmail.com>
2026-03-28 14:23:19 -07:00
Teknium 2dd286c162 fix: write models.dev disk cache atomically (#3588)
Use atomic_json_write() from utils.py instead of plain open()/json.dump()
for the models.dev disk cache. Prevents corrupted cache if the process is
killed mid-write — _load_disk_cache() silently returns {} on corrupt JSON,
losing all model metadata until the next successful API fetch.

Co-authored-by: memosr <memosr@users.noreply.github.com>
2026-03-28 14:20:30 -07:00
Teknium 924857c3e3 fix: prevent tool name/arg concatenation for Ollama-compatible endpoints (#3582)
Ollama reuses index 0 for every tool call in a parallel batch,
distinguishing them only by id.  The streaming accumulator now
detects a new non-empty id at an already-active index and redirects
it to a fresh slot, preventing names and arguments from being
concatenated into a single tool call.

No-op for normal providers that use incrementing indices.

Co-authored-by: dmater01 <dmater01@users.noreply.github.com>
2026-03-28 14:08:26 -07:00
Teknium ba3bbf5b53 fix: add missing mattermost/matrix/dingtalk toolsets + platform consistency tests (salvage #3512) (#3583)
* Fixing mattermost configuration parsing bugs

* fix: add homeassistant to skills_config + platform consistency tests

Follow-up for cherry-picked #3512:
- Add homeassistant to skills_config.py PLATFORMS (was in tools_config
  but missing from skills_config)
- Add 3 consistency tests that verify all platforms in tools_config have
  matching toolset definitions, gateway includes, and skills_config entries
  — prevents this class of bug from recurring

---------

Co-authored-by: DaneelV3 <dannel@v3rtical.tech>
2026-03-28 14:05:02 -07:00
Teknium d6b4fa2e9f fix: strip @botname from commands so /new@TigerNanoBot resolves correctly (#3581)
Commands sent directly to the bot in groups include @botname suffix
(e.g. /compress@TigerNanoBot). get_command() now strips the @anything
part before lookup, matching how Telegram bot menu generates commands.
Fixes all slash commands silently doing nothing when sent with @mention.

Co-authored-by: MacroAnarchy <MacroAnarchy@users.noreply.github.com>
2026-03-28 14:01:01 -07:00
Teknium df1bf0a209 feat(api-server): add basic security headers (#3576)
Add X-Content-Type-Options: nosniff and Referrer-Policy: no-referrer
to all API server responses via a new security_headers_middleware.

Co-authored-by: Oktay Aydin <aydnOktay@users.noreply.github.com>
2026-03-28 14:00:52 -07:00
Teknium 49a49983e4 feat(api-server): add Access-Control-Max-Age to CORS preflight responses (#3580)
Adds Access-Control-Max-Age: 600 to CORS preflight responses, telling
browsers to cache the preflight for 10 minutes. Reduces redundant OPTIONS
requests and improves perceived latency for browser-based API clients.

Salvaged from PR #3514 by aydnOktay.

Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-28 14:00:03 -07:00
Teknium e97c0cb578 fix: replace hardcoded ~/.hermes paths with get_hermes_home() for profile support
* feat: GPT tool-use steering + strip budget warnings from history

Two changes to improve tool reliability, especially for OpenAI GPT models:

1. GPT tool-use enforcement prompt: Adds GPT_TOOL_USE_GUIDANCE to the
   system prompt when the model name contains 'gpt' and tools are loaded.
   This addresses a known behavioral pattern where GPT models describe
   intended actions ('I will run the tests') instead of actually making
   tool calls. Inspired by similar steering in OpenCode (beast.txt) and
   Cline (GPT-5.1 variant).

2. Budget warning history stripping: Budget pressure warnings injected by
   _get_budget_warning() into tool results are now stripped when
   conversation history is replayed via run_conversation(). Previously,
   these turn-scoped signals persisted across turns, causing models to
   avoid tool calls in all subsequent messages after any turn that hit
   the 70-90% iteration threshold.

* fix: replace hardcoded ~/.hermes paths with get_hermes_home() for profile support

Prep for the upcoming profiles feature — each profile is a separate
HERMES_HOME directory, so all paths must respect the env var.

Fixes:
- gateway/platforms/matrix.py: Matrix E2EE store was hardcoded to
  ~/.hermes/matrix/store, ignoring HERMES_HOME. Now uses
  get_hermes_home() so each profile gets its own Matrix state.

- gateway/platforms/telegram.py: Two locations reading config.yaml via
  Path.home()/.hermes instead of get_hermes_home(). DM topic thread_id
  persistence and hot-reload would read the wrong config in a profile.

- tools/file_tools.py: Security path for hub index blocking was
  hardcoded to ~/.hermes, would miss the actual profile's hub cache.

- hermes_cli/gateway.py: Service naming now uses the profile name
  (hermes-gateway-coder) instead of a cryptic hash suffix. Extracted
  _profile_suffix() helper shared by systemd and launchd.

- hermes_cli/gateway.py: Launchd plist path and Label now scoped per
  profile (ai.hermes.gateway-coder.plist). Previously all profiles
  would collide on the same plist file on macOS.

- hermes_cli/gateway.py: Launchd plist now includes HERMES_HOME in
  EnvironmentVariables — was missing entirely, making custom
  HERMES_HOME broken on macOS launchd (pre-existing bug).

- All launchctl commands in gateway.py, main.py, status.py updated
  to use get_launchd_label() instead of hardcoded string.

Test fixes: DM topic tests now set HERMES_HOME env var alongside
Path.home() mock. Launchd test uses get_launchd_label() for expected
commands.
2026-03-28 13:51:08 -07:00
Teknium c0aa06f300 fix(test): update streaming test to match PR #3566 behavior change (#3574)
PR #3566 intentionally routes suppressed content to stream_delta_callback
when tool calls are present, so reasoning tag extraction can fire during
streaming. The test was still asserting the old behavior where content
after tool calls was fully suppressed from the callback.

Updated the assertion to match: content IS delivered to the callback
(for tag extraction), with display-level suppression handled by the
CLI's _stream_delta.
2026-03-28 13:41:23 -07:00
Teknium 3273732891 fix(api-server): add CORS headers to streaming SSE responses (#3573)
StreamResponse headers are flushed on prepare() before the CORS
middleware can inject them. Resolve CORS headers up front using
_cors_headers_for_origin() so the full set (including
Access-Control-Allow-Origin) is present on SSE streams.

Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
2026-03-28 13:38:30 -07:00
Teknium 09ebf8b252 feat(api-server): add /v1/health alias for OpenAI compatibility (#3572)
Add GET /v1/health as an alias to the existing /health endpoint so
OpenAI-compatible health checks work out of the box.

Co-authored-by: Oktay Aydin <aydnOktay@users.noreply.github.com>
2026-03-28 13:32:39 -07:00
Teknium 33c89e52ec fix(whatsapp): add **kwargs to media sending methods to accept metadata (#3571)
The base orchestrator passes metadata=_thread_metadata to
send_image_file, send_video, and send_document. WhatsApp was the
only platform adapter missing the parameter, causing TypeError
crashes when sending media.

Extended to all three methods (original PR only fixed send_image_file).


Salvaged from PR #3144.

Co-authored-by: afifai <afifai@users.noreply.github.com>
2026-03-28 13:28:04 -07:00
Teknium 558cc14ad9 chore: release v0.5.0 (v2026.3.28) (#3568)
The hardening release — Nous Portal 400+ models, Hugging Face provider,
Telegram Private Chat Topics, native Modal SDK, plugin lifecycle hooks,
improved OpenAI model reliability, Nix flake, supply chain hardening,
Anthropic output limits fix, and 50+ security/reliability fixes.

165 merged PRs, 65 closed issues across a 5-day window.
2026-03-28 13:11:39 -07:00
Teknium 1d0a119368 fix(display): show reasoning before response when tool calls suppress content (#3566)
* fix(provider): remove MiniMax /v1→/anthropic auto-correction to allow user override

The minimax-specific auto-correction in runtime_provider.py was
preventing users from overriding to the OpenAI-compatible endpoint
via MINIMAX_BASE_URL. Users in certain regions get nginx 404 on
api.minimax.io/anthropic and need to switch to api.minimax.chat/v1.

The generic URL-suffix detection already handles /anthropic →
anthropic_messages, so the minimax-specific code was redundant for
the default path and harmful for the override path.

Now: default /anthropic URL works via generic detection, user
override to /v1 gets chat_completions mode naturally.

Closes #3546 (different approach — respects user overrides instead
of changing the default endpoint).

* fix(display): show reasoning during streaming even when tool calls suppress content

When a model generates content (containing <REASONING_SCRATCHPAD> tags)
alongside tool calls in the same API response, content deltas were
suppressed from streaming once any tool call chunk arrived. This
prevented the CLI's tag extraction from running, so reasoning was
never shown during streaming. The post-response fallback then
displayed reasoning AFTER the already-visible streamed response,
creating a confusing reversed order.

Fix: route suppressed content to stream_delta_callback even when tool
calls are present. The CLI's _stream_delta handles tag extraction —
reasoning tags are routed to the reasoning display box, while
non-reasoning text is handled by the existing stream display logic.
This ensures reasoning appears before tool execution and the final
response, matching the expected visual order.
2026-03-28 12:34:32 -07:00
Teknium 901494d728 feat: make tool-use enforcement configurable via agent.tool_use_enforcement (#3551)
The TOOL_USE_ENFORCEMENT_GUIDANCE injection (added in #3528) was
hardcoded to only match gpt/codex model names. This makes it a
config option so users can turn it on for any model family.

New config key: agent.tool_use_enforcement
  - "auto" (default): matches gpt/codex (existing behavior)
  - true: inject for all models
  - false: never inject
  - list of strings: custom model-name substrings to match
    e.g. ["gpt", "codex", "deepseek", "qwen"]

No version bump needed — deep merge provides the default
automatically for existing installs.

12 new tests covering all config modes.
2026-03-28 12:31:22 -07:00
Osman Mehmood d26ee20659 docs(discord): fix Public Bot setting for Discord-provided invite link (#3519)
The documentation incorrectly instructed users to set Public Bot to OFF,
but this prevents using the Discord-provided invite link (recommended method),
causing the error: 'Private application cannot have a default authorization link'.

Changes:
- Changed Step 2: Public Bot now set to ON (required for Installation tab method)
- Added info callout explaining the Private Bot alternative (use Manual URL)
- Added note in Step 5 Option A clarifying the Public Bot requirement

Fixes Discord bot setup flow for new users following the recommended path.

Co-authored-by: Docs Fix <docs-fix@example.com>
2026-03-28 12:24:43 -07:00
Teknium 393929831e fix(gateway): preserve transcript on /compress and hygiene compression (salvage #3516) (#3556)
* fix(gateway): preserve full transcript on /compress instead of overwriting

The /compress command calls _compress_context() which correctly ends the
old session (preserving its full transcript in SQLite) and creates a new
session_id for the continuation. However, it then immediately called
rewrite_transcript() on the OLD session_id, overwriting the preserved
transcript with the compressed version — destroying searchable history.

Auto-compression (triggered by context pressure) does not have this bug
because the gateway already handles the session_id swap via the
agent.session_id != session_id check after _run_agent_sync.

Fix: after _compress_context creates the new session, write the compressed
messages into the NEW session_id and update the session store pointer.
The old session's full transcript stays intact and searchable via
session_search.

Before: /compress destroys original messages, session_search can't find
details from compressed portions.

After: /compress behaves like /new for history — full transcript preserved,
compressed context for the live session.

* fix(gateway): preserve transcript on /compress and hygiene compression

Apply session_id swap after _compress_context in both /compress handler
and hygiene pre-compression. _compress_context creates a new session
(ending the old one), but both paths were calling rewrite_transcript on
the OLD session_id — overwriting the preserved transcript and destroying
searchable history.

Now follows the same pattern as the auto-compression handler (lines
5415-5423): detect the new session_id, update the session store entry,
and write compressed messages to the new session.

Also fix FakeCompressAgent test mock to include session_id attribute
and simulate the session_id change that real _compress_context performs.

Co-authored-by: MacroAnarchy <MacroAnarchy@users.noreply.github.com>

---------

Co-authored-by: MacroAnarchy <MacroAnarchy@users.noreply.github.com>
2026-03-28 12:23:43 -07:00
Teknium be322efdf2 fix(matrix): harden e2ee access-token handling (#3562)
* fix(matrix): harden e2ee access-token handling

* fix: patch nio mock in e2ee maintenance sync loop test

The sync_loop now imports nio for SyncError checking (from PR #3280),
so the test needs to inject a fake nio module via sys.modules.

---------

Co-authored-by: Cortana <andrew+cortana@chalkley.org>
2026-03-28 12:13:35 -07:00
Teknium be39292633 fix(cli): guard .strip() against None values from YAML config (#3552)
dict.get(key, default) only returns default when key is ABSENT.
When YAML has 'key:' with no value, it parses as None — .get()
returns None, then .strip() crashes with AttributeError.

Use (x or '') pattern to handle both missing and null cases.


Salvaged from PR #3217.

Co-authored-by: erosika <erosika@users.noreply.github.com>
2026-03-28 11:39:01 -07:00
Teknium df6ce848e9 fix(provider): remove MiniMax /v1→/anthropic auto-correction to allow user override (#3553)
The minimax-specific auto-correction in runtime_provider.py was
preventing users from overriding to the OpenAI-compatible endpoint
via MINIMAX_BASE_URL. Users in certain regions get nginx 404 on
api.minimax.io/anthropic and need to switch to api.minimax.chat/v1.

The generic URL-suffix detection already handles /anthropic →
anthropic_messages, so the minimax-specific code was redundant for
the default path and harmful for the override path.

Now: default /anthropic URL works via generic detection, user
override to /v1 gets chat_completions mode naturally.

Closes #3546 (different approach — respects user overrides instead
of changing the default endpoint).
2026-03-28 11:36:59 -07:00
Teknium 735ca9dfb2 refactor: replace swe-rex with native Modal SDK for Modal backend (#3538)
Drop the swe-rex dependency for Modal terminal backend and use the
Modal SDK directly (Sandbox.create + Sandbox.exec). This fixes:

- AsyncUsageWarning from synchronous App.lookup() in async context
- DeprecationError from unencrypted_ports / .url on unencrypted tunnels
  (deprecated 2026-03-05)

The new implementation:
- Uses modal.App.lookup.aio() for async-safe app creation
- Uses Sandbox.create.aio() with 'sleep infinity' entrypoint
- Uses Sandbox.exec.aio() for direct command execution (no HTTP server
  or tunnel needed)
- Keeps all existing features: persistent filesystem snapshots,
  configurable resources (CPU/memory/disk), sudo support, interrupt
  handling, _AsyncWorker for event loop safety

Consistent with the Docker backend precedent (PR #2804) where we
removed mini-swe-agent in favor of direct docker run.

Files changed:
- tools/environments/modal.py - core rewrite
- tools/terminal_tool.py - health check: modal instead of swerex
- hermes_cli/setup.py - install modal instead of swe-rex[modal]
- pyproject.toml - modal extra: modal>=1.0.0 instead of swe-rex[modal]
- scripts/kill_modal.sh - grep for hermes-agent instead of swe-rex
- tests/ - updated for new implementation
- environments/README.md - updated patches section
- website/docs - updated install command
2026-03-28 11:21:44 -07:00
Teknium 455bf2e853 feat: activate plugin lifecycle hooks (pre/post_llm_call, session start/end) (#3542)
The plugin system defined six lifecycle hooks but only pre_tool_call and
post_tool_call were invoked.  This activates the remaining four so that
external plugins (e.g. memory systems) can hook into the conversation
loop without touching core code.

Hook semantics:
- on_session_start: fires once when a new session is created
- pre_llm_call: fires once per turn before the tool-calling loop;
  plugins can return {"context": "..."} to inject into the ephemeral
  system prompt (not cached, not persisted)
- post_llm_call: fires once per turn after the loop completes, with
  user_message and assistant_response for sync/storage
- on_session_end: fires at the end of every run_conversation call

invoke_hook() now returns a list of non-None callback return values,
enabling pre_llm_call context injection while remaining backward
compatible (existing hooks that return None are unaffected).

Salvaged from PR #2823.

Co-authored-by: Nicolò Boschi <boschi1997@gmail.com>
2026-03-28 11:14:54 -07:00
Teknium 411e3c1539 fix(api-server): allow Idempotency-Key in CORS headers (#3530)
Browser clients using the Idempotency-Key header for request
deduplication were blocked by CORS preflight because the header
was not listed in Access-Control-Allow-Headers.

Add Idempotency-Key to _CORS_HEADERS and add tests for both the
new header allowance and the existing Vary: Origin behavior.

Co-authored-by: aydnOktay <aydnOktay@users.noreply.github.com>
Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-03-28 08:16:41 -07:00
Teknium d313a3b7d7 fix: auto-repair jobs.json with invalid control characters (#3537)
load_jobs() uses strict json.load() which rejects bare control characters
(e.g. literal newlines) in JSON string values. When a cron job prompt
contains such characters, the parser throws JSONDecodeError and the
function silently returns an empty list — causing ALL scheduled jobs
to stop firing with no error logged.

Fix: on JSONDecodeError, retry with json.loads(strict=False). If jobs
are recovered, auto-rewrite the file with proper escaping via save_jobs()
and log a warning. Only fall back to empty list if the JSON is truly
unrecoverable.

Co-authored-by: Sebastian Bochna <sbochna@SB-MBP-M2-2.local>
2026-03-28 08:15:31 -07:00
Teknium 80a899a8e2 fix: enable fine-grained tool streaming for Claude/OpenRouter + retry SSE errors (#3497)
Root cause: Anthropic buffers entire tool call arguments and goes silent
for minutes while thinking (verified: 167s gap with zero SSE events on
direct API).  OpenRouter's upstream proxy times out after ~125s of
inactivity and drops the connection with 'Network connection lost'.

Fix: Send the x-anthropic-beta: fine-grained-tool-streaming-2025-05-14
header for Claude models on OpenRouter.  This makes Anthropic stream
tool call arguments token-by-token instead of buffering them, keeping
the connection alive through OpenRouter's proxy.

Live-tested: the exact prompt that consistently failed at ~128s now
completes successfully — 2,972 lines written, 49K tokens, 8 minutes.

Additional improvements:

1. Send explicit max_tokens for Claude through OpenRouter.  Without it,
   OpenRouter defaults to 65,536 (confirmed via echo_upstream_body) —
   only half of Opus 4.6's 128K limit.

2. Classify SSE 'Network connection lost' as retryable in the streaming
   inner retry loop.  The OpenAI SDK raises APIError from SSE error
   events, which was bypassing our transient error retry logic.

3. Actionable diagnostic guidance when stream-drop retries exhaust.
2026-03-28 08:01:37 -07:00
Teknium e295a2215a fix(gateway): include user-local bin paths in systemd unit PATH (#3527)
Add ~/.local/bin, ~/.cargo/bin, ~/go/bin, ~/.npm-global/bin to the
systemd unit PATH so tools installed via uv/pipx/cargo/go are
discoverable by MCP servers and terminal commands.

Uses a _build_user_local_paths() helper that checks exists() before
adding, and correctly resolves home dir for both user and system
service types.

Co-authored-by: Kal Sze <ksze@users.noreply.github.com>
2026-03-28 07:47:40 -07:00
Teknium 831e8ba0e5 feat: tool-use enforcement + strip budget warnings from history (#3528)
Cherry-pick of feat/gpt-tool-steering with modifications:

1. Tool-use enforcement prompt (refactored from GPT-specific):
   - Renamed GPT_TOOL_USE_GUIDANCE -> TOOL_USE_ENFORCEMENT_GUIDANCE
   - Added TOOL_USE_ENFORCEMENT_MODELS tuple: ('gpt', 'codex')
   - Injection logic now checks against the tuple instead of hardcoding
     'gpt' — adding new model families is a one-line change
   - Addresses models describing actions instead of making tool calls

2. Budget warning history stripping:
   - _strip_budget_warnings_from_history() strips _budget_warning JSON
     keys and [BUDGET WARNING: ...] text from tool results at the start
     of run_conversation()
   - Prevents old budget warnings from poisoning subsequent turns

Based on PR #3479 by teknium1.
2026-03-28 07:38:36 -07:00
Teknium 9d4b3e5470 fix: harden hermes update against diverged history, non-main branches, and gateway edge cases (salvage #3489) (#3492)
* fix: harden `hermes update` against diverged history, non-main branches, and gateway edge cases

The self-update command (`hermes update` / gateway `/update`) could fail
or silently corrupt state in several scenarios:

1. **Diverged history** — `git pull --ff-only` aborts with a cryptic
   subprocess error when upstream has force-pushed or rebased. Now falls
   back to `git reset --hard origin/main` since local changes are already
   stashed.

2. **User on a feature branch / detached HEAD** — the old code would
   either clobber the feature branch HEAD to point at origin/main, or
   silently pull against a non-existent remote branch. Now auto-checkouts
   main before pulling, with a clear warning.

3. **Fetch failures** — network or auth errors produced raw subprocess
   tracebacks. Now shows user-friendly messages ("Network error",
   "Authentication failed") with actionable hints.

4. **reset --hard failure** — if the fallback reset itself fails (disk
   full, permissions), the old code would still attempt stash restore on
   a broken working tree. Now skips restore and tells the user their
   changes are safe in stash.

5. **Gateway /update stash conflicts** — non-interactive mode (Telegram
   `/update`) called sys.exit(1) when stash restore had conflicts, making
   the entire update report as failed even though the code update itself
   succeeded. Now treats stash conflicts as non-fatal in non-interactive
   mode (returns False instead of exiting).

* fix: restore stash and branch on 'already up to date' early return

The PR moved stash creation before the commit-count check (needed for
the branch-switching feature), but the 'already up to date' early return
didn't restore the stash or switch back to the original branch — leaving
the user stranded on main with changes trapped in a stash.

Now the early-return path restores the stash and checks out the original
branch when applicable.

---------

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-27 23:12:43 -07:00
Teknium 6ed9740444 fix: prevent unbounded growth of _seen_uids in EmailAdapter (#3490)
EmailAdapter._seen_uids accumulates every IMAP UID ever seen but
never removes any. A long-running gateway processing a high-volume
inbox would leak memory indefinitely — thousands of integers per day.

IMAP UIDs are monotonically increasing integers, so old UIDs are safe
to drop: new messages always have higher UIDs, and the IMAP UNSEEN
flag already prevents re-delivery regardless of our local tracking.

Fix adds _trim_seen_uids() which keeps only the most recent 1000 UIDs
(half of the 2000-entry cap) when the set grows too large. Called
automatically during connect() and after each fetch cycle.

Co-authored-by: memosr.eth <96793918+memosr@users.noreply.github.com>
2026-03-27 23:08:42 -07:00
Teknium 290c71a707 fix(gateway): scope progress thread fallback to Slack only (salvage #3414) (#3488)
* test(gateway): map fixture adapter by platform in progress threading tests

* fix(gateway): scope progress thread fallback to Slack only

---------

Co-authored-by: EmpireOperating <258363005+EmpireOperating@users.noreply.github.com>
2026-03-27 22:37:53 -07:00
Teknium 09796b183b fix: alibaba provider default endpoint and model list (#3484)
- Change default inference_base_url from dashscope-intl Anthropic-compat
  endpoint to coding-intl OpenAI-compat /v1 endpoint. The old Anthropic
  endpoint 404'd when used with the OpenAI SDK (which appends
  /chat/completions to a /apps/anthropic base URL).

- Update curated model list: remove models unavailable on coding-intl
  (qwen3-max, qwen-plus-latest, qwen3.5-flash, qwen-vl-max), add
  third-party models available on the platform (glm-5, glm-4.7,
  kimi-k2.5, MiniMax-M2.5).

- URL-based api_mode auto-detection still works: overriding
  DASHSCOPE_BASE_URL to an /apps/anthropic endpoint automatically
  switches to anthropic_messages mode.

- Update provider description and env var descriptions to reflect the
  coding-intl multi-provider platform.

- Update tests to match new default URL and test the anthropic override
  path instead.
2026-03-27 22:10:10 -07:00
Teknium 15cfd20820 fix: cap context pressure percentage at 100% in display (#3480)
* fix: cap context pressure percentage at 100% in display

The forward-looking token estimate can overshoot the compaction threshold
(e.g. a large tool result pushes it from 70% to 109% in one step). The
progress bar was already capped via min(), but pct_int was not — causing
the user to see '109% to compaction' which is confusing.

Cap pct_int at 100 in both CLI and gateway display functions.

Reported by @JoshExile82.

* refactor: use real API token counts for compression decisions

Replace the rough chars/3 estimation with actual prompt_tokens +
completion_tokens from the API response. The estimation was needed to
predict whether tool results would push context past the threshold, but
the default 50% threshold leaves ample headroom — if tool results push
past it, the next API call reports real usage and triggers compression
then.

This removes all estimation from the compression and context pressure
paths, making both 100% data-driven from provider-reported token counts.

Also removes the dead _msg_count_before_tools variable.
2026-03-27 21:42:09 -07:00
Teknium 03f24c1edd fix: session_search fallback preview on summarization failure (salvage #3413) (#3478)
* Fix #3409: Add fallback to session_search to prevent false negatives on summarization failure

Fixes #3409. When the auxiliary summarizer fails or returns None, the tool now returns a raw fallback preview of the matched session instead of silently dropping it and returning an empty list

* fix: clean up fallback logic — separate exception handling from preview

Restructure the loop: handle exceptions first (log + nullify), build
entry dict once, then branch on result truthiness. Removes duplicated
field assignments and makes the control flow linear.

---------

Co-authored-by: devorun <130918800+devorun@users.noreply.github.com>
2026-03-27 21:27:51 -07:00
Teknium 388fa5293d fix(matrix): add missing matrix entry in PLATFORMS dict (#3473)
Matrix platform was missing from the PLATFORMS config, causing a
KeyError in _get_platform_tools() when handling Matrix messages.
Every other platform (telegram, discord, slack, etc.) was present
but matrix was overlooked.

Co-authored-by: williamtwomey <williamtwomey@users.noreply.github.com>
2026-03-27 18:36:23 -07:00
Teknium 83043e9aa8 fix: add timeout to subprocess calls in context_references (#3469)
_expand_git_reference() and _rg_files() called subprocess.run()
without a timeout. On a large repository, @diff, @staged, or
@git:N references could hang the agent indefinitely while git
or ripgrep processes slow output.

- Add timeout=30 to git subprocess in _expand_git_reference()
  with a user-friendly error message on TimeoutExpired
- Add timeout=10 to rg subprocess in _rg_files() returning
  None on timeout (falls back to os.walk folder listing)

Co-authored-by: memosr.eth <96793918+memosr@users.noreply.github.com>
2026-03-27 17:51:14 -07:00
Teknium b6b87dedd4 fix: discover plugins before reading plugin toolsets in tools_config (#3457)
hermes tools and _get_platform_tools() call get_plugin_toolsets() /
_get_plugin_toolset_keys() without first ensuring plugins have been
discovered. discover_plugins() only runs as a side effect of importing
model_tools.py, which hermes tools never does. This means:

- hermes tools TUI never shows plugin toolsets (invisible to users)
- _get_platform_tools() in standalone processes misses plugin toolsets

Fix: call discover_plugins() (idempotent) in both
_get_plugin_toolset_keys() and _get_effective_configurable_toolsets()
before accessing plugin state. In the gateway/CLI where model_tools.py
is already imported, the call is a no-op (discover_and_load checks
_discovered flag).
2026-03-27 15:31:17 -07:00
Teknium 8fdfc4b00c fix(agent): detect thinking-budget exhaustion on truncation, skip useless retries (#3444)
When finish_reason='length' and the response contains only reasoning
(think blocks or empty content), the model exhausted its output token
budget on thinking with nothing left for the actual response.

Previously, this fell into either:
- chat_completions: 3 useless continuation retries (model hits same limit)
- anthropic/codex: generic 'Response truncated' error with rollback

Now: detect the think-only + length condition early and return immediately
with a targeted error message: 'Model used all output tokens on reasoning
with none left for the response. Try lowering reasoning effort or
increasing max_tokens.'

This saves 2 wasted API calls on the chat_completions path and gives
users actionable guidance instead of a cryptic error.

The existing think-only retry logic (finish_reason='stop') is unchanged —
that's a genuine model glitch where retrying can help.
2026-03-27 15:29:30 -07:00
Teknium 658692799d fix: guard aux LLM calls against None content + reasoning fallback + retry (salvage #3389) (#3449)
Salvage of #3389 by @binhnt92 with reasoning fallback and retry logic added on top.

All 7 auxiliary LLM call sites now use extract_content_or_reasoning() which mirrors the main agent loop's behavior: extract content, strip think blocks, fall back to structured reasoning fields, retry on empty.

Closes #3389.
2026-03-27 15:28:19 -07:00
Teknium ab09f6b568 feat: curate HF model picker with OpenRouter analogues (#3440)
Show only agentic models that map to OpenRouter defaults:

  Qwen/Qwen3.5-397B-A17B          ↔ qwen/qwen3.5-plus
  Qwen/Qwen3.5-35B-A3B            ↔ qwen/qwen3.5-35b-a3b
  deepseek-ai/DeepSeek-V3.2       ↔ deepseek/deepseek-chat
  moonshotai/Kimi-K2.5             ↔ moonshotai/kimi-k2.5
  MiniMaxAI/MiniMax-M2.5           ↔ minimax/minimax-m2.5
  zai-org/GLM-5                    ↔ z-ai/glm-5
  XiaomiMiMo/MiMo-V2-Flash         ↔ xiaomi/mimo-v2-pro
  moonshotai/Kimi-K2-Thinking      ↔ moonshotai/kimi-k2-thinking

Users can still pick any HF model via Enter custom model name.
2026-03-27 13:54:46 -07:00
Teknium e4e04c2005 fix: make tirith block verdicts approvable instead of hard-blocking (#3428)
Previously, tirith exit code 1 (block) immediately rejected the command
with no approval prompt — users saw 'BLOCKED: Command blocked by
security scan' and the agent moved on.  This prevented gateway/CLI users
from approving pipe-to-shell installs like 'curl ... | sh' even when
they understood the risk.

Changes:
- Tirith 'block' and 'warn' now both go through the approval flow.
  Users see the full tirith findings (severity, title, description,
  safer alternatives) and can choose to approve or deny.
- New _format_tirith_description() builds rich descriptions from tirith
  findings JSON so the approval prompt is informative.
- CLI startup now warns when tirith is enabled but not available, so
  users know command scanning is degraded to pattern matching only.

The default approval choice is still deny, so the security posture is
unchanged for unattended/timeout scenarios.

Reported via Discord by pistrie — 'curl -fsSL https://mandex.dev/install.sh | sh'
was hard-blocked with no way to approve.
2026-03-27 13:22:01 -07:00
Teknium 6f11ff53ad fix(anthropic): use model-native output limits instead of hardcoded 16K (#3426)
The Anthropic adapter defaulted to max_tokens=16384 when no explicit value
was configured.  This severely limits thinking-enabled models where thinking
tokens count toward max_tokens:

- Claude Opus 4.6 supports 128K output but was capped at 16K
- Claude Sonnet 4.6 supports 64K output but was capped at 16K

With extended thinking (adaptive or budget-based), the model could exhaust
the entire 16K on reasoning, leaving zero tokens for the actual response.
This caused two user-visible errors:
- 'Response truncated (finish_reason=length)' — thinking consumed most tokens
- 'Response only contains think block with no content' — thinking consumed all

Fix: add _ANTHROPIC_OUTPUT_LIMITS lookup table (sourced from Anthropic docs
and Cline's model catalog) and use the model's actual output limit as the
default.  Unknown future models default to 128K (the current maximum).

Also adds context_length clamping: if the user configured a smaller context
window (e.g. custom endpoint), max_tokens is clamped to context_length - 1
to avoid exceeding the window.

Closes #2706
2026-03-27 13:02:52 -07:00
Teknium fb46a90098 fix: increase API timeout default from 900s to 1800s for slow-thinking models (#3431)
Models like GLM-5/5.1 can think for 15+ minutes. The previous 900s
(15 min) default for HERMES_API_TIMEOUT killed legitimate requests.

Raised to 1800s (30 min) in both places that read the env var:
- _build_api_kwargs() timeout (non-streaming total timeout)
- _call_chat_completions() write timeout (streaming connection)

The streaming per-chunk read timeout (60s) and stale stream detector
(180-300s) are unchanged — those are appropriate for inter-chunk timing.
2026-03-27 13:02:23 -07:00
Teknium fd8c465e42 feat: add Hugging Face as a first-class inference provider (#3419)
Salvage of PR #1747 (original PR #1171 by @davanstrien) onto current main.

Registers Hugging Face Inference Providers (router.huggingface.co/v1) as a named provider:
- hermes chat --provider huggingface (or --provider hf)
- 18 curated open models via hermes model picker
- HF_TOKEN in ~/.hermes/.env
- OpenAI-compatible endpoint with automatic failover (Groq, Together, SambaNova, etc.)

Files: auth.py, models.py, main.py, setup.py, config.py, model_metadata.py, .env.example, 5 docs pages, 17 new tests.

Co-authored-by: Daniel van Strien <davanstrien@gmail.com>
2026-03-27 12:41:59 -07:00
Teknium f57ebf52e9 fix(api-server): cancel orphaned agent + true interrupt on SSE disconnect (salvage #3399) (#3427)
Salvage of #3399 by @binhnt92 with true agent interruption added on top.

When a streaming /v1/chat/completions client disconnects mid-stream, the agent is now interrupted via agent.interrupt() so it stops making LLM API calls, and the asyncio task wrapper is cancelled.

Closes #3399.
2026-03-27 11:33:19 -07:00
Teknium 5127567d5d perf(ttft): cache skills prompt with shared skill_utils module (salvage #3366) (#3421)
Two-layer caching for build_skills_system_prompt():
  1. In-process LRU (OrderedDict, max 8) — same-process: 546ms → <1ms
  2. Disk snapshot (.skills_prompt_snapshot.json) — cold start: 297ms → 103ms

Key improvements over original PR #3366:
- Extract shared logic into agent/skill_utils.py (parse_frontmatter,
  skill_matches_platform, get_disabled_skill_names, extract_skill_conditions,
  extract_skill_description, iter_skill_index_files)
- tools/skills_tool.py delegates to shared module — zero code duplication
- Proper LRU eviction via OrderedDict.move_to_end + popitem(last=False)
- Cache invalidation on all skill mutation paths:
  - skill_manage tool (in-conversation writes)
  - hermes skills install (CLI hub)
  - hermes skills uninstall (CLI hub)
  - Automatic via mtime/size manifest on cold start

prompt_builder.py no longer imports tools.skills_tool (avoids pulling
in the entire tool registry chain at prompt build time).

6301 tests pass, 0 failures.

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-27 10:54:02 -07:00
Siddharth Balyan cc4514076b feat(nix): add suffix PATHs during nix build for more agent-friendliness (#3274)
* refactor: suffix runtimeDeps PATH so apt-installed tools take priority

Changes makeWrapper from --prefix to --suffix. In container mode,
tools installed via apt in /usr/bin now win over read-only nix store
copies. Nix store versions become dead-letter fallbacks. Native NixOS
mode unaffected — tools in /run/current-system/sw/bin already precede
the suffix.

* feat(container): first-boot apt provisioning for agent tools

Installs nodejs, npm, curl via apt and uv via curl on first container
boot. Uses sentinel file so subsequent boots skip. Container recreation
triggers fresh install. Combined with --suffix PATH change, agents get
mutable tools that support npm i -g and uv without hitting read-only
nix store paths.

* docs: update nixosModules header for tool provisioning

* feat(container): consolidate first-boot provisioning + Python 3.11 venv

Merge sudo and tool apt installs into a single apt-get update call.
Move uv install outside the sentinel so transient failures retry on
next boot. Bootstrap a Python 3.11 venv via uv (--seed for pip) and
prepend ~/.venv/bin to PATH so agents get writable python/pip/node
out of the box.

---------

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-03-27 23:00:56 +05:30
Teknium 8ecd7aed2c fix: prevent reasoning box from rendering 3x during tool-calling loops (#3405)
Two independent bugs caused the reasoning box to appear three times when
the model produced reasoning + tool_calls:

Bug A: _build_assistant_message() re-fired reasoning_callback with the full
reasoning text even when streaming had already displayed it. The original
guard only checked structured reasoning_content deltas, but reasoning also
arrives via content tag extraction (<REASONING_SCRATCHPAD>/<think> tags
in delta.content), which went through _fire_stream_delta not
_fire_reasoning_delta. Fix: skip the callback entirely when streaming is
active — both paths display reasoning during the stream. Any reasoning not
shown during streaming is caught by the CLI post-response fallback.

Bug B: The post-response reasoning display checked _reasoning_stream_started,
but that flag was reset by _reset_stream_state() during intermediate turn
boundaries (when stream_delta_callback(None) fires between tool calls).
Introduced _reasoning_shown_this_turn flag that persists across the tool
loop and is only reset at the start of each user turn.

Live-tested in PTY: reasoning now shows exactly once per API call, no
duplicates across tool-calling loops.
2026-03-27 09:57:50 -07:00
Teknium e0dbbdb2c9 fix: eliminate 'Event loop is closed' / 'Press ENTER to continue' during idle (#3398)
The OpenAI SDK's AsyncHttpxClientWrapper.__del__ schedules aclose() via
asyncio.get_running_loop().create_task().  When an AsyncOpenAI client is
garbage-collected while prompt_toolkit's event loop is running (the common
CLI idle state), the aclose() task runs on prompt_toolkit's loop but the
underlying TCP transport is bound to a different (dead) worker loop.
The transport's self._loop.call_soon() then raises RuntimeError('Event
loop is closed'), which prompt_toolkit surfaces as the disruptive
'Unhandled exception in event loop ... Press ENTER to continue...' error.

Three-layer fix:

1. neuter_async_httpx_del(): Monkey-patches __del__ to a no-op at CLI
   startup before any AsyncOpenAI clients are created.  Safe because
   cached clients are explicitly cleaned via _force_close_async_httpx,
   and uncached clients' TCP connections are cleaned by the OS on exit.

2. Custom asyncio exception handler: Installed on prompt_toolkit's event
   loop to silently suppress 'Event loop is closed' RuntimeError.
   Defense-in-depth for SDK upgrades that might change the class name.

3. cleanup_stale_async_clients(): Called after each agent turn (when the
   agent thread joins) to proactively evict cache entries whose event
   loop is closed, preventing stale clients from accumulating.
2026-03-27 09:45:25 -07:00
Teknium eb2127c1dc fix(cron): prevent recurring job re-fire on gateway crash/restart loop (#3396)
When a gateway crashes mid-job execution (before mark_job_run can persist
the updated next_run_at), the job would fire again on every restart attempt
within the grace window. For a daily 6:15 AM job with a 2-hour grace,
rapidly restarting the gateway could trigger dozens of duplicate runs.

Fix: call advance_next_run() BEFORE run_job() in tick(). For recurring
jobs (cron/interval), this preemptively advances next_run_at to the next
future occurrence and persists it to disk. If the process then crashes
during execution, the job won't be considered due on restart.

One-shot jobs are left unchanged — they still retry on restart since
there's no future occurrence to advance to.

This changes the scheduler from at-least-once to at-most-once semantics
for recurring jobs, which is the correct tradeoff: missing one daily
message is far better than sending it dozens of times.
2026-03-27 08:02:58 -07:00
Teknium 5a1e2a307a perf(ttft): salvage easy-win startup optimizations from #3346 (#3395)
* perf(ttft): dedupe shared tool availability checks

* perf(ttft): short-circuit vision auto-resolution

* perf(ttft): make Claude Code version detection lazy

* perf(ttft): reuse loaded toolsets for skills prompt

---------

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-27 07:49:44 -07:00
Teknium 41d9d08078 fix(telegram): fall back to no thread_id on 'Message thread not found' (#3390)
python-telegram-bot's BadRequest inherits from NetworkError, so the
send() retry loop was catching 'Message thread not found' as a transient
network error and retrying 3 times before silently failing. This killed
all tool progress messages, streaming responses, and typing indicators
when the incoming message carried an invalid message_thread_id.

Now detect BadRequest inside the NetworkError handler:
- 'thread not found' + thread_id set → clear thread_id and retry once
  (message still reaches the chat, just without topic threading)
- Other BadRequest errors → raise immediately (permanent, don't retry)
- True NetworkError → retry as before (transient)

252 silent failures in gateway.log traced to this on 2026-03-26.

5 new tests for thread fallback, non-thread BadRequest, no-thread sends,
network retry, and multi-chunk fallback.
2026-03-27 06:07:28 -07:00
Teknium b7bcae49c6 fix: SQLite WAL write-lock contention causing 15-20s TUI freeze (#3385)
Multiple hermes processes (gateway + CLI sessions + worktree agents) sharing
one state.db caused WAL write-lock convoy effects. SQLite's built-in busy
handler uses deterministic sleep intervals (up to 100ms) that synchronize
competing writers, creating 15-20 second freezes during agent init.

Root cause: timeout=30.0 with 7+ concurrent connections meant:
- WAL never checkpointed (294MB, readers always blocked it)
- Bloated WAL slowed all reads and writes
- Deterministic backoff caused convoy effects under contention

Fix:
- Replace 30s SQLite timeout with 1s + app-level retry (15 attempts,
  random 20-150ms jitter between retries to break convoys)
- Use BEGIN IMMEDIATE for explicit write-lock acquisition (fail fast)
- Set isolation_level=None for manual transaction control
- PASSIVE WAL checkpoint on close() and every 50 writes
- All 12 write methods converted to _execute_write() helper

Before: 15-20s frozen at create_session during agent init
After:  <1s to API call, WAL stays at ~4MB

Tested: 4355 tests pass, 3 concurrent live sessions with simultaneous
writes showed zero contention on every py-spy sample.
2026-03-27 05:22:57 -07:00
Teknium 915df02bbf fix(streaming): stale stream detector race causing spurious RemoteProtocolError
The stale stream detector (90s timeout) was killing healthy connections
during the model's thinking phase, producing self-inflicted
RemoteProtocolError ("peer closed connection without sending complete
message body"). Three issues:

1. last_chunk_time was never reset between inner stream retries, so
   subsequent attempts inherited the previous attempt's stale budget
2. The non-streaming fallback path didn't reset the timer either
3. 90s base timeout was too aggressive for large-context Opus sessions
   where thinking time before first token routinely exceeds 90s

Fix: reset last_chunk_time at the start of each streaming attempt and
before the non-streaming fallback. Increase base timeout to 180s and
scale to 300s for >100K token contexts.

Made-with: Cursor
2026-03-27 04:05:51 -07:00
Teknium 75fcbc44ce feat(telegram): auto-discover fallback IPs via DoH when api.telegram.org is unreachable (#3376)
* feat(telegram): auto-discover fallback IPs via DoH when api.telegram.org is unreachable

On some networks (university, corporate), api.telegram.org resolves to a
valid Telegram IP that is unreachable due to routing/firewall rules. A
different IP in the same Telegram-owned 149.154.160.0/20 block works fine.

This adds automatic fallback IP discovery at connect time:
1. Query Google and Cloudflare DNS-over-HTTPS for api.telegram.org A records
2. Exclude the system-DNS IP (the unreachable one), use the rest as fallbacks
3. If DoH is also blocked, fall back to a seed list (149.154.167.220)
4. TelegramFallbackTransport tries primary first, sticks to whichever works

No configuration needed — works automatically. TELEGRAM_FALLBACK_IPS env var
still available as manual override. Zero impact on healthy networks (primary
path succeeds on first attempt, fallback never exercised).

No new dependencies (uses httpx already in deps + stdlib socket).

* fix: share transport instance and downgrade seed fallback log to info

- Use single TelegramFallbackTransport shared between request and
  get_updates_request so sticky IP is shared across polling and API calls
- Keep separate HTTPXRequest instances (different timeout settings)
- Downgrade "using seed fallback IPs" from warning to info to avoid
  noisy logs on healthy networks

* fix: add telegram.request mock and discovery fixture to remaining test files

The original PR missed test_dm_topics.py and
test_telegram_network_reconnect.py — both need the telegram.request
mock module. The reconnect test also needs _no_auto_discovery since
_handle_polling_network_error calls connect() which now invokes
discover_fallback_ips().

---------

Co-authored-by: Mohan Qiao <Gavin-Qiao@users.noreply.github.com>
2026-03-27 04:03:13 -07:00
Teknium be416cdfa9 fix: guard config.get() against YAML null values to prevent AttributeError (#3377)
dict.get(key, default) returns None — not the default — when the key IS
present but explicitly set to null/~ in YAML.  Calling .lower() on that
raises AttributeError.

Use (config.get(key) or fallback) so both missing keys and explicit nulls
coalesce to the intended default.

Files fixed:
- tools/tts_tool.py — _get_provider()
- tools/web_tools.py — _get_backend()
- tools/mcp_tool.py — MCPServerTask auth config
- trajectory_compressor.py — _detect_provider() and config loading

Co-authored-by: dieutx <dangtc94@gmail.com>
2026-03-27 04:03:00 -07:00
Teknium b8b1f24fd7 fix: handle addition-only hunks in V4A patch parser (#3325)
V4A patches with only + lines (no context or - lines) were silently
dropped because search_lines was empty and the 'if search_lines:' block
was the only code path. Addition-only hunks are common when the model
generates patches for new functions or blocks.

Adds an else branch that inserts at the context_hint position when
available, or appends at end of file.

Includes 2 regression tests for addition-only hunks with and without
context hints.

Salvaged from PR #3092 by thakoreh.

Co-authored-by: Hiren <hiren.thakore58@gmail.com>
2026-03-26 19:38:04 -07:00
Teknium a2847ea7f0 fix(gateway): add media download retry to Mattermost, Slack, and base cache (#3323)
* fix(gateway): add media download retry to Mattermost, Slack, and base cache

Media downloads on Mattermost and Slack fail permanently on transient
errors (timeouts, 429 rate limits, 5xx server errors). Telegram and
WhatsApp already have retry logic, but these platforms had single-attempt
downloads with hardcoded 30s timeouts.

Changes:
- base.py cache_image_from_url: add retry with exponential backoff
  (covers Signal and any platform using the shared cache helper)
- mattermost.py _send_media_url: retry on 429/5xx/timeout (3 attempts)
- slack.py _download_slack_file: retry on timeout/5xx (3 attempts)
- slack.py _download_slack_file_bytes: same retry pattern

* test: add tests for media download retry

---------

Co-authored-by: dieutx <dangtc94@gmail.com>
2026-03-26 19:33:18 -07:00
Teknium 58ca875e19 feat(gateway): surface session config on /new, /reset, and auto-reset (#3321)
When a new session starts in the gateway (via /new, /reset, or
auto-reset), send the user a summary of the detected configuration:

   Session reset! Starting fresh.

  ◆ Model: qwen3.5:27b-q4_K_M
  ◆ Provider: custom
  ◆ Context: 8K tokens (config)
  ◆ Endpoint: http://localhost:11434/v1

This makes misconfigured context length immediately visible — a user
running a local 8K model that falls to the 128K default will see:

  ◆ Context: 128K tokens (default — set model.context_length in config to override)

Instead of silently getting no compression and degrading responses.

- _format_session_info() resolves model, provider, context length,
  and endpoint from config + runtime, matching the hygiene code's
  resolution chain
- Local/custom endpoints shown; cloud endpoints hidden (not useful)
- Context source annotated: config, detected, or default with hint
- Appended to /new and /reset responses, and auto-reset notifications
- 9 tests covering all formatting paths and failure resilience

Addresses the user-facing side of #2708 — instead of trying to fix
every edge case in context detection, surface the values so users
can immediately see when something is wrong.
2026-03-26 19:27:58 -07:00
Teknium 3f95e741a7 fix: validate empty user messages to prevent Anthropic API 400 errors (#3322)
When user messages have empty content (e.g., Discord @mention-only
messages, unrecognized attachments), the Anthropic API rejects the
request with 'user messages must have non-empty content'.

Changes:
- anthropic_adapter.py: Add empty content validation for user messages
  (string and list formats), matching the existing pattern for assistant
  and tool messages. Empty content gets '(empty message)' placeholder.

- discord.py: Defense-in-depth check at gateway layer to catch empty
  messages before they enter session history.

- Add 4 regression tests covering empty string, whitespace-only,
  empty list, and empty text block scenarios.

Fixes #3143

Co-authored-by: Bartok9 <bartok9@users.noreply.github.com>
2026-03-26 19:24:03 -07:00
Teknium 03396627a6 fix(ci): pin acp <0.9 and update retry-exhaust test (#3320)
Two remaining CI failures:

1. agent-client-protocol 0.9.0 removed AuthMethod (replaced with
   AuthMethodAgent/EnvVar/Terminal). Pin to <0.9 until the new API
   is evaluated — our usage doesn't map 1:1 to the new types.

2. test_429_exhausts_all_retries_before_raising expected pytest.raises
   but the agent now catches 429s after max retries, tries fallback,
   then returns a result dict. Updated to check final_response.
2026-03-26 19:21:34 -07:00
Teknium 22cfad157b fix: gateway token double-counting — use absolute set instead of increment (#3317)
The gateway's update_session() used += for token counts, but the cached
agent's session_prompt_tokens / session_completion_tokens are cumulative
totals that grow across messages. Each update_session call re-added the
running total, inflating usage stats with every message (1.7x after 3
messages, worse over longer conversations).

Fix: change += to = for in-memory entry fields, add set_token_counts()
to SessionDB that uses direct assignment instead of SQL increment, and
switch the gateway to call it.

CLI mode continues using update_token_counts() (increment) since it
tracks per-API-call deltas — that path is unchanged.

Based on analysis from PR #3222 by @zaycruz (closed).

Co-authored-by: zaycruz <zay@users.noreply.github.com>
2026-03-26 19:13:07 -07:00
Teknium 867eefdd9f fix(signal): track SSE keepalive comments as connection activity (#3316)
signal-cli sends SSE comment lines (':') as keepalives every ~15s. The
SSE listener only counted 'data:' lines as activity, so the health
monitor reported false idle warnings every 2 minutes during quiet
periods. Recognize ':' lines as valid activity per the SSE spec.

Salvaged from PR #2938 by ticketclosed-wontfix.
2026-03-26 19:10:25 -07:00
Teknium a8df7f9964 fix: gateway token double-counting with cached agents (#3306)
The cached agent accumulates session_input_tokens across messages, so
run_conversation() returns cumulative totals. But update_session() used
+= (increment), double-counting on every message after the first.

- session.py: change in-memory entry updates from += to = (direct
  assignment for cumulative values)
- hermes_state.py: add absolute=True flag to update_token_counts()
  that uses SET column = ? instead of SET column = column + ?
- session.py: pass absolute=True to the DB call

CLI path is unchanged — it passes per-API-call deltas directly to
update_token_counts() with the default absolute=False (increment).

Reported by @zaycruz in #3222. Closes #3222.
2026-03-26 19:04:53 -07:00
Teknium 1519c4d477 fix(session): add /resume CLI handler, session log truncation guard, reopen_session API (#3315)
Three improvements salvaged from PR #3225 by Mibayy:

1. Add /resume slash command handler in CLI process_command(). The
   command was registered in the commands registry but had no handler,
   so typing /resume produced 'Unknown command'. The handler resolves
   by title or session ID, ends the current session cleanly, loads
   conversation history from SQLite, re-opens the target session, and
   syncs the AIAgent instance. Follows the same pattern as new_session().

2. Add truncation guard in _save_session_log(). When resuming a session
   whose messages weren't fully written to SQLite, the agent starts with
   partial history and the first save would overwrite the full JSON log
   on disk. The guard reads the existing file and skips the write if it
   already has more messages than the current batch.

3. Add reopen_session() method to SessionDB. Proper API for clearing
   ended_at/end_reason instead of reaching into _conn directly.

Note: Bug 1 from the original PR (INSERT OR IGNORE + _session_db = None)
is already fixed on main — skipped as redundant.

Closes #3123.
2026-03-26 19:04:28 -07:00
Teknium 005786c55d fix(gateway): include per-platform ALLOW_ALL and SIGNAL_GROUP in startup allowlist check (#3313)
The startup warning 'No user allowlists configured' only checked
GATEWAY_ALLOW_ALL_USERS and per-platform _ALLOWED_USERS vars. It
missed SIGNAL_GROUP_ALLOWED_USERS and per-platform _ALLOW_ALL_USERS
vars (e.g. TELEGRAM_ALLOW_ALL_USERS), causing a false warning even
when users had these configured. The actual auth check in
_is_user_authorized already recognized these vars.

Cherry-picked from PR #3202 by binhnt92.

Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>
2026-03-26 18:23:49 -07:00
Teknium ad764d3513 fix(auxiliary): catch ImportError from build_anthropic_client in vision auto-detection (#3312)
_try_anthropic() caught ImportError on the module import (line 667-669)
but not on the build_anthropic_client() call (line 696). When the
anthropic_adapter module imports fine but the anthropic SDK is missing,
build_anthropic_client() raises ImportError at call time. This escaped
_try_anthropic() entirely, killing get_available_vision_backends() and
cascading to 7 test failures:

- 4 setup wizard tests hit unexpected 'Configure vision:' prompt
- 3 codex-auth-as-vision tests failed check_vision_requirements()

The fix wraps the build_anthropic_client call in try/except ImportError,
returning (None, None) when the SDK is unavailable — consistent with the
existing guard at the top of the function.
2026-03-26 18:21:59 -07:00
Teknium f008ee1019 fix(session): preserve reasoning fields in rewrite_transcript (#3311)
rewrite_transcript (used by /retry, /undo, /compress) was calling
append_message without reasoning, reasoning_details, or
codex_reasoning_items — permanently dropping them from SQLite.

Co-authored-by: alireza78a <alireza78.crypto@gmail.com>
2026-03-26 18:18:00 -07:00
Teknium 60fdb58ce4 fix(agent): update context compressor limits after fallback activation (#3305)
When _try_activate_fallback() switches to the fallback model, it
updates the agent's model/provider/client but never touches
self.context_compressor. The compressor keeps the primary model's
context_length and threshold_tokens, so compression decisions use
wrong limits — a 200K primary → 32K fallback still uses 200K-based
thresholds, causing oversized sessions to overflow the fallback.

Update the compressor's model, credentials, context_length, and
threshold_tokens after fallback activation using get_model_context_length()
for the new model.

Cherry-picked from PR #3202 by binhnt92.

Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>
2026-03-26 18:10:50 -07:00
Teknium 18d28c63a7 fix: add explicit hermes-api-server toolset for API server platform (#3304)
The API server adapter was creating agents without specifying
enabled_toolsets, causing ALL tools to load — including clarify,
send_message, and text_to_speech which don't work without interactive
callbacks or gateway dispatch.

Changes:
- toolsets.py: Add hermes-api-server toolset (core tools minus clarify,
  send_message, text_to_speech)
- api_server.py: Resolve toolsets from config.yaml platform_toolsets
  via _get_platform_tools() — same path as all other gateway platforms.
  Falls back to hermes-api-server default when no override configured.
- tools_config.py: Add api_server to PLATFORMS dict so users can
  customize via 'hermes tools' or platform_toolsets.api_server in
  config.yaml
- 12 tests covering toolset definition, config resolution, and
  user override

Reported by thatwolfieguy on Discord.
2026-03-26 18:02:26 -07:00
Teknium 3c57eaf744 fix: YAML boolean handling for tool_progress config (#3300)
YAML 1.1 parses bare `off` as boolean False, which is falsy in
Python's `or` chain and silently falls through to the 'all' default.
Users setting `display.tool_progress: off` in config.yaml saw no
effect — tool progress stayed on.

Normalise False → 'off' before the or chain in both affected paths:
- gateway/run.py _run_agent() tool progress reader
- cli.py HermesCLI.__init__() tool_progress_mode

Reported by @gibbsoft in #2859. Closes #2859.
2026-03-26 17:58:50 -07:00
Teknium 2d232c9991 feat(cli): configurable busy input mode + fix /queue always working (#3298)
Two changes:

1. Fix /queue command: remove the _agent_running guard that rejected
   /queue after the agent finished. The prompt was deferred in
   _pending_input until the agent completed, then the handler checked
   _agent_running (now False) and rejected it. /queue now always queues
   regardless of timing.

2. Add display.busy_input_mode config (CLI-only):
   - 'interrupt' (default): Enter while busy interrupts the current run
     (preserves existing behavior)
   - 'queue': Enter while busy queues the message for the next turn,
     with a 'Queued for the next turn: ...' confirmation
   Ctrl+C always interrupts regardless of this setting.

Salvaged from PR #3037 by StefanoChiodino. Key differences:
- Default is 'interrupt' (preserves existing behavior) not 'queue'
- No config version bump (unnecessary for new key in existing section)
- Simpler normalization (no alias map)
- /queue fix is simpler: just remove the guard instead of intercepting
  commands during busy state
2026-03-26 17:58:40 -07:00
Teknium 0375b2a0d7 fix(gateway): silence background agent terminal output (#3297)
* fix(gateway): silence flush agent terminal output

quiet_mode=True only suppresses AIAgent init messages.
Tool call output still leaks to the terminal through
_safe_print → _print_fn during session reset/expiry.

Since #2670 injected live memory state into the flush prompt,
the flush agent now reliably calls memory tools — making the
output leak noticeable for the first time.

Set _print_fn to a no-op so the background flush is fully silent.

* test(gateway): add test for flush agent terminal silence + fix dotenv mock

- Add TestFlushAgentSilenced: verifies _print_fn is set to a no-op on
  the flush agent so tool output never leaks to the terminal
- Fix pre-existing test failures: replace patch('run_agent.AIAgent')
  with sys.modules mock to avoid importing run_agent (requires openai)
- Add autouse _mock_dotenv fixture so all tests in this file run
  without the dotenv package installed

* fix(display): route KawaiiSpinner output through print_fn to fully silence flush agent

The previous fix set tmp_agent._print_fn = no-op on the flush agent but
spinner output and quiet-mode cute messages bypassed _print_fn entirely:
- KawaiiSpinner captured sys.stdout at __init__ and wrote directly to it
- quiet-mode tool results used builtin print() instead of _safe_print()

Add optional print_fn parameter to KawaiiSpinner.__init__; _write routes
through it when set. Pass self._print_fn to all spinner construction sites
in run_agent.py and change the quiet-mode cute message print to _safe_print.
The existing gateway fix (tmp_agent._print_fn = lambda) now propagates
correctly through both paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(gateway): silence hygiene and compression background agents

Two more background AIAgent instances in the gateway were created with
quiet_mode=True but without _print_fn = no-op, causing tool output to
leak to the terminal:
- _hyg_agent (in-turn hygiene memory agent)
- tmp_agent (_compress_context path)

Apply the same _print_fn no-op pattern used for the flush agent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(display): remove unused _last_flush_time from KawaiiSpinner

Attribute was set but never read; upstream already removed it.
Leftover from conflict resolution during rebase onto upstream/main.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 17:40:31 -07:00
Teknium 08fa326bb0 feat(gateway): deliver background review notifications to user chat (#3293)
The background memory/skill review (_spawn_background_review) runs
after the agent response when turn/iteration counters exceed their
thresholds. It saves memories and skills, then prints a summary like
'💾 Memory updated · User profile updated'. In CLI mode this goes to
the terminal via _safe_print. In gateway mode, _safe_print routes to
print() which goes to stdout — invisible to the user.

Add a background_review_callback attribute to AIAgent. When set, the
background review thread calls it with the summary string after saves
complete. The gateway wires this to adapter.send() via the same
run_coroutine_threadsafe bridge used by status_callback, delivering
the notification to the user's chat.
2026-03-26 17:38:24 -07:00
Teknium bde45f5a2a fix(gateway): retry transient send failures and notify user on exhaustion (#3288)
When send() fails due to a network error (ConnectError, ReadTimeout, etc.),
the failure was silently logged and the user received no feedback — appearing
as a hang. In one reported case, a user waited 1+ hour for a response that
had already been generated but failed to deliver (#2910).

Adds _send_with_retry() to BasePlatformAdapter:
- Transient errors: retry up to 2x with exponential backoff + jitter
- On exhaustion: send delivery-failure notice so user knows to retry
- Permanent errors: fall back to plain-text version (preserves existing behavior)
- SendResult.retryable flag for platform-specific transient errors

All adapters benefit automatically via BasePlatformAdapter inheritance.

Cherry-picked from PR #3108 by Mibayy.

Co-authored-by: Mibayy <mibayy@users.noreply.github.com>
2026-03-26 17:37:10 -07:00
Teknium 716e616d28 fix(tui): status bar duplicates and degrades during long sessions (#3291)
shutil.get_terminal_size() can return stale/fallback values on SSH that
differ from prompt_toolkit's actual terminal width. Fragments built for
the wrong width overflow and wrap onto a second line (wrap_lines=True
default), appearing as progressively degrading duplicates.

- Read width from get_app().output.get_size().columns when inside a
  prompt_toolkit TUI, falling back to shutil outside TUI context
- Add wrap_lines=False on the status bar Window as belt-and-suspenders
  guard against any future width mismatch

Closes #3130

Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>
2026-03-26 17:33:11 -07:00
Teknium bdccdd67a1 fix: OpenClaw migration overwrites defaults and setup wizard skips imported sections (#3282)
Two bugs caused the OpenClaw migration during first-time setup to be
ineffective, forcing users to reconfigure everything manually:

1. The setup wizard created config.yaml with all defaults BEFORE running
   the migration, then the migrator ran with overwrite=False. Every config
   setting was reported as a 'conflict' against the defaults and skipped.
   Fix: use overwrite=True during setup-time migration (safe because only
   defaults exist at that point). The hermes claw migrate CLI command
   still defaults to overwrite=False for post-setup use.

2. After migration, the full setup wizard ran all 5 sections unconditionally,
   forcing the user through model/terminal/agent/messaging/tools configuration
   even when those settings were just imported.
   Fix: add _get_section_config_summary() and _skip_configured_section()
   helpers. After migration, each section checks if it's already configured
   (API keys present, non-default values, platform tokens) and offers
   'Reconfigure? [y/N]' with default No. Unconfigured sections still run
   normally.

Reported by Dev Bredda on social media.
2026-03-26 16:29:38 -07:00
Teknium 148f46620f fix(matrix): add backoff for SyncError in sync loop (#3280)
When the homeserver returns an error response, matrix-nio parses it
as a SyncError return value rather than raising an exception. The sync
loop only had backoff in the except handler, so SyncError caused a
tight retry loop (~489 req/s) flooding logs and hammering the
homeserver. Check the return value and sleep 5s before retry.

Cherry-picked from PR #2937 by ticketclosed-wontfix.

Co-authored-by: ticketclosed-wontfix <ticketclosed-wontfix@users.noreply.github.com>
2026-03-26 16:19:58 -07:00
382 changed files with 41075 additions and 2539 deletions
+13
View File
@@ -0,0 +1,13 @@
# Git
.git
.gitignore
.gitmodules
# Dependencies
node_modules
# CI/CD
.github
# Environment files
.env
+29 -1
View File
@@ -59,12 +59,25 @@ OPENCODE_ZEN_API_KEY=
# OpenCode Go provides access to open models (GLM-5, Kimi K2.5, MiniMax M2.5)
# $10/month subscription. Get your key at: https://opencode.ai/auth
OPENCODE_GO_API_KEY=
# =============================================================================
# LLM PROVIDER (Hugging Face Inference Providers)
# =============================================================================
# Hugging Face routes to 20+ open models via unified OpenAI-compatible endpoint.
# Free tier included ($0.10/month), no markup on provider rates.
# Get your token at: https://huggingface.co/settings/tokens
# Required permission: "Make calls to Inference Providers"
HF_TOKEN=
# OPENCODE_GO_BASE_URL=https://opencode.ai/zen/go/v1 # Override default base URL
# =============================================================================
# TOOL API KEYS
# =============================================================================
# Exa API Key - AI-native web search and contents
# Get at: https://exa.ai
EXA_API_KEY=
# Parallel API Key - AI-native web search and extract
# Get at: https://parallel.ai
PARALLEL_API_KEY=
@@ -85,7 +98,7 @@ FAL_KEY=
HONCHO_API_KEY=
# =============================================================================
# TERMINAL TOOL CONFIGURATION (mini-swe-agent backend)
# TERMINAL TOOL CONFIGURATION
# =============================================================================
# Backend type: "local", "singularity", "docker", "modal", or "ssh"
# Terminal backend is configured in ~/.hermes/config.yaml (terminal.backend).
@@ -218,6 +231,21 @@ VOICE_TOOLS_OPENAI_KEY=
# Slack allowed users (comma-separated Slack user IDs)
# SLACK_ALLOWED_USERS=
# =============================================================================
# TELEGRAM INTEGRATION
# =============================================================================
# Telegram Bot Token - From @BotFather (https://t.me/BotFather)
# TELEGRAM_BOT_TOKEN=
# TELEGRAM_ALLOWED_USERS= # Comma-separated user IDs
# TELEGRAM_HOME_CHANNEL= # Default chat for cron delivery
# TELEGRAM_HOME_CHANNEL_NAME= # Display name for home channel
# Webhook mode (optional — for cloud deployments like Fly.io/Railway)
# Default is long polling. Setting TELEGRAM_WEBHOOK_URL switches to webhook mode.
# TELEGRAM_WEBHOOK_URL=https://my-app.fly.dev/telegram
# TELEGRAM_WEBHOOK_PORT=8443
# TELEGRAM_WEBHOOK_SECRET= # Recommended for production
# WhatsApp (built-in Baileys bridge — run `hermes whatsapp` to pair)
# WHATSAPP_ENABLED=false
# WHATSAPP_ALLOWED_USERS=15551234567
+2
View File
@@ -19,6 +19,8 @@ concurrency:
jobs:
build-and-deploy:
# Only run on the upstream repository, not on forks
if: github.repository == 'NousResearch/hermes-agent'
runs-on: ubuntu-latest
environment:
name: github-pages
+63
View File
@@ -0,0 +1,63 @@
name: Docker Build and Publish
on:
push:
branches: [main]
pull_request:
branches: [main]
concurrency:
group: docker-${{ github.ref }}
cancel-in-progress: true
jobs:
build-and-push:
# Only run on the upstream repository, not on forks
if: github.repository == 'NousResearch/hermes-agent'
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
submodules: recursive
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build image
uses: docker/build-push-action@v6
with:
context: .
file: Dockerfile
load: true
tags: nousresearch/hermes-agent:test
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Test image starts
run: |
docker run --rm \
-v /tmp/hermes-test:/opt/data \
--entrypoint /opt/hermes/docker/entrypoint.sh \
nousresearch/hermes-agent:test --help
- name: Log in to Docker Hub
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Push image
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
uses: docker/build-push-action@v6
with:
context: .
file: Dockerfile
push: true
tags: |
nousresearch/hermes-agent:latest
nousresearch/hermes-agent:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
+78
View File
@@ -210,6 +210,10 @@ registry.register(
The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.
**Path references in tool schemas**: If the schema description mentions file paths (e.g. default output directories), use `display_hermes_home()` to make them profile-aware. The schema is generated at import time, which is after `_apply_profile_override()` sets `HERMES_HOME`.
**State files**: If a tool stores persistent state (caches, logs, checkpoints), use `get_hermes_home()` for the base directory — never `Path.home() / ".hermes"`. This ensures each profile gets its own state.
**Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `todo_tool.py` for the pattern.
---
@@ -358,8 +362,69 @@ in config.yaml (or `HERMES_BACKGROUND_NOTIFICATIONS` env var):
---
## Profiles: Multi-Instance Support
Hermes supports **profiles** — multiple fully isolated instances, each with its own
`HERMES_HOME` directory (config, API keys, memory, sessions, skills, gateway, etc.).
The core mechanism: `_apply_profile_override()` in `hermes_cli/main.py` sets
`HERMES_HOME` before any module imports. All 119+ references to `get_hermes_home()`
automatically scope to the active profile.
### Rules for profile-safe code
1. **Use `get_hermes_home()` for all HERMES_HOME paths.** Import from `hermes_constants`.
NEVER hardcode `~/.hermes` or `Path.home() / ".hermes"` in code that reads/writes state.
```python
# GOOD
from hermes_constants import get_hermes_home
config_path = get_hermes_home() / "config.yaml"
# BAD — breaks profiles
config_path = Path.home() / ".hermes" / "config.yaml"
```
2. **Use `display_hermes_home()` for user-facing messages.** Import from `hermes_constants`.
This returns `~/.hermes` for default or `~/.hermes/profiles/<name>` for profiles.
```python
# GOOD
from hermes_constants import display_hermes_home
print(f"Config saved to {display_hermes_home()}/config.yaml")
# BAD — shows wrong path for profiles
print("Config saved to ~/.hermes/config.yaml")
```
3. **Module-level constants are fine** — they cache `get_hermes_home()` at import time,
which is AFTER `_apply_profile_override()` sets the env var. Just use `get_hermes_home()`,
not `Path.home() / ".hermes"`.
4. **Tests that mock `Path.home()` must also set `HERMES_HOME`** — since code now uses
`get_hermes_home()` (reads env var), not `Path.home() / ".hermes"`:
```python
with patch.object(Path, "home", return_value=tmp_path), \
patch.dict(os.environ, {"HERMES_HOME": str(tmp_path / ".hermes")}):
...
```
5. **Gateway platform adapters should use token locks** — if the adapter connects with
a unique credential (bot token, API key), call `acquire_scoped_lock()` from
`gateway.status` in the `connect()`/`start()` method and `release_scoped_lock()` in
`disconnect()`/`stop()`. This prevents two profiles from using the same credential.
See `gateway/platforms/telegram.py` for the canonical pattern.
6. **Profile operations are HOME-anchored, not HERMES_HOME-anchored** — `_get_profiles_root()`
returns `Path.home() / ".hermes" / "profiles"`, NOT `get_hermes_home() / "profiles"`.
This is intentional — it lets `hermes -p coder profile list` see all profiles regardless
of which one is active.
## Known Pitfalls
### DO NOT hardcode `~/.hermes` paths
Use `get_hermes_home()` from `hermes_constants` for code paths. Use `display_hermes_home()`
for user-facing print/log messages. Hardcoding `~/.hermes` breaks profiles — each profile
has its own `HERMES_HOME` directory. This was the source of 5 bugs fixed in PR #3575.
### DO NOT use `simple_term_menu` for interactive menus
Rendering bugs in tmux/iTerm2 — ghosting on scroll. Use `curses` (stdlib) instead. See `hermes_cli/tools_config.py` for the pattern.
@@ -375,6 +440,19 @@ Tool schema descriptions must not mention tools from other toolsets by name (e.g
### Tests must not write to `~/.hermes/`
The `_isolate_hermes_home` autouse fixture in `tests/conftest.py` redirects `HERMES_HOME` to a temp dir. Never hardcode `~/.hermes/` paths in tests.
**Profile tests**: When testing profile features, also mock `Path.home()` so that
`_get_profiles_root()` and `_get_default_hermes_home()` resolve within the temp dir.
Use the pattern from `tests/hermes_cli/test_profiles.py`:
```python
@pytest.fixture
def profile_env(tmp_path, monkeypatch):
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(home))
return home
```
---
## Testing
+20
View File
@@ -0,0 +1,20 @@
FROM debian:13.4
RUN apt-get update
RUN apt-get install -y nodejs npm python3 python3-pip ripgrep ffmpeg gcc python3-dev libffi-dev
COPY . /opt/hermes
WORKDIR /opt/hermes
RUN pip install -e ".[all]" --break-system-packages
RUN npm install
RUN npx playwright install --with-deps chromium
WORKDIR /opt/hermes/scripts/whatsapp-bridge
RUN npm install
WORKDIR /opt/hermes
RUN chmod +x /opt/hermes/docker/entrypoint.sh
ENV HERMES_HOME=/opt/data
VOLUME [ "/opt/data" ]
ENTRYPOINT [ "/opt/hermes/docker/entrypoint.sh" ]
+4
View File
@@ -0,0 +1,4 @@
graft skills
graft optional-skills
global-exclude __pycache__
global-exclude *.py[cod]
+348
View File
@@ -0,0 +1,348 @@
# Hermes Agent v0.5.0 (v2026.3.28)
**Release Date:** March 28, 2026
> The hardening release — Hugging Face provider, /model command overhaul, Telegram Private Chat Topics, native Modal SDK, plugin lifecycle hooks, tool-use enforcement for GPT models, Nix flake, 50+ security and reliability fixes, and a comprehensive supply chain audit.
---
## ✨ Highlights
- **Nous Portal now supports 400+ models** — The Nous Research inference portal has expanded dramatically, giving Hermes Agent users access to over 400 models through a single provider endpoint
- **Hugging Face as a first-class inference provider** — Full integration with HF Inference API including curated agentic model picker that maps to OpenRouter analogues, live `/models` endpoint probe, and setup wizard flow ([#3419](https://github.com/NousResearch/hermes-agent/pull/3419), [#3440](https://github.com/NousResearch/hermes-agent/pull/3440))
- **Telegram Private Chat Topics** — Project-based conversations with functional skill binding per topic, enabling isolated workflows within a single Telegram chat ([#3163](https://github.com/NousResearch/hermes-agent/pull/3163))
- **Native Modal SDK backend** — Replaced swe-rex dependency with native Modal SDK (`Sandbox.create.aio` + `exec.aio`), eliminating tunnels and simplifying the Modal terminal backend ([#3538](https://github.com/NousResearch/hermes-agent/pull/3538))
- **Plugin lifecycle hooks activated** — `pre_llm_call`, `post_llm_call`, `on_session_start`, and `on_session_end` hooks now fire in the agent loop and CLI/gateway, completing the plugin hook system ([#3542](https://github.com/NousResearch/hermes-agent/pull/3542))
- **Improved OpenAI Model Reliability** — Added `GPT_TOOL_USE_GUIDANCE` to prevent GPT models from describing intended actions instead of making tool calls, plus automatic stripping of stale budget warnings from conversation history that caused models to avoid tools across turns ([#3528](https://github.com/NousResearch/hermes-agent/pull/3528))
- **Nix flake** — Full uv2nix build, NixOS module with persistent container mode, auto-generated config keys from Python source, and suffix PATHs for agent-friendliness ([#20](https://github.com/NousResearch/hermes-agent/pull/20), [#3274](https://github.com/NousResearch/hermes-agent/pull/3274), [#3061](https://github.com/NousResearch/hermes-agent/pull/3061)) by @alt-glitch
- **Supply chain hardening** — Removed compromised `litellm` dependency, pinned all dependency version ranges, regenerated `uv.lock` with hashes, added CI workflow scanning PRs for supply chain attack patterns, and bumped deps to fix CVEs ([#2796](https://github.com/NousResearch/hermes-agent/pull/2796), [#2810](https://github.com/NousResearch/hermes-agent/pull/2810), [#2812](https://github.com/NousResearch/hermes-agent/pull/2812), [#2816](https://github.com/NousResearch/hermes-agent/pull/2816), [#3073](https://github.com/NousResearch/hermes-agent/pull/3073))
- **Anthropic output limits fix** — Replaced hardcoded 16K `max_tokens` with per-model native output limits (128K for Opus 4.6, 64K for Sonnet 4.6), fixing "Response truncated" and thinking-budget exhaustion on direct Anthropic API ([#3426](https://github.com/NousResearch/hermes-agent/pull/3426), [#3444](https://github.com/NousResearch/hermes-agent/pull/3444))
---
## 🏗️ Core Agent & Architecture
### New Provider: Hugging Face
- First-class Hugging Face Inference API integration with auth, setup wizard, and model picker ([#3419](https://github.com/NousResearch/hermes-agent/pull/3419))
- Curated model list mapping OpenRouter agentic defaults to HF equivalents — providers with 8+ curated models skip live `/models` probe for speed ([#3440](https://github.com/NousResearch/hermes-agent/pull/3440))
- Added glm-5-turbo to Z.AI provider model list ([#3095](https://github.com/NousResearch/hermes-agent/pull/3095))
### Provider & Model Improvements
- `/model` command overhaul — extracted shared `switch_model()` pipeline for CLI and gateway, custom endpoint support, provider-aware routing ([#2795](https://github.com/NousResearch/hermes-agent/pull/2795), [#2799](https://github.com/NousResearch/hermes-agent/pull/2799))
- Removed `/model` slash command from CLI and gateway in favor of `hermes model` subcommand ([#3080](https://github.com/NousResearch/hermes-agent/pull/3080))
- Preserve `custom` provider instead of silently remapping to `openrouter` ([#2792](https://github.com/NousResearch/hermes-agent/pull/2792))
- Read root-level `provider` and `base_url` from config.yaml into model config ([#3112](https://github.com/NousResearch/hermes-agent/pull/3112))
- Align Nous Portal model slugs with OpenRouter naming ([#3253](https://github.com/NousResearch/hermes-agent/pull/3253))
- Fix Alibaba provider default endpoint and model list ([#3484](https://github.com/NousResearch/hermes-agent/pull/3484))
- Allow MiniMax users to override `/v1``/anthropic` auto-correction ([#3553](https://github.com/NousResearch/hermes-agent/pull/3553))
- Migrate OAuth token refresh to `platform.claude.com` with fallback ([#3246](https://github.com/NousResearch/hermes-agent/pull/3246))
### Agent Loop & Conversation
- **Improved OpenAI model reliability** — `GPT_TOOL_USE_GUIDANCE` prevents GPT models from describing actions instead of calling tools + automatic budget warning stripping from history ([#3528](https://github.com/NousResearch/hermes-agent/pull/3528))
- **Surface lifecycle events** — All retry, fallback, and compression events now surface to the user as formatted messages ([#3153](https://github.com/NousResearch/hermes-agent/pull/3153))
- **Anthropic output limits** — Per-model native output limits instead of hardcoded 16K `max_tokens` ([#3426](https://github.com/NousResearch/hermes-agent/pull/3426))
- **Thinking-budget exhaustion detection** — Skip useless continuation retries when model uses all output tokens on reasoning ([#3444](https://github.com/NousResearch/hermes-agent/pull/3444))
- Always prefer streaming for API calls to prevent hung subagents ([#3120](https://github.com/NousResearch/hermes-agent/pull/3120))
- Restore safe non-streaming fallback after stream failures ([#3020](https://github.com/NousResearch/hermes-agent/pull/3020))
- Give subagents independent iteration budgets ([#3004](https://github.com/NousResearch/hermes-agent/pull/3004))
- Update `api_key` in `_try_activate_fallback` for subagent auth ([#3103](https://github.com/NousResearch/hermes-agent/pull/3103))
- Graceful return on max retries instead of crashing thread ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Count compression restarts toward retry limit ([#3070](https://github.com/NousResearch/hermes-agent/pull/3070))
- Include tool tokens in preflight estimate, guard context probe persistence ([#3164](https://github.com/NousResearch/hermes-agent/pull/3164))
- Update context compressor limits after fallback activation ([#3305](https://github.com/NousResearch/hermes-agent/pull/3305))
- Validate empty user messages to prevent Anthropic API 400 errors ([#3322](https://github.com/NousResearch/hermes-agent/pull/3322))
- GLM reasoning-only and max-length handling ([#3010](https://github.com/NousResearch/hermes-agent/pull/3010))
- Increase API timeout default from 900s to 1800s for slow-thinking models ([#3431](https://github.com/NousResearch/hermes-agent/pull/3431))
- Send `max_tokens` for Claude/OpenRouter + retry SSE connection errors ([#3497](https://github.com/NousResearch/hermes-agent/pull/3497))
- Prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode ([#2701](https://github.com/NousResearch/hermes-agent/pull/2701)) by @ctlst
### Streaming & Reasoning
- **Persist reasoning across gateway session turns** with new schema v6 columns (`reasoning`, `reasoning_details`, `codex_reasoning_items`) ([#2974](https://github.com/NousResearch/hermes-agent/pull/2974))
- Detect and kill stale SSE connections ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Fix stale stream detector race causing spurious `RemoteProtocolError` ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Skip duplicate callback for `<think>`-extracted reasoning during streaming ([#3116](https://github.com/NousResearch/hermes-agent/pull/3116))
- Preserve reasoning fields in `rewrite_transcript` ([#3311](https://github.com/NousResearch/hermes-agent/pull/3311))
- Preserve Gemini thought signatures in streamed tool calls ([#2997](https://github.com/NousResearch/hermes-agent/pull/2997))
- Ensure first delta is fired during reasoning updates ([untagged commit](https://github.com/NousResearch/hermes-agent))
### Session & Memory
- **Session search recent sessions mode** — Omit query to browse recent sessions with titles, previews, and timestamps ([#2533](https://github.com/NousResearch/hermes-agent/pull/2533))
- **Session config surfacing** on `/new`, `/reset`, and auto-reset ([#3321](https://github.com/NousResearch/hermes-agent/pull/3321))
- **Third-party session isolation** — `--source` flag for isolating sessions by origin ([#3255](https://github.com/NousResearch/hermes-agent/pull/3255))
- Add `/resume` CLI handler, session log truncation guard, `reopen_session` API ([#3315](https://github.com/NousResearch/hermes-agent/pull/3315))
- Clear compressor summary and turn counter on `/clear` and `/new` ([#3102](https://github.com/NousResearch/hermes-agent/pull/3102))
- Surface silent SessionDB failures that cause session data loss ([#2999](https://github.com/NousResearch/hermes-agent/pull/2999))
- Session search fallback preview on summarization failure ([#3478](https://github.com/NousResearch/hermes-agent/pull/3478))
- Prevent stale memory overwrites by flush agent ([#2687](https://github.com/NousResearch/hermes-agent/pull/2687))
### Context Compression
- Replace dead `summary_target_tokens` with ratio-based scaling ([#2554](https://github.com/NousResearch/hermes-agent/pull/2554))
- Expose `compression.target_ratio`, `protect_last_n`, and `threshold` in `DEFAULT_CONFIG` ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Restore sane defaults and cap summary at 12K tokens ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Preserve transcript on `/compress` and hygiene compression ([#3556](https://github.com/NousResearch/hermes-agent/pull/3556))
- Update context pressure warnings and token estimates after compaction ([untagged commit](https://github.com/NousResearch/hermes-agent))
### Architecture & Dependencies
- **Remove mini-swe-agent dependency** — Inline Docker and Modal backends directly ([#2804](https://github.com/NousResearch/hermes-agent/pull/2804))
- **Replace swe-rex with native Modal SDK** for Modal backend ([#3538](https://github.com/NousResearch/hermes-agent/pull/3538))
- **Plugin lifecycle hooks** — `pre_llm_call`, `post_llm_call`, `on_session_start`, `on_session_end` now fire in the agent loop ([#3542](https://github.com/NousResearch/hermes-agent/pull/3542))
- Fix plugin toolsets invisible in `hermes tools` and standalone processes ([#3457](https://github.com/NousResearch/hermes-agent/pull/3457))
- Consolidate `get_hermes_home()` and `parse_reasoning_effort()` ([#3062](https://github.com/NousResearch/hermes-agent/pull/3062))
- Remove unused Hermes-native PKCE OAuth flow ([#3107](https://github.com/NousResearch/hermes-agent/pull/3107))
- Remove ~100 unused imports across 55 files ([#3016](https://github.com/NousResearch/hermes-agent/pull/3016))
- Fix 154 f-strings, simplify getattr/URL patterns, remove dead code ([#3119](https://github.com/NousResearch/hermes-agent/pull/3119))
---
## 📱 Messaging Platforms (Gateway)
### Telegram
- **Private Chat Topics** — Project-based conversations with functional skill binding per topic, enabling isolated workflows within a single Telegram chat ([#3163](https://github.com/NousResearch/hermes-agent/pull/3163))
- **Auto-discover fallback IPs via DNS-over-HTTPS** when `api.telegram.org` is unreachable ([#3376](https://github.com/NousResearch/hermes-agent/pull/3376))
- **Configurable reply threading mode** ([#2907](https://github.com/NousResearch/hermes-agent/pull/2907))
- Fall back to no `thread_id` on "Message thread not found" BadRequest ([#3390](https://github.com/NousResearch/hermes-agent/pull/3390))
- Self-reschedule reconnect when `start_polling` fails after 502 ([#3268](https://github.com/NousResearch/hermes-agent/pull/3268))
### Discord
- Stop phantom typing indicator after agent turn completes ([#3003](https://github.com/NousResearch/hermes-agent/pull/3003))
### Slack
- Send tool call progress messages to correct Slack thread ([#3063](https://github.com/NousResearch/hermes-agent/pull/3063))
- Scope progress thread fallback to Slack only ([#3488](https://github.com/NousResearch/hermes-agent/pull/3488))
### WhatsApp
- Download documents, audio, and video media from messages ([#2978](https://github.com/NousResearch/hermes-agent/pull/2978))
### Matrix
- Add missing Matrix entry in `PLATFORMS` dict ([#3473](https://github.com/NousResearch/hermes-agent/pull/3473))
- Harden e2ee access-token handling ([#3562](https://github.com/NousResearch/hermes-agent/pull/3562))
- Add backoff for `SyncError` in sync loop ([#3280](https://github.com/NousResearch/hermes-agent/pull/3280))
### Signal
- Track SSE keepalive comments as connection activity ([#3316](https://github.com/NousResearch/hermes-agent/pull/3316))
### Email
- Prevent unbounded growth of `_seen_uids` in EmailAdapter ([#3490](https://github.com/NousResearch/hermes-agent/pull/3490))
### Gateway Core
- **Config-gated `/verbose` command** for messaging platforms — toggle tool output verbosity from chat ([#3262](https://github.com/NousResearch/hermes-agent/pull/3262))
- **Background review notifications** delivered to user chat ([#3293](https://github.com/NousResearch/hermes-agent/pull/3293))
- **Retry transient send failures** and notify user on exhaustion ([#3288](https://github.com/NousResearch/hermes-agent/pull/3288))
- Recover from hung agents — `/stop` hard-kills session lock ([#3104](https://github.com/NousResearch/hermes-agent/pull/3104))
- Thread-safe `SessionStore` — protect `_entries` with `threading.Lock` ([#3052](https://github.com/NousResearch/hermes-agent/pull/3052))
- Fix gateway token double-counting with cached agents — use absolute set instead of increment ([#3306](https://github.com/NousResearch/hermes-agent/pull/3306), [#3317](https://github.com/NousResearch/hermes-agent/pull/3317))
- Fingerprint full auth token in agent cache signature ([#3247](https://github.com/NousResearch/hermes-agent/pull/3247))
- Silence background agent terminal output ([#3297](https://github.com/NousResearch/hermes-agent/pull/3297))
- Include per-platform `ALLOW_ALL` and `SIGNAL_GROUP` in startup allowlist check ([#3313](https://github.com/NousResearch/hermes-agent/pull/3313))
- Include user-local bin paths in systemd unit PATH ([#3527](https://github.com/NousResearch/hermes-agent/pull/3527))
- Track background task references in `GatewayRunner` ([#3254](https://github.com/NousResearch/hermes-agent/pull/3254))
- Add request timeouts to HA, Email, Mattermost, SMS adapters ([#3258](https://github.com/NousResearch/hermes-agent/pull/3258))
- Add media download retry to Mattermost, Slack, and base cache ([#3323](https://github.com/NousResearch/hermes-agent/pull/3323))
- Detect virtualenv path instead of hardcoding `venv/` ([#2797](https://github.com/NousResearch/hermes-agent/pull/2797))
- Use `TERMINAL_CWD` for context file discovery, not process cwd ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Stop loading hermes repo AGENTS.md into gateway sessions (~10k wasted tokens) ([#2891](https://github.com/NousResearch/hermes-agent/pull/2891))
---
## 🖥️ CLI & User Experience
### Interactive CLI
- **Configurable busy input mode** + fix `/queue` always working ([#3298](https://github.com/NousResearch/hermes-agent/pull/3298))
- **Preserve user input on multiline paste** ([#3065](https://github.com/NousResearch/hermes-agent/pull/3065))
- **Tool generation callback** — streaming "preparing terminal…" updates during tool argument generation ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Show tool progress for substantive tools, not just "preparing" ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Buffer reasoning preview chunks and fix duplicate display ([#3013](https://github.com/NousResearch/hermes-agent/pull/3013))
- Prevent reasoning box from rendering 3x during tool-calling loops ([#3405](https://github.com/NousResearch/hermes-agent/pull/3405))
- Eliminate "Event loop is closed" / "Press ENTER to continue" during idle — three-layer fix with `neuter_async_httpx_del()`, custom exception handler, and stale client cleanup ([#3398](https://github.com/NousResearch/hermes-agent/pull/3398))
- Fix status bar shows 26K instead of 260K for token counts with trailing zeros ([#3024](https://github.com/NousResearch/hermes-agent/pull/3024))
- Fix status bar duplicates and degrades during long sessions ([#3291](https://github.com/NousResearch/hermes-agent/pull/3291))
- Refresh TUI before background task output to prevent status bar overlap ([#3048](https://github.com/NousResearch/hermes-agent/pull/3048))
- Suppress KawaiiSpinner animation under `patch_stdout` ([#2994](https://github.com/NousResearch/hermes-agent/pull/2994))
- Skip KawaiiSpinner when TUI handles tool progress ([#2973](https://github.com/NousResearch/hermes-agent/pull/2973))
- Guard `isatty()` against closed streams via `_is_tty` property ([#3056](https://github.com/NousResearch/hermes-agent/pull/3056))
- Ensure single closure of streaming boxes during tool generation ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Cap context pressure percentage at 100% in display ([#3480](https://github.com/NousResearch/hermes-agent/pull/3480))
- Clean up HTML error messages in CLI display ([#3069](https://github.com/NousResearch/hermes-agent/pull/3069))
- Show HTTP status code and 400 body in API error output ([#3096](https://github.com/NousResearch/hermes-agent/pull/3096))
- Extract useful info from HTML error pages, dump debug on max retries ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Prevent TypeError on startup when `base_url` is None ([#3068](https://github.com/NousResearch/hermes-agent/pull/3068))
- Prevent update crash in non-TTY environments ([#3094](https://github.com/NousResearch/hermes-agent/pull/3094))
- Handle EOFError in sessions delete/prune confirmation prompts ([#3101](https://github.com/NousResearch/hermes-agent/pull/3101))
- Catch KeyboardInterrupt during `flush_memories` on exit and in exit cleanup handlers ([#3025](https://github.com/NousResearch/hermes-agent/pull/3025), [#3257](https://github.com/NousResearch/hermes-agent/pull/3257))
- Guard `.strip()` against None values from YAML config ([#3552](https://github.com/NousResearch/hermes-agent/pull/3552))
- Guard `config.get()` against YAML null values to prevent AttributeError ([#3377](https://github.com/NousResearch/hermes-agent/pull/3377))
- Store asyncio task references to prevent GC mid-execution ([#3267](https://github.com/NousResearch/hermes-agent/pull/3267))
### Setup & Configuration
- Use explicit key mapping for returning-user menu dispatch instead of positional index ([#3083](https://github.com/NousResearch/hermes-agent/pull/3083))
- Use `sys.executable` for pip in update commands to fix PEP 668 ([#3099](https://github.com/NousResearch/hermes-agent/pull/3099))
- Harden `hermes update` against diverged history, non-main branches, and gateway edge cases ([#3492](https://github.com/NousResearch/hermes-agent/pull/3492))
- OpenClaw migration overwrites defaults and setup wizard skips imported sections — fixed ([#3282](https://github.com/NousResearch/hermes-agent/pull/3282))
- Stop recursive AGENTS.md walk, load top-level only ([#3110](https://github.com/NousResearch/hermes-agent/pull/3110))
- Add macOS Homebrew paths to browser and terminal PATH resolution ([#2713](https://github.com/NousResearch/hermes-agent/pull/2713))
- YAML boolean handling for `tool_progress` config ([#3300](https://github.com/NousResearch/hermes-agent/pull/3300))
- Reset default SOUL.md to baseline identity text ([#3159](https://github.com/NousResearch/hermes-agent/pull/3159))
- Reject relative cwd paths for container terminal backends ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Add explicit `hermes-api-server` toolset for API server platform ([#3304](https://github.com/NousResearch/hermes-agent/pull/3304))
- Reorder setup wizard providers — OpenRouter first ([untagged commit](https://github.com/NousResearch/hermes-agent))
---
## 🔧 Tool System
### API Server
- **Idempotency-Key support**, body size limit, and OpenAI error envelope ([#2903](https://github.com/NousResearch/hermes-agent/pull/2903))
- Allow Idempotency-Key in CORS headers ([#3530](https://github.com/NousResearch/hermes-agent/pull/3530))
- Cancel orphaned agent + true interrupt on SSE disconnect ([#3427](https://github.com/NousResearch/hermes-agent/pull/3427))
- Fix streaming breaks when agent makes tool calls ([#2985](https://github.com/NousResearch/hermes-agent/pull/2985))
### Terminal & File Operations
- Handle addition-only hunks in V4A patch parser ([#3325](https://github.com/NousResearch/hermes-agent/pull/3325))
- Exponential backoff for persistent shell polling ([#2996](https://github.com/NousResearch/hermes-agent/pull/2996))
- Add timeout to subprocess calls in `context_references` ([#3469](https://github.com/NousResearch/hermes-agent/pull/3469))
### Browser & Vision
- Handle 402 insufficient credits error in vision tool ([#2802](https://github.com/NousResearch/hermes-agent/pull/2802))
- Fix `browser_vision` ignores `auxiliary.vision.timeout` config ([#2901](https://github.com/NousResearch/hermes-agent/pull/2901))
- Make browser command timeout configurable via config.yaml ([#2801](https://github.com/NousResearch/hermes-agent/pull/2801))
### MCP
- MCP toolset resolution for runtime and config ([#3252](https://github.com/NousResearch/hermes-agent/pull/3252))
- Add MCP tool name collision protection ([#3077](https://github.com/NousResearch/hermes-agent/pull/3077))
### Auxiliary LLM
- Guard aux LLM calls against None content + reasoning fallback + retry ([#3449](https://github.com/NousResearch/hermes-agent/pull/3449))
- Catch ImportError from `build_anthropic_client` in vision auto-detection ([#3312](https://github.com/NousResearch/hermes-agent/pull/3312))
### Other Tools
- Add request timeouts to `send_message_tool` HTTP calls ([#3162](https://github.com/NousResearch/hermes-agent/pull/3162)) by @memosr
- Auto-repair `jobs.json` with invalid control characters ([#3537](https://github.com/NousResearch/hermes-agent/pull/3537))
- Enable fine-grained tool streaming for Claude/OpenRouter ([#3497](https://github.com/NousResearch/hermes-agent/pull/3497))
---
## 🧩 Skills Ecosystem
### Skills System
- **Env var passthrough** for skills and user config — skills can declare environment variables to pass through ([#2807](https://github.com/NousResearch/hermes-agent/pull/2807))
- Cache skills prompt with shared `skill_utils` module for faster TTFT ([#3421](https://github.com/NousResearch/hermes-agent/pull/3421))
- Avoid redundant file re-read for skill conditions ([#2992](https://github.com/NousResearch/hermes-agent/pull/2992))
- Use Git Trees API to prevent silent subdirectory loss during install ([#2995](https://github.com/NousResearch/hermes-agent/pull/2995))
- Fix skills-sh install for deeply nested repo structures ([#2980](https://github.com/NousResearch/hermes-agent/pull/2980))
- Handle null metadata in skill frontmatter ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Preserve trust for skills-sh identifiers + reduce resolution churn ([#3251](https://github.com/NousResearch/hermes-agent/pull/3251))
- Agent-created skills were incorrectly treated as untrusted community content — fixed ([untagged commit](https://github.com/NousResearch/hermes-agent))
### New Skills
- **G0DM0D3 godmode jailbreaking skill** + docs ([#3157](https://github.com/NousResearch/hermes-agent/pull/3157))
- **Docker management skill** added to optional-skills ([#3060](https://github.com/NousResearch/hermes-agent/pull/3060))
- **OpenClaw migration v2** — 17 new modules, terminal recap for migrating from OpenClaw to Hermes ([#2906](https://github.com/NousResearch/hermes-agent/pull/2906))
---
## 🔒 Security & Reliability
### Security Hardening
- **SSRF protection** added to `browser_navigate` ([#3058](https://github.com/NousResearch/hermes-agent/pull/3058))
- **SSRF protection** added to `vision_tools` and `web_tools` (hardened) ([#2679](https://github.com/NousResearch/hermes-agent/pull/2679))
- **Restrict subagent toolsets** to parent's enabled set ([#3269](https://github.com/NousResearch/hermes-agent/pull/3269))
- **Prevent zip-slip path traversal** in self-update ([#3250](https://github.com/NousResearch/hermes-agent/pull/3250))
- **Prevent shell injection** in `_expand_path` via `~user` path suffix ([#2685](https://github.com/NousResearch/hermes-agent/pull/2685))
- **Normalize input** before dangerous command detection ([#3260](https://github.com/NousResearch/hermes-agent/pull/3260))
- Make tirith block verdicts approvable instead of hard-blocking ([#3428](https://github.com/NousResearch/hermes-agent/pull/3428))
- Remove compromised `litellm`/`typer`/`platformdirs` from deps ([#2796](https://github.com/NousResearch/hermes-agent/pull/2796))
- Pin all dependency version ranges ([#2810](https://github.com/NousResearch/hermes-agent/pull/2810))
- Regenerate `uv.lock` with hashes, use lockfile in setup ([#2812](https://github.com/NousResearch/hermes-agent/pull/2812))
- Bump dependencies to fix CVEs + regenerate `uv.lock` ([#3073](https://github.com/NousResearch/hermes-agent/pull/3073))
- Supply chain audit CI workflow for PR scanning ([#2816](https://github.com/NousResearch/hermes-agent/pull/2816))
### Reliability
- **SQLite WAL write-lock contention** causing 15-20s TUI freeze — fixed ([#3385](https://github.com/NousResearch/hermes-agent/pull/3385))
- **SQLite concurrency hardening** + session transcript integrity ([#3249](https://github.com/NousResearch/hermes-agent/pull/3249))
- Prevent recurring cron job re-fire on gateway crash/restart loop ([#3396](https://github.com/NousResearch/hermes-agent/pull/3396))
- Mark cron session as ended after job completes ([#2998](https://github.com/NousResearch/hermes-agent/pull/2998))
---
## ⚡ Performance
- **TTFT startup optimizations** — salvaged easy-win startup improvements ([#3395](https://github.com/NousResearch/hermes-agent/pull/3395))
- Cache skills prompt with shared `skill_utils` module ([#3421](https://github.com/NousResearch/hermes-agent/pull/3421))
- Avoid redundant file re-read for skill conditions in prompt builder ([#2992](https://github.com/NousResearch/hermes-agent/pull/2992))
---
## 🐛 Notable Bug Fixes
- Fix gateway token double-counting with cached agents ([#3306](https://github.com/NousResearch/hermes-agent/pull/3306), [#3317](https://github.com/NousResearch/hermes-agent/pull/3317))
- Fix "Event loop is closed" / "Press ENTER to continue" during idle sessions ([#3398](https://github.com/NousResearch/hermes-agent/pull/3398))
- Fix reasoning box rendering 3x during tool-calling loops ([#3405](https://github.com/NousResearch/hermes-agent/pull/3405))
- Fix status bar shows 26K instead of 260K for token counts ([#3024](https://github.com/NousResearch/hermes-agent/pull/3024))
- Fix `/queue` always working regardless of config ([#3298](https://github.com/NousResearch/hermes-agent/pull/3298))
- Fix phantom Discord typing indicator after agent turn ([#3003](https://github.com/NousResearch/hermes-agent/pull/3003))
- Fix Slack progress messages appearing in wrong thread ([#3063](https://github.com/NousResearch/hermes-agent/pull/3063))
- Fix WhatsApp media downloads (documents, audio, video) ([#2978](https://github.com/NousResearch/hermes-agent/pull/2978))
- Fix Telegram "Message thread not found" killing progress messages ([#3390](https://github.com/NousResearch/hermes-agent/pull/3390))
- Fix OpenClaw migration overwriting defaults ([#3282](https://github.com/NousResearch/hermes-agent/pull/3282))
- Fix returning-user setup menu dispatching wrong section ([#3083](https://github.com/NousResearch/hermes-agent/pull/3083))
- Fix `hermes update` PEP 668 "externally-managed-environment" error ([#3099](https://github.com/NousResearch/hermes-agent/pull/3099))
- Fix subagents hitting `max_iterations` prematurely via shared budget ([#3004](https://github.com/NousResearch/hermes-agent/pull/3004))
- Fix YAML boolean handling for `tool_progress` config ([#3300](https://github.com/NousResearch/hermes-agent/pull/3300))
- Fix `config.get()` crashes on YAML null values ([#3377](https://github.com/NousResearch/hermes-agent/pull/3377))
- Fix `.strip()` crash on None values from YAML config ([#3552](https://github.com/NousResearch/hermes-agent/pull/3552))
- Fix hung agents on gateway — `/stop` now hard-kills session lock ([#3104](https://github.com/NousResearch/hermes-agent/pull/3104))
- Fix `_custom` provider silently remapped to `openrouter` ([#2792](https://github.com/NousResearch/hermes-agent/pull/2792))
- Fix Matrix missing from `PLATFORMS` dict ([#3473](https://github.com/NousResearch/hermes-agent/pull/3473))
- Fix Email adapter unbounded `_seen_uids` growth ([#3490](https://github.com/NousResearch/hermes-agent/pull/3490))
---
## 🧪 Testing
- Pin `agent-client-protocol` < 0.9 to handle breaking upstream release ([#3320](https://github.com/NousResearch/hermes-agent/pull/3320))
- Catch anthropic ImportError in vision auto-detection tests ([#3312](https://github.com/NousResearch/hermes-agent/pull/3312))
- Update retry-exhaust test for new graceful return behavior ([#3320](https://github.com/NousResearch/hermes-agent/pull/3320))
- Add regression tests for null metadata frontmatter ([untagged commit](https://github.com/NousResearch/hermes-agent))
---
## 📚 Documentation
- Update all docs for `/model` command overhaul and custom provider support ([#2800](https://github.com/NousResearch/hermes-agent/pull/2800))
- Fix stale and incorrect documentation across 18 files ([#2805](https://github.com/NousResearch/hermes-agent/pull/2805))
- Document 9 previously undocumented features ([#2814](https://github.com/NousResearch/hermes-agent/pull/2814))
- Add missing skills, CLI commands, and messaging env vars to docs ([#2809](https://github.com/NousResearch/hermes-agent/pull/2809))
- Fix api-server response storage documentation — SQLite, not in-memory ([#2819](https://github.com/NousResearch/hermes-agent/pull/2819))
- Quote pip install extras to fix zsh glob errors ([#2815](https://github.com/NousResearch/hermes-agent/pull/2815))
- Unify hooks documentation — add plugin hooks to hooks page, add `session:end` event ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Clarify two-mode behavior in `session_search` schema description ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Fix Discord Public Bot setting for Discord-provided invite link ([#3519](https://github.com/NousResearch/hermes-agent/pull/3519)) by @mehmoodosman
- Revise v0.4.0 changelog — fix feature attribution, reorder sections ([untagged commit](https://github.com/NousResearch/hermes-agent))
---
## 👥 Contributors
### Core
- **@teknium1** — 157 PRs covering the full scope of this release
### Community Contributors
- **@alt-glitch** (Siddharth Balyan) — 2 PRs: Nix flake with uv2nix build, NixOS module, and persistent container mode ([#20](https://github.com/NousResearch/hermes-agent/pull/20)); auto-generated config keys and suffix PATHs for Nix builds ([#3061](https://github.com/NousResearch/hermes-agent/pull/3061), [#3274](https://github.com/NousResearch/hermes-agent/pull/3274))
- **@ctlst** — 1 PR: Prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode ([#2701](https://github.com/NousResearch/hermes-agent/pull/2701))
- **@memosr** (memosr.eth) — 1 PR: Add request timeouts to `send_message_tool` HTTP calls ([#3162](https://github.com/NousResearch/hermes-agent/pull/3162))
- **@mehmoodosman** (Osman Mehmood) — 1 PR: Fix Discord docs for Public Bot setting ([#3519](https://github.com/NousResearch/hermes-agent/pull/3519))
### All Contributors
@alt-glitch, @ctlst, @mehmoodosman, @memosr, @teknium1
---
**Full Changelog**: [v2026.3.23...v2026.3.28](https://github.com/NousResearch/hermes-agent/compare/v2026.3.23...v2026.3.28)
+249
View File
@@ -0,0 +1,249 @@
# Hermes Agent v0.6.0 (v2026.3.30)
**Release Date:** March 30, 2026
> The multi-instance release — Profiles for running isolated agent instances, MCP server mode, Docker container, fallback provider chains, two new messaging platforms (Feishu/Lark and WeCom), Telegram webhook mode, Slack multi-workspace OAuth, 95 PRs and 16 resolved issues in 2 days.
---
## ✨ Highlights
- **Profiles — Multi-Instance Hermes** — Run multiple isolated Hermes instances from the same installation. Each profile gets its own config, memory, sessions, skills, and gateway service. Create with `hermes profile create`, switch with `hermes -p <name>`, export/import for sharing. Full token-lock isolation prevents two profiles from using the same bot credential. ([#3681](https://github.com/NousResearch/hermes-agent/pull/3681))
- **MCP Server Mode** — Expose Hermes conversations and sessions to any MCP-compatible client (Claude Desktop, Cursor, VS Code, etc.) via `hermes mcp serve`. Browse conversations, read messages, search across sessions, and manage attachments — all through the Model Context Protocol. Supports both stdio and Streamable HTTP transports. ([#3795](https://github.com/NousResearch/hermes-agent/pull/3795))
- **Docker Container** — Official Dockerfile for running Hermes Agent in a container. Supports both CLI and gateway modes with volume-mounted config. ([#3668](https://github.com/NousResearch/hermes-agent/pull/3668), closes [#850](https://github.com/NousResearch/hermes-agent/issues/850))
- **Ordered Fallback Provider Chain** — Configure multiple inference providers with automatic failover. When your primary provider returns errors or is unreachable, Hermes automatically tries the next provider in the chain. Configure via `fallback_providers` in config.yaml. ([#3813](https://github.com/NousResearch/hermes-agent/pull/3813), closes [#1734](https://github.com/NousResearch/hermes-agent/issues/1734))
- **Feishu/Lark Platform Support** — Full gateway adapter for Feishu (飞书) and Lark with event subscriptions, message cards, group chat, image/file attachments, and interactive card callbacks. ([#3799](https://github.com/NousResearch/hermes-agent/pull/3799), [#3817](https://github.com/NousResearch/hermes-agent/pull/3817), closes [#1788](https://github.com/NousResearch/hermes-agent/issues/1788))
- **WeCom (Enterprise WeChat) Platform Support** — New gateway adapter for WeCom (企业微信) with text/image/voice messages, group chats, and callback verification. ([#3847](https://github.com/NousResearch/hermes-agent/pull/3847))
- **Slack Multi-Workspace OAuth** — Connect a single Hermes gateway to multiple Slack workspaces via OAuth token file. Each workspace gets its own bot token, resolved dynamically per incoming event. ([#3903](https://github.com/NousResearch/hermes-agent/pull/3903))
- **Telegram Webhook Mode & Group Controls** — Run the Telegram adapter in webhook mode as an alternative to polling — faster response times and better for production deployments behind a reverse proxy. New group mention gating controls when the bot responds: always, only when @mentioned, or via regex triggers. ([#3880](https://github.com/NousResearch/hermes-agent/pull/3880), [#3870](https://github.com/NousResearch/hermes-agent/pull/3870))
- **Exa Search Backend** — Add Exa as an alternative web search and content extraction backend alongside Firecrawl and DuckDuckGo. Set `EXA_API_KEY` and configure as preferred backend. ([#3648](https://github.com/NousResearch/hermes-agent/pull/3648))
- **Skills & Credentials on Remote Backends** — Mount skill directories and credential files into Modal and Docker containers, so remote terminal sessions have access to the same skills and secrets as local execution. ([#3890](https://github.com/NousResearch/hermes-agent/pull/3890), [#3671](https://github.com/NousResearch/hermes-agent/pull/3671), closes [#3665](https://github.com/NousResearch/hermes-agent/issues/3665), [#3433](https://github.com/NousResearch/hermes-agent/issues/3433))
---
## 🏗️ Core Agent & Architecture
### Provider & Model Support
- **Ordered fallback provider chain** — automatic failover across multiple configured providers ([#3813](https://github.com/NousResearch/hermes-agent/pull/3813))
- **Fix api_mode on provider switch** — switching providers via `hermes model` now correctly clears stale `api_mode` instead of hardcoding `chat_completions`, fixing 404s for providers with Anthropic-compatible endpoints ([#3726](https://github.com/NousResearch/hermes-agent/pull/3726), [#3857](https://github.com/NousResearch/hermes-agent/pull/3857), closes [#3685](https://github.com/NousResearch/hermes-agent/issues/3685))
- **Stop silent OpenRouter fallback** — when no provider is configured, Hermes now raises a clear error instead of silently routing to OpenRouter ([#3807](https://github.com/NousResearch/hermes-agent/pull/3807), [#3862](https://github.com/NousResearch/hermes-agent/pull/3862))
- **Gemini 3.1 preview models** — added to OpenRouter and Nous Portal catalogs ([#3803](https://github.com/NousResearch/hermes-agent/pull/3803), closes [#3753](https://github.com/NousResearch/hermes-agent/issues/3753))
- **Gemini direct API context length** — full context length resolution for direct Google AI endpoints ([#3876](https://github.com/NousResearch/hermes-agent/pull/3876))
- **gpt-5.4-mini** added to Codex fallback catalog ([#3855](https://github.com/NousResearch/hermes-agent/pull/3855))
- **Curated model lists preferred** over live API probe when the probe returns fewer models ([#3856](https://github.com/NousResearch/hermes-agent/pull/3856), [#3867](https://github.com/NousResearch/hermes-agent/pull/3867))
- **User-friendly 429 rate limit messages** with Retry-After countdown ([#3809](https://github.com/NousResearch/hermes-agent/pull/3809))
- **Auxiliary client placeholder key** for local servers without auth requirements ([#3842](https://github.com/NousResearch/hermes-agent/pull/3842))
- **INFO-level logging** for auxiliary provider resolution ([#3866](https://github.com/NousResearch/hermes-agent/pull/3866))
### Agent Loop & Conversation
- **Subagent status reporting** — reports `completed` status when summary exists instead of generic failure ([#3829](https://github.com/NousResearch/hermes-agent/pull/3829))
- **Session log file updated during compression** — prevents stale file references after context compression ([#3835](https://github.com/NousResearch/hermes-agent/pull/3835))
- **Omit empty tools param** — sends no `tools` parameter when empty instead of `None`, fixing compatibility with strict providers ([#3820](https://github.com/NousResearch/hermes-agent/pull/3820))
### Profiles & Multi-Instance
- **Profiles system** — `hermes profile create/list/switch/delete/export/import/rename`. Each profile gets isolated HERMES_HOME, gateway service, CLI wrapper. Token locks prevent credential collisions. Tab completion for profile names. ([#3681](https://github.com/NousResearch/hermes-agent/pull/3681))
- **Profile-aware display paths** — all user-facing `~/.hermes` paths replaced with `display_hermes_home()` to show the correct profile directory ([#3623](https://github.com/NousResearch/hermes-agent/pull/3623))
- **Lazy display_hermes_home imports** — prevents `ImportError` during `hermes update` when modules cache stale bytecode ([#3776](https://github.com/NousResearch/hermes-agent/pull/3776))
- **HERMES_HOME for protected paths** — `.env` write-deny path now respects HERMES_HOME instead of hardcoded `~/.hermes` ([#3840](https://github.com/NousResearch/hermes-agent/pull/3840))
---
## 📱 Messaging Platforms (Gateway)
### New Platforms
- **Feishu/Lark** — Full adapter with event subscriptions, message cards, group chat, image/file attachments, interactive card callbacks ([#3799](https://github.com/NousResearch/hermes-agent/pull/3799), [#3817](https://github.com/NousResearch/hermes-agent/pull/3817))
- **WeCom (Enterprise WeChat)** — Text/image/voice messages, group chats, callback verification ([#3847](https://github.com/NousResearch/hermes-agent/pull/3847))
### Telegram
- **Webhook mode** — run as webhook endpoint instead of polling for production deployments ([#3880](https://github.com/NousResearch/hermes-agent/pull/3880))
- **Group mention gating & regex triggers** — configurable bot response behavior in groups: always, @mention-only, or regex-matched ([#3870](https://github.com/NousResearch/hermes-agent/pull/3870))
- **Gracefully handle deleted reply targets** — no more crashes when the message being replied to was deleted ([#3858](https://github.com/NousResearch/hermes-agent/pull/3858), closes [#3229](https://github.com/NousResearch/hermes-agent/issues/3229))
### Discord
- **Message processing reactions** — adds a reaction emoji while processing and removes it when done, giving visual feedback in channels ([#3871](https://github.com/NousResearch/hermes-agent/pull/3871))
- **DISCORD_IGNORE_NO_MENTION** — skip messages that @mention other users/bots but not Hermes ([#3640](https://github.com/NousResearch/hermes-agent/pull/3640))
- **Clean up deferred "thinking..."** — properly removes the "thinking..." indicator after slash commands complete ([#3674](https://github.com/NousResearch/hermes-agent/pull/3674), closes [#3595](https://github.com/NousResearch/hermes-agent/issues/3595))
### Slack
- **Multi-workspace OAuth** — connect to multiple Slack workspaces from a single gateway via OAuth token file ([#3903](https://github.com/NousResearch/hermes-agent/pull/3903))
### WhatsApp
- **Persistent aiohttp session** — reuse HTTP sessions across requests instead of creating new ones per message ([#3818](https://github.com/NousResearch/hermes-agent/pull/3818))
- **LID↔phone alias resolution** — correctly match Linked ID and phone number formats in allowlists ([#3830](https://github.com/NousResearch/hermes-agent/pull/3830))
- **Skip reply prefix in bot mode** — cleaner message formatting when running as a WhatsApp bot ([#3931](https://github.com/NousResearch/hermes-agent/pull/3931))
### Matrix
- **Native voice messages via MSC3245** — send voice messages as proper Matrix voice events instead of file attachments ([#3877](https://github.com/NousResearch/hermes-agent/pull/3877))
### Mattermost
- **Configurable mention behavior** — respond to messages without requiring @mention ([#3664](https://github.com/NousResearch/hermes-agent/pull/3664))
### Signal
- **URL-encode phone numbers** and correct attachment RPC parameter — fixes delivery failures with certain phone number formats ([#3670](https://github.com/NousResearch/hermes-agent/pull/3670)) — @kshitijk4poor
### Email
- **Close SMTP/IMAP connections on failure** — prevents connection leaks during error scenarios ([#3804](https://github.com/NousResearch/hermes-agent/pull/3804))
### Gateway Core
- **Atomic config writes** — use atomic file writes for config.yaml to prevent data loss during crashes ([#3800](https://github.com/NousResearch/hermes-agent/pull/3800))
- **Home channel env overrides** — apply environment variable overrides for home channels consistently ([#3796](https://github.com/NousResearch/hermes-agent/pull/3796), [#3808](https://github.com/NousResearch/hermes-agent/pull/3808))
- **Replace print() with logger** — BasePlatformAdapter now uses proper logging instead of print statements ([#3669](https://github.com/NousResearch/hermes-agent/pull/3669))
- **Cron delivery labels** — resolve human-friendly delivery labels via channel directory ([#3860](https://github.com/NousResearch/hermes-agent/pull/3860), closes [#1945](https://github.com/NousResearch/hermes-agent/issues/1945))
- **Cron [SILENT] tightening** — prevent agents from prefixing reports with [SILENT] to suppress delivery ([#3901](https://github.com/NousResearch/hermes-agent/pull/3901))
- **Background task media delivery** and vision download timeout fixes ([#3919](https://github.com/NousResearch/hermes-agent/pull/3919))
- **Boot-md hook** — example built-in hook to run a BOOT.md file on gateway startup ([#3733](https://github.com/NousResearch/hermes-agent/pull/3733))
---
## 🖥️ CLI & User Experience
### Interactive CLI
- **Configurable tool preview length** — show full file paths by default instead of truncating at 40 chars ([#3841](https://github.com/NousResearch/hermes-agent/pull/3841))
- **Tool token context display** — `hermes tools` checklist now shows estimated token cost per toolset ([#3805](https://github.com/NousResearch/hermes-agent/pull/3805))
- **/bg spinner TUI fix** — route background task spinner through the TUI widget to prevent status bar collision ([#3643](https://github.com/NousResearch/hermes-agent/pull/3643))
- **Prevent status bar wrapping** into duplicate rows ([#3883](https://github.com/NousResearch/hermes-agent/pull/3883)) — @kshitijk4poor
- **Handle closed stdout ValueError** in safe print paths — fixes crashes when stdout is closed during gateway thread shutdown ([#3843](https://github.com/NousResearch/hermes-agent/pull/3843), closes [#3534](https://github.com/NousResearch/hermes-agent/issues/3534))
- **Remove input() from /tools disable** — eliminates freeze in terminal when disabling tools ([#3918](https://github.com/NousResearch/hermes-agent/pull/3918))
- **TTY guard for interactive CLI commands** — prevent CPU spin when launched without a terminal ([#3933](https://github.com/NousResearch/hermes-agent/pull/3933))
- **Argparse entrypoint** — use argparse in the top-level launcher for cleaner error handling ([#3874](https://github.com/NousResearch/hermes-agent/pull/3874))
- **Lazy-initialized tools show yellow** in banner instead of red, reducing false alarm about "missing" tools ([#3822](https://github.com/NousResearch/hermes-agent/pull/3822))
- **Honcho tools shown in banner** when configured ([#3810](https://github.com/NousResearch/hermes-agent/pull/3810))
### Setup & Configuration
- **Auto-install matrix-nio** during `hermes setup` when Matrix is selected ([#3802](https://github.com/NousResearch/hermes-agent/pull/3802), [#3873](https://github.com/NousResearch/hermes-agent/pull/3873))
- **Session export stdout support** — export sessions to stdout with `-` for piping ([#3641](https://github.com/NousResearch/hermes-agent/pull/3641), closes [#3609](https://github.com/NousResearch/hermes-agent/issues/3609))
- **Configurable approval timeouts** — set how long dangerous command approval prompts wait before auto-denying ([#3886](https://github.com/NousResearch/hermes-agent/pull/3886), closes [#3765](https://github.com/NousResearch/hermes-agent/issues/3765))
- **Clear __pycache__ during update** — prevents stale bytecode ImportError after `hermes update` ([#3819](https://github.com/NousResearch/hermes-agent/pull/3819))
---
## 🔧 Tool System
### MCP
- **MCP Server Mode** — `hermes mcp serve` exposes conversations, sessions, and attachments to MCP clients via stdio or Streamable HTTP ([#3795](https://github.com/NousResearch/hermes-agent/pull/3795))
- **Dynamic tool discovery** — respond to `notifications/tools/list_changed` events to pick up new tools from MCP servers without reconnecting ([#3812](https://github.com/NousResearch/hermes-agent/pull/3812))
- **Non-deprecated HTTP transport** — switched from `sse_client` to `streamable_http_client` ([#3646](https://github.com/NousResearch/hermes-agent/pull/3646))
### Web Tools
- **Exa search backend** — alternative to Firecrawl and DuckDuckGo for web search and extraction ([#3648](https://github.com/NousResearch/hermes-agent/pull/3648))
### Browser
- **Guard against None LLM responses** in browser snapshot and vision tools ([#3642](https://github.com/NousResearch/hermes-agent/pull/3642))
### Terminal & Remote Backends
- **Mount skill directories** into Modal and Docker containers ([#3890](https://github.com/NousResearch/hermes-agent/pull/3890))
- **Mount credential files** into remote backends with mtime+size caching ([#3671](https://github.com/NousResearch/hermes-agent/pull/3671))
- **Preserve partial output** when commands time out instead of losing everything ([#3868](https://github.com/NousResearch/hermes-agent/pull/3868))
- **Stop marking persisted env vars as missing** on remote backends ([#3650](https://github.com/NousResearch/hermes-agent/pull/3650))
### Audio
- **.aac format support** in transcription tool ([#3865](https://github.com/NousResearch/hermes-agent/pull/3865), closes [#1963](https://github.com/NousResearch/hermes-agent/issues/1963))
- **Audio download retry** — retry logic for `cache_audio_from_url` matching the existing image download pattern ([#3401](https://github.com/NousResearch/hermes-agent/pull/3401)) — @binhnt92
### Vision
- **Reject non-image files** and enforce website-only policy for vision analysis ([#3845](https://github.com/NousResearch/hermes-agent/pull/3845))
### Tool Schema
- **Ensure name field** always present in tool definitions, fixing `KeyError: 'name'` crashes ([#3811](https://github.com/NousResearch/hermes-agent/pull/3811), closes [#3729](https://github.com/NousResearch/hermes-agent/issues/3729))
### ACP (Editor Integration)
- **Complete session management surface** for VS Code/Zed/JetBrains clients — proper task lifecycle, cancel support, session persistence ([#3675](https://github.com/NousResearch/hermes-agent/pull/3675))
---
## 🧩 Skills & Plugins
### Skills System
- **External skill directories** — configure additional skill directories via `skills.external_dirs` in config.yaml ([#3678](https://github.com/NousResearch/hermes-agent/pull/3678))
- **Category path traversal blocked** — prevents `../` attacks in skill category names ([#3844](https://github.com/NousResearch/hermes-agent/pull/3844))
- **parallel-cli moved to optional-skills** — reduces default skill footprint ([#3673](https://github.com/NousResearch/hermes-agent/pull/3673)) — @kshitijk4poor
### New Skills
- **memento-flashcards** — spaced repetition flashcard system ([#3827](https://github.com/NousResearch/hermes-agent/pull/3827))
- **songwriting-and-ai-music** — songwriting craft and AI music generation prompts ([#3834](https://github.com/NousResearch/hermes-agent/pull/3834))
- **SiYuan Note** — integration with SiYuan note-taking app ([#3742](https://github.com/NousResearch/hermes-agent/pull/3742))
- **Scrapling** — web scraping skill using Scrapling library ([#3742](https://github.com/NousResearch/hermes-agent/pull/3742))
- **one-three-one-rule** — communication framework skill ([#3797](https://github.com/NousResearch/hermes-agent/pull/3797))
### Plugin System
- **Plugin enable/disable commands** — `hermes plugins enable/disable <name>` for managing plugin state without removing them ([#3747](https://github.com/NousResearch/hermes-agent/pull/3747))
- **Plugin message injection** — plugins can now inject messages into the conversation stream on behalf of the user via `ctx.inject_message()` ([#3778](https://github.com/NousResearch/hermes-agent/pull/3778)) — @winglian
- **Honcho self-hosted support** — allow local Honcho instances without requiring an API key ([#3644](https://github.com/NousResearch/hermes-agent/pull/3644))
---
## 🔒 Security & Reliability
### Security Hardening
- **Hardened dangerous command detection** — expanded pattern matching for risky shell commands and added file tool path guards for sensitive locations (`/etc/`, `/boot/`, docker.sock) ([#3872](https://github.com/NousResearch/hermes-agent/pull/3872))
- **Sensitive path write checks** in approval system — catch writes to system config files through file tools, not just terminal ([#3859](https://github.com/NousResearch/hermes-agent/pull/3859))
- **Secret redaction expansion** — now covers ElevenLabs, Tavily, and Exa API keys ([#3920](https://github.com/NousResearch/hermes-agent/pull/3920))
- **Vision file rejection** — reject non-image files passed to vision analysis to prevent information disclosure ([#3845](https://github.com/NousResearch/hermes-agent/pull/3845))
- **Category path traversal blocking** — prevent directory traversal in skill category names ([#3844](https://github.com/NousResearch/hermes-agent/pull/3844))
### Reliability
- **Atomic config.yaml writes** — prevent data loss during gateway crashes ([#3800](https://github.com/NousResearch/hermes-agent/pull/3800))
- **Clear __pycache__ on update** — prevent stale bytecode from causing ImportError after updates ([#3819](https://github.com/NousResearch/hermes-agent/pull/3819))
- **Lazy imports for update safety** — prevent ImportError chains during `hermes update` when modules reference new functions ([#3776](https://github.com/NousResearch/hermes-agent/pull/3776))
- **Restore terminalbench2 from patch corruption** — recovered file damaged by patch tool's secret redaction ([#3801](https://github.com/NousResearch/hermes-agent/pull/3801))
- **Terminal timeout preserves partial output** — no more lost command output on timeout ([#3868](https://github.com/NousResearch/hermes-agent/pull/3868))
---
## 🐛 Notable Bug Fixes
- **OpenClaw migration model config overwrite** — migration no longer overwrites model config dict with a string ([#3924](https://github.com/NousResearch/hermes-agent/pull/3924)) — @0xbyt4
- **OpenClaw migration expanded** — covers full data footprint including sessions, cron, memory ([#3869](https://github.com/NousResearch/hermes-agent/pull/3869))
- **Telegram deleted reply targets** — gracefully handle replies to deleted messages instead of crashing ([#3858](https://github.com/NousResearch/hermes-agent/pull/3858))
- **Discord "thinking..." persistence** — properly cleans up deferred response indicators ([#3674](https://github.com/NousResearch/hermes-agent/pull/3674))
- **WhatsApp LID↔phone aliases** — fixes allowlist matching failures with Linked ID format ([#3830](https://github.com/NousResearch/hermes-agent/pull/3830))
- **Signal URL-encoded phone numbers** — fixes delivery failures with certain formats ([#3670](https://github.com/NousResearch/hermes-agent/pull/3670))
- **Email connection leaks** — properly close SMTP/IMAP connections on error ([#3804](https://github.com/NousResearch/hermes-agent/pull/3804))
- **_safe_print ValueError** — no more gateway thread crashes on closed stdout ([#3843](https://github.com/NousResearch/hermes-agent/pull/3843))
- **Tool schema KeyError 'name'** — ensure name field always present in tool definitions ([#3811](https://github.com/NousResearch/hermes-agent/pull/3811))
- **api_mode stale on provider switch** — correctly clear when switching providers via `hermes model` ([#3857](https://github.com/NousResearch/hermes-agent/pull/3857))
---
## 🧪 Testing
- Resolved 10+ CI failures across hooks, tiktoken, plugins, and skill tests ([#3848](https://github.com/NousResearch/hermes-agent/pull/3848), [#3721](https://github.com/NousResearch/hermes-agent/pull/3721), [#3936](https://github.com/NousResearch/hermes-agent/pull/3936))
---
## 📚 Documentation
- **Comprehensive OpenClaw migration guide** — step-by-step guide for migrating from OpenClaw/Claw3D to Hermes Agent ([#3864](https://github.com/NousResearch/hermes-agent/pull/3864), [#3900](https://github.com/NousResearch/hermes-agent/pull/3900))
- **Credential file passthrough docs** — document how to forward credential files and env vars to remote backends ([#3677](https://github.com/NousResearch/hermes-agent/pull/3677))
- **DuckDuckGo requirements clarified** — note runtime dependency on duckduckgo-search package ([#3680](https://github.com/NousResearch/hermes-agent/pull/3680))
- **Skills catalog updated** — added red-teaming category and optional skills listing ([#3745](https://github.com/NousResearch/hermes-agent/pull/3745))
- **Feishu docs MDX fix** — escape angle-bracket URLs that break Docusaurus build ([#3902](https://github.com/NousResearch/hermes-agent/pull/3902))
---
## 👥 Contributors
### Core
- **@teknium1** — 90 PRs across all subsystems
### Community Contributors
- **@kshitijk4poor** — 3 PRs: Signal phone number fix ([#3670](https://github.com/NousResearch/hermes-agent/pull/3670)), parallel-cli to optional-skills ([#3673](https://github.com/NousResearch/hermes-agent/pull/3673)), status bar wrapping fix ([#3883](https://github.com/NousResearch/hermes-agent/pull/3883))
- **@winglian** — 1 PR: Plugin message injection interface ([#3778](https://github.com/NousResearch/hermes-agent/pull/3778))
- **@binhnt92** — 1 PR: Audio download retry logic ([#3401](https://github.com/NousResearch/hermes-agent/pull/3401))
- **@0xbyt4** — 1 PR: OpenClaw migration model config fix ([#3924](https://github.com/NousResearch/hermes-agent/pull/3924))
### Issues Resolved from Community
@Material-Scientist ([#850](https://github.com/NousResearch/hermes-agent/issues/850)), @hanxu98121 ([#1734](https://github.com/NousResearch/hermes-agent/issues/1734)), @penwyp ([#1788](https://github.com/NousResearch/hermes-agent/issues/1788)), @dan-and ([#1945](https://github.com/NousResearch/hermes-agent/issues/1945)), @AdrianScott ([#1963](https://github.com/NousResearch/hermes-agent/issues/1963)), @clawdbot47 ([#3229](https://github.com/NousResearch/hermes-agent/issues/3229)), @alanfwilliams ([#3404](https://github.com/NousResearch/hermes-agent/issues/3404)), @kentimsit ([#3433](https://github.com/NousResearch/hermes-agent/issues/3433)), @hayka-pacha ([#3534](https://github.com/NousResearch/hermes-agent/issues/3534)), @primmer ([#3595](https://github.com/NousResearch/hermes-agent/issues/3595)), @dagelf ([#3609](https://github.com/NousResearch/hermes-agent/issues/3609)), @HenkDz ([#3685](https://github.com/NousResearch/hermes-agent/issues/3685)), @tmdgusya ([#3729](https://github.com/NousResearch/hermes-agent/issues/3729)), @TypQxQ ([#3753](https://github.com/NousResearch/hermes-agent/issues/3753)), @acsezen ([#3765](https://github.com/NousResearch/hermes-agent/issues/3765))
---
**Full Changelog**: [v2026.3.28...v2026.3.30](https://github.com/NousResearch/hermes-agent/compare/v2026.3.28...v2026.3.30)
+1 -1
View File
@@ -74,7 +74,7 @@ def main() -> None:
agent = HermesACPAgent()
try:
asyncio.run(acp.run_agent(agent))
asyncio.run(acp.run_agent(agent, use_unstable_protocol=True))
except KeyboardInterrupt:
logger.info("Shutting down (KeyboardInterrupt)")
except Exception:
+46 -3
View File
@@ -25,6 +25,9 @@ from acp.schema import (
NewSessionResponse,
PromptResponse,
ResumeSessionResponse,
SetSessionConfigOptionResponse,
SetSessionModelResponse,
SetSessionModeResponse,
ResourceContentBlock,
SessionCapabilities,
SessionForkCapabilities,
@@ -94,11 +97,14 @@ class HermesACPAgent(acp.Agent):
async def initialize(
self,
protocol_version: int,
protocol_version: int | None = None,
client_capabilities: ClientCapabilities | None = None,
client_info: Implementation | None = None,
**kwargs: Any,
) -> InitializeResponse:
resolved_protocol_version = (
protocol_version if isinstance(protocol_version, int) else acp.PROTOCOL_VERSION
)
provider = detect_provider()
auth_methods = None
if provider:
@@ -111,7 +117,11 @@ class HermesACPAgent(acp.Agent):
]
client_name = client_info.name if client_info else "unknown"
logger.info("Initialize from %s (protocol v%s)", client_name, protocol_version)
logger.info(
"Initialize from %s (protocol v%s)",
client_name,
resolved_protocol_version,
)
return InitializeResponse(
protocol_version=acp.PROTOCOL_VERSION,
@@ -471,7 +481,7 @@ class HermesACPAgent(acp.Agent):
async def set_session_model(
self, model_id: str, session_id: str, **kwargs: Any
):
) -> SetSessionModelResponse | None:
"""Switch the model for a session (called by ACP protocol)."""
state = self.session_manager.get_session(session_id)
if state:
@@ -489,4 +499,37 @@ class HermesACPAgent(acp.Agent):
)
self.session_manager.save_session(session_id)
logger.info("Session %s: model switched to %s", session_id, model_id)
return SetSessionModelResponse()
logger.warning("Session %s: model switch requested for missing session", session_id)
return None
async def set_session_mode(
self, mode_id: str, session_id: str, **kwargs: Any
) -> SetSessionModeResponse | None:
"""Persist the editor-requested mode so ACP clients do not fail on mode switches."""
state = self.session_manager.get_session(session_id)
if state is None:
logger.warning("Session %s: mode switch requested for missing session", session_id)
return None
setattr(state, "mode", mode_id)
self.session_manager.save_session(session_id)
logger.info("Session %s: mode switched to %s", session_id, mode_id)
return SetSessionModeResponse()
async def set_config_option(
self, config_id: str, session_id: str, value: str, **kwargs: Any
) -> SetSessionConfigOptionResponse | None:
"""Accept ACP config option updates even when Hermes has no typed ACP config surface yet."""
state = self.session_manager.get_session(session_id)
if state is None:
logger.warning("Session %s: config update requested for missing session", session_id)
return None
options = getattr(state, "config_options", None)
if not isinstance(options, dict):
options = {}
options[str(config_id)] = value
setattr(state, "config_options", options)
self.session_manager.save_session(session_id)
logger.info("Session %s: config option %s updated", session_id, config_id)
return SetSessionConfigOptionResponse(config_options=[])
+114 -11
View File
@@ -35,6 +35,54 @@ ADAPTIVE_EFFORT_MAP = {
"minimal": "low",
}
# ── Max output token limits per Anthropic model ───────────────────────
# Source: Anthropic docs + Cline model catalog. Anthropic's API requires
# max_tokens as a mandatory field. Previously we hardcoded 16384, which
# starves thinking-enabled models (thinking tokens count toward the limit).
_ANTHROPIC_OUTPUT_LIMITS = {
# Claude 4.6
"claude-opus-4-6": 128_000,
"claude-sonnet-4-6": 64_000,
# Claude 4.5
"claude-opus-4-5": 64_000,
"claude-sonnet-4-5": 64_000,
"claude-haiku-4-5": 64_000,
# Claude 4
"claude-opus-4": 32_000,
"claude-sonnet-4": 64_000,
# Claude 3.7
"claude-3-7-sonnet": 128_000,
# Claude 3.5
"claude-3-5-sonnet": 8_192,
"claude-3-5-haiku": 8_192,
# Claude 3
"claude-3-opus": 4_096,
"claude-3-sonnet": 4_096,
"claude-3-haiku": 4_096,
}
# For any model not in the table, assume the highest current limit.
# Future Anthropic models are unlikely to have *less* output capacity.
_ANTHROPIC_DEFAULT_OUTPUT_LIMIT = 128_000
def _get_anthropic_max_output(model: str) -> int:
"""Look up the max output token limit for an Anthropic model.
Uses substring matching against _ANTHROPIC_OUTPUT_LIMITS so date-stamped
model IDs (claude-sonnet-4-5-20250929) and variant suffixes (:1m, :fast)
resolve correctly. Longest-prefix match wins to avoid e.g. "claude-3-5"
matching before "claude-3-5-sonnet".
"""
m = model.lower()
best_key = ""
best_val = _ANTHROPIC_DEFAULT_OUTPUT_LIMIT
for key, val in _ANTHROPIC_OUTPUT_LIMITS.items():
if key in m and len(key) > len(best_key):
best_key = key
best_val = val
return best_val
def _supports_adaptive_thinking(model: str) -> bool:
"""Return True for Claude 4.6 models that support adaptive thinking."""
@@ -59,6 +107,7 @@ _OAUTH_ONLY_BETAS = [
# The version must stay reasonably current — Anthropic rejects OAuth requests
# when the spoofed user-agent version is too far behind the actual release.
_CLAUDE_CODE_VERSION_FALLBACK = "2.1.74"
_claude_code_version_cache: Optional[str] = None
def _detect_claude_code_version() -> str:
@@ -86,26 +135,52 @@ def _detect_claude_code_version() -> str:
return _CLAUDE_CODE_VERSION_FALLBACK
_CLAUDE_CODE_VERSION = _detect_claude_code_version()
_CLAUDE_CODE_SYSTEM_PREFIX = "You are Claude Code, Anthropic's official CLI for Claude."
_MCP_TOOL_PREFIX = "mcp_"
def _get_claude_code_version() -> str:
"""Lazily detect the installed Claude Code version when OAuth headers need it."""
global _claude_code_version_cache
if _claude_code_version_cache is None:
_claude_code_version_cache = _detect_claude_code_version()
return _claude_code_version_cache
def _is_oauth_token(key: str) -> bool:
"""Check if the key is an OAuth/setup token (not a regular Console API key).
Regular API keys start with 'sk-ant-api'. Everything else (setup-tokens
starting with 'sk-ant-oat', managed keys, JWTs, etc.) needs Bearer auth.
Azure AI Foundry keys (non sk-ant- prefixed) should use x-api-key, not Bearer.
"""
if not key:
return False
# Regular Console API keys use x-api-key header
if key.startswith("sk-ant-api"):
return False
# Everything else (setup-tokens, managed keys, JWTs) uses Bearer auth
# Azure AI Foundry keys don't start with sk-ant- at all — treat as regular API key
if not key.startswith("sk-ant-"):
return False
# Everything else (setup-tokens sk-ant-oat, managed keys, JWTs) uses Bearer auth
return True
def _requires_bearer_auth(base_url: str | None) -> bool:
"""Return True for Anthropic-compatible providers that require Bearer auth.
Some third-party /anthropic endpoints implement Anthropic's Messages API but
require Authorization: Bearer instead of Anthropic's native x-api-key header.
MiniMax's global and China Anthropic-compatible endpoints follow this pattern.
"""
if not base_url:
return False
normalized = base_url.rstrip("/").lower()
return normalized.startswith("https://api.minimax.io/anthropic") or normalized.startswith(
"https://api.minimaxi.com/anthropic"
)
def build_anthropic_client(api_key: str, base_url: str = None):
"""Create an Anthropic client, auto-detecting setup-tokens vs API keys.
@@ -124,7 +199,17 @@ def build_anthropic_client(api_key: str, base_url: str = None):
if base_url:
kwargs["base_url"] = base_url
if _is_oauth_token(api_key):
if _requires_bearer_auth(base_url):
# Some Anthropic-compatible providers (e.g. MiniMax) expect the API key in
# Authorization: Bearer even for regular API keys. Route those endpoints
# through auth_token so the SDK sends Bearer auth instead of x-api-key.
# Check this before OAuth token shape detection because MiniMax secrets do
# not use Anthropic's sk-ant-api prefix and would otherwise be misread as
# Anthropic OAuth/setup tokens.
kwargs["auth_token"] = api_key
if _COMMON_BETAS:
kwargs["default_headers"] = {"anthropic-beta": ",".join(_COMMON_BETAS)}
elif _is_oauth_token(api_key):
# OAuth access token / setup-token → Bearer auth + Claude Code identity.
# Anthropic routes OAuth requests based on user-agent and headers;
# without Claude Code's fingerprint, requests get intermittent 500s.
@@ -132,7 +217,7 @@ def build_anthropic_client(api_key: str, base_url: str = None):
kwargs["auth_token"] = api_key
kwargs["default_headers"] = {
"anthropic-beta": ",".join(all_betas),
"user-agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
"user-agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
"x-app": "cli",
}
else:
@@ -241,7 +326,7 @@ def _refresh_oauth_token(creds: Dict[str, Any]) -> Optional[str]:
headers = {
"Content-Type": "application/json",
"User-Agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
"User-Agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
}
for endpoint in token_endpoints:
@@ -706,14 +791,21 @@ def convert_messages_to_anthropic(
result.append({"role": "user", "content": [tool_result]})
continue
# Regular user message
# Regular user message — validate non-empty content (Anthropic rejects empty)
if isinstance(content, list):
converted_blocks = _convert_content_to_anthropic(content)
result.append({
"role": "user",
"content": converted_blocks or [{"type": "text", "text": ""}],
})
# Check if all text blocks are empty
if not converted_blocks or all(
b.get("text", "").strip() == ""
for b in converted_blocks
if isinstance(b, dict) and b.get("type") == "text"
):
converted_blocks = [{"type": "text", "text": "(empty message)"}]
result.append({"role": "user", "content": converted_blocks})
else:
# Validate string content is non-empty
if not content or (isinstance(content, str) and not content.strip()):
content = "(empty message)"
result.append({"role": "user", "content": content})
# Strip orphaned tool_use blocks (no matching tool_result follows)
@@ -803,9 +895,15 @@ def build_anthropic_kwargs(
tool_choice: Optional[str] = None,
is_oauth: bool = False,
preserve_dots: bool = False,
context_length: Optional[int] = None,
) -> Dict[str, Any]:
"""Build kwargs for anthropic.messages.create().
When *max_tokens* is None, the model's native output limit is used
(e.g. 128K for Opus 4.6, 64K for Sonnet 4.6). If *context_length*
is provided, the effective limit is clamped so it doesn't exceed
the context window.
When *is_oauth* is True, applies Claude Code compatibility transforms:
system prompt prefix, tool name prefixing, and prompt sanitization.
@@ -816,7 +914,12 @@ def build_anthropic_kwargs(
anthropic_tools = convert_tools_to_anthropic(tools) if tools else []
model = normalize_model_name(model, preserve_dots=preserve_dots)
effective_max_tokens = max_tokens or 16384
effective_max_tokens = max_tokens or _get_anthropic_max_output(model)
# Clamp to context window if the user set a lower context_length
# (e.g. custom endpoint with limited capacity).
if context_length and effective_max_tokens > context_length:
effective_max_tokens = max(context_length - 1, 1)
# ── OAuth: Claude Code identity ──────────────────────────────────
if is_oauth:
+195 -14
View File
@@ -627,8 +627,6 @@ def _resolve_custom_runtime() -> Tuple[Optional[str], Optional[str]]:
custom_key = runtime.get("api_key")
if not isinstance(custom_base, str) or not custom_base.strip():
return None, None
if not isinstance(custom_key, str) or not custom_key.strip():
return None, None
custom_base = custom_base.strip().rstrip("/")
if "openrouter.ai" in custom_base.lower():
@@ -636,6 +634,13 @@ def _resolve_custom_runtime() -> Tuple[Optional[str], Optional[str]]:
# configured. Treat that as "no custom endpoint" for auxiliary routing.
return None, None
# Local servers (Ollama, llama.cpp, vLLM, LM Studio) don't require auth.
# Use a placeholder key — the OpenAI SDK requires a non-empty string but
# local servers ignore the Authorization header. Same fix as cli.py
# _ensure_runtime_credentials() (PR #2556).
if not isinstance(custom_key, str) or not custom_key.strip():
custom_key = "no-key-required"
return custom_base, custom_key.strip()
@@ -693,7 +698,13 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
is_oauth = _is_oauth_token(token)
model = _API_KEY_PROVIDER_AUX_MODELS.get("anthropic", "claude-haiku-4-5-20251001")
logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", model, base_url, is_oauth)
real_client = build_anthropic_client(token, base_url)
try:
real_client = build_anthropic_client(token, base_url)
except ImportError:
# The anthropic_adapter module imports fine but the SDK itself is
# missing — build_anthropic_client raises ImportError at call time
# when _anthropic_sdk is None. Treat as unavailable.
return None, None
return AnthropicAuxiliaryClient(real_client, model, token, base_url, is_oauth=is_oauth), model
@@ -731,16 +742,37 @@ def _resolve_forced_provider(forced: str) -> Tuple[Optional[OpenAI], Optional[st
return None, None
_AUTO_PROVIDER_LABELS = {
"_try_openrouter": "openrouter",
"_try_nous": "nous",
"_try_custom_endpoint": "local/custom",
"_try_codex": "openai-codex",
"_resolve_api_key_provider": "api-key",
}
def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
"""Full auto-detection chain: OpenRouter → Nous → custom → Codex → API-key → None."""
global auxiliary_is_nous
auxiliary_is_nous = False # Reset — _try_nous() will set True if it wins
tried = []
for try_fn in (_try_openrouter, _try_nous, _try_custom_endpoint,
_try_codex, _resolve_api_key_provider):
fn_name = getattr(try_fn, "__name__", "unknown")
label = _AUTO_PROVIDER_LABELS.get(fn_name, fn_name)
client, model = try_fn()
if client is not None:
if tried:
logger.info("Auxiliary auto-detect: using %s (%s) — skipped: %s",
label, model or "default", ", ".join(tried))
else:
logger.info("Auxiliary auto-detect: using %s (%s)", label, model or "default")
return client, model
logger.debug("Auxiliary client: none available")
tried.append(label)
logger.warning("Auxiliary auto-detect: no provider available (tried: %s). "
"Compression, summarization, and memory flush will not work. "
"Set OPENROUTER_API_KEY or configure a local model in config.yaml.",
", ".join(tried))
return None, None
@@ -891,11 +923,12 @@ def resolve_provider_client(
custom_key = (
(explicit_api_key or "").strip()
or os.getenv("OPENAI_API_KEY", "").strip()
or "no-key-required" # local servers don't need auth
)
if not custom_base or not custom_key:
if not custom_base:
logger.warning(
"resolve_provider_client: explicit custom endpoint requested "
"but no API key was found (set explicit_api_key or OPENAI_API_KEY)"
"but base_url is empty"
)
return None, None
final_model = model or _read_main_model() or "gpt-4o-mini"
@@ -1131,7 +1164,13 @@ def resolve_vision_provider_client(
return "custom", client, final_model
if requested == "auto":
for candidate in get_available_vision_backends():
ordered = list(_VISION_AUTO_PROVIDER_ORDER)
preferred = _preferred_main_vision_provider()
if preferred in ordered:
ordered.remove(preferred)
ordered.insert(0, preferred)
for candidate in ordered:
sync_client, default_model = _resolve_strict_vision_backend(candidate)
if sync_client is not None:
return _finalize(candidate, sync_client, default_model)
@@ -1204,6 +1243,39 @@ _client_cache: Dict[tuple, tuple] = {}
_client_cache_lock = threading.Lock()
def neuter_async_httpx_del() -> None:
"""Monkey-patch ``AsyncHttpxClientWrapper.__del__`` to be a no-op.
The OpenAI SDK's ``AsyncHttpxClientWrapper.__del__`` schedules
``self.aclose()`` via ``asyncio.get_running_loop().create_task()``.
When an ``AsyncOpenAI`` client is garbage-collected while
prompt_toolkit's event loop is running (the common CLI idle state),
the ``aclose()`` task runs on prompt_toolkit's loop but the
underlying TCP transport is bound to a *different* loop (the worker
thread's loop that the client was originally created on). If that
loop is closed or its thread is dead, the transport's
``self._loop.call_soon()`` raises ``RuntimeError("Event loop is
closed")``, which prompt_toolkit surfaces as "Unhandled exception
in event loop ... Press ENTER to continue...".
Neutering ``__del__`` is safe because:
- Cached clients are explicitly cleaned via ``_force_close_async_httpx``
on stale-loop detection and ``shutdown_cached_clients`` on exit.
- Uncached clients' TCP connections are cleaned up by the OS when the
process exits.
- The OpenAI SDK itself marks this as a TODO (``# TODO(someday):
support non asyncio runtimes here``).
Call this once at CLI startup, before any ``AsyncOpenAI`` clients are
created.
"""
try:
from openai._base_client import AsyncHttpxClientWrapper
AsyncHttpxClientWrapper.__del__ = lambda self: None # type: ignore[assignment]
except (ImportError, AttributeError):
pass # Graceful degradation if the SDK changes its internals
def _force_close_async_httpx(client: Any) -> None:
"""Mark the httpx AsyncClient inside an AsyncOpenAI client as closed.
@@ -1251,6 +1323,25 @@ def shutdown_cached_clients() -> None:
_client_cache.clear()
def cleanup_stale_async_clients() -> None:
"""Force-close cached async clients whose event loop is closed.
Call this after each agent turn to proactively clean up stale clients
before GC can trigger ``AsyncHttpxClientWrapper.__del__`` on them.
This is defense-in-depth — the primary fix is ``neuter_async_httpx_del``
which disables ``__del__`` entirely.
"""
with _client_cache_lock:
stale_keys = []
for key, entry in _client_cache.items():
client, _default, cached_loop = entry
if cached_loop is not None and cached_loop.is_closed():
_force_close_async_httpx(client)
stale_keys.append(key)
for key in stale_keys:
del _client_cache[key]
def _get_cached_client(
provider: str,
model: str = None,
@@ -1394,6 +1485,29 @@ def _resolve_task_provider_model(
return "auto", resolved_model, None, None
_DEFAULT_AUX_TIMEOUT = 30.0
def _get_task_timeout(task: str, default: float = _DEFAULT_AUX_TIMEOUT) -> float:
"""Read timeout from auxiliary.{task}.timeout in config, falling back to *default*."""
if not task:
return default
try:
from hermes_cli.config import load_config
config = load_config()
except ImportError:
return default
aux = config.get("auxiliary", {}) if isinstance(config, dict) else {}
task_config = aux.get(task, {}) if isinstance(aux, dict) else {}
raw = task_config.get("timeout")
if raw is not None:
try:
return float(raw)
except (ValueError, TypeError):
pass
return default
def _build_call_kwargs(
provider: str,
model: str,
@@ -1451,7 +1565,7 @@ def call_llm(
temperature: float = None,
max_tokens: int = None,
tools: list = None,
timeout: float = 30.0,
timeout: float = None,
extra_body: dict = None,
) -> Any:
"""Centralized synchronous LLM call.
@@ -1469,7 +1583,7 @@ def call_llm(
temperature: Sampling temperature (None = provider default).
max_tokens: Max output tokens (handles max_tokens vs max_completion_tokens).
tools: Tool definitions (for function calling).
timeout: Request timeout in seconds.
timeout: Request timeout in seconds (None = read from auxiliary.{task}.timeout config).
extra_body: Additional request body fields.
Returns:
@@ -1525,8 +1639,8 @@ def call_llm(
)
# For auto/custom, fall back to OpenRouter
if not resolved_base_url:
logger.warning("Provider %s unavailable, falling back to openrouter",
resolved_provider)
logger.info("Auxiliary %s: provider %s unavailable, falling back to openrouter",
task or "call", resolved_provider)
client, final_model = _get_cached_client(
"openrouter", resolved_model or _OPENROUTER_MODEL)
if client is None:
@@ -1534,10 +1648,19 @@ def call_llm(
f"No LLM provider configured for task={task} provider={resolved_provider}. "
f"Run: hermes setup")
effective_timeout = timeout if timeout is not None else _get_task_timeout(task)
# Log what we're about to do — makes auxiliary operations visible
_base_info = str(getattr(client, "base_url", resolved_base_url) or "")
if task:
logger.info("Auxiliary %s: using %s (%s)%s",
task, resolved_provider or "auto", final_model or "default",
f" at {_base_info}" if _base_info and "openrouter" not in _base_info else "")
kwargs = _build_call_kwargs(
resolved_provider, final_model, messages,
temperature=temperature, max_tokens=max_tokens,
tools=tools, timeout=timeout, extra_body=extra_body,
tools=tools, timeout=effective_timeout, extra_body=extra_body,
base_url=resolved_base_url)
# Handle max_tokens vs max_completion_tokens retry
@@ -1552,6 +1675,62 @@ def call_llm(
raise
def extract_content_or_reasoning(response) -> str:
"""Extract content from an LLM response, falling back to reasoning fields.
Mirrors the main agent loop's behavior when a reasoning model (DeepSeek-R1,
Qwen-QwQ, etc.) returns ``content=None`` with reasoning in structured fields.
Resolution order:
1. ``message.content`` — strip inline think/reasoning blocks, check for
remaining non-whitespace text.
2. ``message.reasoning`` / ``message.reasoning_content`` — direct
structured reasoning fields (DeepSeek, Moonshot, Novita, etc.).
3. ``message.reasoning_details`` — OpenRouter unified array format.
Returns the best available text, or ``""`` if nothing found.
"""
import re
msg = response.choices[0].message
content = (msg.content or "").strip()
if content:
# Strip inline think/reasoning blocks (mirrors _strip_think_blocks)
cleaned = re.sub(
r"<(?:think|thinking|reasoning|REASONING_SCRATCHPAD)>"
r".*?"
r"</(?:think|thinking|reasoning|REASONING_SCRATCHPAD)>",
"", content, flags=re.DOTALL | re.IGNORECASE,
).strip()
if cleaned:
return cleaned
# Content is empty or reasoning-only — try structured reasoning fields
reasoning_parts: list[str] = []
for field in ("reasoning", "reasoning_content"):
val = getattr(msg, field, None)
if val and isinstance(val, str) and val.strip() and val not in reasoning_parts:
reasoning_parts.append(val.strip())
details = getattr(msg, "reasoning_details", None)
if details and isinstance(details, list):
for detail in details:
if isinstance(detail, dict):
summary = (
detail.get("summary")
or detail.get("content")
or detail.get("text")
)
if summary and summary not in reasoning_parts:
reasoning_parts.append(summary.strip() if isinstance(summary, str) else str(summary))
if reasoning_parts:
return "\n\n".join(reasoning_parts)
return ""
async def async_call_llm(
task: str = None,
*,
@@ -1563,7 +1742,7 @@ async def async_call_llm(
temperature: float = None,
max_tokens: int = None,
tools: list = None,
timeout: float = 30.0,
timeout: float = None,
extra_body: dict = None,
) -> Any:
"""Centralized asynchronous LLM call.
@@ -1624,10 +1803,12 @@ async def async_call_llm(
f"No LLM provider configured for task={task} provider={resolved_provider}. "
f"Run: hermes setup")
effective_timeout = timeout if timeout is not None else _get_task_timeout(task)
kwargs = _build_call_kwargs(
resolved_provider, final_model, messages,
temperature=temperature, max_tokens=max_tokens,
tools=tools, timeout=timeout, extra_body=extra_body,
tools=tools, timeout=effective_timeout, extra_body=extra_body,
base_url=resolved_base_url)
try:
+2 -2
View File
@@ -141,7 +141,7 @@ class ContextCompressor:
"last_prompt_tokens": self.last_prompt_tokens,
"threshold_tokens": self.threshold_tokens,
"context_length": self.context_length,
"usage_percent": (self.last_prompt_tokens / self.context_length * 100) if self.context_length else 0,
"usage_percent": min(100, (self.last_prompt_tokens / self.context_length * 100)) if self.context_length else 0,
"compression_count": self.compression_count,
}
@@ -347,7 +347,7 @@ Write only the summary body. Do not include any preamble or prefix."""
"messages": [{"role": "user", "content": prompt}],
"temperature": 0.3,
"max_tokens": summary_budget * 2,
"timeout": 45.0,
# timeout resolved from auxiliary.compression.timeout config by call_llm
}
if self.summary_model:
call_kwargs["model"] = self.summary_model
+13 -6
View File
@@ -286,12 +286,16 @@ def _expand_git_reference(
args: list[str],
label: str,
) -> tuple[str | None, str | None]:
result = subprocess.run(
["git", *args],
cwd=cwd,
capture_output=True,
text=True,
)
try:
result = subprocess.run(
["git", *args],
cwd=cwd,
capture_output=True,
text=True,
timeout=30,
)
except subprocess.TimeoutExpired:
return f"{ref.raw}: git command timed out (30s)", None
if result.returncode != 0:
stderr = (result.stderr or "").strip() or "git command failed"
return f"{ref.raw}: {stderr}", None
@@ -449,9 +453,12 @@ def _rg_files(path: Path, cwd: Path, limit: int) -> list[Path] | None:
cwd=cwd,
capture_output=True,
text=True,
timeout=10,
)
except FileNotFoundError:
return None
except subprocess.TimeoutExpired:
return None
if result.returncode != 0:
return None
files = [Path(line.strip()) for line in result.stdout.splitlines() if line.strip()]
+53 -12
View File
@@ -17,6 +17,23 @@ _RESET = "\033[0m"
logger = logging.getLogger(__name__)
# =========================================================================
# Configurable tool preview length (0 = no limit)
# Set once at startup by CLI or gateway from display.tool_preview_length config.
# =========================================================================
_tool_preview_max_len: int = 0 # 0 = unlimited
def set_tool_preview_max_len(n: int) -> None:
"""Set the global max length for tool call previews. 0 = no limit."""
global _tool_preview_max_len
_tool_preview_max_len = max(int(n), 0) if n else 0
def get_tool_preview_max_len() -> int:
"""Return the configured max preview length (0 = unlimited)."""
return _tool_preview_max_len
# =========================================================================
# Skin-aware helpers (lazy import to avoid circular deps)
@@ -94,8 +111,14 @@ def _oneline(text: str) -> str:
return " ".join(text.split())
def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str | None:
"""Build a short preview of a tool call's primary argument for display."""
def build_tool_preview(tool_name: str, args: dict, max_len: int | None = None) -> str | None:
"""Build a short preview of a tool call's primary argument for display.
*max_len* controls truncation. ``None`` (default) defers to the global
``_tool_preview_max_len`` set via config; ``0`` means unlimited.
"""
if max_len is None:
max_len = _tool_preview_max_len
if not args:
return None
primary_args = {
@@ -190,7 +213,7 @@ def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str | N
preview = _oneline(str(value))
if not preview:
return None
if len(preview) > max_len:
if max_len > 0 and len(preview) > max_len:
preview = preview[:max_len - 3] + "..."
return preview
@@ -231,7 +254,7 @@ class KawaiiSpinner:
"analyzing", "computing", "synthesizing", "formulating", "brainstorming",
]
def __init__(self, message: str = "", spinner_type: str = 'dots'):
def __init__(self, message: str = "", spinner_type: str = 'dots', print_fn=None):
self.message = message
self.spinner_frames = self.SPINNERS.get(spinner_type, self.SPINNERS['dots'])
self.running = False
@@ -239,12 +262,26 @@ class KawaiiSpinner:
self.frame_idx = 0
self.start_time = None
self.last_line_len = 0
# Optional callable to route all output through (e.g. a no-op for silent
# background agents). When set, bypasses self._out entirely so that
# agents with _print_fn overridden remain fully silent.
self._print_fn = print_fn
# Capture stdout NOW, before any redirect_stdout(devnull) from
# child agents can replace sys.stdout with a black hole.
self._out = sys.stdout
def _write(self, text: str, end: str = '\n', flush: bool = False):
"""Write to the stdout captured at spinner creation time."""
"""Write to the stdout captured at spinner creation time.
If a print_fn was supplied at construction, all output is routed through
it instead — allowing callers to silence the spinner with a no-op lambda.
"""
if self._print_fn is not None:
try:
self._print_fn(text)
except Exception:
pass
return
try:
self._out.write(text + end)
if flush:
@@ -270,11 +307,11 @@ class KawaiiSpinner:
The CLI already drives a TUI widget (_spinner_text) for spinner display,
so KawaiiSpinner's \\r-based animation is redundant under StdoutProxy.
"""
out = self._out
# StdoutProxy has a 'raw' attribute (bool) that plain file objects lack.
if hasattr(out, 'raw') and type(out).__name__ == 'StdoutProxy':
return True
return False
try:
from prompt_toolkit.patch_stdout import StdoutProxy
return isinstance(self._out, StdoutProxy)
except ImportError:
return False
def _animate(self):
# When stdout is not a real terminal (e.g. Docker, systemd, pipe),
@@ -470,10 +507,14 @@ def get_cute_tool_message(
def _trunc(s, n=40):
s = str(s)
if _tool_preview_max_len == 0:
return s # no limit
return (s[:n-3] + "...") if len(s) > n else s
def _path(p, n=35):
p = str(p)
if _tool_preview_max_len == 0:
return p # no limit
return ("..." + p[-(n-3):]) if len(p) > n else p
def _wrap(line: str) -> str:
@@ -685,7 +726,7 @@ def format_context_pressure(
threshold_percent: Compaction threshold as a fraction of context window.
compression_enabled: Whether auto-compression is active.
"""
pct_int = int(compaction_progress * 100)
pct_int = min(int(compaction_progress * 100), 100)
filled = min(int(compaction_progress * _BAR_WIDTH), _BAR_WIDTH)
bar = _BAR_FILLED * filled + _BAR_EMPTY * (_BAR_WIDTH - filled)
@@ -715,7 +756,7 @@ def format_context_pressure_gateway(
No ANSI — just Unicode and plain text suitable for Telegram/Discord/etc.
The percentage shows progress toward the compaction threshold.
"""
pct_int = int(compaction_progress * 100)
pct_int = min(int(compaction_progress * 100), 100)
filled = min(int(compaction_progress * _BAR_WIDTH), _BAR_WIDTH)
bar = _BAR_FILLED * filled + _BAR_EMPTY * (_BAR_WIDTH - filled)
+10
View File
@@ -113,6 +113,15 @@ DEFAULT_CONTEXT_LENGTHS = {
"glm": 202752,
# Kimi
"kimi": 262144,
# Hugging Face Inference Providers — model IDs use org/name format
"Qwen/Qwen3.5-397B-A17B": 131072,
"Qwen/Qwen3.5-35B-A3B": 131072,
"deepseek-ai/DeepSeek-V3.2": 65536,
"moonshotai/Kimi-K2.5": 262144,
"moonshotai/Kimi-K2-Thinking": 262144,
"MiniMaxAI/MiniMax-M2.5": 204800,
"XiaomiMiMo/MiMo-V2-Flash": 32768,
"zai-org/GLM-5": 202752,
}
_CONTEXT_LENGTH_KEYS = (
@@ -162,6 +171,7 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"dashscope.aliyuncs.com": "alibaba",
"dashscope-intl.aliyuncs.com": "alibaba",
"openrouter.ai": "openrouter",
"generativelanguage.googleapis.com": "google",
"inference-api.nousresearch.com": "nous",
"api.deepseek.com": "deepseek",
"api.githubcopilot.com": "copilot",
+4 -4
View File
@@ -15,6 +15,8 @@ import time
from pathlib import Path
from typing import Any, Dict, Optional
from utils import atomic_json_write
import requests
logger = logging.getLogger(__name__)
@@ -64,12 +66,10 @@ def _load_disk_cache() -> Dict[str, Any]:
def _save_disk_cache(data: Dict[str, Any]) -> None:
"""Save models.dev data to disk cache."""
"""Save models.dev data to disk cache atomically."""
try:
cache_path = _get_cache_path()
cache_path.parent.mkdir(parents=True, exist_ok=True)
with open(cache_path, "w", encoding="utf-8") as f:
json.dump(data, f, separators=(",", ":"))
atomic_json_write(cache_path, data, indent=None, separators=(",", ":"))
except Exception as e:
logger.debug("Failed to save models.dev disk cache: %s", e)
+331 -108
View File
@@ -4,14 +4,28 @@ All functions are stateless. AIAgent._build_system_prompt() calls these to
assemble pieces, then combines them with memory and ephemeral prompts.
"""
import json
import logging
import os
import re
import threading
from collections import OrderedDict
from pathlib import Path
from hermes_constants import get_hermes_home
from typing import Optional
from agent.skill_utils import (
extract_skill_conditions,
extract_skill_description,
get_all_skills_dirs,
get_disabled_skill_names,
iter_skill_index_files,
parse_frontmatter,
skill_matches_platform,
)
from utils import atomic_json_write
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
@@ -156,6 +170,25 @@ SKILLS_GUIDANCE = (
"Skills that aren't maintained become liabilities."
)
TOOL_USE_ENFORCEMENT_GUIDANCE = (
"# Tool-use enforcement\n"
"You MUST use your tools to take action — do not describe what you would do "
"or plan to do without actually doing it. When you say you will perform an "
"action (e.g. 'I will run the tests', 'Let me check the file', 'I will create "
"the project'), you MUST immediately make the corresponding tool call in the same "
"response. Never end your turn with a promise of future action — execute it now.\n"
"Keep working until the task is actually complete. Do not stop with a summary of "
"what you plan to do next time. If you have tools available that can accomplish "
"the task, use them instead of telling the user what you would do.\n"
"Every response should either (a) contain tool calls that make progress, or "
"(b) deliver a final result to the user. Responses that only describe intentions "
"without acting are not acceptable."
)
# Model name substrings that trigger tool-use enforcement guidance.
# Add new patterns here when a model family needs explicit steering.
TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex")
PLATFORM_HINTS = {
"whatsapp": (
"You are on a text messaging communication platform, WhatsApp. "
@@ -230,6 +263,111 @@ CONTEXT_TRUNCATE_HEAD_RATIO = 0.7
CONTEXT_TRUNCATE_TAIL_RATIO = 0.2
# =========================================================================
# Skills prompt cache
# =========================================================================
_SKILLS_PROMPT_CACHE_MAX = 8
_SKILLS_PROMPT_CACHE: OrderedDict[tuple, str] = OrderedDict()
_SKILLS_PROMPT_CACHE_LOCK = threading.Lock()
_SKILLS_SNAPSHOT_VERSION = 1
def _skills_prompt_snapshot_path() -> Path:
return get_hermes_home() / ".skills_prompt_snapshot.json"
def clear_skills_system_prompt_cache(*, clear_snapshot: bool = False) -> None:
"""Drop the in-process skills prompt cache (and optionally the disk snapshot)."""
with _SKILLS_PROMPT_CACHE_LOCK:
_SKILLS_PROMPT_CACHE.clear()
if clear_snapshot:
try:
_skills_prompt_snapshot_path().unlink(missing_ok=True)
except OSError as e:
logger.debug("Could not remove skills prompt snapshot: %s", e)
def _build_skills_manifest(skills_dir: Path) -> dict[str, list[int]]:
"""Build an mtime/size manifest of all SKILL.md and DESCRIPTION.md files."""
manifest: dict[str, list[int]] = {}
for filename in ("SKILL.md", "DESCRIPTION.md"):
for path in iter_skill_index_files(skills_dir, filename):
try:
st = path.stat()
except OSError:
continue
manifest[str(path.relative_to(skills_dir))] = [st.st_mtime_ns, st.st_size]
return manifest
def _load_skills_snapshot(skills_dir: Path) -> Optional[dict]:
"""Load the disk snapshot if it exists and its manifest still matches."""
snapshot_path = _skills_prompt_snapshot_path()
if not snapshot_path.exists():
return None
try:
snapshot = json.loads(snapshot_path.read_text(encoding="utf-8"))
except Exception:
return None
if not isinstance(snapshot, dict):
return None
if snapshot.get("version") != _SKILLS_SNAPSHOT_VERSION:
return None
if snapshot.get("manifest") != _build_skills_manifest(skills_dir):
return None
return snapshot
def _write_skills_snapshot(
skills_dir: Path,
manifest: dict[str, list[int]],
skill_entries: list[dict],
category_descriptions: dict[str, str],
) -> None:
"""Persist skill metadata to disk for fast cold-start reuse."""
payload = {
"version": _SKILLS_SNAPSHOT_VERSION,
"manifest": manifest,
"skills": skill_entries,
"category_descriptions": category_descriptions,
}
try:
atomic_json_write(_skills_prompt_snapshot_path(), payload)
except Exception as e:
logger.debug("Could not write skills prompt snapshot: %s", e)
def _build_snapshot_entry(
skill_file: Path,
skills_dir: Path,
frontmatter: dict,
description: str,
) -> dict:
"""Build a serialisable metadata dict for one skill."""
rel_path = skill_file.relative_to(skills_dir)
parts = rel_path.parts
if len(parts) >= 2:
skill_name = parts[-2]
category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
else:
category = "general"
skill_name = skill_file.parent.name
platforms = frontmatter.get("platforms") or []
if isinstance(platforms, str):
platforms = [platforms]
return {
"skill_name": skill_name,
"category": category,
"frontmatter_name": str(frontmatter.get("name", skill_name)),
"description": description,
"platforms": [str(p).strip() for p in platforms if str(p).strip()],
"conditions": extract_skill_conditions(frontmatter),
}
# =========================================================================
# Skills index
# =========================================================================
@@ -241,22 +379,13 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:
(True, {}, "") to err on the side of showing the skill.
"""
try:
from tools.skills_tool import _parse_frontmatter, skill_matches_platform
raw = skill_file.read_text(encoding="utf-8")[:2000]
frontmatter, _ = _parse_frontmatter(raw)
frontmatter, _ = parse_frontmatter(raw)
if not skill_matches_platform(frontmatter):
return False, {}, ""
return False, frontmatter, ""
desc = ""
raw_desc = frontmatter.get("description", "")
if raw_desc:
desc = str(raw_desc).strip().strip("'\"")
if len(desc) > 60:
desc = desc[:57] + "..."
return True, frontmatter, desc
return True, frontmatter, extract_skill_description(frontmatter)
except Exception as e:
logger.debug("Failed to parse skill file %s: %s", skill_file, e)
return True, {}, ""
@@ -265,16 +394,9 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:
def _read_skill_conditions(skill_file: Path) -> dict:
"""Extract conditional activation fields from SKILL.md frontmatter."""
try:
from tools.skills_tool import _parse_frontmatter
raw = skill_file.read_text(encoding="utf-8")[:2000]
frontmatter, _ = _parse_frontmatter(raw)
hermes = frontmatter.get("metadata", {}).get("hermes", {})
return {
"fallback_for_toolsets": hermes.get("fallback_for_toolsets", []),
"requires_toolsets": hermes.get("requires_toolsets", []),
"fallback_for_tools": hermes.get("fallback_for_tools", []),
"requires_tools": hermes.get("requires_tools", []),
}
frontmatter, _ = parse_frontmatter(raw)
return extract_skill_conditions(frontmatter)
except Exception as e:
logger.debug("Failed to read skill conditions from %s: %s", skill_file, e)
return {}
@@ -317,109 +439,210 @@ def build_skills_system_prompt(
) -> str:
"""Build a compact skill index for the system prompt.
Scans ~/.hermes/skills/ for SKILL.md files grouped by category.
Includes per-skill descriptions from frontmatter so the model can
match skills by meaning, not just name.
Filters out skills incompatible with the current OS platform.
Two-layer cache:
1. In-process LRU dict keyed by (skills_dir, tools, toolsets)
2. Disk snapshot (``.skills_prompt_snapshot.json``) validated by
mtime/size manifest — survives process restarts
Falls back to a full filesystem scan when both layers miss.
External skill directories (``skills.external_dirs`` in config.yaml) are
scanned alongside the local ``~/.hermes/skills/`` directory. External dirs
are read-only — they appear in the index but new skills are always created
in the local dir. Local skills take precedence when names collide.
"""
hermes_home = get_hermes_home()
skills_dir = hermes_home / "skills"
external_dirs = get_all_skills_dirs()[1:] # skip local (index 0)
if not skills_dir.exists():
if not skills_dir.exists() and not external_dirs:
return ""
# Collect skills with descriptions, grouped by category.
# Each entry: (skill_name, description)
# Supports sub-categories: skills/mlops/training/axolotl/SKILL.md
# -> category "mlops/training", skill "axolotl"
# Load disabled skill names once for the entire scan
try:
from tools.skills_tool import _get_disabled_skill_names
disabled = _get_disabled_skill_names()
except Exception:
disabled = set()
# ── Layer 1: in-process LRU cache ─────────────────────────────────
cache_key = (
str(skills_dir.resolve()),
tuple(str(d) for d in external_dirs),
tuple(sorted(str(t) for t in (available_tools or set()))),
tuple(sorted(str(ts) for ts in (available_toolsets or set()))),
)
with _SKILLS_PROMPT_CACHE_LOCK:
cached = _SKILLS_PROMPT_CACHE.get(cache_key)
if cached is not None:
_SKILLS_PROMPT_CACHE.move_to_end(cache_key)
return cached
disabled = get_disabled_skill_names()
# ── Layer 2: disk snapshot ────────────────────────────────────────
snapshot = _load_skills_snapshot(skills_dir)
skills_by_category: dict[str, list[tuple[str, str]]] = {}
for skill_file in skills_dir.rglob("SKILL.md"):
is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
if not is_compatible:
continue
rel_path = skill_file.relative_to(skills_dir)
parts = rel_path.parts
if len(parts) >= 2:
skill_name = parts[-2]
category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
else:
category = "general"
skill_name = skill_file.parent.name
# Respect user's disabled skills config
fm_name = frontmatter.get("name", skill_name)
if fm_name in disabled or skill_name in disabled:
continue
# Extract conditions inline from already-parsed frontmatter
# (avoids redundant file re-read that _read_skill_conditions would do)
hermes_meta = (frontmatter.get("metadata") or {}).get("hermes") or {}
conditions = {
"fallback_for_toolsets": hermes_meta.get("fallback_for_toolsets", []),
"requires_toolsets": hermes_meta.get("requires_toolsets", []),
"fallback_for_tools": hermes_meta.get("fallback_for_tools", []),
"requires_tools": hermes_meta.get("requires_tools", []),
category_descriptions: dict[str, str] = {}
if snapshot is not None:
# Fast path: use pre-parsed metadata from disk
for entry in snapshot.get("skills", []):
if not isinstance(entry, dict):
continue
skill_name = entry.get("skill_name") or ""
category = entry.get("category") or "general"
frontmatter_name = entry.get("frontmatter_name") or skill_name
platforms = entry.get("platforms") or []
if not skill_matches_platform({"platforms": platforms}):
continue
if frontmatter_name in disabled or skill_name in disabled:
continue
if not _skill_should_show(
entry.get("conditions") or {},
available_tools,
available_toolsets,
):
continue
skills_by_category.setdefault(category, []).append(
(skill_name, entry.get("description", ""))
)
category_descriptions = {
str(k): str(v)
for k, v in (snapshot.get("category_descriptions") or {}).items()
}
if not _skill_should_show(conditions, available_tools, available_toolsets):
continue
skills_by_category.setdefault(category, []).append((skill_name, desc))
else:
# Cold path: full filesystem scan + write snapshot for next time
skill_entries: list[dict] = []
for skill_file in iter_skill_index_files(skills_dir, "SKILL.md"):
is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
entry = _build_snapshot_entry(skill_file, skills_dir, frontmatter, desc)
skill_entries.append(entry)
if not is_compatible:
continue
skill_name = entry["skill_name"]
if entry["frontmatter_name"] in disabled or skill_name in disabled:
continue
if not _skill_should_show(
extract_skill_conditions(frontmatter),
available_tools,
available_toolsets,
):
continue
skills_by_category.setdefault(entry["category"], []).append(
(skill_name, entry["description"])
)
if not skills_by_category:
return ""
# Read category-level descriptions from DESCRIPTION.md
# Checks both the exact category path and parent directories
category_descriptions = {}
for category in skills_by_category:
cat_path = Path(category)
desc_file = skills_dir / cat_path / "DESCRIPTION.md"
if desc_file.exists():
# Read category-level DESCRIPTION.md files
for desc_file in iter_skill_index_files(skills_dir, "DESCRIPTION.md"):
try:
content = desc_file.read_text(encoding="utf-8")
match = re.search(r"^---\s*\n.*?description:\s*(.+?)\s*\n.*?^---", content, re.MULTILINE | re.DOTALL)
if match:
category_descriptions[category] = match.group(1).strip()
fm, _ = parse_frontmatter(content)
cat_desc = fm.get("description")
if not cat_desc:
continue
rel = desc_file.relative_to(skills_dir)
cat = "/".join(rel.parts[:-1]) if len(rel.parts) > 1 else "general"
category_descriptions[cat] = str(cat_desc).strip().strip("'\"")
except Exception as e:
logger.debug("Could not read skill description %s: %s", desc_file, e)
index_lines = []
for category in sorted(skills_by_category.keys()):
cat_desc = category_descriptions.get(category, "")
if cat_desc:
index_lines.append(f" {category}: {cat_desc}")
else:
index_lines.append(f" {category}:")
# Deduplicate and sort skills within each category
seen = set()
for name, desc in sorted(skills_by_category[category], key=lambda x: x[0]):
if name in seen:
continue
seen.add(name)
if desc:
index_lines.append(f" - {name}: {desc}")
else:
index_lines.append(f" - {name}")
_write_skills_snapshot(
skills_dir,
_build_skills_manifest(skills_dir),
skill_entries,
category_descriptions,
)
return (
"## Skills (mandatory)\n"
"Before replying, scan the skills below. If one clearly matches your task, "
"load it with skill_view(name) and follow its instructions. "
"If a skill has issues, fix it with skill_manage(action='patch').\n"
"After difficult/iterative tasks, offer to save as a skill. "
"If a skill you loaded was missing steps, had wrong commands, or needed "
"pitfalls you discovered, update it before finishing.\n"
"\n"
"<available_skills>\n"
+ "\n".join(index_lines) + "\n"
"</available_skills>\n"
"\n"
"If none match, proceed normally without loading a skill."
)
# ── External skill directories ─────────────────────────────────────
# Scan external dirs directly (no snapshot caching — they're read-only
# and typically small). Local skills already in skills_by_category take
# precedence: we track seen names and skip duplicates from external dirs.
seen_skill_names: set[str] = set()
for cat_skills in skills_by_category.values():
for name, _desc in cat_skills:
seen_skill_names.add(name)
for ext_dir in external_dirs:
if not ext_dir.exists():
continue
for skill_file in iter_skill_index_files(ext_dir, "SKILL.md"):
try:
is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
if not is_compatible:
continue
entry = _build_snapshot_entry(skill_file, ext_dir, frontmatter, desc)
skill_name = entry["skill_name"]
if skill_name in seen_skill_names:
continue
if entry["frontmatter_name"] in disabled or skill_name in disabled:
continue
if not _skill_should_show(
extract_skill_conditions(frontmatter),
available_tools,
available_toolsets,
):
continue
seen_skill_names.add(skill_name)
skills_by_category.setdefault(entry["category"], []).append(
(skill_name, entry["description"])
)
except Exception as e:
logger.debug("Error reading external skill %s: %s", skill_file, e)
# External category descriptions
for desc_file in iter_skill_index_files(ext_dir, "DESCRIPTION.md"):
try:
content = desc_file.read_text(encoding="utf-8")
fm, _ = parse_frontmatter(content)
cat_desc = fm.get("description")
if not cat_desc:
continue
rel = desc_file.relative_to(ext_dir)
cat = "/".join(rel.parts[:-1]) if len(rel.parts) > 1 else "general"
category_descriptions.setdefault(cat, str(cat_desc).strip().strip("'\""))
except Exception as e:
logger.debug("Could not read external skill description %s: %s", desc_file, e)
if not skills_by_category:
result = ""
else:
index_lines = []
for category in sorted(skills_by_category.keys()):
cat_desc = category_descriptions.get(category, "")
if cat_desc:
index_lines.append(f" {category}: {cat_desc}")
else:
index_lines.append(f" {category}:")
# Deduplicate and sort skills within each category
seen = set()
for name, desc in sorted(skills_by_category[category], key=lambda x: x[0]):
if name in seen:
continue
seen.add(name)
if desc:
index_lines.append(f" - {name}: {desc}")
else:
index_lines.append(f" - {name}")
result = (
"## Skills (mandatory)\n"
"Before replying, scan the skills below. If one clearly matches your task, "
"load it with skill_view(name) and follow its instructions. "
"If a skill has issues, fix it with skill_manage(action='patch').\n"
"After difficult/iterative tasks, offer to save as a skill. "
"If a skill you loaded was missing steps, had wrong commands, or needed "
"pitfalls you discovered, update it before finishing.\n"
"\n"
"<available_skills>\n"
+ "\n".join(index_lines) + "\n"
"</available_skills>\n"
"\n"
"If none match, proceed normally without loading a skill."
)
# ── Store in LRU cache ────────────────────────────────────────────
with _SKILLS_PROMPT_CACHE_LOCK:
_SKILLS_PROMPT_CACHE[cache_key] = result
_SKILLS_PROMPT_CACHE.move_to_end(cache_key)
while len(_SKILLS_PROMPT_CACHE) > _SKILLS_PROMPT_CACHE_MAX:
_SKILLS_PROMPT_CACHE.popitem(last=False)
return result
# =========================================================================
+3
View File
@@ -37,6 +37,9 @@ _PREFIX_PATTERNS = [
r"dop_v1_[A-Za-z0-9]{10,}", # DigitalOcean PAT
r"doo_v1_[A-Za-z0-9]{10,}", # DigitalOcean OAuth
r"am_[A-Za-z0-9_-]{10,}", # AgentMail API key
r"sk_[A-Za-z0-9_]{10,}", # ElevenLabs TTS key (sk_ underscore, not sk- dash)
r"tvly-[A-Za-z0-9]{10,}", # Tavily search API key
r"exa_[A-Za-z0-9]{10,}", # Exa search API key
]
# ENV assignment patterns: KEY=value where KEY contains a secret-like name
+45 -30
View File
@@ -128,7 +128,11 @@ def _build_skill_message(
supporting.append(rel)
if supporting and skill_dir:
skill_view_target = str(skill_dir.relative_to(SKILLS_DIR))
try:
skill_view_target = str(skill_dir.relative_to(SKILLS_DIR))
except ValueError:
# Skill is from an external dir — use the skill name instead
skill_view_target = skill_dir.name
parts.append("")
parts.append("[This skill has supporting files you can load with the skill_view tool:]")
for sf in supporting:
@@ -158,38 +162,49 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
_skill_commands = {}
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
if not SKILLS_DIR.exists():
return _skill_commands
from agent.skill_utils import get_external_skills_dirs
disabled = _get_disabled_skill_names()
for skill_md in SKILLS_DIR.rglob("SKILL.md"):
if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
continue
try:
content = skill_md.read_text(encoding='utf-8')
frontmatter, body = _parse_frontmatter(content)
# Skip skills incompatible with the current OS platform
if not skill_matches_platform(frontmatter):
seen_names: set = set()
# Scan local dir first, then external dirs
dirs_to_scan = []
if SKILLS_DIR.exists():
dirs_to_scan.append(SKILLS_DIR)
dirs_to_scan.extend(get_external_skills_dirs())
for scan_dir in dirs_to_scan:
for skill_md in scan_dir.rglob("SKILL.md"):
if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
continue
name = frontmatter.get('name', skill_md.parent.name)
# Respect user's disabled skills config
if name in disabled:
try:
content = skill_md.read_text(encoding='utf-8')
frontmatter, body = _parse_frontmatter(content)
# Skip skills incompatible with the current OS platform
if not skill_matches_platform(frontmatter):
continue
name = frontmatter.get('name', skill_md.parent.name)
if name in seen_names:
continue
# Respect user's disabled skills config
if name in disabled:
continue
description = frontmatter.get('description', '')
if not description:
for line in body.strip().split('\n'):
line = line.strip()
if line and not line.startswith('#'):
description = line[:80]
break
seen_names.add(name)
cmd_name = name.lower().replace(' ', '-').replace('_', '-')
_skill_commands[f"/{cmd_name}"] = {
"name": name,
"description": description or f"Invoke the {name} skill",
"skill_md_path": str(skill_md),
"skill_dir": str(skill_md.parent),
}
except Exception:
continue
description = frontmatter.get('description', '')
if not description:
for line in body.strip().split('\n'):
line = line.strip()
if line and not line.startswith('#'):
description = line[:80]
break
cmd_name = name.lower().replace(' ', '-').replace('_', '-')
_skill_commands[f"/{cmd_name}"] = {
"name": name,
"description": description or f"Invoke the {name} skill",
"skill_md_path": str(skill_md),
"skill_dir": str(skill_md.parent),
}
except Exception:
continue
except Exception:
pass
return _skill_commands
+270
View File
@@ -0,0 +1,270 @@
"""Lightweight skill metadata utilities shared by prompt_builder and skills_tool.
This module intentionally avoids importing the tool registry, CLI config, or any
heavy dependency chain. It is safe to import at module level without triggering
tool registration or provider resolution.
"""
import logging
import os
import re
import sys
from pathlib import Path
from typing import Any, Dict, List, Optional, Set, Tuple
from hermes_constants import get_hermes_home
logger = logging.getLogger(__name__)
# ── Platform mapping ──────────────────────────────────────────────────────
PLATFORM_MAP = {
"macos": "darwin",
"linux": "linux",
"windows": "win32",
}
EXCLUDED_SKILL_DIRS = frozenset((".git", ".github", ".hub"))
# ── Lazy YAML loader ─────────────────────────────────────────────────────
_yaml_load_fn = None
def yaml_load(content: str):
"""Parse YAML with lazy import and CSafeLoader preference."""
global _yaml_load_fn
if _yaml_load_fn is None:
import yaml
loader = getattr(yaml, "CSafeLoader", None) or yaml.SafeLoader
def _load(value: str):
return yaml.load(value, Loader=loader)
_yaml_load_fn = _load
return _yaml_load_fn(content)
# ── Frontmatter parsing ──────────────────────────────────────────────────
def parse_frontmatter(content: str) -> Tuple[Dict[str, Any], str]:
"""Parse YAML frontmatter from a markdown string.
Uses yaml with CSafeLoader for full YAML support (nested metadata, lists)
with a fallback to simple key:value splitting for robustness.
Returns:
(frontmatter_dict, remaining_body)
"""
frontmatter: Dict[str, Any] = {}
body = content
if not content.startswith("---"):
return frontmatter, body
end_match = re.search(r"\n---\s*\n", content[3:])
if not end_match:
return frontmatter, body
yaml_content = content[3 : end_match.start() + 3]
body = content[end_match.end() + 3 :]
try:
parsed = yaml_load(yaml_content)
if isinstance(parsed, dict):
frontmatter = parsed
except Exception:
# Fallback: simple key:value parsing for malformed YAML
for line in yaml_content.strip().split("\n"):
if ":" not in line:
continue
key, value = line.split(":", 1)
frontmatter[key.strip()] = value.strip()
return frontmatter, body
# ── Platform matching ─────────────────────────────────────────────────────
def skill_matches_platform(frontmatter: Dict[str, Any]) -> bool:
"""Return True when the skill is compatible with the current OS.
Skills declare platform requirements via a top-level ``platforms`` list
in their YAML frontmatter::
platforms: [macos] # macOS only
platforms: [macos, linux] # macOS and Linux
If the field is absent or empty the skill is compatible with **all**
platforms (backward-compatible default).
"""
platforms = frontmatter.get("platforms")
if not platforms:
return True
if not isinstance(platforms, list):
platforms = [platforms]
current = sys.platform
for platform in platforms:
normalized = str(platform).lower().strip()
mapped = PLATFORM_MAP.get(normalized, normalized)
if current.startswith(mapped):
return True
return False
# ── Disabled skills ───────────────────────────────────────────────────────
def get_disabled_skill_names() -> Set[str]:
"""Read disabled skill names from config.yaml.
Resolves platform from ``HERMES_PLATFORM`` env var, falls back to
the global disabled list. Reads the config file directly (no CLI
config imports) to stay lightweight.
"""
config_path = get_hermes_home() / "config.yaml"
if not config_path.exists():
return set()
try:
parsed = yaml_load(config_path.read_text(encoding="utf-8"))
except Exception as e:
logger.debug("Could not read skill config %s: %s", config_path, e)
return set()
if not isinstance(parsed, dict):
return set()
skills_cfg = parsed.get("skills")
if not isinstance(skills_cfg, dict):
return set()
resolved_platform = os.getenv("HERMES_PLATFORM")
if resolved_platform:
platform_disabled = (skills_cfg.get("platform_disabled") or {}).get(
resolved_platform
)
if platform_disabled is not None:
return _normalize_string_set(platform_disabled)
return _normalize_string_set(skills_cfg.get("disabled"))
def _normalize_string_set(values) -> Set[str]:
if values is None:
return set()
if isinstance(values, str):
values = [values]
return {str(v).strip() for v in values if str(v).strip()}
# ── External skills directories ──────────────────────────────────────────
def get_external_skills_dirs() -> List[Path]:
"""Read ``skills.external_dirs`` from config.yaml and return validated paths.
Each entry is expanded (``~`` and ``${VAR}``) and resolved to an absolute
path. Only directories that actually exist are returned. Duplicates and
paths that resolve to the local ``~/.hermes/skills/`` are silently skipped.
"""
config_path = get_hermes_home() / "config.yaml"
if not config_path.exists():
return []
try:
parsed = yaml_load(config_path.read_text(encoding="utf-8"))
except Exception:
return []
if not isinstance(parsed, dict):
return []
skills_cfg = parsed.get("skills")
if not isinstance(skills_cfg, dict):
return []
raw_dirs = skills_cfg.get("external_dirs")
if not raw_dirs:
return []
if isinstance(raw_dirs, str):
raw_dirs = [raw_dirs]
if not isinstance(raw_dirs, list):
return []
local_skills = (get_hermes_home() / "skills").resolve()
seen: Set[Path] = set()
result: List[Path] = []
for entry in raw_dirs:
entry = str(entry).strip()
if not entry:
continue
# Expand ~ and environment variables
expanded = os.path.expanduser(os.path.expandvars(entry))
p = Path(expanded).resolve()
if p == local_skills:
continue
if p in seen:
continue
if p.is_dir():
seen.add(p)
result.append(p)
else:
logger.debug("External skills dir does not exist, skipping: %s", p)
return result
def get_all_skills_dirs() -> List[Path]:
"""Return all skill directories: local ``~/.hermes/skills/`` first, then external.
The local dir is always first (and always included even if it doesn't exist
yet — callers handle that). External dirs follow in config order.
"""
dirs = [get_hermes_home() / "skills"]
dirs.extend(get_external_skills_dirs())
return dirs
# ── Condition extraction ──────────────────────────────────────────────────
def extract_skill_conditions(frontmatter: Dict[str, Any]) -> Dict[str, List]:
"""Extract conditional activation fields from parsed frontmatter."""
hermes = (frontmatter.get("metadata") or {}).get("hermes") or {}
return {
"fallback_for_toolsets": hermes.get("fallback_for_toolsets", []),
"requires_toolsets": hermes.get("requires_toolsets", []),
"fallback_for_tools": hermes.get("fallback_for_tools", []),
"requires_tools": hermes.get("requires_tools", []),
}
# ── Description extraction ────────────────────────────────────────────────
def extract_skill_description(frontmatter: Dict[str, Any]) -> str:
"""Extract a truncated description from parsed frontmatter."""
raw_desc = frontmatter.get("description", "")
if not raw_desc:
return ""
desc = str(raw_desc).strip().strip("'\"")
if len(desc) > 60:
return desc[:57] + "..."
return desc
# ── File iteration ────────────────────────────────────────────────────────
def iter_skill_index_files(skills_dir: Path, filename: str):
"""Walk skills_dir yielding sorted paths matching *filename*.
Excludes ``.git``, ``.github``, ``.hub`` directories.
"""
matches = []
for root, dirs, files in os.walk(skills_dir):
dirs[:] = [d for d in dirs if d not in EXCLUDED_SKILL_DIRS]
if filename in files:
matches.append(Path(root) / filename)
for path in sorted(matches, key=lambda p: str(p.relative_to(skills_dir))):
yield path
+1 -1
View File
@@ -19,7 +19,7 @@ _TITLE_PROMPT = (
)
def generate_title(user_message: str, assistant_response: str, timeout: float = 15.0) -> Optional[str]:
def generate_title(user_message: str, assistant_response: str, timeout: float = 30.0) -> Optional[str]:
"""Generate a session title from the first exchange.
Uses the auxiliary LLM client (cheapest/fastest available model).
+42 -8
View File
@@ -7,17 +7,33 @@
# =============================================================================
model:
# Default model to use (can be overridden with --model flag)
# Both "default" and "model" work as the key name here.
default: "anthropic/claude-opus-4.6"
# Inference provider selection:
# "auto" - Use Nous Portal if logged in, otherwise OpenRouter/env vars (default)
# "nous-api" - Use Nous Portal via API key (requires: NOUS_API_KEY)
# "openrouter" - Always use OpenRouter API key from OPENROUTER_API_KEY
# "nous" - Always use Nous Portal (requires: hermes login)
# "zai" - Use z.ai / ZhipuAI GLM models (requires: GLM_API_KEY)
# "kimi-coding"- Use Kimi / Moonshot AI models (requires: KIMI_API_KEY)
# "minimax" - Use MiniMax global endpoint (requires: MINIMAX_API_KEY)
# "minimax-cn" - Use MiniMax China endpoint (requires: MINIMAX_CN_API_KEY)
# "auto" - Auto-detect from credentials (default)
# "openrouter" - OpenRouter (requires: OPENROUTER_API_KEY or OPENAI_API_KEY)
# "nous" - Nous Portal OAuth (requires: hermes login)
# "nous-api" - Nous Portal API key (requires: NOUS_API_KEY)
# "anthropic" - Direct Anthropic API (requires: ANTHROPIC_API_KEY)
# "openai-codex" - OpenAI Codex (requires: hermes login --provider openai-codex)
# "copilot" - GitHub Copilot / GitHub Models (requires: GITHUB_TOKEN)
# "zai" - z.ai / ZhipuAI GLM (requires: GLM_API_KEY)
# "kimi-coding" - Kimi / Moonshot AI (requires: KIMI_API_KEY)
# "minimax" - MiniMax global (requires: MINIMAX_API_KEY)
# "minimax-cn" - MiniMax China (requires: MINIMAX_CN_API_KEY)
# "huggingface" - Hugging Face Inference (requires: HF_TOKEN)
# "kilocode" - KiloCode gateway (requires: KILOCODE_API_KEY)
# "ai-gateway" - Vercel AI Gateway (requires: AI_GATEWAY_API_KEY)
#
# Local servers (LM Studio, Ollama, vLLM, llama.cpp):
# "custom" - Any OpenAI-compatible endpoint. Set base_url below.
# Aliases: "lmstudio", "ollama", "vllm", "llamacpp" all map to "custom".
# Example for LM Studio:
# provider: "lmstudio"
# base_url: "http://localhost:1234/v1"
# No API key needed — local servers typically ignore auth.
#
# Can also be overridden with --provider flag or HERMES_INFERENCE_PROVIDER env var.
provider: "auto"
@@ -308,6 +324,9 @@ compression:
# vision:
# provider: "auto"
# model: "" # e.g. "google/gemini-2.5-flash", "openai/gpt-4o"
# timeout: 30 # LLM API call timeout (seconds)
# download_timeout: 30 # Image HTTP download timeout (seconds)
# # Increase for slow connections or self-hosted image servers
#
# # Web page scraping / summarization + browser page text extraction
# web_extract:
@@ -401,6 +420,15 @@ skills:
# Set to 0 to disable.
creation_nudge_interval: 15
# External skill directories — share skills across tools/agents without
# copying them into ~/.hermes/skills/. Each path is expanded (~ and ${VAR})
# and resolved to an absolute path. External dirs are read-only: skill
# creation always writes to ~/.hermes/skills/. Local skills take precedence
# when names collide.
# external_dirs:
# - ~/.agents/skills
# - /home/shared/team-skills
# =============================================================================
# Agent Behavior
# =============================================================================
@@ -688,6 +716,12 @@ display:
# Toggle at runtime with /verbose in the CLI
tool_progress: all
# What Enter does when Hermes is already busy in the CLI.
# interrupt: Interrupt the current run and redirect Hermes (default)
# queue: Queue your message for the next turn
# Ctrl+C always interrupts regardless of this setting.
busy_input_mode: interrupt
# Background process notifications (gateway/messaging only).
# Controls how chatty the process watcher is when you use
# terminal(background=true, check_interval=...) from Telegram/Discord/etc.
+444 -106
View File
@@ -70,7 +70,7 @@ _COMMAND_SPINNER_FRAMES = ("⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧
# Load .env from ~/.hermes/.env first, then project root as dev fallback.
# User-managed env files should override stale shell exports on restart.
from hermes_constants import get_hermes_home, OPENROUTER_BASE_URL
from hermes_constants import get_hermes_home, display_hermes_home, OPENROUTER_BASE_URL
from hermes_cli.env_loader import load_hermes_dotenv
_hermes_home = get_hermes_home()
@@ -205,6 +205,7 @@ def load_cli_config() -> Dict[str, Any]:
"resume_display": "full",
"show_reasoning": False,
"streaming": True,
"busy_input_mode": "interrupt",
"skin": "default",
},
@@ -448,6 +449,25 @@ try:
except Exception:
pass # Skin engine is optional — default skin used if unavailable
# Initialize tool preview length from config
try:
from agent.display import set_tool_preview_max_len
_tpl = CLI_CONFIG.get("display", {}).get("tool_preview_length", 0)
set_tool_preview_max_len(int(_tpl) if _tpl else 0)
except Exception:
pass
# Neuter AsyncHttpxClientWrapper.__del__ before any AsyncOpenAI clients are
# created. The SDK's __del__ schedules aclose() on asyncio.get_running_loop()
# which, during CLI idle time, finds prompt_toolkit's event loop and tries to
# close TCP transports bound to dead worker loops — producing
# "Event loop is closed" / "Press ENTER to continue..." errors.
try:
from agent.auxiliary_client import neuter_async_httpx_del
neuter_async_httpx_del()
except Exception:
pass
from rich import box as rich_box
from rich.console import Console
from rich.markup import escape as _escape
@@ -1035,13 +1055,18 @@ class HermesCLI:
self.config = CLI_CONFIG
self.compact = compact if compact is not None else CLI_CONFIG["display"].get("compact", False)
# tool_progress: "off", "new", "all", "verbose" (from config.yaml display section)
self.tool_progress_mode = CLI_CONFIG["display"].get("tool_progress", "all")
# YAML 1.1 parses bare `off` as boolean False — normalise to string.
_raw_tp = CLI_CONFIG["display"].get("tool_progress", "all")
self.tool_progress_mode = "off" if _raw_tp is False else str(_raw_tp)
# resume_display: "full" (show history) | "minimal" (one-liner only)
self.resume_display = CLI_CONFIG["display"].get("resume_display", "full")
# bell_on_complete: play terminal bell (\a) when agent finishes a response
self.bell_on_complete = CLI_CONFIG["display"].get("bell_on_complete", False)
# show_reasoning: display model thinking/reasoning before the response
self.show_reasoning = CLI_CONFIG["display"].get("show_reasoning", False)
# busy_input_mode: "interrupt" (Enter interrupts current run) or "queue" (Enter queues for next turn)
_bim = CLI_CONFIG["display"].get("busy_input_mode", "interrupt")
self.busy_input_mode = "queue" if str(_bim).strip().lower() == "queue" else "interrupt"
self.verbose = verbose if verbose is not None else (self.tool_progress_mode == "verbose")
@@ -1061,12 +1086,12 @@ class HermesCLI:
# authoritative. This avoids conflicts in multi-agent setups where
# env vars would stomp each other.
_model_config = CLI_CONFIG.get("model", {})
_config_model = _model_config.get("default", "") if isinstance(_model_config, dict) else (_model_config or "")
_FALLBACK_MODEL = "anthropic/claude-opus-4.6"
self.model = model or _config_model or _FALLBACK_MODEL
# Auto-detect model from local server if still on fallback
if self.model == _FALLBACK_MODEL:
_base_url = _model_config.get("base_url", "") if isinstance(_model_config, dict) else ""
_config_model = (_model_config.get("default") or _model_config.get("model") or "") if isinstance(_model_config, dict) else (_model_config or "")
_DEFAULT_CONFIG_MODEL = "anthropic/claude-opus-4.6"
self.model = model or _config_model or _DEFAULT_CONFIG_MODEL
# Auto-detect model from local server if still on default
if self.model == _DEFAULT_CONFIG_MODEL:
_base_url = (_model_config.get("base_url") or "") if isinstance(_model_config, dict) else ""
if "localhost" in _base_url or "127.0.0.1" in _base_url:
from hermes_cli.runtime_provider import _auto_detect_local_model
_detected = _auto_detect_local_model(_base_url)
@@ -1079,7 +1104,7 @@ class HermesCLI:
# explicit choice — the user just never changed it. But a config model
# like "gpt-5.3-codex" IS explicit and must be preserved.
self._model_is_default = not model and (
not _config_model or _config_model == _FALLBACK_MODEL
not _config_model or _config_model == _DEFAULT_CONFIG_MODEL
)
self._explicit_api_key = api_key
@@ -1165,9 +1190,13 @@ class HermesCLI:
self._provider_require_params = pr.get("require_parameters", False)
self._provider_data_collection = pr.get("data_collection")
# Fallback model config — tried when primary provider fails after retries
fb = CLI_CONFIG.get("fallback_model") or {}
self._fallback_model = fb if fb.get("provider") and fb.get("model") else None
# Fallback provider chain — tried in order when primary fails after retries.
# Supports new list format (fallback_providers) and legacy single-dict (fallback_model).
fb = CLI_CONFIG.get("fallback_providers") or CLI_CONFIG.get("fallback_model") or []
# Normalize legacy single-dict to a one-element list
if isinstance(fb, dict):
fb = [fb] if fb.get("provider") and fb.get("model") else []
self._fallback_model = fb
# Optional cheap-vs-strong routing for simple turns
self._smart_model_routing = CLI_CONFIG.get("smart_model_routing", {}) or {}
@@ -1326,20 +1355,69 @@ class HermesCLI:
return snapshot
@staticmethod
def _status_bar_display_width(text: str) -> int:
"""Return terminal cell width for status-bar text.
len() is not enough for prompt_toolkit layout decisions because some
glyphs can render wider than one Python codepoint. Keeping the status
bar within the real display width prevents it from wrapping onto a
second line and leaving behind duplicate rows.
"""
try:
from prompt_toolkit.utils import get_cwidth
return get_cwidth(text or "")
except Exception:
return len(text or "")
@classmethod
def _trim_status_bar_text(cls, text: str, max_width: int) -> str:
"""Trim status-bar text to a single terminal row."""
if max_width <= 0:
return ""
try:
from prompt_toolkit.utils import get_cwidth
except Exception:
get_cwidth = None
if cls._status_bar_display_width(text) <= max_width:
return text
ellipsis = "..."
ellipsis_width = cls._status_bar_display_width(ellipsis)
if max_width <= ellipsis_width:
return ellipsis[:max_width]
out = []
width = 0
for ch in text:
ch_width = get_cwidth(ch) if get_cwidth else len(ch)
if width + ch_width + ellipsis_width > max_width:
break
out.append(ch)
width += ch_width
return "".join(out).rstrip() + ellipsis
def _build_status_bar_text(self, width: Optional[int] = None) -> str:
try:
snapshot = self._get_status_bar_snapshot()
width = width or shutil.get_terminal_size((80, 24)).columns
if width is None:
try:
from prompt_toolkit.application import get_app
width = get_app().output.get_size().columns
except Exception:
width = shutil.get_terminal_size((80, 24)).columns
percent = snapshot["context_percent"]
percent_label = f"{percent}%" if percent is not None else "--"
duration_label = snapshot["duration"]
if width < 52:
return f"{snapshot['model_short']} · {duration_label}"
text = f"{snapshot['model_short']} · {duration_label}"
return self._trim_status_bar_text(text, width)
if width < 76:
parts = [f"{snapshot['model_short']}", percent_label]
parts.append(duration_label)
return " · ".join(parts)
return self._trim_status_bar_text(" · ".join(parts), width)
if snapshot["context_length"]:
ctx_total = _format_context_length(snapshot["context_length"])
@@ -1350,7 +1428,7 @@ class HermesCLI:
parts = [f"{snapshot['model_short']}", context_label, percent_label]
parts.append(duration_label)
return "".join(parts)
return self._trim_status_bar_text("".join(parts), width)
except Exception:
return f"{self.model if getattr(self, 'model', None) else 'Hermes'}"
@@ -1359,57 +1437,67 @@ class HermesCLI:
return []
try:
snapshot = self._get_status_bar_snapshot()
width = shutil.get_terminal_size((80, 24)).columns
# Use prompt_toolkit's own terminal width when running inside the
# TUI — shutil.get_terminal_size() can return stale or fallback
# values (especially on SSH) that differ from what prompt_toolkit
# actually renders, causing the fragments to overflow to a second
# line and produce duplicated status bar rows over long sessions.
try:
from prompt_toolkit.application import get_app
width = get_app().output.get_size().columns
except Exception:
width = shutil.get_terminal_size((80, 24)).columns
duration_label = snapshot["duration"]
if width < 52:
return [
("class:status-bar", ""),
("class:status-bar-strong", snapshot["model_short"]),
("class:status-bar-dim", " · "),
("class:status-bar-dim", duration_label),
("class:status-bar", " "),
]
percent = snapshot["context_percent"]
percent_label = f"{percent}%" if percent is not None else "--"
if width < 76:
frags = [
("class:status-bar", ""),
("class:status-bar-strong", snapshot["model_short"]),
("class:status-bar-dim", " · "),
(self._status_bar_context_style(percent), percent_label),
]
frags.extend([
("class:status-bar-dim", " · "),
("class:status-bar-dim", duration_label),
("class:status-bar", " "),
])
return frags
if snapshot["context_length"]:
ctx_total = _format_context_length(snapshot["context_length"])
ctx_used = format_token_count_compact(snapshot["context_tokens"])
context_label = f"{ctx_used}/{ctx_total}"
]
else:
context_label = "ctx --"
percent = snapshot["context_percent"]
percent_label = f"{percent}%" if percent is not None else "--"
if width < 76:
frags = [
("class:status-bar", ""),
("class:status-bar-strong", snapshot["model_short"]),
("class:status-bar-dim", " · "),
(self._status_bar_context_style(percent), percent_label),
("class:status-bar-dim", " · "),
("class:status-bar-dim", duration_label),
("class:status-bar", " "),
]
else:
if snapshot["context_length"]:
ctx_total = _format_context_length(snapshot["context_length"])
ctx_used = format_token_count_compact(snapshot["context_tokens"])
context_label = f"{ctx_used}/{ctx_total}"
else:
context_label = "ctx --"
bar_style = self._status_bar_context_style(percent)
frags = [
("class:status-bar", ""),
("class:status-bar-strong", snapshot["model_short"]),
("class:status-bar-dim", ""),
("class:status-bar-dim", context_label),
("class:status-bar-dim", ""),
(bar_style, self._build_context_bar(percent)),
("class:status-bar-dim", " "),
(bar_style, percent_label),
]
frags.extend([
("class:status-bar-dim", " "),
("class:status-bar-dim", duration_label),
("class:status-bar", " "),
])
bar_style = self._status_bar_context_style(percent)
frags = [
("class:status-bar", ""),
("class:status-bar-strong", snapshot["model_short"]),
("class:status-bar-dim", ""),
("class:status-bar-dim", context_label),
("class:status-bar-dim", ""),
(bar_style, self._build_context_bar(percent)),
("class:status-bar-dim", " "),
(bar_style, percent_label),
("class:status-bar-dim", ""),
("class:status-bar-dim", duration_label),
("class:status-bar", " "),
]
total_width = sum(self._status_bar_display_width(text) for _, text in frags)
if total_width > width:
plain_text = "".join(text for _, text in frags)
trimmed = self._trim_status_bar_text(plain_text, width)
return [("class:status-bar", trimmed)]
return frags
except Exception:
return [("class:status-bar", f" {self._build_status_bar_text()} ")]
@@ -1594,6 +1682,7 @@ class HermesCLI:
if not text:
return
self._reasoning_stream_started = True
self._reasoning_shown_this_turn = True
if getattr(self, "_stream_box_opened", False):
return
@@ -2700,22 +2789,12 @@ class HermesCLI:
print(f" MCP tool: /tools {subcommand} github:create_issue")
return
# Confirm session reset before applying
verb = "Disable" if subcommand == "disable" else "Enable"
# Apply the change directly — the user typing the command is implicit
# consent. Do NOT use input() here; it hangs inside prompt_toolkit's
# TUI event loop (known pitfall).
verb = "Disabling" if subcommand == "disable" else "Enabling"
label = ", ".join(names)
_cprint(f"{_GOLD}{verb} {label}?{_RST}")
_cprint(f"{_DIM}This will save to config and reset your session so the "
f"change takes effect cleanly.{_RST}")
try:
answer = input(" Continue? [y/N] ").strip().lower()
except (EOFError, KeyboardInterrupt):
print()
_cprint(f"{_DIM}Cancelled.{_RST}")
return
if answer not in ("y", "yes"):
_cprint(f"{_DIM}Cancelled.{_RST}")
return
_cprint(f"{_GOLD}{verb} {label}...{_RST}")
tools_disable_enable_command(
Namespace(tools_action=subcommand, names=names, platform="cli"))
@@ -2758,6 +2837,28 @@ class HermesCLI:
print(" Example: python cli.py --toolsets web,terminal")
print()
def _handle_profile_command(self):
"""Display active profile name and home directory."""
from hermes_constants import get_hermes_home, display_hermes_home
home = get_hermes_home()
display = display_hermes_home()
profiles_parent = Path.home() / ".hermes" / "profiles"
try:
rel = home.relative_to(profiles_parent)
profile_name = str(rel).split("/")[0]
except ValueError:
profile_name = None
print()
if profile_name:
print(f" Profile: {profile_name}")
else:
print(" Profile: default")
print(f" Home: {display}")
print()
def show_config(self):
"""Display current configuration with kawaii ASCII art."""
# Get terminal config from environment (which was set from cli-config.yaml)
@@ -2929,6 +3030,82 @@ class HermesCLI:
if not silent:
print("(^_^)v New session started!")
def _handle_resume_command(self, cmd_original: str) -> None:
"""Handle /resume <session_id_or_title> — switch to a previous session mid-conversation."""
parts = cmd_original.split(None, 1)
target = parts[1].strip() if len(parts) > 1 else ""
if not target:
_cprint(" Usage: /resume <session_id_or_title>")
_cprint(" Tip: Use /history or `hermes sessions list` to find sessions.")
return
if not self._session_db:
_cprint(" Session database not available.")
return
# Resolve title or ID
from hermes_cli.main import _resolve_session_by_name_or_id
resolved = _resolve_session_by_name_or_id(target)
target_id = resolved or target
session_meta = self._session_db.get_session(target_id)
if not session_meta:
_cprint(f" Session not found: {target}")
_cprint(" Use /history or `hermes sessions list` to see available sessions.")
return
if target_id == self.session_id:
_cprint(" Already on that session.")
return
# End current session
try:
self._session_db.end_session(self.session_id, "resumed_other")
except Exception:
pass
# Switch to the target session
self.session_id = target_id
self._resumed = True
self._pending_title = None
# Load conversation history
restored = self._session_db.get_messages_as_conversation(target_id)
self.conversation_history = restored or []
# Re-open the target session so it's not marked as ended
try:
self._session_db.reopen_session(target_id)
except Exception:
pass
# Sync the agent if already initialised
if self.agent:
self.agent.session_id = target_id
self.agent.reset_session_state()
if hasattr(self.agent, "_last_flushed_db_idx"):
self.agent._last_flushed_db_idx = len(self.conversation_history)
if hasattr(self.agent, "_todo_store"):
try:
from tools.todo_tool import TodoStore
self.agent._todo_store = TodoStore()
except Exception:
pass
if hasattr(self.agent, "_invalidate_system_prompt"):
self.agent._invalidate_system_prompt()
title_part = f" \"{session_meta['title']}\"" if session_meta.get("title") else ""
msg_count = len([m for m in self.conversation_history if m.get("role") == "user"])
if self.conversation_history:
_cprint(
f" ↻ Resumed session {target_id}{title_part}"
f" ({msg_count} user message{'s' if msg_count != 1 else ''},"
f" {len(self.conversation_history)} total)"
)
else:
_cprint(f" ↻ Resumed session {target_id}{title_part} — no messages, starting fresh.")
def reset_conversation(self):
"""Reset the conversation by starting a new session."""
self.new_session()
@@ -3486,7 +3663,7 @@ class HermesCLI:
print(" To start the gateway:")
print(" python cli.py --gateway")
print()
print(" Configuration file: ~/.hermes/config.yaml")
print(f" Configuration file: {display_hermes_home()}/config.yaml")
print()
except Exception as e:
@@ -3496,7 +3673,7 @@ class HermesCLI:
print(" 1. Set environment variables:")
print(" TELEGRAM_BOT_TOKEN=your_token")
print(" DISCORD_BOT_TOKEN=your_token")
print(" 2. Or configure settings in ~/.hermes/config.yaml")
print(f" 2. Or configure settings in {display_hermes_home()}/config.yaml")
print()
def process_command(self, command: str) -> bool:
@@ -3524,6 +3701,8 @@ class HermesCLI:
return False
elif canonical == "help":
self.show_help()
elif canonical == "profile":
self._handle_profile_command()
elif canonical == "tools":
self._handle_tools_command(cmd_original)
elif canonical == "toolsets":
@@ -3647,6 +3826,8 @@ class HermesCLI:
_cprint(" Session database not available.")
elif canonical == "new":
self.new_session()
elif canonical == "resume":
self._handle_resume_command(cmd_original)
elif canonical == "provider":
self._show_model_and_providers()
elif canonical == "prompt":
@@ -3679,6 +3860,8 @@ class HermesCLI:
self.console.print(f" Status bar {state}")
elif canonical == "verbose":
self._toggle_verbose()
elif canonical == "yolo":
self._toggle_yolo()
elif canonical == "reasoning":
self._handle_reasoning_command(cmd_original)
elif canonical == "compress":
@@ -3701,7 +3884,7 @@ class HermesCLI:
plugins = mgr.list_plugins()
if not plugins:
print("No plugins installed.")
print("Drop plugin directories into ~/.hermes/plugins/ to get started.")
print(f"Drop plugin directories into {display_hermes_home()}/plugins/ to get started.")
else:
print(f"Plugins ({len(plugins)}):")
for p in plugins:
@@ -3722,17 +3905,17 @@ class HermesCLI:
elif canonical == "background":
self._handle_background_command(cmd_original)
elif canonical == "queue":
if not self._agent_running:
_cprint(" /queue only works while Hermes is busy. Just type your message normally.")
# Extract prompt after "/queue " or "/q "
parts = cmd_original.split(None, 1)
payload = parts[1].strip() if len(parts) > 1 else ""
if not payload:
_cprint(" Usage: /queue <prompt>")
else:
# Extract prompt after "/queue " or "/q "
parts = cmd_original.split(None, 1)
payload = parts[1].strip() if len(parts) > 1 else ""
if not payload:
_cprint(" Usage: /queue <prompt>")
else:
self._pending_input.put(payload)
self._pending_input.put(payload)
if self._agent_running:
_cprint(f" Queued for the next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
else:
_cprint(f" Queued: {payload[:80]}{'...' if len(payload) > 80 else ''}")
elif canonical == "skin":
self._handle_skin_command(cmd_original)
elif canonical == "voice":
@@ -3924,6 +4107,17 @@ class HermesCLI:
provider_data_collection=self._provider_data_collection,
fallback_model=self._fallback_model,
)
# Silence raw spinner; route thinking through TUI widget when no foreground agent is active.
bg_agent._print_fn = lambda *_a, **_kw: None
def _bg_thinking(text: str) -> None:
# Concurrent bg tasks may race on _spinner_text; acceptable for best-effort UI.
if not self._agent_running:
self._spinner_text = text
if self._app:
self._app.invalidate()
bg_agent.thinking_callback = _bg_thinking
result = bg_agent.run_conversation(
user_message=prompt,
@@ -3986,6 +4180,9 @@ class HermesCLI:
_cprint(f" ❌ Background task #{task_num} failed: {e}")
finally:
self._background_tasks.pop(task_id, None)
# Clear spinner only if no foreground agent owns it
if not self._agent_running:
self._spinner_text = ""
if self._app:
self._invalidate(min_interval=0)
@@ -4216,7 +4413,7 @@ class HermesCLI:
source = f" ({s['source']})" if s["source"] == "user" else ""
print(f" {marker} {s['name']}{source}{s['description']}")
print("\n Usage: /skin <name>")
print(" Custom skins: drop a YAML file in ~/.hermes/skins/\n")
print(f" Custom skins: drop a YAML file in {display_hermes_home()}/skins/\n")
return
new_skin = parts[1].strip().lower()
@@ -4263,6 +4460,17 @@ class HermesCLI:
}
_cprint(labels.get(self.tool_progress_mode, ""))
def _toggle_yolo(self):
"""Toggle YOLO mode — skip all dangerous command approval prompts."""
import os
current = bool(os.environ.get("HERMES_YOLO_MODE"))
if current:
os.environ.pop("HERMES_YOLO_MODE", None)
self.console.print(" ⚠ YOLO mode [bold red]OFF[/] — dangerous commands will require approval.")
else:
os.environ["HERMES_YOLO_MODE"] = "1"
self.console.print(" ⚡ YOLO mode [bold green]ON[/] — all commands auto-approved. Use with caution.")
def _handle_reasoning_command(self, cmd: str):
"""Handle /reasoning — manage effort level and display toggle.
@@ -4396,7 +4604,7 @@ class HermesCLI:
compressor = agent.context_compressor
last_prompt = compressor.last_prompt_tokens
ctx_len = compressor.context_length
pct = (last_prompt / ctx_len * 100) if ctx_len else 0
pct = min(100, (last_prompt / ctx_len * 100)) if ctx_len else 0
compressions = compressor.compression_count
msg_count = len(self.conversation_history)
@@ -4654,8 +4862,10 @@ class HermesCLI:
from agent.display import get_tool_emoji
emoji = get_tool_emoji(function_name)
label = preview or function_name
if len(label) > 50:
label = label[:47] + "..."
from agent.display import get_tool_preview_max_len
_pl = get_tool_preview_max_len()
if _pl > 0 and len(label) > _pl:
label = label[:_pl - 3] + "..."
self._spinner_text = f"{emoji} {label}"
self._invalidate()
@@ -5387,6 +5597,8 @@ class HermesCLI:
self.agent = None
# Initialize agent if needed
if self.agent is None:
_cprint(f"{_DIM}Initializing agent...{_RST}")
if not self._init_agent(
model_override=turn_route["model"],
runtime_override=turn_route["runtime"],
@@ -5424,6 +5636,13 @@ class HermesCLI:
except Exception as e:
logging.debug("@ context reference expansion failed: %s", e)
# Sanitize surrogate characters that can arrive via clipboard paste from
# rich-text editors (Google Docs, Word, etc.). Lone surrogates are invalid
# UTF-8 and crash JSON serialization in the OpenAI SDK.
if isinstance(message, str):
from run_agent import _sanitize_surrogates
message = _sanitize_surrogates(message)
# Add user message to history
self.conversation_history.append({"role": "user", "content": message})
@@ -5436,6 +5655,10 @@ class HermesCLI:
# Reset streaming display state for this turn
self._reset_stream_state()
# Separate from _reset_stream_state because this must persist
# across intermediate turn boundaries (tool-calling loops) — only
# reset at the start of each user turn.
self._reasoning_shown_this_turn = False
# --- Streaming TTS setup ---
# When ElevenLabs is the TTS provider and sounddevice is available,
@@ -5580,6 +5803,16 @@ class HermesCLI:
agent_thread.join() # Ensure agent thread completes
# Proactively clean up async clients whose event loop is dead.
# The agent thread may have created AsyncOpenAI clients bound
# to a per-thread event loop; if that loop is now closed, those
# clients' __del__ would crash prompt_toolkit's loop on GC.
try:
from agent.auxiliary_client import cleanup_stale_async_clients
cleanup_stale_async_clients()
except Exception:
pass
# Flush any remaining streamed text and close the box
self._flush_stream()
@@ -5640,8 +5873,13 @@ class HermesCLI:
response_previewed = result.get("response_previewed", False) if result else False
# Display reasoning (thinking) box if enabled and available.
# Skip when streaming already showed reasoning live.
if self.show_reasoning and result and not self._reasoning_stream_started:
# Skip when streaming already showed reasoning live. Use the
# turn-persistent flag (_reasoning_shown_this_turn) instead of
# _reasoning_stream_started — the latter gets reset during
# intermediate turn boundaries (tool-calling loops), which caused
# the reasoning box to re-render after the final response.
_reasoning_already_shown = getattr(self, '_reasoning_shown_this_turn', False)
if self.show_reasoning and result and not _reasoning_already_shown:
reasoning = result.get("last_reasoning")
if reasoning:
w = shutil.get_terminal_size().columns
@@ -5762,10 +6000,22 @@ class HermesCLI:
else:
duration_str = f"{seconds}s"
# Look up session title for resume-by-name hint
session_title = None
if self._session_db:
try:
session_title = self._session_db.get_session_title(self.session_id)
except Exception:
pass
print("Resume this session with:")
print(f" hermes --resume {self.session_id}")
if session_title:
print(f" hermes -c \"{session_title}\"")
print()
print(f"Session: {self.session_id}")
if session_title:
print(f"Title: {session_title}")
print(f"Duration: {duration_str}")
print(f"Messages: {msg_count} ({user_msgs} user, {tool_calls} tool calls)")
else:
@@ -5782,6 +6032,9 @@ class HermesCLI:
``normal_prompt`` is the full ``branding.prompt_symbol``.
``state_suffix`` is what special states (sudo/secret/approval/agent)
should render after their leading icon.
When a profile is active (not "default"), the profile name is
prepended to the prompt symbol: ``coder `` instead of ````.
"""
try:
from hermes_cli.skin_engine import get_active_prompt_symbol
@@ -5790,6 +6043,15 @@ class HermesCLI:
symbol = " "
symbol = (symbol or " ").rstrip() + " "
# Prepend profile name when not default
try:
from hermes_cli.profiles import get_active_profile_name
profile = get_active_profile_name()
if profile not in ("default", "custom"):
symbol = f"{profile} {symbol}"
except Exception:
pass
stripped = symbol.rstrip()
if not stripped:
return " ", " "
@@ -5941,7 +6203,7 @@ class HermesCLI:
from honcho_integration.client import HonchoClientConfig
from agent.display import honcho_session_line, write_tty
hcfg = HonchoClientConfig.from_global_config()
if hcfg.enabled and hcfg.api_key and hcfg.explicitly_configured:
if hcfg.enabled and (hcfg.api_key or hcfg.base_url) and hcfg.explicitly_configured:
sname = hcfg.resolve_session_name(session_id=self.session_id)
if sname:
write_tty(honcho_session_line(hcfg.workspace_id, sname) + "\n")
@@ -5977,6 +6239,11 @@ class HermesCLI:
self._interrupt_queue = queue.Queue() # For messages typed while agent is running
self._should_exit = False
self._last_ctrl_c_time = 0 # Track double Ctrl+C for force exit
# Give plugin manager a CLI reference so plugins can inject messages
from hermes_cli.plugins import get_plugin_manager
get_plugin_manager()._cli_ref = self
# Config file watcher — detect mcp_servers changes and auto-reload
from hermes_cli.config import get_config_path as _get_config_path
_cfg_path = _get_config_path()
@@ -6028,10 +6295,18 @@ class HermesCLI:
set_approval_callback(self._approval_callback)
set_secret_capture_callback(self._secret_capture_callback)
# Ensure tirith security scanner is available (downloads if needed)
# Ensure tirith security scanner is available (downloads if needed).
# Warn the user if tirith is enabled in config but not available,
# so they know command security scanning is degraded.
try:
from tools.tirith_security import ensure_installed
ensure_installed(log_failures=False)
tirith_path = ensure_installed(log_failures=False)
if tirith_path is None:
security_cfg = self.config.get("security", {}) or {}
tirith_enabled = security_cfg.get("tirith_enabled", True)
if tirith_enabled:
_cprint(f" {_DIM}⚠ tirith security scanner enabled but not available "
f"— command scanning will use pattern matching only{_RST}")
except Exception:
pass # Non-fatal — fail-open at scan time if unavailable
@@ -6112,16 +6387,22 @@ class HermesCLI:
# Bundle text + images as a tuple when images are present
payload = (text, images) if images else text
if self._agent_running and not (text and text.startswith("/")):
self._interrupt_queue.put(payload)
# Debug: log to file when message enters interrupt queue
try:
_dbg = _hermes_home / "interrupt_debug.log"
with open(_dbg, "a") as _f:
import time as _t
_f.write(f"{_t.strftime('%H:%M:%S')} ENTER: queued interrupt msg={str(payload)[:60]!r}, "
f"agent_running={self._agent_running}\n")
except Exception:
pass
if self.busy_input_mode == "queue":
# Queue for the next turn instead of interrupting
self._pending_input.put(payload)
preview = text if text else f"[{len(images)} image{'s' if len(images) != 1 else ''} attached]"
_cprint(f" Queued for the next turn: {preview[:80]}{'...' if len(preview) > 80 else ''}")
else:
self._interrupt_queue.put(payload)
# Debug: log to file when message enters interrupt queue
try:
_dbg = _hermes_home / "interrupt_debug.log"
with open(_dbg, "a") as _f:
import time as _t
_f.write(f"{_t.strftime('%H:%M:%S')} ENTER: queued interrupt msg={str(payload)[:60]!r}, "
f"agent_running={self._agent_running}\n")
except Exception:
pass
else:
self._pending_input.put(payload)
event.app.current_buffer.reset(append_to_history=True)
@@ -6312,6 +6593,24 @@ class HermesCLI:
self._should_exit = True
event.app.exit()
@kb.add('c-z')
def handle_ctrl_z(event):
"""Handle Ctrl+Z - suspend process to background (Unix only)."""
import sys
if sys.platform == 'win32':
_cprint(f"\n{_DIM}Suspend (Ctrl+Z) is not supported on Windows.{_RST}")
event.app.invalidate()
return
import os, signal as _sig
from prompt_toolkit.application import run_in_terminal
from hermes_cli.skin_engine import get_active_skin
agent_name = get_active_skin().get_branding("agent_name", "Hermes Agent")
msg = f"\n{agent_name} has been suspended. Run `fg` to bring {agent_name} back."
def _suspend():
os.write(1, msg.encode())
os.kill(0, _sig.SIGTSTP)
run_in_terminal(_suspend)
# Voice push-to-talk key: configurable via config.yaml (voice.record_key)
# Default: Ctrl+B (avoids conflict with Ctrl+R readline reverse-search)
# Config uses "ctrl+b" format; prompt_toolkit expects "c-b" format.
@@ -6501,6 +6800,7 @@ class HermesCLI:
# Paste collapsing: detect large pastes and save to temp file
_paste_counter = [0]
_prev_text_len = [0]
_prev_newline_count = [0]
_paste_just_collapsed = [False]
def _on_text_changed(buf):
@@ -6509,18 +6809,27 @@ class HermesCLI:
When bracketed paste is available, handle_paste collapses
large pastes directly. This handler is a fallback for
terminals without bracketed paste support.
Two heuristics (either triggers collapse):
1. Many characters added at once (chars_added > 1) works
when the terminal delivers the paste in one event-loop tick.
2. Newline count jumped by 4+ in a single text-change event
catches terminals that feed characters individually but
still batch newlines. Alt+Enter only adds 1 newline per
event so it never triggers this.
"""
text = buf.text
chars_added = len(text) - _prev_text_len[0]
_prev_text_len[0] = len(text)
if _paste_just_collapsed[0]:
_paste_just_collapsed[0] = False
_prev_newline_count[0] = text.count('\n')
return
line_count = text.count('\n')
# Heuristic: a real paste adds many characters at once (not just a
# single newline from Alt+Enter) AND the result has 5+ lines.
# Fallback for terminals without bracketed paste support.
if line_count >= 5 and chars_added > 1 and not text.startswith('/'):
newlines_added = line_count - _prev_newline_count[0]
_prev_newline_count[0] = line_count
is_paste = chars_added > 1 or newlines_added >= 4
if line_count >= 5 and is_paste and not text.startswith('/'):
_paste_counter[0] += 1
# Save to temp file
paste_dir = _hermes_home / "pastes"
@@ -6528,6 +6837,7 @@ class HermesCLI:
paste_file = paste_dir / f"paste_{_paste_counter[0]}_{datetime.now().strftime('%H%M%S')}.txt"
paste_file.write_text(text, encoding="utf-8")
# Replace buffer with compact reference
_paste_just_collapsed[0] = True
buf.text = f"[Pasted text #{_paste_counter[0]}: {line_count + 1} lines \u2192 {paste_file}]"
buf.cursor_position = len(buf.text)
@@ -6894,6 +7204,15 @@ class HermesCLI:
Window(
content=FormattedTextControl(lambda: cli_ref._get_status_bar_fragments()),
height=1,
# Prevent fragments that overflow the terminal width from
# wrapping onto a second line, which causes the status bar to
# appear duplicated (one full + one partial row) during long
# sessions, especially on SSH where shutil.get_terminal_size
# may return stale values. _get_status_bar_fragments now reads
# width from prompt_toolkit's own output object, so fragments
# will always fit; wrap_lines=False is the belt-and-suspenders
# guard against any future width mismatch.
wrap_lines=False,
),
filter=Condition(lambda: cli_ref._status_bar_visible),
)
@@ -7128,9 +7447,28 @@ class HermesCLI:
# Register atexit cleanup so resources are freed even on unexpected exit
atexit.register(_run_cleanup)
# Install a custom asyncio exception handler that suppresses the
# "Event loop is closed" RuntimeError from httpx transport cleanup.
# This is defense-in-depth — the primary fix is neuter_async_httpx_del
# which disables __del__ entirely, but older clients or SDK upgrades
# could bypass it.
def _suppress_closed_loop_errors(loop, context):
exc = context.get("exception")
if isinstance(exc, RuntimeError) and "Event loop is closed" in str(exc):
return # silently suppress
# Fall back to default handler for everything else
loop.default_exception_handler(context)
# Run the application with patch_stdout for proper output handling
try:
with patch_stdout():
# Set the custom handler on prompt_toolkit's event loop
try:
import asyncio as _aio
_loop = _aio.get_event_loop()
_loop.set_exception_handler(_suppress_closed_loop_errors)
except Exception:
pass
app.run()
except (EOFError, KeyboardInterrupt):
pass
+42 -1
View File
@@ -327,7 +327,20 @@ def load_jobs() -> List[Dict[str, Any]]:
with open(JOBS_FILE, 'r', encoding='utf-8') as f:
data = json.load(f)
return data.get("jobs", [])
except (json.JSONDecodeError, IOError):
except json.JSONDecodeError:
# Retry with strict=False to handle bare control chars in string values
try:
with open(JOBS_FILE, 'r', encoding='utf-8') as f:
data = json.loads(f.read(), strict=False)
jobs = data.get("jobs", [])
if jobs:
# Auto-repair: rewrite with proper escaping
save_jobs(jobs)
logger.warning("Auto-repaired jobs.json (had invalid control characters)")
return jobs
except Exception:
return []
except IOError:
return []
@@ -598,6 +611,34 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):
save_jobs(jobs)
def advance_next_run(job_id: str) -> bool:
"""Preemptively advance next_run_at for a recurring job before execution.
Call this BEFORE run_job() so that if the process crashes mid-execution,
the job won't re-fire on the next gateway restart. This converts the
scheduler from at-least-once to at-most-once for recurring jobs — missing
one run is far better than firing dozens of times in a crash loop.
One-shot jobs are left unchanged so they can still retry on restart.
Returns True if next_run_at was advanced, False otherwise.
"""
jobs = load_jobs()
for job in jobs:
if job["id"] == job_id:
kind = job.get("schedule", {}).get("kind")
if kind not in ("cron", "interval"):
return False
now = _hermes_now().isoformat()
new_next = compute_next_run(job["schedule"], now)
if new_next and new_next != job.get("next_run_at"):
job["next_run_at"] = new_next
save_jobs(jobs)
return True
return False
return False
def get_due_jobs() -> List[Dict[str, Any]]:
"""Get all jobs that are due to run now.
+55 -18
View File
@@ -26,6 +26,7 @@ except ImportError:
msvcrt = None
from pathlib import Path
from hermes_constants import get_hermes_home
from hermes_cli.config import load_config
from typing import Optional
from hermes_time import now as _hermes_now
@@ -35,7 +36,7 @@ logger = logging.getLogger(__name__)
# Add parent directory to path for imports
sys.path.insert(0, str(Path(__file__).parent.parent))
from cron.jobs import get_due_jobs, mark_job_run, save_job_output
from cron.jobs import get_due_jobs, mark_job_run, save_job_output, advance_next_run
# Sentinel: when a cron agent has nothing new to report, it can start its
# response with this marker to suppress delivery. Output is still saved
@@ -86,6 +87,22 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
chat_id, thread_id = rest.split(":", 1)
else:
chat_id, thread_id = rest, None
# Resolve human-friendly labels like "Alice (dm)" to real IDs.
# send_message(action="list") shows labels with display suffixes
# that aren't valid platform IDs (e.g. WhatsApp JIDs).
try:
from gateway.channel_directory import resolve_channel_name
target = chat_id
# Strip display suffix like " (dm)" or " (group)"
if target.endswith(")") and " (" in target:
target = target.rsplit(" (", 1)[0].strip()
resolved = resolve_channel_name(platform_name.lower(), target)
if resolved:
chat_id = resolved
except Exception:
pass
return {
"platform": platform_name,
"chat_id": chat_id,
@@ -145,6 +162,8 @@ def _deliver_result(job: dict, content: str) -> None:
"mattermost": Platform.MATTERMOST,
"homeassistant": Platform.HOMEASSISTANT,
"dingtalk": Platform.DINGTALK,
"feishu": Platform.FEISHU,
"wecom": Platform.WECOM,
"email": Platform.EMAIL,
"sms": Platform.SMS,
}
@@ -164,18 +183,29 @@ def _deliver_result(job: dict, content: str) -> None:
logger.warning("Job '%s': platform '%s' not configured/enabled", job["id"], platform_name)
return
# Wrap the content so the user knows this is a cron delivery and that
# the interactive agent has no visibility into it.
task_name = job.get("name", job["id"])
wrapped = (
f"Cronjob Response: {task_name}\n"
f"-------------\n\n"
f"{content}\n\n"
f"Note: The agent cannot see this message, and therefore cannot respond to it."
)
# Optionally wrap the content with a header/footer so the user knows this
# is a cron delivery. Wrapping is on by default; set cron.wrap_response: false
# in config.yaml for clean output.
wrap_response = True
try:
user_cfg = load_config()
wrap_response = user_cfg.get("cron", {}).get("wrap_response", True)
except Exception:
pass
if wrap_response:
task_name = job.get("name", job["id"])
delivery_content = (
f"Cronjob Response: {task_name}\n"
f"-------------\n\n"
f"{content}\n\n"
f"Note: The agent cannot see this message, and therefore cannot respond to it."
)
else:
delivery_content = content
# Run the async send in a fresh event loop (safe from any thread)
coro = _send_to_platform(platform, pconfig, chat_id, wrapped, thread_id=thread_id)
coro = _send_to_platform(platform, pconfig, chat_id, delivery_content, thread_id=thread_id)
try:
result = asyncio.run(coro)
except RuntimeError:
@@ -186,7 +216,7 @@ def _deliver_result(job: dict, content: str) -> None:
coro.close()
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, wrapped, thread_id=thread_id))
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, delivery_content, thread_id=thread_id))
result = future.result(timeout=30)
except Exception as e:
logger.error("Job '%s': delivery to %s:%s failed: %s", job["id"], platform_name, chat_id, e)
@@ -206,11 +236,12 @@ def _build_job_prompt(job: dict) -> str:
# Always prepend [SILENT] guidance so the cron agent can suppress
# delivery when it has nothing new or noteworthy to report.
silent_hint = (
"[SYSTEM: If you have nothing new or noteworthy to report, respond "
"with exactly \"[SILENT]\" (optionally followed by a brief internal "
"note). This suppresses delivery to the user while still saving "
"output locally. Only use [SILENT] when there are genuinely no "
"changes worth reporting.]\n\n"
"[SYSTEM: If you have a meaningful status report or findings, "
"send them — that is the whole point of this job. Only respond "
"with exactly \"[SILENT]\" (nothing else) when there is genuinely "
"nothing new to report. [SILENT] suppresses delivery to the user. "
"Never combine [SILENT] with content — either report your "
"findings normally, or say [SILENT] and nothing more.]\n\n"
)
prompt = silent_hint + prompt
if skills is None:
@@ -308,7 +339,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
if delivery_target.get("thread_id") is not None:
os.environ["HERMES_CRON_AUTO_DELIVER_THREAD_ID"] = str(delivery_target["thread_id"])
model = job.get("model") or os.getenv("HERMES_MODEL") or "anthropic/claude-opus-4.6"
model = job.get("model") or os.getenv("HERMES_MODEL") or ""
# Load config.yaml for model, reasoning, prefill, toolsets, provider routing
_cfg = {}
@@ -524,6 +555,12 @@ def tick(verbose: bool = True) -> int:
executed = 0
for job in due_jobs:
try:
# For recurring jobs (cron/interval), advance next_run_at to the
# next future occurrence BEFORE execution. This way, if the
# process crashes mid-run, the job won't re-fire on restart.
# One-shot jobs are left alone so they can retry on restart.
advance_next_run(job["id"])
success, output, final_response, error = run_job(job)
output_file = save_job_output(job["id"], output)
+15
View File
@@ -0,0 +1,15 @@
# Hermes Agent Persona
<!--
This file defines the agent's personality and tone.
The agent will embody whatever you write here.
Edit this to customize how Hermes communicates with you.
Examples:
- "You are a warm, playful assistant who uses kaomoji occasionally."
- "You are a concise technical expert. No fluff, just facts."
- "You speak like a friendly coworker who happens to know everything."
This file is loaded fresh each message -- no restart needed.
Delete the contents (or this file) to use the default personality.
-->
+34
View File
@@ -0,0 +1,34 @@
#!/bin/bash
# Docker entrypoint: bootstrap config files into the mounted volume, then run hermes.
set -e
HERMES_HOME="/opt/data"
INSTALL_DIR="/opt/hermes"
# Create essential directory structure. Cache and platform directories
# (cache/images, cache/audio, platforms/whatsapp, etc.) are created on
# demand by the application — don't pre-create them here so new installs
# get the consolidated layout from get_hermes_dir().
mkdir -p "$HERMES_HOME"/{cron,sessions,logs,hooks,memories,skills}
# .env
if [ ! -f "$HERMES_HOME/.env" ]; then
cp "$INSTALL_DIR/.env.example" "$HERMES_HOME/.env"
fi
# config.yaml
if [ ! -f "$HERMES_HOME/config.yaml" ]; then
cp "$INSTALL_DIR/cli-config.yaml.example" "$HERMES_HOME/config.yaml"
fi
# SOUL.md
if [ ! -f "$HERMES_HOME/SOUL.md" ]; then
cp "$INSTALL_DIR/docker/SOUL.md" "$HERMES_HOME/SOUL.md"
fi
# Sync bundled skills (manifest-based so user edits are preserved)
if [ -d "$INSTALL_DIR/skills" ]; then
python3 "$INSTALL_DIR/tools/skills_sync.py"
fi
exec hermes "$@"
+3 -13
View File
@@ -101,21 +101,11 @@ Available methods:
### Patches (`patches.py`)
**Problem**: Some hermes-agent tools use `asyncio.run()` internally (e.g., the Modal backend via SWE-ReX). This crashes when called from inside Atropos's event loop because `asyncio.run()` cannot be nested.
**Problem**: Some hermes-agent tools use `asyncio.run()` internally (e.g., the Modal backend). This crashes when called from inside Atropos's event loop because `asyncio.run()` cannot be nested.
**Solution**: `patches.py` monkey-patches `SwerexModalEnvironment` to use a dedicated background thread (`_AsyncWorker`) with its own event loop. The calling code sees the same sync interface, but internally the async work happens on a separate thread that doesn't conflict with Atropos's loop.
**Solution**: `ModalEnvironment` uses a dedicated `_AsyncWorker` background thread with its own event loop. The calling code sees a sync interface, but internally all async Modal SDK calls happen on the worker thread so they don't conflict with Atropos's loop. This is built directly into `tools/environments/modal.py` — no monkey-patching required.
What gets patched:
- `SwerexModalEnvironment.__init__` -- creates Modal deployment on a background thread
- `SwerexModalEnvironment.execute` -- runs commands on the same background thread
- `SwerexModalEnvironment.stop` -- stops deployment on the background thread
The patches are:
- **Idempotent** -- calling `apply_patches()` multiple times is safe
- **Transparent** -- same interface and behavior, only the internal async execution changes
- **Universal** -- works identically in normal CLI use (no running event loop)
Applied automatically at import time by `hermes_base_env.py`.
`patches.py` is now a no-op (kept for backward compatibility with imports).
### Tool Call Parsers (`tool_call_parsers/`)
@@ -209,7 +209,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
# Agent settings -- TB2 tasks are complex, need many turns
max_agent_turns=60,
max_token_length=***
max_token_length=16000,
agent_temperature=0.6,
system_prompt=None,
@@ -233,7 +233,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
steps_per_eval=1,
total_steps=1,
tokenizer_name="NousRe...1-8B",
tokenizer_name="NousResearch/Hermes-3-Llama-3.1-8B",
use_wandb=True,
wandb_name="terminal-bench-2",
ensure_scores_are_not_same=False, # Binary rewards may all be 0 or 1
@@ -245,7 +245,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
base_url="https://openrouter.ai/api/v1",
model_name="anthropic/claude-sonnet-4",
server_type="openai",
api_key=os.get...EY", ""),
api_key=os.getenv("OPENROUTER_API_KEY", ""),
health_check=False,
)
]
@@ -513,3 +513,446 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
reward = 0.0
else:
# Run tests in a thread so the blocking ctx.terminal() calls
# don't freeze the entire event loop (which would stall all
# other tasks, tqdm updates, and timeout timers).
ctx = ToolContext(task_id)
try:
loop = asyncio.get_event_loop()
reward = await loop.run_in_executor(
None, # default thread pool
self._run_tests, eval_item, ctx, task_name,
)
except Exception as e:
logger.error("Task %s: test verification failed: %s", task_name, e)
reward = 0.0
finally:
ctx.cleanup()
passed = reward == 1.0
status = "PASS" if passed else "FAIL"
elapsed = time.time() - task_start
tqdm.write(f" [{status}] {task_name} (turns={result.turns_used}, {elapsed:.0f}s)")
logger.info(
"Task %s: reward=%.1f, turns=%d, finished=%s",
task_name, reward, result.turns_used, result.finished_naturally,
)
out = {
"passed": passed,
"reward": reward,
"task_name": task_name,
"category": category,
"turns_used": result.turns_used,
"finished_naturally": result.finished_naturally,
"messages": result.messages,
}
self._save_result(out)
return out
except Exception as e:
elapsed = time.time() - task_start
logger.error("Task %s: rollout failed: %s", task_name, e, exc_info=True)
tqdm.write(f" [ERROR] {task_name}: {e} ({elapsed:.0f}s)")
out = {
"passed": False, "reward": 0.0,
"task_name": task_name, "category": category,
"error": str(e),
}
self._save_result(out)
return out
finally:
# --- Cleanup: clear overrides, sandbox, and temp files ---
clear_task_env_overrides(task_id)
try:
cleanup_vm(task_id)
except Exception as e:
logger.debug("VM cleanup for %s: %s", task_id[:8], e)
if task_dir and task_dir.exists():
shutil.rmtree(task_dir, ignore_errors=True)
def _run_tests(
self, item: Dict[str, Any], ctx: ToolContext, task_name: str
) -> float:
"""
Upload and execute the test suite in the agent's sandbox, then
download the verifier output locally to read the reward.
Follows Harbor's verification pattern:
1. Upload tests/ directory into the sandbox
2. Execute test.sh inside the sandbox
3. Download /logs/verifier/ directory to a local temp dir
4. Read reward.txt locally with native Python I/O
Downloading locally avoids issues with the file_read tool on
the Modal VM and matches how Harbor handles verification.
TB2 test scripts (test.sh) typically:
1. Install pytest via uv/pip
2. Run pytest against the test files in /tests/
3. Write results to /logs/verifier/reward.txt
Args:
item: The TB2 task dict (contains tests_tar, test_sh)
ctx: ToolContext scoped to this task's sandbox
task_name: For logging
Returns:
1.0 if tests pass, 0.0 otherwise
"""
tests_tar = item.get("tests_tar", "")
test_sh = item.get("test_sh", "")
if not test_sh:
logger.warning("Task %s: no test_sh content, reward=0", task_name)
return 0.0
# Create required directories in the sandbox
ctx.terminal("mkdir -p /tests /logs/verifier")
# Upload test files into the sandbox (binary-safe via base64)
if tests_tar:
tests_temp = Path(tempfile.mkdtemp(prefix=f"tb2-tests-{task_name}-"))
try:
_extract_base64_tar(tests_tar, tests_temp)
ctx.upload_dir(str(tests_temp), "/tests")
except Exception as e:
logger.warning("Task %s: failed to upload test files: %s", task_name, e)
finally:
shutil.rmtree(tests_temp, ignore_errors=True)
# Write the test runner script (test.sh)
ctx.write_file("/tests/test.sh", test_sh)
ctx.terminal("chmod +x /tests/test.sh")
# Execute the test suite
logger.info(
"Task %s: running test suite (timeout=%ds)",
task_name, self.config.test_timeout,
)
test_result = ctx.terminal(
"bash /tests/test.sh",
timeout=self.config.test_timeout,
)
exit_code = test_result.get("exit_code", -1)
output = test_result.get("output", "")
# Download the verifier output directory locally, then read reward.txt
# with native Python I/O. This avoids issues with file_read on the
# Modal VM and matches Harbor's verification pattern.
reward = 0.0
local_verifier_dir = Path(tempfile.mkdtemp(prefix=f"tb2-verifier-{task_name}-"))
try:
ctx.download_dir("/logs/verifier", str(local_verifier_dir))
reward_file = local_verifier_dir / "reward.txt"
if reward_file.exists() and reward_file.stat().st_size > 0:
content = reward_file.read_text().strip()
if content == "1":
reward = 1.0
elif content == "0":
reward = 0.0
else:
# Unexpected content -- try parsing as float
try:
reward = float(content)
except (ValueError, TypeError):
logger.warning(
"Task %s: reward.txt content unexpected (%r), "
"falling back to exit_code=%d",
task_name, content, exit_code,
)
reward = 1.0 if exit_code == 0 else 0.0
else:
# reward.txt not written -- fall back to exit code
logger.warning(
"Task %s: reward.txt not found after download, "
"falling back to exit_code=%d",
task_name, exit_code,
)
reward = 1.0 if exit_code == 0 else 0.0
except Exception as e:
logger.warning(
"Task %s: failed to download verifier dir: %s, "
"falling back to exit_code=%d",
task_name, e, exit_code,
)
reward = 1.0 if exit_code == 0 else 0.0
finally:
shutil.rmtree(local_verifier_dir, ignore_errors=True)
# Log test output for debugging failures
if reward == 0.0:
output_preview = output[-500:] if output else "(no output)"
logger.info(
"Task %s: FAIL (exit_code=%d)\n%s",
task_name, exit_code, output_preview,
)
return reward
# =========================================================================
# Evaluate -- main entry point for the eval subcommand
# =========================================================================
async def _eval_with_timeout(self, item: Dict[str, Any]) -> Dict:
"""
Wrap rollout_and_score_eval with a per-task wall-clock timeout.
If the task exceeds task_timeout seconds, it's automatically scored
as FAIL. This prevents any single task from hanging indefinitely.
"""
task_name = item.get("task_name", "unknown")
category = item.get("category", "unknown")
try:
return await asyncio.wait_for(
self.rollout_and_score_eval(item),
timeout=self.config.task_timeout,
)
except asyncio.TimeoutError:
from tqdm import tqdm
elapsed = self.config.task_timeout
tqdm.write(f" [TIMEOUT] {task_name} (exceeded {elapsed}s wall-clock limit)")
logger.error("Task %s: wall-clock timeout after %ds", task_name, elapsed)
out = {
"passed": False, "reward": 0.0,
"task_name": task_name, "category": category,
"error": f"timeout ({elapsed}s)",
}
self._save_result(out)
return out
async def evaluate(self, *args, **kwargs) -> None:
"""
Run Terminal-Bench 2.0 evaluation over all tasks.
This is the main entry point when invoked via:
python environments/terminalbench2_env.py evaluate
Runs all tasks through rollout_and_score_eval() via asyncio.gather()
(same pattern as GPQA and other Atropos eval envs). Each task is
wrapped with a wall-clock timeout so hung tasks auto-fail.
Suppresses noisy Modal/terminal output (HERMES_QUIET) so the tqdm
bar stays visible.
"""
start_time = time.time()
# Route all logging through tqdm.write() so the progress bar stays
# pinned at the bottom while log lines scroll above it.
from tqdm import tqdm
class _TqdmHandler(logging.Handler):
def emit(self, record):
try:
tqdm.write(self.format(record))
except Exception:
self.handleError(record)
handler = _TqdmHandler()
handler.setFormatter(logging.Formatter(
"%(asctime)s [%(name)s] %(levelname)s: %(message)s",
datefmt="%H:%M:%S",
))
root = logging.getLogger()
root.handlers = [handler] # Replace any existing handlers
root.setLevel(logging.INFO)
# Silence noisy third-party loggers that flood the output
logging.getLogger("httpx").setLevel(logging.WARNING) # Every HTTP request
logging.getLogger("openai").setLevel(logging.WARNING) # OpenAI client retries
logging.getLogger("rex-deploy").setLevel(logging.WARNING) # Swerex deployment
logging.getLogger("rex_image_builder").setLevel(logging.WARNING) # Image builds
print(f"\n{'='*60}")
print("Starting Terminal-Bench 2.0 Evaluation")
print(f"{'='*60}")
print(f" Dataset: {self.config.dataset_name}")
print(f" Total tasks: {len(self.all_eval_items)}")
print(f" Max agent turns: {self.config.max_agent_turns}")
print(f" Task timeout: {self.config.task_timeout}s")
print(f" Terminal backend: {self.config.terminal_backend}")
print(f" Tool thread pool: {self.config.tool_pool_size}")
print(f" Terminal timeout: {self.config.terminal_timeout}s/cmd")
print(f" Terminal lifetime: {self.config.terminal_lifetime}s (auto: task_timeout + 120)")
print(f" Max concurrent tasks: {self.config.max_concurrent_tasks}")
print(f"{'='*60}\n")
# Semaphore to limit concurrent Modal sandbox creations.
# Without this, all 86 tasks fire simultaneously, each creating a Modal
# sandbox via asyncio.run() inside a thread pool worker. Modal's blocking
# calls (App.lookup, etc.) deadlock when too many are created at once.
semaphore = asyncio.Semaphore(self.config.max_concurrent_tasks)
async def _eval_with_semaphore(item):
async with semaphore:
return await self._eval_with_timeout(item)
# Fire all tasks with wall-clock timeout, track live accuracy on the bar
total_tasks = len(self.all_eval_items)
eval_tasks = [
asyncio.ensure_future(_eval_with_semaphore(item))
for item in self.all_eval_items
]
results = []
passed_count = 0
pbar = tqdm(total=total_tasks, desc="Evaluating TB2", dynamic_ncols=True)
try:
for coro in asyncio.as_completed(eval_tasks):
result = await coro
results.append(result)
if result and result.get("passed"):
passed_count += 1
done = len(results)
pct = (passed_count / done * 100) if done else 0
pbar.set_postfix_str(f"pass={passed_count}/{done} ({pct:.1f}%)")
pbar.update(1)
except (KeyboardInterrupt, asyncio.CancelledError):
pbar.close()
print(f"\n\nInterrupted! Cleaning up {len(eval_tasks)} tasks...")
# Cancel all pending tasks
for task in eval_tasks:
task.cancel()
# Let cancellations propagate (finally blocks run cleanup_vm)
await asyncio.gather(*eval_tasks, return_exceptions=True)
# Belt-and-suspenders: clean up any remaining sandboxes
from tools.terminal_tool import cleanup_all_environments
cleanup_all_environments()
print("All sandboxes cleaned up.")
return
finally:
pbar.close()
end_time = time.time()
# Filter out None results (shouldn't happen, but be safe)
valid_results = [r for r in results if r is not None]
if not valid_results:
print("Warning: No valid evaluation results obtained")
return
# ---- Compute metrics ----
total = len(valid_results)
passed = sum(1 for r in valid_results if r.get("passed"))
overall_pass_rate = passed / total if total > 0 else 0.0
# Per-category breakdown
cat_results: Dict[str, List[Dict]] = defaultdict(list)
for r in valid_results:
cat_results[r.get("category", "unknown")].append(r)
# Build metrics dict
eval_metrics = {
"eval/pass_rate": overall_pass_rate,
"eval/total_tasks": total,
"eval/passed_tasks": passed,
"eval/evaluation_time_seconds": end_time - start_time,
}
# Per-category metrics
for category, cat_items in sorted(cat_results.items()):
cat_passed = sum(1 for r in cat_items if r.get("passed"))
cat_total = len(cat_items)
cat_pass_rate = cat_passed / cat_total if cat_total > 0 else 0.0
cat_key = category.replace(" ", "_").replace("-", "_").lower()
eval_metrics[f"eval/pass_rate_{cat_key}"] = cat_pass_rate
# Store metrics for wandb_log
self.eval_metrics = [(k, v) for k, v in eval_metrics.items()]
# ---- Print summary ----
print(f"\n{'='*60}")
print("Terminal-Bench 2.0 Evaluation Results")
print(f"{'='*60}")
print(f"Overall Pass Rate: {overall_pass_rate:.4f} ({passed}/{total})")
print(f"Evaluation Time: {end_time - start_time:.1f} seconds")
print("\nCategory Breakdown:")
for category, cat_items in sorted(cat_results.items()):
cat_passed = sum(1 for r in cat_items if r.get("passed"))
cat_total = len(cat_items)
cat_rate = cat_passed / cat_total if cat_total > 0 else 0.0
print(f" {category}: {cat_rate:.1%} ({cat_passed}/{cat_total})")
# Print individual task results
print("\nTask Results:")
for r in sorted(valid_results, key=lambda x: x.get("task_name", "")):
status = "PASS" if r.get("passed") else "FAIL"
turns = r.get("turns_used", "?")
error = r.get("error", "")
extra = f" (error: {error})" if error else ""
print(f" [{status}] {r['task_name']} (turns={turns}){extra}")
print(f"{'='*60}\n")
# Build sample records for evaluate_log (includes full conversations)
samples = [
{
"task_name": r.get("task_name"),
"category": r.get("category"),
"passed": r.get("passed"),
"reward": r.get("reward"),
"turns_used": r.get("turns_used"),
"error": r.get("error"),
"messages": r.get("messages"),
}
for r in valid_results
]
# Log evaluation results
try:
await self.evaluate_log(
metrics=eval_metrics,
samples=samples,
start_time=start_time,
end_time=end_time,
generation_parameters={
"temperature": self.config.agent_temperature,
"max_tokens": self.config.max_token_length,
"max_agent_turns": self.config.max_agent_turns,
"terminal_backend": self.config.terminal_backend,
},
)
except Exception as e:
print(f"Error logging evaluation results: {e}")
# Close streaming file
if hasattr(self, "_streaming_file") and not self._streaming_file.closed:
self._streaming_file.close()
print(f" Live results saved to: {self._streaming_path}")
# Kill all remaining sandboxes. Timed-out tasks leave orphaned thread
# pool workers still executing commands -- cleanup_all stops them.
from tools.terminal_tool import cleanup_all_environments
print("\nCleaning up all sandboxes...")
cleanup_all_environments()
# Shut down the tool thread pool so orphaned workers from timed-out
# tasks are killed immediately instead of retrying against dead
# sandboxes and spamming the console with TimeoutError warnings.
from environments.agent_loop import _tool_executor
_tool_executor.shutdown(wait=False, cancel_futures=True)
print("Done.")
# =========================================================================
# Wandb logging
# =========================================================================
async def wandb_log(self, wandb_metrics: Optional[Dict] = None):
"""Log TB2-specific metrics to wandb."""
if wandb_metrics is None:
wandb_metrics = {}
# Add stored eval metrics
for metric_name, metric_value in self.eval_metrics:
wandb_metrics[metric_name] = metric_value
self.eval_metrics = []
await super().wandb_log(wandb_metrics)
if __name__ == "__main__":
TerminalBench2EvalEnv.cli()
+1
View File
@@ -0,0 +1 @@
"""Built-in gateway hooks that are always registered."""
+86
View File
@@ -0,0 +1,86 @@
"""Built-in boot-md hook — run ~/.hermes/BOOT.md on gateway startup.
This hook is always registered. It silently skips if no BOOT.md exists.
To activate, create ``~/.hermes/BOOT.md`` with instructions for the
agent to execute on every gateway restart.
Example BOOT.md::
# Startup Checklist
1. Check if any cron jobs failed overnight
2. Send a status update to Discord #general
3. If there are errors in /opt/app/deploy.log, summarize them
The agent runs in a background thread so it doesn't block gateway
startup. If nothing needs attention, it replies with [SILENT] to
suppress delivery.
"""
import logging
import os
import threading
from pathlib import Path
logger = logging.getLogger("hooks.boot-md")
HERMES_HOME = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes"))
BOOT_FILE = HERMES_HOME / "BOOT.md"
def _build_boot_prompt(content: str) -> str:
"""Wrap BOOT.md content in a system-level instruction."""
return (
"You are running a startup boot checklist. Follow the BOOT.md "
"instructions below exactly.\n\n"
"---\n"
f"{content}\n"
"---\n\n"
"Execute each instruction. If you need to send a message to a "
"platform, use the send_message tool.\n"
"If nothing needs attention and there is nothing to report, "
"reply with ONLY: [SILENT]"
)
def _run_boot_agent(content: str) -> None:
"""Spawn a one-shot agent session to execute the boot instructions."""
try:
from run_agent import AIAgent
prompt = _build_boot_prompt(content)
agent = AIAgent(
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
max_iterations=20,
)
result = agent.run_conversation(prompt)
response = result.get("final_response", "")
if response and "[SILENT]" not in response:
logger.info("boot-md completed: %s", response[:200])
else:
logger.info("boot-md completed (nothing to report)")
except Exception as e:
logger.error("boot-md agent failed: %s", e)
async def handle(event_type: str, context: dict) -> None:
"""Gateway startup handler — run BOOT.md if it exists."""
if not BOOT_FILE.exists():
return
content = BOOT_FILE.read_text(encoding="utf-8").strip()
if not content:
return
logger.info("Running BOOT.md (%d chars)", len(content))
# Run in a background thread so we don't block gateway startup.
thread = threading.Thread(
target=_run_boot_agent,
args=(content,),
name="boot-md",
daemon=True,
)
thread.start()
+134 -45
View File
@@ -27,9 +27,16 @@ def _coerce_bool(value: Any, default: bool = True) -> bool:
return default
if isinstance(value, bool):
return value
if isinstance(value, int):
return value != 0
if isinstance(value, str):
return value.strip().lower() in ("true", "1", "yes", "on")
return bool(value)
lowered = value.strip().lower()
if lowered in ("true", "1", "yes", "on"):
return True
if lowered in ("false", "0", "no", "off"):
return False
return default
return default
def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> str:
@@ -57,6 +64,8 @@ class Platform(Enum):
DINGTALK = "dingtalk"
API_SERVER = "api_server"
WEBHOOK = "webhook"
FEISHU = "feishu"
WECOM = "wecom"
@dataclass
@@ -274,6 +283,12 @@ class GatewayConfig:
# Webhook uses enabled flag only (secrets are per-route)
elif platform == Platform.WEBHOOK:
connected.append(platform)
# Feishu uses extra dict for app credentials
elif platform == Platform.FEISHU and config.extra.get("app_id"):
connected.append(platform)
# WeCom uses extra dict for bot credentials
elif platform == Platform.WECOM and config.extra.get("bot_id"):
connected.append(platform)
return connected
def get_home_channel(self, platform: Platform) -> Optional[HomeChannel]:
@@ -507,6 +522,10 @@ def load_gateway_config() -> GatewayConfig:
)
if "reply_prefix" in platform_cfg:
bridged["reply_prefix"] = platform_cfg["reply_prefix"]
if "require_mention" in platform_cfg:
bridged["require_mention"] = platform_cfg["require_mention"]
if "mention_patterns" in platform_cfg:
bridged["mention_patterns"] = platform_cfg["mention_patterns"]
if not bridged:
continue
plat_data = platforms_data.setdefault(plat.value, {})
@@ -531,6 +550,20 @@ def load_gateway_config() -> GatewayConfig:
os.environ["DISCORD_FREE_RESPONSE_CHANNELS"] = str(frc)
if "auto_thread" in discord_cfg and not os.getenv("DISCORD_AUTO_THREAD"):
os.environ["DISCORD_AUTO_THREAD"] = str(discord_cfg["auto_thread"]).lower()
# Telegram settings → env vars (env vars take precedence)
telegram_cfg = yaml_cfg.get("telegram", {})
if isinstance(telegram_cfg, dict):
if "require_mention" in telegram_cfg and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(telegram_cfg["require_mention"]).lower()
if "mention_patterns" in telegram_cfg and not os.getenv("TELEGRAM_MENTION_PATTERNS"):
import json as _json
os.environ["TELEGRAM_MENTION_PATTERNS"] = _json.dumps(telegram_cfg["mention_patterns"])
frc = telegram_cfg.get("free_response_chats")
if frc is not None and not os.getenv("TELEGRAM_FREE_RESPONSE_CHATS"):
if isinstance(frc, list):
frc = ",".join(str(v) for v in frc)
os.environ["TELEGRAM_FREE_RESPONSE_CHATS"] = str(frc)
except Exception as e:
logger.warning(
"Failed to process config.yaml — falling back to .env / gateway.json values. "
@@ -601,6 +634,14 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.TELEGRAM] = PlatformConfig()
config.platforms[Platform.TELEGRAM].reply_to_mode = telegram_reply_mode
telegram_fallback_ips = os.getenv("TELEGRAM_FALLBACK_IPS", "")
if telegram_fallback_ips:
if Platform.TELEGRAM not in config.platforms:
config.platforms[Platform.TELEGRAM] = PlatformConfig()
config.platforms[Platform.TELEGRAM].extra["fallback_ips"] = [
ip.strip() for ip in telegram_fallback_ips.split(",") if ip.strip()
]
telegram_home = os.getenv("TELEGRAM_HOME_CHANNEL")
if telegram_home and Platform.TELEGRAM in config.platforms:
config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
@@ -639,14 +680,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.SLACK] = PlatformConfig()
config.platforms[Platform.SLACK].enabled = True
config.platforms[Platform.SLACK].token = slack_token
# Home channel
slack_home = os.getenv("SLACK_HOME_CHANNEL")
if slack_home:
config.platforms[Platform.SLACK].home_channel = HomeChannel(
platform=Platform.SLACK,
chat_id=slack_home,
name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
)
slack_home = os.getenv("SLACK_HOME_CHANNEL")
if slack_home and Platform.SLACK in config.platforms:
config.platforms[Platform.SLACK].home_channel = HomeChannel(
platform=Platform.SLACK,
chat_id=slack_home,
name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
)
# Signal
signal_url = os.getenv("SIGNAL_HTTP_URL")
@@ -660,13 +700,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
"account": signal_account,
"ignore_stories": os.getenv("SIGNAL_IGNORE_STORIES", "true").lower() in ("true", "1", "yes"),
})
signal_home = os.getenv("SIGNAL_HOME_CHANNEL")
if signal_home:
config.platforms[Platform.SIGNAL].home_channel = HomeChannel(
platform=Platform.SIGNAL,
chat_id=signal_home,
name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
)
signal_home = os.getenv("SIGNAL_HOME_CHANNEL")
if signal_home and Platform.SIGNAL in config.platforms:
config.platforms[Platform.SIGNAL].home_channel = HomeChannel(
platform=Platform.SIGNAL,
chat_id=signal_home,
name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
)
# Mattermost
mattermost_token = os.getenv("MATTERMOST_TOKEN")
@@ -679,13 +719,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.MATTERMOST].enabled = True
config.platforms[Platform.MATTERMOST].token = mattermost_token
config.platforms[Platform.MATTERMOST].extra["url"] = mattermost_url
mattermost_home = os.getenv("MATTERMOST_HOME_CHANNEL")
if mattermost_home:
config.platforms[Platform.MATTERMOST].home_channel = HomeChannel(
platform=Platform.MATTERMOST,
chat_id=mattermost_home,
name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
)
mattermost_home = os.getenv("MATTERMOST_HOME_CHANNEL")
if mattermost_home and Platform.MATTERMOST in config.platforms:
config.platforms[Platform.MATTERMOST].home_channel = HomeChannel(
platform=Platform.MATTERMOST,
chat_id=mattermost_home,
name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
)
# Matrix
matrix_token = os.getenv("MATRIX_ACCESS_TOKEN")
@@ -707,13 +747,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.MATRIX].extra["password"] = matrix_password
matrix_e2ee = os.getenv("MATRIX_ENCRYPTION", "").lower() in ("true", "1", "yes")
config.platforms[Platform.MATRIX].extra["encryption"] = matrix_e2ee
matrix_home = os.getenv("MATRIX_HOME_ROOM")
if matrix_home:
config.platforms[Platform.MATRIX].home_channel = HomeChannel(
platform=Platform.MATRIX,
chat_id=matrix_home,
name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
)
matrix_home = os.getenv("MATRIX_HOME_ROOM")
if matrix_home and Platform.MATRIX in config.platforms:
config.platforms[Platform.MATRIX].home_channel = HomeChannel(
platform=Platform.MATRIX,
chat_id=matrix_home,
name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
)
# Home Assistant
hass_token = os.getenv("HASS_TOKEN")
@@ -740,13 +780,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
"imap_host": email_imap,
"smtp_host": email_smtp,
})
email_home = os.getenv("EMAIL_HOME_ADDRESS")
if email_home:
config.platforms[Platform.EMAIL].home_channel = HomeChannel(
platform=Platform.EMAIL,
chat_id=email_home,
name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
)
email_home = os.getenv("EMAIL_HOME_ADDRESS")
if email_home and Platform.EMAIL in config.platforms:
config.platforms[Platform.EMAIL].home_channel = HomeChannel(
platform=Platform.EMAIL,
chat_id=email_home,
name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
)
# SMS (Twilio)
twilio_sid = os.getenv("TWILIO_ACCOUNT_SID")
@@ -755,13 +795,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.SMS] = PlatformConfig()
config.platforms[Platform.SMS].enabled = True
config.platforms[Platform.SMS].api_key = os.getenv("TWILIO_AUTH_TOKEN", "")
sms_home = os.getenv("SMS_HOME_CHANNEL")
if sms_home:
config.platforms[Platform.SMS].home_channel = HomeChannel(
platform=Platform.SMS,
chat_id=sms_home,
name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
)
sms_home = os.getenv("SMS_HOME_CHANNEL")
if sms_home and Platform.SMS in config.platforms:
config.platforms[Platform.SMS].home_channel = HomeChannel(
platform=Platform.SMS,
chat_id=sms_home,
name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
)
# API Server
api_server_enabled = os.getenv("API_SERVER_ENABLED", "").lower() in ("true", "1", "yes")
@@ -803,6 +843,55 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
if webhook_secret:
config.platforms[Platform.WEBHOOK].extra["secret"] = webhook_secret
# Feishu / Lark
feishu_app_id = os.getenv("FEISHU_APP_ID")
feishu_app_secret = os.getenv("FEISHU_APP_SECRET")
if feishu_app_id and feishu_app_secret:
if Platform.FEISHU not in config.platforms:
config.platforms[Platform.FEISHU] = PlatformConfig()
config.platforms[Platform.FEISHU].enabled = True
config.platforms[Platform.FEISHU].extra.update({
"app_id": feishu_app_id,
"app_secret": feishu_app_secret,
"domain": os.getenv("FEISHU_DOMAIN", "feishu"),
"connection_mode": os.getenv("FEISHU_CONNECTION_MODE", "websocket"),
})
feishu_encrypt_key = os.getenv("FEISHU_ENCRYPT_KEY", "")
if feishu_encrypt_key:
config.platforms[Platform.FEISHU].extra["encrypt_key"] = feishu_encrypt_key
feishu_verification_token = os.getenv("FEISHU_VERIFICATION_TOKEN", "")
if feishu_verification_token:
config.platforms[Platform.FEISHU].extra["verification_token"] = feishu_verification_token
feishu_home = os.getenv("FEISHU_HOME_CHANNEL")
if feishu_home:
config.platforms[Platform.FEISHU].home_channel = HomeChannel(
platform=Platform.FEISHU,
chat_id=feishu_home,
name=os.getenv("FEISHU_HOME_CHANNEL_NAME", "Home"),
)
# WeCom (Enterprise WeChat)
wecom_bot_id = os.getenv("WECOM_BOT_ID")
wecom_secret = os.getenv("WECOM_SECRET")
if wecom_bot_id and wecom_secret:
if Platform.WECOM not in config.platforms:
config.platforms[Platform.WECOM] = PlatformConfig()
config.platforms[Platform.WECOM].enabled = True
config.platforms[Platform.WECOM].extra.update({
"bot_id": wecom_bot_id,
"secret": wecom_secret,
})
wecom_ws_url = os.getenv("WECOM_WEBSOCKET_URL", "")
if wecom_ws_url:
config.platforms[Platform.WECOM].extra["websocket_url"] = wecom_ws_url
wecom_home = os.getenv("WECOM_HOME_CHANNEL")
if wecom_home:
config.platforms[Platform.WECOM].home_channel = HomeChannel(
platform=Platform.WECOM,
chat_id=wecom_home,
name=os.getenv("WECOM_HOME_CHANNEL_NAME", "Home"),
)
# Session settings
idle_minutes = os.getenv("SESSION_IDLE_MINUTES")
if idle_minutes:
+19
View File
@@ -51,14 +51,33 @@ class HookRegistry:
"""Return metadata about all loaded hooks."""
return list(self._loaded_hooks)
def _register_builtin_hooks(self) -> None:
"""Register built-in hooks that are always active."""
try:
from gateway.builtin_hooks.boot_md import handle as boot_md_handle
self._handlers.setdefault("gateway:startup", []).append(boot_md_handle)
self._loaded_hooks.append({
"name": "boot-md",
"description": "Run ~/.hermes/BOOT.md on gateway startup",
"events": ["gateway:startup"],
"path": "(builtin)",
})
except Exception as e:
print(f"[hooks] Could not load built-in boot-md hook: {e}", flush=True)
def discover_and_load(self) -> None:
"""
Scan the hooks directory for hook directories and load their handlers.
Also registers built-in hooks that are always active.
Each hook directory must contain:
- HOOK.yaml with at least 'name' and 'events' keys
- handler.py with a top-level 'handle' function (sync or async)
"""
self._register_builtin_hooks()
if not HOOKS_DIR.exists():
return
+2 -2
View File
@@ -25,7 +25,7 @@ import time
from pathlib import Path
from typing import Optional
from hermes_cli.config import get_hermes_home
from hermes_constants import get_hermes_dir
# Unambiguous alphabet -- excludes 0/O, 1/I to prevent confusion
@@ -41,7 +41,7 @@ LOCKOUT_SECONDS = 3600 # Lockout duration after too many failures
MAX_PENDING_PER_PLATFORM = 3 # Max pending codes per platform
MAX_FAILED_ATTEMPTS = 5 # Failed approvals before lockout
PAIRING_DIR = get_hermes_home() / "pairing"
PAIRING_DIR = get_hermes_dir("platforms/pairing", "pairing")
def _secure_write(path: Path, data: str) -> None:
+147 -71
View File
@@ -166,7 +166,7 @@ class ResponseStore:
_CORS_HEADERS = {
"Access-Control-Allow-Methods": "GET, POST, DELETE, OPTIONS",
"Access-Control-Allow-Headers": "Authorization, Content-Type",
"Access-Control-Allow-Headers": "Authorization, Content-Type, Idempotency-Key",
}
@@ -223,6 +223,23 @@ if AIOHTTP_AVAILABLE:
else:
body_limit_middleware = None # type: ignore[assignment]
_SECURITY_HEADERS = {
"X-Content-Type-Options": "nosniff",
"Referrer-Policy": "no-referrer",
}
if AIOHTTP_AVAILABLE:
@web.middleware
async def security_headers_middleware(request, handler):
"""Add security headers to all responses (including errors)."""
response = await handler(request)
for k, v in _SECURITY_HEADERS.items():
response.headers.setdefault(k, v)
return response
else:
security_headers_middleware = None # type: ignore[assignment]
class _IdempotencyCache:
"""In-memory idempotency cache with TTL and basic LRU semantics."""
@@ -307,6 +324,7 @@ class APIServerAdapter(BasePlatformAdapter):
if "*" in self._cors_origins:
headers = dict(_CORS_HEADERS)
headers["Access-Control-Allow-Origin"] = "*"
headers["Access-Control-Max-Age"] = "600"
return headers
if origin not in self._cors_origins:
@@ -315,6 +333,7 @@ class APIServerAdapter(BasePlatformAdapter):
headers = dict(_CORS_HEADERS)
headers["Access-Control-Allow-Origin"] = origin
headers["Vary"] = "Origin"
headers["Access-Control-Max-Age"] = "600"
return headers
def _origin_allowed(self, origin: str) -> bool:
@@ -366,14 +385,20 @@ class APIServerAdapter(BasePlatformAdapter):
Create an AIAgent instance using the gateway's runtime config.
Uses _resolve_runtime_agent_kwargs() to pick up model, api_key,
base_url, etc. from config.yaml / env vars.
base_url, etc. from config.yaml / env vars. Toolsets are resolved
from config.yaml platform_toolsets.api_server (same as all other
gateway platforms), falling back to the hermes-api-server default.
"""
from run_agent import AIAgent
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config
from hermes_cli.tools_config import _get_platform_tools
runtime_kwargs = _resolve_runtime_agent_kwargs()
model = _resolve_gateway_model()
user_config = _load_gateway_config()
enabled_toolsets = sorted(_get_platform_tools(user_config, "api_server"))
max_iterations = int(os.getenv("HERMES_MAX_ITERATIONS", "90"))
agent = AIAgent(
@@ -383,7 +408,7 @@ class APIServerAdapter(BasePlatformAdapter):
quiet_mode=True,
verbose_logging=False,
ephemeral_system_prompt=ephemeral_system_prompt or None,
enabled_toolsets=["hermes-api-server"],
enabled_toolsets=enabled_toolsets,
session_id=session_id,
platform="api_server",
stream_delta_callback=stream_delta_callback,
@@ -489,17 +514,21 @@ class APIServerAdapter(BasePlatformAdapter):
if delta is not None:
_stream_q.put(delta)
# Start agent in background
# Start agent in background. agent_ref is a mutable container
# so the SSE writer can interrupt the agent on client disconnect.
agent_ref = [None]
agent_task = asyncio.ensure_future(self._run_agent(
user_message=user_message,
conversation_history=history,
ephemeral_system_prompt=system_prompt,
session_id=session_id,
stream_delta_callback=_on_delta,
agent_ref=agent_ref,
))
return await self._write_sse_chat_completion(
request, completion_id, model_name, created, _stream_q, agent_task
request, completion_id, model_name, created, _stream_q,
agent_task, agent_ref,
)
# Non-streaming: run the agent (with optional Idempotency-Key)
@@ -562,80 +591,107 @@ class APIServerAdapter(BasePlatformAdapter):
async def _write_sse_chat_completion(
self, request: "web.Request", completion_id: str, model: str,
created: int, stream_q, agent_task,
created: int, stream_q, agent_task, agent_ref=None,
) -> "web.StreamResponse":
"""Write real streaming SSE from agent's stream_delta_callback queue."""
"""Write real streaming SSE from agent's stream_delta_callback queue.
If the client disconnects mid-stream (network drop, browser tab close),
the agent is interrupted via ``agent.interrupt()`` so it stops making
LLM API calls, and the asyncio task wrapper is cancelled.
"""
import queue as _q
response = web.StreamResponse(
status=200,
headers={"Content-Type": "text/event-stream", "Cache-Control": "no-cache"},
)
sse_headers = {"Content-Type": "text/event-stream", "Cache-Control": "no-cache"}
# CORS middleware can't inject headers into StreamResponse after
# prepare() flushes them, so resolve CORS headers up front.
origin = request.headers.get("Origin", "")
cors = self._cors_headers_for_origin(origin) if origin else None
if cors:
sse_headers.update(cors)
response = web.StreamResponse(status=200, headers=sse_headers)
await response.prepare(request)
# Role chunk
role_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
"created": created, "model": model,
"choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": None}],
}
await response.write(f"data: {json.dumps(role_chunk)}\n\n".encode())
# Stream content chunks as they arrive from the agent
loop = asyncio.get_event_loop()
while True:
try:
delta = await loop.run_in_executor(None, lambda: stream_q.get(timeout=0.5))
except _q.Empty:
if agent_task.done():
# Drain any remaining items
while True:
try:
delta = stream_q.get_nowait()
if delta is None:
break
content_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
"created": created, "model": model,
"choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
}
await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
except _q.Empty:
break
break
continue
if delta is None: # End of stream sentinel
break
content_chunk = {
try:
# Role chunk
role_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
"created": created, "model": model,
"choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
"choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": None}],
}
await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
await response.write(f"data: {json.dumps(role_chunk)}\n\n".encode())
# Get usage from completed agent
usage = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
try:
result, agent_usage = await agent_task
usage = agent_usage or usage
except Exception:
pass
# Stream content chunks as they arrive from the agent
loop = asyncio.get_event_loop()
while True:
try:
delta = await loop.run_in_executor(None, lambda: stream_q.get(timeout=0.5))
except _q.Empty:
if agent_task.done():
# Drain any remaining items
while True:
try:
delta = stream_q.get_nowait()
if delta is None:
break
content_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
"created": created, "model": model,
"choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
}
await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
except _q.Empty:
break
break
continue
# Finish chunk
finish_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
"created": created, "model": model,
"choices": [{"index": 0, "delta": {}, "finish_reason": "stop"}],
"usage": {
"prompt_tokens": usage.get("input_tokens", 0),
"completion_tokens": usage.get("output_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
},
}
await response.write(f"data: {json.dumps(finish_chunk)}\n\n".encode())
await response.write(b"data: [DONE]\n\n")
if delta is None: # End of stream sentinel
break
content_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
"created": created, "model": model,
"choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
}
await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
# Get usage from completed agent
usage = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
try:
result, agent_usage = await agent_task
usage = agent_usage or usage
except Exception:
pass
# Finish chunk
finish_chunk = {
"id": completion_id, "object": "chat.completion.chunk",
"created": created, "model": model,
"choices": [{"index": 0, "delta": {}, "finish_reason": "stop"}],
"usage": {
"prompt_tokens": usage.get("input_tokens", 0),
"completion_tokens": usage.get("output_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
},
}
await response.write(f"data: {json.dumps(finish_chunk)}\n\n".encode())
await response.write(b"data: [DONE]\n\n")
except (ConnectionResetError, ConnectionAbortedError, BrokenPipeError, OSError):
# Client disconnected mid-stream. Interrupt the agent so it
# stops making LLM API calls at the next loop iteration, then
# cancel the asyncio task wrapper.
agent = agent_ref[0] if agent_ref else None
if agent is not None:
try:
agent.interrupt("SSE client disconnected")
except Exception:
pass
if not agent_task.done():
agent_task.cancel()
try:
await agent_task
except (asyncio.CancelledError, Exception):
pass
logger.info("SSE client disconnected; interrupted agent task %s", completion_id)
return response
@@ -1138,12 +1194,18 @@ class APIServerAdapter(BasePlatformAdapter):
ephemeral_system_prompt: Optional[str] = None,
session_id: Optional[str] = None,
stream_delta_callback=None,
agent_ref: Optional[list] = None,
) -> tuple:
"""
Create an agent and run a conversation in a thread executor.
Returns ``(result_dict, usage_dict)`` where *usage_dict* contains
``input_tokens``, ``output_tokens`` and ``total_tokens``.
If *agent_ref* is a one-element list, the AIAgent instance is stored
at ``agent_ref[0]`` before ``run_conversation`` begins. This allows
callers (e.g. the SSE writer) to call ``agent.interrupt()`` from
another thread to stop in-progress LLM calls.
"""
loop = asyncio.get_event_loop()
@@ -1153,6 +1215,8 @@ class APIServerAdapter(BasePlatformAdapter):
session_id=session_id,
stream_delta_callback=stream_delta_callback,
)
if agent_ref is not None:
agent_ref[0] = agent
result = agent.run_conversation(
user_message=user_message,
conversation_history=conversation_history,
@@ -1177,10 +1241,11 @@ class APIServerAdapter(BasePlatformAdapter):
return False
try:
mws = [mw for mw in (cors_middleware, body_limit_middleware) if mw is not None]
mws = [mw for mw in (cors_middleware, body_limit_middleware, security_headers_middleware) if mw is not None]
self._app = web.Application(middlewares=mws)
self._app["api_server_adapter"] = self
self._app.router.add_get("/health", self._handle_health)
self._app.router.add_get("/v1/health", self._handle_health)
self._app.router.add_get("/v1/models", self._handle_models)
self._app.router.add_post("/v1/chat/completions", self._handle_chat_completions)
self._app.router.add_post("/v1/responses", self._handle_responses)
@@ -1196,6 +1261,17 @@ class APIServerAdapter(BasePlatformAdapter):
self._app.router.add_post("/api/jobs/{job_id}/resume", self._handle_resume_job)
self._app.router.add_post("/api/jobs/{job_id}/run", self._handle_run_job)
# Port conflict detection — fail fast if port is already in use
import socket as _socket
try:
with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as _s:
_s.settimeout(1)
_s.connect(('127.0.0.1', self._port))
logger.error('[%s] Port %d already in use. Set a different port in config.yaml: platforms.api_server.port', self.name, self._port)
return False
except (ConnectionRefusedError, OSError):
pass # port is free
self._runner = web.AppRunner(self._app)
await self._runner.setup()
self._site = web.TCPSite(self._runner, self._host, self._port)
+226 -48
View File
@@ -8,6 +8,7 @@ and implement the required methods.
import asyncio
import logging
import os
import random
import re
import uuid
from abc import ABC, abstractmethod
@@ -26,6 +27,7 @@ sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
from gateway.session import SessionSource, build_session_key
from hermes_cli.config import get_hermes_home
from hermes_constants import get_hermes_dir
GATEWAY_SECRET_CAPTURE_UNSUPPORTED_MESSAGE = (
@@ -43,8 +45,8 @@ GATEWAY_SECRET_CAPTURE_UNSUPPORTED_MESSAGE = (
# (e.g. Telegram file URLs expire after ~1 hour).
# ---------------------------------------------------------------------------
# Default location: {HERMES_HOME}/image_cache/
IMAGE_CACHE_DIR = get_hermes_home() / "image_cache"
# Default location: {HERMES_HOME}/cache/images/ (legacy: image_cache/)
IMAGE_CACHE_DIR = get_hermes_dir("cache/images", "image_cache")
def get_image_cache_dir() -> Path:
@@ -71,31 +73,51 @@ def cache_image_from_bytes(data: bytes, ext: str = ".jpg") -> str:
return str(filepath)
async def cache_image_from_url(url: str, ext: str = ".jpg") -> str:
async def cache_image_from_url(url: str, ext: str = ".jpg", retries: int = 2) -> str:
"""
Download an image from a URL and save it to the local cache.
Uses httpx for async download with a reasonable timeout.
Retries on transient failures (timeouts, 429, 5xx) with exponential
backoff so a single slow CDN response doesn't lose the media.
Args:
url: The HTTP/HTTPS URL to download from.
ext: File extension including the dot (e.g. ".jpg", ".png").
retries: Number of retry attempts on transient failures.
Returns:
Absolute path to the cached image file as a string.
"""
import asyncio
import httpx
import logging as _logging
_log = _logging.getLogger(__name__)
last_exc = None
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
response = await client.get(
url,
headers={
"User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
"Accept": "image/*,*/*;q=0.8",
},
)
response.raise_for_status()
return cache_image_from_bytes(response.content, ext)
for attempt in range(retries + 1):
try:
response = await client.get(
url,
headers={
"User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
"Accept": "image/*,*/*;q=0.8",
},
)
response.raise_for_status()
return cache_image_from_bytes(response.content, ext)
except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
last_exc = exc
if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
raise
if attempt < retries:
wait = 1.5 * (attempt + 1)
_log.debug("Media cache retry %d/%d for %s (%.1fs): %s",
attempt + 1, retries, url[:80], wait, exc)
await asyncio.sleep(wait)
continue
raise
raise last_exc
def cleanup_image_cache(max_age_hours: int = 24) -> int:
@@ -126,7 +148,7 @@ def cleanup_image_cache(max_age_hours: int = 24) -> int:
# here so the STT tool (OpenAI Whisper) can transcribe them from local files.
# ---------------------------------------------------------------------------
AUDIO_CACHE_DIR = get_hermes_home() / "audio_cache"
AUDIO_CACHE_DIR = get_hermes_dir("cache/audio", "audio_cache")
def get_audio_cache_dir() -> Path:
@@ -153,29 +175,51 @@ def cache_audio_from_bytes(data: bytes, ext: str = ".ogg") -> str:
return str(filepath)
async def cache_audio_from_url(url: str, ext: str = ".ogg") -> str:
async def cache_audio_from_url(url: str, ext: str = ".ogg", retries: int = 2) -> str:
"""
Download an audio file from a URL and save it to the local cache.
Retries on transient failures (timeouts, 429, 5xx) with exponential
backoff so a single slow CDN response doesn't lose the media.
Args:
url: The HTTP/HTTPS URL to download from.
ext: File extension including the dot (e.g. ".ogg", ".mp3").
retries: Number of retry attempts on transient failures.
Returns:
Absolute path to the cached audio file as a string.
"""
import asyncio
import httpx
import logging as _logging
_log = _logging.getLogger(__name__)
last_exc = None
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
response = await client.get(
url,
headers={
"User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
"Accept": "audio/*,*/*;q=0.8",
},
)
response.raise_for_status()
return cache_audio_from_bytes(response.content, ext)
for attempt in range(retries + 1):
try:
response = await client.get(
url,
headers={
"User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
"Accept": "audio/*,*/*;q=0.8",
},
)
response.raise_for_status()
return cache_audio_from_bytes(response.content, ext)
except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
last_exc = exc
if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
raise
if attempt < retries:
wait = 1.5 * (attempt + 1)
_log.debug("Audio cache retry %d/%d for %s (%.1fs): %s",
attempt + 1, retries, url[:80], wait, exc)
await asyncio.sleep(wait)
continue
raise
raise last_exc
# ---------------------------------------------------------------------------
@@ -185,7 +229,7 @@ async def cache_audio_from_url(url: str, ext: str = ".ogg") -> str:
# here so the agent can reference them by local file path.
# ---------------------------------------------------------------------------
DOCUMENT_CACHE_DIR = get_hermes_home() / "document_cache"
DOCUMENT_CACHE_DIR = get_hermes_dir("cache/documents", "document_cache")
SUPPORTED_DOCUMENT_TYPES = {
".pdf": "application/pdf",
@@ -312,7 +356,10 @@ class MessageEvent:
return None
# Split on space and get first word, strip the /
parts = self.text.split(maxsplit=1)
return parts[0][1:].lower() if parts else None
raw = parts[0][1:].lower() if parts else None
if raw and "@" in raw:
raw = raw.split("@", 1)[0]
return raw
def get_command_args(self) -> str:
"""Get the arguments after a command."""
@@ -329,6 +376,24 @@ class SendResult:
message_id: Optional[str] = None
error: Optional[str] = None
raw_response: Any = None
retryable: bool = False # True for transient errors (network, timeout) — base will retry automatically
# Error substrings that indicate a transient network failure worth retrying
_RETRYABLE_ERROR_PATTERNS = (
"connecterror",
"connectionerror",
"connectionreset",
"connectionrefused",
"timeout",
"timed out",
"network",
"broken pipe",
"remotedisconnected",
"eoferror",
"readtimeout",
"writetimeout",
)
# Type for message handlers
@@ -833,6 +898,111 @@ class BasePlatformAdapter(ABC):
except Exception:
pass
# ── Processing lifecycle hooks ──────────────────────────────────────────
# Subclasses override these to react to message processing events
# (e.g. Discord adds 👀/✅/❌ reactions).
async def on_processing_start(self, event: MessageEvent) -> None:
"""Hook called when background processing begins."""
async def on_processing_complete(self, event: MessageEvent, success: bool) -> None:
"""Hook called when background processing completes."""
async def _run_processing_hook(self, hook_name: str, *args: Any, **kwargs: Any) -> None:
"""Run a lifecycle hook without letting failures break message flow."""
hook = getattr(self, hook_name, None)
if not callable(hook):
return
try:
await hook(*args, **kwargs)
except Exception as e:
logger.warning("[%s] %s hook failed: %s", self.name, hook_name, e)
@staticmethod
def _is_retryable_error(error: Optional[str]) -> bool:
"""Return True if the error string looks like a transient network failure."""
if not error:
return False
lowered = error.lower()
return any(pat in lowered for pat in _RETRYABLE_ERROR_PATTERNS)
async def _send_with_retry(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Any = None,
max_retries: int = 2,
base_delay: float = 2.0,
) -> "SendResult":
"""
Send a message with automatic retry for transient network errors.
On permanent failures (e.g. formatting / permission errors) falls back
to a plain-text version before giving up. If all attempts fail due to
network errors, sends the user a brief delivery-failure notice so they
know to retry rather than waiting indefinitely.
"""
result = await self.send(
chat_id=chat_id,
content=content,
reply_to=reply_to,
metadata=metadata,
)
if result.success:
return result
error_str = result.error or ""
is_network = result.retryable or self._is_retryable_error(error_str)
if is_network:
# Retry with exponential backoff for transient errors
for attempt in range(1, max_retries + 1):
delay = base_delay * (2 ** (attempt - 1)) + random.uniform(0, 1)
logger.warning(
"[%s] Send failed (attempt %d/%d, retrying in %.1fs): %s",
self.name, attempt, max_retries, delay, error_str,
)
await asyncio.sleep(delay)
result = await self.send(
chat_id=chat_id,
content=content,
reply_to=reply_to,
metadata=metadata,
)
if result.success:
logger.info("[%s] Send succeeded on retry %d", self.name, attempt)
return result
error_str = result.error or ""
if not (result.retryable or self._is_retryable_error(error_str)):
break # error switched to non-transient — fall through to plain-text fallback
else:
# All retries exhausted (loop completed without break) — notify user
logger.error("[%s] Failed to deliver response after %d retries: %s", self.name, max_retries, error_str)
notice = (
"\u26a0\ufe0f Message delivery failed after multiple attempts. "
"Please try again \u2014 your request was processed but the response could not be sent."
)
try:
await self.send(chat_id=chat_id, content=notice, reply_to=reply_to, metadata=metadata)
except Exception as notify_err:
logger.debug("[%s] Could not send delivery-failure notice: %s", self.name, notify_err)
return result
# Non-network / post-retry formatting failure: try plain text as fallback
logger.warning("[%s] Send failed: %s — trying plain-text fallback", self.name, error_str)
fallback_result = await self.send(
chat_id=chat_id,
content=f"(Response formatting failed, plain text:)\n\n{content[:3500]}",
reply_to=reply_to,
metadata=metadata,
)
if not fallback_result.success:
logger.error("[%s] Fallback send also failed: %s", self.name, fallback_result.error)
return fallback_result
async def handle_message(self, event: MessageEvent) -> None:
"""
Process an incoming message.
@@ -855,7 +1025,7 @@ class BasePlatformAdapter(ABC):
# simultaneous messages. Queue them without interrupting the active run,
# then process them immediately after the current task finishes.
if event.message_type == MessageType.PHOTO:
print(f"[{self.name}] 🖼️ Queuing photo follow-up for session {session_key} without interrupt")
logger.debug("[%s] Queuing photo follow-up for session %s without interrupt", self.name, session_key)
existing = self._pending_messages.get(session_key)
if existing and existing.message_type == MessageType.PHOTO:
existing.media_urls.extend(event.media_urls)
@@ -870,7 +1040,7 @@ class BasePlatformAdapter(ABC):
return # Don't interrupt now - will run after current task completes
# Default behavior for non-photo follow-ups: interrupt the running agent
print(f"[{self.name}] New message while session {session_key} is active - triggering interrupt")
logger.debug("[%s] New message while session %s is active triggering interrupt", self.name, session_key)
self._pending_messages[session_key] = event
# Signal the interrupt (the processing task checks this)
self._active_sessions[session_key].set()
@@ -910,6 +1080,18 @@ class BasePlatformAdapter(ABC):
async def _process_message_background(self, event: MessageEvent, session_key: str) -> None:
"""Background task that actually processes the message."""
# Track delivery outcomes for the processing-complete hook
delivery_attempted = False
delivery_succeeded = False
def _record_delivery(result):
nonlocal delivery_attempted, delivery_succeeded
if result is None:
return
delivery_attempted = True
if getattr(result, "success", False):
delivery_succeeded = True
# Create interrupt event for this session
interrupt_event = asyncio.Event()
self._active_sessions[session_key] = interrupt_event
@@ -919,6 +1101,8 @@ class BasePlatformAdapter(ABC):
typing_task = asyncio.create_task(self._keep_typing(event.source.chat_id, metadata=_thread_metadata))
try:
await self._run_processing_hook("on_processing_start", event)
# Call the handler (this can take a while with tool calls)
response = await self._message_handler(event)
@@ -982,25 +1166,13 @@ class BasePlatformAdapter(ABC):
# Send the text portion
if text_content:
logger.info("[%s] Sending response (%d chars) to %s", self.name, len(text_content), event.source.chat_id)
result = await self.send(
result = await self._send_with_retry(
chat_id=event.source.chat_id,
content=text_content,
reply_to=event.message_id,
metadata=_thread_metadata,
)
# Log send failures (don't raise - user already saw tool progress)
if not result.success:
print(f"[{self.name}] Failed to send response: {result.error}")
# Try sending without markdown as fallback
fallback_result = await self.send(
chat_id=event.source.chat_id,
content=f"(Response formatting failed, plain text:)\n\n{text_content[:3500]}",
reply_to=event.message_id,
metadata=_thread_metadata,
)
if not fallback_result.success:
print(f"[{self.name}] Fallback send also failed: {fallback_result.error}")
_record_delivery(result)
# Human-like pacing delay between text and media
human_delay = self._get_human_delay()
@@ -1069,9 +1241,9 @@ class BasePlatformAdapter(ABC):
)
if not media_result.success:
print(f"[{self.name}] Failed to send media ({ext}): {media_result.error}")
logger.warning("[%s] Failed to send media (%s): %s", self.name, ext, media_result.error)
except Exception as media_err:
print(f"[{self.name}] Error sending media: {media_err}")
logger.warning("[%s] Error sending media: %s", self.name, media_err)
# Send auto-detected local files as native attachments
for file_path in local_files:
@@ -1100,10 +1272,14 @@ class BasePlatformAdapter(ABC):
except Exception as file_err:
logger.error("[%s] Error sending local file %s: %s", self.name, file_path, file_err)
# Determine overall success for the processing hook
processing_ok = delivery_succeeded if delivery_attempted else not bool(response)
await self._run_processing_hook("on_processing_complete", event, processing_ok)
# Check if there's a pending message that was queued during our processing
if session_key in self._pending_messages:
pending_event = self._pending_messages.pop(session_key)
print(f"[{self.name}] 📨 Processing queued message from interrupt")
logger.debug("[%s] Processing queued message from interrupt", self.name)
# Clean up current session before processing pending
if session_key in self._active_sessions:
del self._active_sessions[session_key]
@@ -1116,10 +1292,12 @@ class BasePlatformAdapter(ABC):
await self._process_message_background(pending_event, session_key)
return # Already cleaned up
except asyncio.CancelledError:
await self._run_processing_hook("on_processing_complete", event, False)
raise
except Exception as e:
print(f"[{self.name}] Error handling message: {e}")
import traceback
traceback.print_exc()
await self._run_processing_hook("on_processing_complete", event, False)
logger.error("[%s] Error handling message: %s", self.name, e, exc_info=True)
# Send the error to the user so they aren't left with radio silence
try:
error_type = type(e).__name__
+93 -12
View File
@@ -486,6 +486,17 @@ class DiscordAdapter(BasePlatformAdapter):
return False
try:
# Acquire scoped lock to prevent duplicate bot token usage
from gateway.status import acquire_scoped_lock
self._token_lock_identity = self.config.token
acquired, existing = acquire_scoped_lock('discord-bot-token', self._token_lock_identity, metadata={'platform': 'discord'})
if not acquired:
owner_pid = existing.get('pid') if isinstance(existing, dict) else None
message = f'Discord bot token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
logger.error('[%s] %s', self.name, message)
self._set_fatal_error('discord_token_lock', message, retryable=False)
return False
# Set up intents -- members intent needed for username-to-ID resolution
intents = Intents.default()
intents.message_content = True
@@ -550,6 +561,22 @@ class DiscordAdapter(BasePlatformAdapter):
return
# "all" falls through to handle_message
# If the message @mentions other users but NOT the bot, the
# sender is talking to someone else — stay silent. Only
# applies in server channels; in DMs the user is always
# talking to the bot (mentions are just references).
# Controlled by DISCORD_IGNORE_NO_MENTION (default: true).
_ignore_no_mention = os.getenv(
"DISCORD_IGNORE_NO_MENTION", "true"
).lower() in ("true", "1", "yes")
if _ignore_no_mention and message.mentions and not isinstance(message.channel, discord.DMChannel):
_bot_mentioned = (
self._client.user is not None
and self._client.user in message.mentions
)
if not _bot_mentioned:
return # Talking to someone else, don't interrupt
await self._handle_message(message)
@self._client.event
@@ -622,7 +649,52 @@ class DiscordAdapter(BasePlatformAdapter):
self._running = False
self._client = None
self._ready_event.clear()
# Release the token lock
try:
from gateway.status import release_scoped_lock
if getattr(self, '_token_lock_identity', None):
release_scoped_lock('discord-bot-token', self._token_lock_identity)
self._token_lock_identity = None
except Exception:
pass
logger.info("[%s] Disconnected", self.name)
async def _add_reaction(self, message: Any, emoji: str) -> bool:
"""Add an emoji reaction to a Discord message."""
if not message or not hasattr(message, "add_reaction"):
return False
try:
await message.add_reaction(emoji)
return True
except Exception as e:
logger.debug("[%s] add_reaction failed (%s): %s", self.name, emoji, e)
return False
async def _remove_reaction(self, message: Any, emoji: str) -> bool:
"""Remove the bot's own emoji reaction from a Discord message."""
if not message or not hasattr(message, "remove_reaction") or not self._client or not self._client.user:
return False
try:
await message.remove_reaction(emoji, self._client.user)
return True
except Exception as e:
logger.debug("[%s] remove_reaction failed (%s): %s", self.name, emoji, e)
return False
async def on_processing_start(self, event: MessageEvent) -> None:
"""Add an in-progress reaction for normal Discord message events."""
message = event.raw_message
if hasattr(message, "add_reaction"):
await self._add_reaction(message, "👀")
async def on_processing_complete(self, event: MessageEvent, success: bool) -> None:
"""Swap the in-progress reaction for a final success/failure reaction."""
message = event.raw_message
if hasattr(message, "add_reaction"):
await self._remove_reaction(message, "👀")
await self._add_reaction(message, "" if success else "")
async def send(
self,
@@ -1413,15 +1485,23 @@ class DiscordAdapter(BasePlatformAdapter):
command_text: str,
followup_msg: str | None = None,
) -> None:
"""Common handler for simple slash commands that dispatch a command string."""
"""Common handler for simple slash commands that dispatch a command string.
Defers the interaction (shows "thinking..."), dispatches the command,
then cleans up the deferred response. If *followup_msg* is provided
the "thinking..." indicator is replaced with that text; otherwise it
is deleted so the channel isn't cluttered.
"""
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, command_text)
await self.handle_message(event)
if followup_msg:
try:
await interaction.followup.send(followup_msg, ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
try:
if followup_msg:
await interaction.edit_original_response(content=followup_msg)
else:
await interaction.delete_original_response()
except Exception as e:
logger.debug("Discord interaction cleanup failed: %s", e)
def _register_slash_commands(self) -> None:
"""Register Discord slash commands on the command tree."""
@@ -1446,9 +1526,7 @@ class DiscordAdapter(BasePlatformAdapter):
@tree.command(name="reasoning", description="Show or change reasoning effort")
@discord.app_commands.describe(effort="Reasoning effort: xhigh, high, medium, low, minimal, or none.")
async def slash_reasoning(interaction: discord.Interaction, effort: str = ""):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/reasoning {effort}".strip())
await self.handle_message(event)
await self._run_simple_slash(interaction, f"/reasoning {effort}".strip())
@tree.command(name="personality", description="Set a personality")
@discord.app_commands.describe(name="Personality name. Leave empty to list available.")
@@ -1521,9 +1599,7 @@ class DiscordAdapter(BasePlatformAdapter):
discord.app_commands.Choice(name="status — show current mode", value="status"),
])
async def slash_voice(interaction: discord.Interaction, mode: str = ""):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/voice {mode}".strip())
await self.handle_message(event)
await self._run_simple_slash(interaction, f"/voice {mode}".strip())
@tree.command(name="update", description="Update Hermes Agent to the latest version")
async def slash_update(interaction: discord.Interaction):
@@ -2096,6 +2172,11 @@ class DiscordAdapter(BasePlatformAdapter):
if pending_text_injection:
event_text = f"{pending_text_injection}\n\n{event_text}" if event_text else pending_text_injection
# Defense-in-depth: prevent empty user messages from entering session
# (can happen when user sends @mention-only with no other text)
if not event_text or not event_text.strip():
event_text = "(The user sent a message with no text content)"
event = MessageEvent(
text=event_text,
message_type=msg_type,
+121 -48
View File
@@ -43,6 +43,20 @@ from gateway.platforms.base import (
from gateway.config import Platform, PlatformConfig
logger = logging.getLogger(__name__)
# Automated sender patterns — emails from these are silently ignored
_NOREPLY_PATTERNS = (
"noreply", "no-reply", "no_reply", "donotreply", "do-not-reply",
"mailer-daemon", "postmaster", "bounce", "notifications@",
"automated@", "auto-confirm", "auto-reply", "automailer",
)
# RFC headers that indicate bulk/automated mail
_AUTOMATED_HEADERS = {
"Auto-Submitted": lambda v: v.lower() != "no",
"Precedence": lambda v: v.lower() in ("bulk", "list", "junk"),
"X-Auto-Response-Suppress": lambda v: bool(v),
"List-Unsubscribe": lambda v: bool(v),
}
# Gmail-safe max length per email body
MAX_MESSAGE_LENGTH = 50_000
@@ -50,7 +64,17 @@ MAX_MESSAGE_LENGTH = 50_000
# Supported image extensions for inline detection
_IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".gif", ".webp"}
def _is_automated_sender(address: str, headers: dict) -> bool:
"""Return True if this email is from an automated/noreply source."""
addr = address.lower()
if any(pattern in addr for pattern in _NOREPLY_PATTERNS):
return True
for header, check in _AUTOMATED_HEADERS.items():
value = headers.get(header, "")
if value and check(value):
return True
return False
def check_email_requirements() -> bool:
"""Check if email platform dependencies are available."""
addr = os.getenv("EMAIL_ADDRESS")
@@ -213,6 +237,7 @@ class EmailAdapter(BasePlatformAdapter):
# Track message IDs we've already processed to avoid duplicates
self._seen_uids: set = set()
self._seen_uids_max: int = 2000 # cap to prevent unbounded memory growth
self._poll_task: Optional[asyncio.Task] = None
# Map chat_id (sender email) -> last subject + message-id for threading
@@ -220,6 +245,26 @@ class EmailAdapter(BasePlatformAdapter):
logger.info("[Email] Adapter initialized for %s", self._address)
def _trim_seen_uids(self) -> None:
"""Keep only the most recent UIDs to prevent unbounded memory growth.
IMAP UIDs are monotonically increasing integers. When the set grows
beyond the cap, we keep only the highest half old UIDs are safe to
drop because new messages always have higher UIDs and IMAP's UNSEEN
flag prevents re-delivery regardless.
"""
if len(self._seen_uids) <= self._seen_uids_max:
return
try:
# UIDs are bytes like b'1234' — sort numerically and keep top half
sorted_uids = sorted(self._seen_uids, key=lambda u: int(u))
keep = self._seen_uids_max // 2
self._seen_uids = set(sorted_uids[-keep:])
logger.debug("[Email] Trimmed seen UIDs to %d entries", len(self._seen_uids))
except (ValueError, TypeError):
# Fallback: just clear old entries if sort fails
self._seen_uids = set(list(self._seen_uids)[-self._seen_uids_max // 2:])
async def connect(self) -> bool:
"""Connect to the IMAP server and start polling for new messages."""
try:
@@ -232,6 +277,8 @@ class EmailAdapter(BasePlatformAdapter):
if status == "OK" and data and data[0]:
for uid in data[0].split():
self._seen_uids.add(uid)
# Keep only the most recent UIDs to prevent unbounded growth
self._trim_seen_uids()
imap.logout()
logger.info("[Email] IMAP connection test passed. %d existing messages skipped.", len(self._seen_uids))
except Exception as e:
@@ -290,52 +337,63 @@ class EmailAdapter(BasePlatformAdapter):
results = []
try:
imap = imaplib.IMAP4_SSL(self._imap_host, self._imap_port, timeout=30)
imap.login(self._address, self._password)
imap.select("INBOX")
try:
imap.login(self._address, self._password)
imap.select("INBOX")
status, data = imap.uid("search", None, "UNSEEN")
if status != "OK" or not data or not data[0]:
imap.logout()
return results
status, data = imap.uid("search", None, "UNSEEN")
if status != "OK" or not data or not data[0]:
return results
for uid in data[0].split():
if uid in self._seen_uids:
continue
self._seen_uids.add(uid)
for uid in data[0].split():
if uid in self._seen_uids:
continue
self._seen_uids.add(uid)
# Trim periodically to prevent unbounded memory growth
if len(self._seen_uids) > self._seen_uids_max:
self._trim_seen_uids()
status, msg_data = imap.uid("fetch", uid, "(RFC822)")
if status != "OK":
continue
status, msg_data = imap.uid("fetch", uid, "(RFC822)")
if status != "OK":
continue
raw_email = msg_data[0][1]
msg = email_lib.message_from_bytes(raw_email)
raw_email = msg_data[0][1]
msg = email_lib.message_from_bytes(raw_email)
sender_raw = msg.get("From", "")
sender_addr = _extract_email_address(sender_raw)
sender_name = _decode_header_value(sender_raw)
# Remove email from name if present
if "<" in sender_name:
sender_name = sender_name.split("<")[0].strip().strip('"')
sender_raw = msg.get("From", "")
sender_addr = _extract_email_address(sender_raw)
sender_name = _decode_header_value(sender_raw)
# Remove email from name if present
if "<" in sender_name:
sender_name = sender_name.split("<")[0].strip().strip('"')
subject = _decode_header_value(msg.get("Subject", "(no subject)"))
message_id = msg.get("Message-ID", "")
in_reply_to = msg.get("In-Reply-To", "")
body = _extract_text_body(msg)
attachments = _extract_attachments(msg, skip_attachments=self._skip_attachments)
subject = _decode_header_value(msg.get("Subject", "(no subject)"))
message_id = msg.get("Message-ID", "")
in_reply_to = msg.get("In-Reply-To", "")
# Skip automated/noreply senders before any processing
msg_headers = dict(msg.items())
if _is_automated_sender(sender_addr, msg_headers):
logger.debug("[Email] Skipping automated sender: %s", sender_addr)
continue
body = _extract_text_body(msg)
attachments = _extract_attachments(msg, skip_attachments=self._skip_attachments)
results.append({
"uid": uid,
"sender_addr": sender_addr,
"sender_name": sender_name,
"subject": subject,
"message_id": message_id,
"in_reply_to": in_reply_to,
"body": body,
"attachments": attachments,
"date": msg.get("Date", ""),
})
imap.logout()
results.append({
"uid": uid,
"sender_addr": sender_addr,
"sender_name": sender_name,
"subject": subject,
"message_id": message_id,
"in_reply_to": in_reply_to,
"body": body,
"attachments": attachments,
"date": msg.get("Date", ""),
})
finally:
try:
imap.logout()
except Exception:
pass
except Exception as e:
logger.error("[Email] IMAP fetch error: %s", e)
return results
@@ -348,6 +406,11 @@ class EmailAdapter(BasePlatformAdapter):
if sender_addr == self._address.lower():
return
# Never reply to automated senders
if _is_automated_sender(sender_addr, {}):
logger.debug("[Email] Dropping automated sender at dispatch: %s", sender_addr)
return
subject = msg_data["subject"]
body = msg_data["body"].strip()
attachments = msg_data["attachments"]
@@ -443,10 +506,15 @@ class EmailAdapter(BasePlatformAdapter):
msg.attach(MIMEText(body, "plain", "utf-8"))
smtp = smtplib.SMTP(self._smtp_host, self._smtp_port, timeout=30)
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.send_message(msg)
smtp.quit()
try:
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.send_message(msg)
finally:
try:
smtp.quit()
except Exception:
smtp.close()
logger.info("[Email] Sent reply to %s (subject: %s)", to_addr, subject)
return msg_id
@@ -530,10 +598,15 @@ class EmailAdapter(BasePlatformAdapter):
msg.attach(part)
smtp = smtplib.SMTP(self._smtp_host, self._smtp_port, timeout=30)
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.send_message(msg)
smtp.quit()
try:
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.send_message(msg)
finally:
try:
smtp.quit()
except Exception:
smtp.close()
return msg_id
File diff suppressed because it is too large Load Diff
+356 -33
View File
@@ -17,6 +17,8 @@ Environment variables:
from __future__ import annotations
import asyncio
import io
import json
import logging
import mimetypes
import os
@@ -40,11 +42,21 @@ logger = logging.getLogger(__name__)
MAX_MESSAGE_LENGTH = 4000
# Store directory for E2EE keys and sync state.
_STORE_DIR = Path.home() / ".hermes" / "matrix" / "store"
# Uses get_hermes_home() so each profile gets its own Matrix store.
from hermes_constants import get_hermes_dir as _get_hermes_dir
_STORE_DIR = _get_hermes_dir("platforms/matrix/store", "matrix/store")
# Grace period: ignore messages older than this many seconds before startup.
_STARTUP_GRACE_SECONDS = 5
# E2EE key export file for persistence across restarts.
_KEY_EXPORT_FILE = _STORE_DIR / "exported_keys.txt"
_KEY_EXPORT_PASSPHRASE = "hermes-matrix-e2ee-keys"
# Pending undecrypted events: cap and TTL for retry buffer.
_MAX_PENDING_EVENTS = 100
_PENDING_EVENT_TTL = 300 # seconds — stop retrying after 5 min
def check_matrix_requirements() -> bool:
"""Return True if the Matrix adapter can be used."""
@@ -107,6 +119,10 @@ class MatrixAdapter(BasePlatformAdapter):
self._processed_events: deque = deque(maxlen=1000)
self._processed_events_set: set = set()
# Buffer for undecrypted events pending key receipt.
# Each entry: (room, event, timestamp)
self._pending_megolm: list = []
def _is_duplicate_event(self, event_id) -> bool:
"""Return True if this event was already processed. Tracks the ID otherwise."""
if not event_id:
@@ -161,22 +177,49 @@ class MatrixAdapter(BasePlatformAdapter):
# Authenticate.
if self._access_token:
client.access_token = self._access_token
# Resolve user_id if not set.
if not self._user_id:
resp = await client.whoami()
if isinstance(resp, nio.WhoamiResponse):
self._user_id = resp.user_id
client.user_id = resp.user_id
logger.info("Matrix: authenticated as %s", self._user_id)
else:
logger.error(
"Matrix: whoami failed — check MATRIX_ACCESS_TOKEN and MATRIX_HOMESERVER"
# With access-token auth, always resolve whoami so we validate the
# token and learn the device_id. The device_id matters for E2EE:
# without it, matrix-nio can send plain messages but may fail to
# decrypt inbound encrypted events or encrypt outbound room sends.
resp = await client.whoami()
if isinstance(resp, nio.WhoamiResponse):
resolved_user_id = getattr(resp, "user_id", "") or self._user_id
resolved_device_id = getattr(resp, "device_id", "")
if resolved_user_id:
self._user_id = resolved_user_id
# restore_login() is the matrix-nio path that binds the access
# token to a specific device and loads the crypto store.
if resolved_device_id and hasattr(client, "restore_login"):
client.restore_login(
self._user_id or resolved_user_id,
resolved_device_id,
self._access_token,
)
await client.close()
return False
else:
if self._user_id:
client.user_id = self._user_id
if resolved_device_id:
client.device_id = resolved_device_id
client.access_token = self._access_token
if self._encryption:
logger.warning(
"Matrix: access-token login did not restore E2EE state; "
"encrypted rooms may fail until a device_id is available"
)
logger.info(
"Matrix: using access token for %s%s",
self._user_id or "(unknown user)",
f" (device {resolved_device_id})" if resolved_device_id else "",
)
else:
client.user_id = self._user_id
logger.info("Matrix: using access token for %s", self._user_id)
logger.error(
"Matrix: whoami failed — check MATRIX_ACCESS_TOKEN and MATRIX_HOMESERVER"
)
await client.close()
return False
elif self._password and self._user_id:
resp = await client.login(
self._password,
@@ -194,7 +237,7 @@ class MatrixAdapter(BasePlatformAdapter):
return False
# If E2EE is enabled, load the crypto store.
if self._encryption and hasattr(client, "olm"):
if self._encryption and getattr(client, "olm", None):
try:
if client.should_upload_keys:
await client.keys_upload()
@@ -202,6 +245,21 @@ class MatrixAdapter(BasePlatformAdapter):
except Exception as exc:
logger.warning("Matrix: crypto init issue: %s", exc)
# Import previously exported Megolm keys (survives restarts).
if _KEY_EXPORT_FILE.exists():
try:
await client.import_keys(
str(_KEY_EXPORT_FILE), _KEY_EXPORT_PASSPHRASE,
)
logger.info("Matrix: imported Megolm keys from backup")
except Exception as exc:
logger.debug("Matrix: could not import keys: %s", exc)
elif self._encryption:
logger.warning(
"Matrix: E2EE requested but crypto store is not loaded; "
"encrypted rooms may fail"
)
# Register event callbacks.
client.add_event_callback(self._on_room_message, nio.RoomMessageText)
client.add_event_callback(self._on_room_message_media, nio.RoomMessageImage)
@@ -230,6 +288,7 @@ class MatrixAdapter(BasePlatformAdapter):
)
# Build DM room cache from m.direct account data.
await self._refresh_dm_cache()
await self._run_e2ee_maintenance()
else:
logger.warning("Matrix: initial sync returned %s", type(resp).__name__)
@@ -249,6 +308,18 @@ class MatrixAdapter(BasePlatformAdapter):
except (asyncio.CancelledError, Exception):
pass
# Export Megolm keys before closing so the next restart can decrypt
# events that used sessions from this run.
if self._client and self._encryption and getattr(self._client, "olm", None):
try:
_STORE_DIR.mkdir(parents=True, exist_ok=True)
await self._client.export_keys(
str(_KEY_EXPORT_FILE), _KEY_EXPORT_PASSPHRASE,
)
logger.info("Matrix: exported Megolm keys for next restart")
except Exception as exc:
logger.debug("Matrix: could not export keys on disconnect: %s", exc)
if self._client:
await self._client.close()
self._client = None
@@ -301,13 +372,48 @@ class MatrixAdapter(BasePlatformAdapter):
relates_to["m.in_reply_to"] = {"event_id": reply_to}
msg_content["m.relates_to"] = relates_to
resp = await self._client.room_send(
chat_id,
"m.room.message",
msg_content,
)
async def _room_send_once(*, ignore_unverified_devices: bool = False):
return await asyncio.wait_for(
self._client.room_send(
chat_id,
"m.room.message",
msg_content,
ignore_unverified_devices=ignore_unverified_devices,
),
timeout=45,
)
try:
resp = await _room_send_once(ignore_unverified_devices=False)
except Exception as exc:
retryable = isinstance(exc, asyncio.TimeoutError)
olm_unverified = getattr(nio, "OlmUnverifiedDeviceError", None)
send_retry = getattr(nio, "SendRetryError", None)
if isinstance(olm_unverified, type) and isinstance(exc, olm_unverified):
retryable = True
if isinstance(send_retry, type) and isinstance(exc, send_retry):
retryable = True
if not retryable:
logger.error("Matrix: failed to send to %s: %s", chat_id, exc)
return SendResult(success=False, error=str(exc))
logger.warning(
"Matrix: initial encrypted send to %s failed (%s); "
"retrying after E2EE maintenance with ignored unverified devices",
chat_id,
exc,
)
await self._run_e2ee_maintenance()
try:
resp = await _room_send_once(ignore_unverified_devices=True)
except Exception as retry_exc:
logger.error("Matrix: failed to send to %s after retry: %s", chat_id, retry_exc)
return SendResult(success=False, error=str(retry_exc))
if isinstance(resp, nio.RoomSendResponse):
last_event_id = resp.event_id
logger.info("Matrix: sent event %s to %s", last_event_id, chat_id)
else:
err = getattr(resp, "message", str(resp))
logger.error("Matrix: failed to send to %s: %s", chat_id, err)
@@ -442,8 +548,11 @@ class MatrixAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload an audio file as a voice message."""
return await self._send_local_file(chat_id, audio_path, "m.audio", caption, reply_to, metadata=metadata)
"""Upload an audio file as a voice message (MSC3245 native voice)."""
return await self._send_local_file(
chat_id, audio_path, "m.audio", caption, reply_to,
metadata=metadata, is_voice=True
)
async def send_video(
self,
@@ -476,13 +585,16 @@ class MatrixAdapter(BasePlatformAdapter):
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
is_voice: bool = False,
) -> SendResult:
"""Upload bytes to Matrix and send as a media message."""
import nio
# Upload to homeserver.
resp = await self._client.upload(
data,
# nio expects a DataProvider (callable) or file-like object, not raw bytes.
# nio.upload() returns a tuple (UploadResponse|UploadError, Optional[Dict])
resp, maybe_encryption_info = await self._client.upload(
io.BytesIO(data),
content_type=content_type,
filename=filename,
)
@@ -504,6 +616,10 @@ class MatrixAdapter(BasePlatformAdapter):
},
}
# Add MSC3245 voice flag for native voice messages.
if is_voice:
msg_content["org.matrix.msc3245.voice"] = {}
if reply_to:
msg_content["m.relates_to"] = {
"m.in_reply_to": {"event_id": reply_to}
@@ -531,6 +647,7 @@ class MatrixAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
file_name: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
is_voice: bool = False,
) -> SendResult:
"""Read a local file and upload it."""
p = Path(file_path)
@@ -543,7 +660,7 @@ class MatrixAdapter(BasePlatformAdapter):
ct = mimetypes.guess_type(fname)[0] or "application/octet-stream"
data = p.read_bytes()
return await self._upload_and_send(room_id, data, fname, ct, msgtype, caption, reply_to, metadata)
return await self._upload_and_send(room_id, data, fname, ct, msgtype, caption, reply_to, metadata, is_voice)
# ------------------------------------------------------------------
# Sync loop
@@ -551,9 +668,23 @@ class MatrixAdapter(BasePlatformAdapter):
async def _sync_loop(self) -> None:
"""Continuously sync with the homeserver."""
import nio
while not self._closing:
try:
await self._client.sync(timeout=30000)
resp = await self._client.sync(timeout=30000)
if isinstance(resp, nio.SyncError):
if self._closing:
return
logger.warning(
"Matrix: sync returned %s: %s — retrying in 5s",
type(resp).__name__,
getattr(resp, "message", resp),
)
await asyncio.sleep(5)
continue
await self._run_e2ee_maintenance()
except asyncio.CancelledError:
return
except Exception as exc:
@@ -562,6 +693,148 @@ class MatrixAdapter(BasePlatformAdapter):
logger.warning("Matrix: sync error: %s — retrying in 5s", exc)
await asyncio.sleep(5)
async def _run_e2ee_maintenance(self) -> None:
"""Run matrix-nio E2EE housekeeping between syncs.
Hermes uses a custom sync loop instead of matrix-nio's sync_forever(),
so we need to explicitly drive the key management work that sync_forever()
normally handles for encrypted rooms.
Also auto-trusts all devices (so senders share session keys with us)
and retries decryption for any buffered MegolmEvents.
"""
client = self._client
if not client or not self._encryption or not getattr(client, "olm", None):
return
did_query_keys = client.should_query_keys
tasks = [asyncio.create_task(client.send_to_device_messages())]
if client.should_upload_keys:
tasks.append(asyncio.create_task(client.keys_upload()))
if did_query_keys:
tasks.append(asyncio.create_task(client.keys_query()))
if client.should_claim_keys:
users = client.get_users_for_key_claiming()
if users:
tasks.append(asyncio.create_task(client.keys_claim(users)))
for task in asyncio.as_completed(tasks):
try:
await task
except asyncio.CancelledError:
raise
except Exception as exc:
logger.warning("Matrix: E2EE maintenance task failed: %s", exc)
# After key queries, auto-trust all devices so senders share keys with
# us. For a bot this is the right default — we want to decrypt
# everything, not enforce manual verification.
if did_query_keys:
self._auto_trust_devices()
# Retry any buffered undecrypted events now that new keys may have
# arrived (from key requests, key queries, or to-device forwarding).
if self._pending_megolm:
await self._retry_pending_decryptions()
def _auto_trust_devices(self) -> None:
"""Trust/verify all unverified devices we know about.
When other clients see our device as verified, they proactively share
Megolm session keys with us. Without this, many clients will refuse
to include an unverified device in key distributions.
"""
client = self._client
if not client:
return
device_store = getattr(client, "device_store", None)
if not device_store:
return
own_device = getattr(client, "device_id", None)
trusted_count = 0
try:
# DeviceStore.__iter__ yields OlmDevice objects directly.
for device in device_store:
if getattr(device, "device_id", None) == own_device:
continue
if not getattr(device, "verified", False):
client.verify_device(device)
trusted_count += 1
except Exception as exc:
logger.debug("Matrix: auto-trust error: %s", exc)
if trusted_count:
logger.info("Matrix: auto-trusted %d new device(s)", trusted_count)
async def _retry_pending_decryptions(self) -> None:
"""Retry decrypting buffered MegolmEvents after new keys arrive."""
import nio
client = self._client
if not client or not self._pending_megolm:
return
now = time.time()
still_pending: list = []
for room, event, ts in self._pending_megolm:
# Drop events that have aged past the TTL.
if now - ts > _PENDING_EVENT_TTL:
logger.debug(
"Matrix: dropping expired pending event %s (age %.0fs)",
getattr(event, "event_id", "?"), now - ts,
)
continue
try:
decrypted = client.decrypt_event(event)
except Exception:
# Still missing the key — keep in buffer.
still_pending.append((room, event, ts))
continue
if isinstance(decrypted, nio.MegolmEvent):
# decrypt_event returned the same undecryptable event.
still_pending.append((room, event, ts))
continue
logger.info(
"Matrix: decrypted buffered event %s (%s)",
getattr(event, "event_id", "?"),
type(decrypted).__name__,
)
# Route to the appropriate handler based on decrypted type.
try:
if isinstance(decrypted, nio.RoomMessageText):
await self._on_room_message(room, decrypted)
elif isinstance(
decrypted,
(nio.RoomMessageImage, nio.RoomMessageAudio,
nio.RoomMessageVideo, nio.RoomMessageFile),
):
await self._on_room_message_media(room, decrypted)
else:
logger.debug(
"Matrix: decrypted event %s has unhandled type %s",
getattr(event, "event_id", "?"),
type(decrypted).__name__,
)
except Exception as exc:
logger.warning(
"Matrix: error processing decrypted event %s: %s",
getattr(event, "event_id", "?"), exc,
)
self._pending_megolm = still_pending
# ------------------------------------------------------------------
# Event callbacks
# ------------------------------------------------------------------
@@ -583,13 +856,29 @@ class MatrixAdapter(BasePlatformAdapter):
if event_ts and event_ts < self._startup_ts - _STARTUP_GRACE_SECONDS:
return
# Handle decrypted MegolmEvents — extract the inner event.
# Handle undecryptable MegolmEvents: request the missing session key
# and buffer the event for retry once the key arrives.
if isinstance(event, nio.MegolmEvent):
# Failed to decrypt.
logger.warning(
"Matrix: could not decrypt event %s in %s",
"Matrix: could not decrypt event %s in %s — requesting key",
event.event_id, room.room_id,
)
# Ask other devices in the room to forward the session key.
try:
resp = await self._client.request_room_key(event)
if hasattr(resp, "event_id") or not isinstance(resp, Exception):
logger.debug(
"Matrix: room key request sent for session %s",
getattr(event, "session_id", "?"),
)
except Exception as exc:
logger.debug("Matrix: room key request failed: %s", exc)
# Buffer for retry on next maintenance cycle.
self._pending_megolm.append((room, event, time.time()))
if len(self._pending_megolm) > _MAX_PENDING_EVENTS:
self._pending_megolm = self._pending_megolm[-_MAX_PENDING_EVENTS:]
return
# Skip edits (m.replace relation).
@@ -692,11 +981,19 @@ class MatrixAdapter(BasePlatformAdapter):
event_mimetype = (content_info.get("info") or {}).get("mimetype", "")
media_type = "application/octet-stream"
msg_type = MessageType.DOCUMENT
is_voice_message = False
if isinstance(event, nio.RoomMessageImage):
msg_type = MessageType.PHOTO
media_type = event_mimetype or "image/png"
elif isinstance(event, nio.RoomMessageAudio):
msg_type = MessageType.AUDIO
# Check for MSC3245 voice flag: org.matrix.msc3245.voice: {}
source_content = getattr(event, "source", {}).get("content", {})
if source_content.get("org.matrix.msc3245.voice") is not None:
is_voice_message = True
msg_type = MessageType.VOICE
else:
msg_type = MessageType.AUDIO
media_type = event_mimetype or "audio/ogg"
elif isinstance(event, nio.RoomMessageVideo):
msg_type = MessageType.VIDEO
@@ -734,6 +1031,31 @@ class MatrixAdapter(BasePlatformAdapter):
if relates_to.get("rel_type") == "m.thread":
thread_id = relates_to.get("event_id")
# For voice messages, cache audio locally for transcription tools.
# Use the authenticated nio client to download (Matrix requires auth for media).
media_urls = [http_url] if http_url else None
media_types = [media_type] if http_url else None
if is_voice_message and url and url.startswith("mxc://"):
try:
import nio
from gateway.platforms.base import cache_audio_from_bytes
resp = await self._client.download(mxc=url)
if isinstance(resp, nio.MemoryDownloadResponse):
# Extract extension from mimetype or default to .ogg
ext = ".ogg"
if media_type and "/" in media_type:
subtype = media_type.split("/")[1]
ext = f".{subtype}" if subtype else ".ogg"
local_path = cache_audio_from_bytes(resp.body, ext)
media_urls = [local_path]
logger.debug("Matrix: cached voice message to %s", local_path)
else:
logger.warning("Matrix: failed to download voice: %s", getattr(resp, "message", resp))
except Exception as e:
logger.warning("Matrix: failed to cache voice message, using HTTP URL: %s", e)
source = self.build_source(
chat_id=room.room_id,
chat_type=chat_type,
@@ -742,8 +1064,9 @@ class MatrixAdapter(BasePlatformAdapter):
thread_id=thread_id,
)
# Use cached local path for images, HTTP URL for other media types
media_urls = [cached_path] if cached_path else ([http_url] if http_url else None)
# Use cached local path for images (voice messages already handled above).
if cached_path:
media_urls = [cached_path]
media_types = [media_type] if media_urls else None
msg_event = MessageEvent(
+52 -14
View File
@@ -407,18 +407,38 @@ class MattermostAdapter(BasePlatformAdapter):
kind: str = "file",
) -> SendResult:
"""Download a URL and upload it as a file attachment."""
import asyncio
import aiohttp
try:
async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
if resp.status >= 400:
# Fall back to sending the URL as text.
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
file_data = await resp.read()
ct = resp.content_type or "application/octet-stream"
# Derive filename from URL.
fname = url.rsplit("/", 1)[-1].split("?")[0] or f"{kind}.png"
except Exception as exc:
logger.warning("Mattermost: failed to download %s: %s", url, exc)
last_exc = None
file_data = None
ct = "application/octet-stream"
fname = url.rsplit("/", 1)[-1].split("?")[0] or f"{kind}.png"
for attempt in range(3):
try:
async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
if resp.status >= 500 or resp.status == 429:
if attempt < 2:
logger.debug("Mattermost download retry %d/2 for %s (status %d)",
attempt + 1, url[:80], resp.status)
await asyncio.sleep(1.5 * (attempt + 1))
continue
if resp.status >= 400:
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
file_data = await resp.read()
ct = resp.content_type or "application/octet-stream"
break
except (aiohttp.ClientError, asyncio.TimeoutError) as exc:
last_exc = exc
if attempt < 2:
await asyncio.sleep(1.5 * (attempt + 1))
continue
logger.warning("Mattermost: failed to download %s after %d attempts: %s", url, attempt + 1, exc)
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
if file_data is None:
logger.warning("Mattermost: download returned no data for %s", url)
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
file_id = await self._upload_file(chat_id, file_data, fname, ct)
@@ -583,9 +603,19 @@ class MattermostAdapter(BasePlatformAdapter):
# For DMs, user_id is sufficient. For channels, check for @mention.
message_text = post.get("message", "")
# Mention-only mode: skip channel messages that don't @mention the bot.
# DMs (type "D") are always processed.
# Mention-gating for non-DM channels.
# Config (env vars):
# MATTERMOST_REQUIRE_MENTION: Require @mention in channels (default: true)
# MATTERMOST_FREE_RESPONSE_CHANNELS: Channel IDs where bot responds without mention
if channel_type_raw != "D":
require_mention = os.getenv(
"MATTERMOST_REQUIRE_MENTION", "true"
).lower() not in ("false", "0", "no")
free_channels_raw = os.getenv("MATTERMOST_FREE_RESPONSE_CHANNELS", "")
free_channels = {ch.strip() for ch in free_channels_raw.split(",") if ch.strip()}
is_free_channel = channel_id in free_channels
mention_patterns = [
f"@{self._bot_username}",
f"@{self._bot_user_id}",
@@ -594,13 +624,21 @@ class MattermostAdapter(BasePlatformAdapter):
pattern.lower() in message_text.lower()
for pattern in mention_patterns
)
if not has_mention:
if require_mention and not is_free_channel and not has_mention:
logger.debug(
"Mattermost: skipping non-DM message without @mention (channel=%s)",
channel_id,
)
return
# Strip @mention from the message text so the agent sees clean input.
if has_mention:
for pattern in mention_patterns:
message_text = re.sub(
re.escape(pattern), "", message_text, flags=re.IGNORECASE
).strip()
# Resolve sender info.
sender_id = post.get("user_id", "")
sender_name = data.get("sender_name", "").lstrip("@") or sender_id
+42 -3
View File
@@ -22,7 +22,7 @@ import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Dict, List, Optional, Any
from urllib.parse import unquote
from urllib.parse import quote, unquote
import httpx
@@ -184,6 +184,8 @@ class SignalAdapter(BasePlatformAdapter):
self._recent_sent_timestamps: set = set()
self._max_recent_timestamps = 50
self._phone_lock_identity: Optional[str] = None
logger.info("Signal adapter initialized: url=%s account=%s groups=%s",
self.http_url, _redact_phone(self.account),
"enabled" if self.group_allow_from else "disabled")
@@ -198,6 +200,29 @@ class SignalAdapter(BasePlatformAdapter):
logger.error("Signal: SIGNAL_HTTP_URL and SIGNAL_ACCOUNT are required")
return False
# Acquire scoped lock to prevent duplicate Signal listeners for the same phone
try:
from gateway.status import acquire_scoped_lock
self._phone_lock_identity = self.account
acquired, existing = acquire_scoped_lock(
"signal-phone",
self._phone_lock_identity,
metadata={"platform": self.platform.value},
)
if not acquired:
owner_pid = existing.get("pid") if isinstance(existing, dict) else None
message = (
"Another local Hermes gateway is already using this Signal account"
+ (f" (PID {owner_pid})." if owner_pid else ".")
+ " Stop the other gateway before starting a second Signal listener."
)
logger.error("Signal: %s", message)
self._set_fatal_error("signal_phone_lock", message, retryable=False)
return False
except Exception as e:
logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)
self.client = httpx.AsyncClient(timeout=30.0)
# Health check — verify signal-cli daemon is reachable
@@ -245,6 +270,14 @@ class SignalAdapter(BasePlatformAdapter):
await self.client.aclose()
self.client = None
if self._phone_lock_identity:
try:
from gateway.status import release_scoped_lock
release_scoped_lock("signal-phone", self._phone_lock_identity)
except Exception as e:
logger.warning("Signal: Error releasing phone lock: %s", e, exc_info=True)
self._phone_lock_identity = None
logger.info("Signal: disconnected")
# ------------------------------------------------------------------
@@ -253,7 +286,7 @@ class SignalAdapter(BasePlatformAdapter):
async def _sse_listener(self) -> None:
"""Listen for SSE events from signal-cli daemon."""
url = f"{self.http_url}/api/v1/events?account={self.account}"
url = f"{self.http_url}/api/v1/events?account={quote(self.account, safe='')}"
backoff = SSE_RETRY_DELAY_INITIAL
while self._running:
@@ -279,6 +312,12 @@ class SignalAdapter(BasePlatformAdapter):
line = line.strip()
if not line:
continue
# SSE keepalive comments (":") prove the connection
# is alive — update activity so the health monitor
# doesn't report false idle warnings.
if line.startswith(":"):
self._last_sse_activity = time.time()
continue
# Parse SSE data lines
if line.startswith("data:"):
data_str = line[5:].strip()
@@ -515,7 +554,7 @@ class SignalAdapter(BasePlatformAdapter):
"""Fetch an attachment via JSON-RPC and cache it. Returns (path, ext)."""
result = await self._rpc("getAttachment", {
"account": self.account,
"attachmentId": attachment_id,
"id": attachment_id,
})
if not result:
+167 -51
View File
@@ -9,6 +9,7 @@ Uses slack-bolt (Python) with Socket Mode for:
"""
import asyncio
import json
import logging
import os
import re
@@ -73,6 +74,10 @@ class SlackAdapter(BasePlatformAdapter):
self._bot_user_id: Optional[str] = None
self._user_name_cache: Dict[str, str] = {} # user_id → display name
self._socket_mode_task: Optional[asyncio.Task] = None
# Multi-workspace support
self._team_clients: Dict[str, AsyncWebClient] = {} # team_id → WebClient
self._team_bot_user_ids: Dict[str, str] = {} # team_id → bot_user_id
self._channel_team: Dict[str, str] = {} # channel_id → team_id
async def connect(self) -> bool:
"""Connect to Slack via Socket Mode."""
@@ -82,23 +87,70 @@ class SlackAdapter(BasePlatformAdapter):
)
return False
bot_token = self.config.token
raw_token = self.config.token
app_token = os.getenv("SLACK_APP_TOKEN")
if not bot_token:
if not raw_token:
logger.error("[Slack] SLACK_BOT_TOKEN not set")
return False
if not app_token:
logger.error("[Slack] SLACK_APP_TOKEN not set")
return False
try:
self._app = AsyncApp(token=bot_token)
# Support comma-separated bot tokens for multi-workspace
bot_tokens = [t.strip() for t in raw_token.split(",") if t.strip()]
# Get our own bot user ID for mention detection
auth_response = await self._app.client.auth_test()
self._bot_user_id = auth_response.get("user_id")
bot_name = auth_response.get("user", "unknown")
# Also load tokens from OAuth token file
from hermes_constants import get_hermes_home
tokens_file = get_hermes_home() / "slack_tokens.json"
if tokens_file.exists():
try:
saved = json.loads(tokens_file.read_text(encoding="utf-8"))
for team_id, entry in saved.items():
tok = entry.get("token", "") if isinstance(entry, dict) else ""
if tok and tok not in bot_tokens:
bot_tokens.append(tok)
team_label = entry.get("team_name", team_id) if isinstance(entry, dict) else team_id
logger.info("[Slack] Loaded saved token for workspace %s", team_label)
except Exception as e:
logger.warning("[Slack] Failed to read %s: %s", tokens_file, e)
try:
# Acquire scoped lock to prevent duplicate app token usage
from gateway.status import acquire_scoped_lock
self._token_lock_identity = app_token
acquired, existing = acquire_scoped_lock('slack-app-token', app_token, metadata={'platform': 'slack'})
if not acquired:
owner_pid = existing.get('pid') if isinstance(existing, dict) else None
message = f'Slack app token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
logger.error('[%s] %s', self.name, message)
self._set_fatal_error('slack_token_lock', message, retryable=False)
return False
# First token is the primary — used for AsyncApp / Socket Mode
primary_token = bot_tokens[0]
self._app = AsyncApp(token=primary_token)
# Register each bot token and map team_id → client
for token in bot_tokens:
client = AsyncWebClient(token=token)
auth_response = await client.auth_test()
team_id = auth_response.get("team_id", "")
bot_user_id = auth_response.get("user_id", "")
bot_name = auth_response.get("user", "unknown")
team_name = auth_response.get("team", "unknown")
self._team_clients[team_id] = client
self._team_bot_user_ids[team_id] = bot_user_id
# First token sets the primary bot_user_id (backward compat)
if self._bot_user_id is None:
self._bot_user_id = bot_user_id
logger.info(
"[Slack] Authenticated as @%s in workspace %s (team: %s)",
bot_name, team_name, team_id,
)
# Register message event handler
@self._app.event("message")
@@ -123,7 +175,10 @@ class SlackAdapter(BasePlatformAdapter):
self._socket_mode_task = asyncio.create_task(self._handler.start_async())
self._running = True
logger.info("[Slack] Connected as @%s (Socket Mode)", bot_name)
logger.info(
"[Slack] Socket Mode connected (%d workspace(s))",
len(self._team_clients),
)
return True
except Exception as e: # pragma: no cover - defensive logging
@@ -138,8 +193,25 @@ class SlackAdapter(BasePlatformAdapter):
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[Slack] Error while closing Socket Mode handler: %s", e, exc_info=True)
self._running = False
# Release the token lock (use stored identity, not re-read env)
try:
from gateway.status import release_scoped_lock
if getattr(self, '_token_lock_identity', None):
release_scoped_lock('slack-app-token', self._token_lock_identity)
self._token_lock_identity = None
except Exception:
pass
logger.info("[Slack] Disconnected")
def _get_client(self, chat_id: str) -> AsyncWebClient:
"""Return the workspace-specific WebClient for a channel."""
team_id = self._channel_team.get(chat_id)
if team_id and team_id in self._team_clients:
return self._team_clients[team_id]
return self._app.client # fallback to primary
async def send(
self,
chat_id: str,
@@ -176,7 +248,7 @@ class SlackAdapter(BasePlatformAdapter):
if broadcast and i == 0:
kwargs["reply_broadcast"] = True
last_result = await self._app.client.chat_postMessage(**kwargs)
last_result = await self._get_client(chat_id).chat_postMessage(**kwargs)
return SendResult(
success=True,
@@ -198,7 +270,7 @@ class SlackAdapter(BasePlatformAdapter):
if not self._app:
return SendResult(success=False, error="Not connected")
try:
await self._app.client.chat_update(
await self._get_client(chat_id).chat_update(
channel=chat_id,
ts=message_id,
text=content,
@@ -232,7 +304,7 @@ class SlackAdapter(BasePlatformAdapter):
return # Can only set status in a thread context
try:
await self._app.client.assistant_threads_setStatus(
await self._get_client(chat_id).assistant_threads_setStatus(
channel_id=chat_id,
thread_ts=thread_ts,
status="is thinking...",
@@ -274,7 +346,7 @@ class SlackAdapter(BasePlatformAdapter):
if not os.path.exists(file_path):
raise FileNotFoundError(f"File not found: {file_path}")
result = await self._app.client.files_upload_v2(
result = await self._get_client(chat_id).files_upload_v2(
channel=chat_id,
file=file_path,
filename=os.path.basename(file_path),
@@ -376,7 +448,7 @@ class SlackAdapter(BasePlatformAdapter):
if not self._app:
return False
try:
await self._app.client.reactions_add(
await self._get_client(channel).reactions_add(
channel=channel, timestamp=timestamp, name=emoji
)
return True
@@ -392,7 +464,7 @@ class SlackAdapter(BasePlatformAdapter):
if not self._app:
return False
try:
await self._app.client.reactions_remove(
await self._get_client(channel).reactions_remove(
channel=channel, timestamp=timestamp, name=emoji
)
return True
@@ -402,7 +474,7 @@ class SlackAdapter(BasePlatformAdapter):
# ----- User identity resolution -----
async def _resolve_user_name(self, user_id: str) -> str:
async def _resolve_user_name(self, user_id: str, chat_id: str = "") -> str:
"""Resolve a Slack user ID to a display name, with caching."""
if not user_id:
return ""
@@ -413,7 +485,8 @@ class SlackAdapter(BasePlatformAdapter):
return user_id
try:
result = await self._app.client.users_info(user=user_id)
client = self._get_client(chat_id) if chat_id else self._app.client
result = await client.users_info(user=user_id)
user = result.get("user", {})
# Prefer display_name → real_name → user_id
profile = user.get("profile", {})
@@ -477,7 +550,7 @@ class SlackAdapter(BasePlatformAdapter):
response = await client.get(image_url)
response.raise_for_status()
result = await self._app.client.files_upload_v2(
result = await self._get_client(chat_id).files_upload_v2(
channel=chat_id,
content=response.content,
filename="image.png",
@@ -537,7 +610,7 @@ class SlackAdapter(BasePlatformAdapter):
return SendResult(success=False, error=f"Video file not found: {video_path}")
try:
result = await self._app.client.files_upload_v2(
result = await self._get_client(chat_id).files_upload_v2(
channel=chat_id,
file=video_path,
filename=os.path.basename(video_path),
@@ -578,7 +651,7 @@ class SlackAdapter(BasePlatformAdapter):
display_name = file_name or os.path.basename(file_path)
try:
result = await self._app.client.files_upload_v2(
result = await self._get_client(chat_id).files_upload_v2(
channel=chat_id,
file=file_path,
filename=display_name,
@@ -606,7 +679,7 @@ class SlackAdapter(BasePlatformAdapter):
return {"name": chat_id, "type": "unknown"}
try:
result = await self._app.client.conversations_info(channel=chat_id)
result = await self._get_client(chat_id).conversations_info(channel=chat_id)
channel = result.get("channel", {})
is_dm = channel.get("is_im", False)
return {
@@ -639,6 +712,11 @@ class SlackAdapter(BasePlatformAdapter):
user_id = event.get("user", "")
channel_id = event.get("channel", "")
ts = event.get("ts", "")
team_id = event.get("team", "")
# Track which workspace owns this channel
if team_id and channel_id:
self._channel_team[channel_id] = team_id
# Determine if this is a DM or channel message
channel_type = event.get("channel_type", "")
@@ -655,11 +733,12 @@ class SlackAdapter(BasePlatformAdapter):
thread_ts = event.get("thread_ts") or ts # ts fallback for channels
# In channels, only respond if bot is mentioned
if not is_dm and self._bot_user_id:
if f"<@{self._bot_user_id}>" not in text:
bot_uid = self._team_bot_user_ids.get(team_id, self._bot_user_id)
if not is_dm and bot_uid:
if f"<@{bot_uid}>" not in text:
return
# Strip the bot mention from the text
text = text.replace(f"<@{self._bot_user_id}>", "").strip()
text = text.replace(f"<@{bot_uid}>", "").strip()
# Determine message type
msg_type = MessageType.TEXT
@@ -679,7 +758,7 @@ class SlackAdapter(BasePlatformAdapter):
if ext not in (".jpg", ".jpeg", ".png", ".gif", ".webp"):
ext = ".jpg"
# Slack private URLs require the bot token as auth header
cached = await self._download_slack_file(url, ext)
cached = await self._download_slack_file(url, ext, team_id=team_id)
media_urls.append(cached)
media_types.append(mimetype)
msg_type = MessageType.PHOTO
@@ -690,7 +769,7 @@ class SlackAdapter(BasePlatformAdapter):
ext = "." + mimetype.split("/")[-1].split(";")[0]
if ext not in (".ogg", ".mp3", ".wav", ".webm", ".m4a"):
ext = ".ogg"
cached = await self._download_slack_file(url, ext, audio=True)
cached = await self._download_slack_file(url, ext, audio=True, team_id=team_id)
media_urls.append(cached)
media_types.append(mimetype)
msg_type = MessageType.VOICE
@@ -721,7 +800,7 @@ class SlackAdapter(BasePlatformAdapter):
continue
# Download and cache
raw_bytes = await self._download_slack_file_bytes(url)
raw_bytes = await self._download_slack_file_bytes(url, team_id=team_id)
cached_path = cache_document_from_bytes(
raw_bytes, original_filename or f"document{ext}"
)
@@ -750,7 +829,7 @@ class SlackAdapter(BasePlatformAdapter):
logger.warning("[Slack] Failed to cache document from %s: %s", url, e, exc_info=True)
# Resolve user display name (cached after first lookup)
user_name = await self._resolve_user_name(user_id)
user_name = await self._resolve_user_name(user_id, chat_id=channel_id)
# Build source
source = self.build_source(
@@ -787,6 +866,11 @@ class SlackAdapter(BasePlatformAdapter):
text = command.get("text", "").strip()
user_id = command.get("user_id", "")
channel_id = command.get("channel_id", "")
team_id = command.get("team_id", "")
# Track which workspace owns this channel
if team_id and channel_id:
self._channel_team[channel_id] = team_id
# Map subcommands to gateway commands — derived from central registry.
# Also keep "compact" as a Slack-specific alias for /compress.
@@ -818,34 +902,66 @@ class SlackAdapter(BasePlatformAdapter):
await self.handle_message(event)
async def _download_slack_file(self, url: str, ext: str, audio: bool = False) -> str:
"""Download a Slack file using the bot token for auth."""
async def _download_slack_file(self, url: str, ext: str, audio: bool = False, team_id: str = "") -> str:
"""Download a Slack file using the bot token for auth, with retry."""
import asyncio
import httpx
bot_token = self.config.token
bot_token = self._team_clients[team_id].token if team_id and team_id in self._team_clients else self.config.token
last_exc = None
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
response = await client.get(
url,
headers={"Authorization": f"Bearer {bot_token}"},
)
response.raise_for_status()
for attempt in range(3):
try:
response = await client.get(
url,
headers={"Authorization": f"Bearer {bot_token}"},
)
response.raise_for_status()
if audio:
from gateway.platforms.base import cache_audio_from_bytes
return cache_audio_from_bytes(response.content, ext)
else:
from gateway.platforms.base import cache_image_from_bytes
return cache_image_from_bytes(response.content, ext)
if audio:
from gateway.platforms.base import cache_audio_from_bytes
return cache_audio_from_bytes(response.content, ext)
else:
from gateway.platforms.base import cache_image_from_bytes
return cache_image_from_bytes(response.content, ext)
except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
last_exc = exc
if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
raise
if attempt < 2:
logger.debug("Slack file download retry %d/2 for %s: %s",
attempt + 1, url[:80], exc)
await asyncio.sleep(1.5 * (attempt + 1))
continue
raise
raise last_exc
async def _download_slack_file_bytes(self, url: str) -> bytes:
"""Download a Slack file and return raw bytes."""
async def _download_slack_file_bytes(self, url: str, team_id: str = "") -> bytes:
"""Download a Slack file and return raw bytes, with retry."""
import asyncio
import httpx
bot_token = self.config.token
bot_token = self._team_clients[team_id].token if team_id and team_id in self._team_clients else self.config.token
last_exc = None
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
response = await client.get(
url,
headers={"Authorization": f"Bearer {bot_token}"},
)
response.raise_for_status()
return response.content
for attempt in range(3):
try:
response = await client.get(
url,
headers={"Authorization": f"Bearer {bot_token}"},
)
response.raise_for_status()
return response.content
except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
last_exc = exc
if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
raise
if attempt < 2:
logger.debug("Slack file download retry %d/2 for %s: %s",
attempt + 1, url[:80], exc)
await asyncio.sleep(1.5 * (attempt + 1))
continue
raise
raise last_exc
+305 -30
View File
@@ -8,10 +8,11 @@ Uses python-telegram-bot library for:
"""
import asyncio
import json
import logging
import os
import re
from typing import Dict, Optional, Any
from typing import Dict, List, Optional, Any
logger = logging.getLogger(__name__)
@@ -25,6 +26,7 @@ try:
filters,
)
from telegram.constants import ParseMode, ChatType
from telegram.request import HTTPXRequest
TELEGRAM_AVAILABLE = True
except ImportError:
TELEGRAM_AVAILABLE = False
@@ -34,6 +36,7 @@ except ImportError:
Application = Any
CommandHandler = Any
TelegramMessageHandler = Any
HTTPXRequest = Any
filters = None
ParseMode = None
ChatType = None
@@ -59,6 +62,11 @@ from gateway.platforms.base import (
cache_document_from_bytes,
SUPPORTED_DOCUMENT_TYPES,
)
from gateway.platforms.telegram_network import (
TelegramFallbackTransport,
discover_fallback_ips,
parse_fallback_ip_env,
)
def check_telegram_requirements() -> bool:
@@ -115,6 +123,8 @@ class TelegramAdapter(BasePlatformAdapter):
super().__init__(config, Platform.TELEGRAM)
self._app: Optional[Application] = None
self._bot: Optional[Bot] = None
self._webhook_mode: bool = False
self._mention_patterns = self._compile_mention_patterns()
self._reply_to_mode: str = getattr(config, 'reply_to_mode', 'first') or 'first'
# Buffer rapid/album photo updates so Telegram image bursts are handled
# as a single MessageEvent instead of self-interrupting multiple turns.
@@ -138,6 +148,13 @@ class TelegramAdapter(BasePlatformAdapter):
# DM Topics config from extra.dm_topics
self._dm_topics_config: List[Dict[str, Any]] = self.config.extra.get("dm_topics", [])
def _fallback_ips(self) -> list[str]:
"""Return validated fallback IPs from config (populated by _apply_env_overrides)."""
configured = self.config.extra.get("fallback_ips", []) if getattr(self.config, "extra", None) else []
if isinstance(configured, str):
configured = configured.split(",")
return parse_fallback_ip_env(",".join(str(v) for v in configured) if configured else None)
@staticmethod
def _looks_like_polling_conflict(error: Exception) -> bool:
text = str(error).lower()
@@ -331,7 +348,8 @@ class TelegramAdapter(BasePlatformAdapter):
def _persist_dm_topic_thread_id(self, chat_id: int, topic_name: str, thread_id: int) -> None:
"""Save a newly created thread_id back into config.yaml so it persists across restarts."""
try:
config_path = _Path.home() / ".hermes" / "config.yaml"
from hermes_constants import get_hermes_home
config_path = get_hermes_home() / "config.yaml"
if not config_path.exists():
logger.warning("[%s] Config file not found at %s, cannot persist thread_id", self.name, config_path)
return
@@ -441,7 +459,19 @@ class TelegramAdapter(BasePlatformAdapter):
self._persist_dm_topic_thread_id(int(chat_id), topic_name, thread_id)
async def connect(self) -> bool:
"""Connect to Telegram and start polling for updates."""
"""Connect to Telegram via polling or webhook.
By default, uses long polling (outbound connection to Telegram).
If ``TELEGRAM_WEBHOOK_URL`` is set, starts an HTTP webhook server
instead. Webhook mode is useful for cloud deployments (Fly.io,
Railway) where inbound HTTP can wake a suspended machine.
Env vars for webhook mode::
TELEGRAM_WEBHOOK_URL Public HTTPS URL (e.g. https://app.fly.dev/telegram)
TELEGRAM_WEBHOOK_PORT Local listen port (default 8443)
TELEGRAM_WEBHOOK_SECRET Secret token for update verification
"""
if not TELEGRAM_AVAILABLE:
logger.error(
"[%s] python-telegram-bot not installed. Run: pip install python-telegram-bot",
@@ -474,7 +504,26 @@ class TelegramAdapter(BasePlatformAdapter):
return False
# Build the application
self._app = Application.builder().token(self.config.token).build()
builder = Application.builder().token(self.config.token)
fallback_ips = self._fallback_ips()
if not fallback_ips:
fallback_ips = await discover_fallback_ips()
logger.info(
"[%s] Auto-discovered Telegram fallback IPs: %s",
self.name,
", ".join(fallback_ips),
)
if fallback_ips:
logger.warning(
"[%s] Telegram fallback IPs active: %s",
self.name,
", ".join(fallback_ips),
)
transport = TelegramFallbackTransport(fallback_ips)
request = HTTPXRequest(httpx_kwargs={"transport": transport})
get_updates_request = HTTPXRequest(httpx_kwargs={"transport": transport})
builder = builder.request(request).get_updates_request(get_updates_request)
self._app = builder.build()
self._bot = self._app.bot
# Register handlers
@@ -516,37 +565,76 @@ class TelegramAdapter(BasePlatformAdapter):
else:
raise
await self._app.start()
loop = asyncio.get_running_loop()
def _polling_error_callback(error: Exception) -> None:
if self._polling_error_task and not self._polling_error_task.done():
return
if self._looks_like_polling_conflict(error):
self._polling_error_task = loop.create_task(self._handle_polling_conflict(error))
elif self._looks_like_network_error(error):
logger.warning("[%s] Telegram network error, scheduling reconnect: %s", self.name, error)
self._polling_error_task = loop.create_task(self._handle_polling_network_error(error))
else:
logger.error("[%s] Telegram polling error: %s", self.name, error, exc_info=True)
# Decide between webhook and polling mode
webhook_url = os.getenv("TELEGRAM_WEBHOOK_URL", "").strip()
# Store reference for retry use in _handle_polling_conflict
self._polling_error_callback_ref = _polling_error_callback
if webhook_url:
# ── Webhook mode ─────────────────────────────────────
# Telegram pushes updates to our HTTP endpoint. This
# enables cloud platforms (Fly.io, Railway) to auto-wake
# suspended machines on inbound HTTP traffic.
webhook_port = int(os.getenv("TELEGRAM_WEBHOOK_PORT", "8443"))
webhook_secret = os.getenv("TELEGRAM_WEBHOOK_SECRET", "").strip() or None
from urllib.parse import urlparse
webhook_path = urlparse(webhook_url).path or "/telegram"
await self._app.updater.start_polling(
allowed_updates=Update.ALL_TYPES,
drop_pending_updates=True,
error_callback=_polling_error_callback,
)
await self._app.updater.start_webhook(
listen="0.0.0.0",
port=webhook_port,
url_path=webhook_path,
webhook_url=webhook_url,
secret_token=webhook_secret,
allowed_updates=Update.ALL_TYPES,
drop_pending_updates=True,
)
self._webhook_mode = True
logger.info(
"[%s] Webhook server listening on 0.0.0.0:%d%s",
self.name, webhook_port, webhook_path,
)
else:
# ── Polling mode (default) ───────────────────────────
loop = asyncio.get_running_loop()
def _polling_error_callback(error: Exception) -> None:
if self._polling_error_task and not self._polling_error_task.done():
return
if self._looks_like_polling_conflict(error):
self._polling_error_task = loop.create_task(self._handle_polling_conflict(error))
elif self._looks_like_network_error(error):
logger.warning("[%s] Telegram network error, scheduling reconnect: %s", self.name, error)
self._polling_error_task = loop.create_task(self._handle_polling_network_error(error))
else:
logger.error("[%s] Telegram polling error: %s", self.name, error, exc_info=True)
# Store reference for retry use in _handle_polling_conflict
self._polling_error_callback_ref = _polling_error_callback
await self._app.updater.start_polling(
allowed_updates=Update.ALL_TYPES,
drop_pending_updates=True,
error_callback=_polling_error_callback,
)
# Register bot commands so Telegram shows a hint menu when users type /
# List is derived from the central COMMAND_REGISTRY — adding a new
# gateway command there automatically adds it to the Telegram menu.
try:
from telegram import BotCommand
from hermes_cli.commands import telegram_bot_commands
from hermes_cli.commands import telegram_menu_commands
# Telegram allows up to 100 commands but has an undocumented
# payload size limit. Skill descriptions are truncated to 40
# chars in telegram_menu_commands() to fit 100 commands safely.
menu_commands, hidden_count = telegram_menu_commands(max_commands=100)
await self._bot.set_my_commands([
BotCommand(name, desc) for name, desc in telegram_bot_commands()
BotCommand(name, desc) for name, desc in menu_commands
])
if hidden_count:
logger.info(
"[%s] Telegram menu: %d commands registered, %d hidden (over 100 limit). Use /commands for full list.",
self.name, len(menu_commands), hidden_count,
)
except Exception as e:
logger.warning(
"[%s] Could not register Telegram command menu: %s",
@@ -556,7 +644,8 @@ class TelegramAdapter(BasePlatformAdapter):
)
self._mark_connected()
logger.info("[%s] Connected and polling for Telegram updates", self.name)
mode = "webhook" if self._webhook_mode else "polling"
logger.info("[%s] Connected to Telegram (%s mode)", self.name, mode)
# Set up DM topics (Bot API 9.4 — Private Chat Topics)
# Runs after connection is established so the bot can call createForumTopic.
@@ -584,7 +673,7 @@ class TelegramAdapter(BasePlatformAdapter):
return False
async def disconnect(self) -> None:
"""Stop polling, cancel pending album flushes, and disconnect."""
"""Stop polling/webhook, cancel pending album flushes, and disconnect."""
pending_media_group_tasks = list(self._media_group_tasks.values())
for task in pending_media_group_tasks:
task.cancel()
@@ -674,9 +763,15 @@ class TelegramAdapter(BasePlatformAdapter):
except ImportError:
_NetErr = OSError # type: ignore[misc,assignment]
try:
from telegram.error import BadRequest as _BadReq
except ImportError:
_BadReq = None # type: ignore[assignment,misc]
for i, chunk in enumerate(chunks):
should_thread = self._should_thread_reply(reply_to, i)
reply_to_id = int(reply_to) if should_thread else None
effective_thread_id = int(thread_id) if thread_id else None
msg = None
for _send_attempt in range(3):
@@ -688,7 +783,7 @@ class TelegramAdapter(BasePlatformAdapter):
text=chunk,
parse_mode=ParseMode.MARKDOWN_V2,
reply_to_message_id=reply_to_id,
message_thread_id=int(thread_id) if thread_id else None,
message_thread_id=effective_thread_id,
)
except Exception as md_error:
# Markdown parsing failed, try plain text
@@ -700,12 +795,40 @@ class TelegramAdapter(BasePlatformAdapter):
text=plain_chunk,
parse_mode=None,
reply_to_message_id=reply_to_id,
message_thread_id=int(thread_id) if thread_id else None,
message_thread_id=effective_thread_id,
)
else:
raise
break # success
except _NetErr as send_err:
# BadRequest is a subclass of NetworkError in
# python-telegram-bot but represents permanent errors
# (not transient network issues). Detect and handle
# specific cases instead of blindly retrying.
if _BadReq and isinstance(send_err, _BadReq):
err_lower = str(send_err).lower()
if "thread not found" in err_lower and effective_thread_id is not None:
# Thread doesn't exist — retry without
# message_thread_id so the message still
# reaches the chat.
logger.warning(
"[%s] Thread %s not found, retrying without message_thread_id",
self.name, effective_thread_id,
)
effective_thread_id = None
continue
if "message to be replied not found" in err_lower and reply_to_id is not None:
# Original message was deleted before we
# could reply — clear reply target and retry
# so the response is still delivered.
logger.warning(
"[%s] Reply target deleted, retrying without reply_to: %s",
self.name, send_err,
)
reply_to_id = None
continue
# Other BadRequest errors are permanent — don't retry
raise
if _send_attempt < 2:
wait = 2 ** _send_attempt
logger.warning("[%s] Network error on send (attempt %d/3), retrying in %ds: %s",
@@ -1257,6 +1380,148 @@ class TelegramAdapter(BasePlatformAdapter):
return text
# ── Group mention gating ──────────────────────────────────────────────
def _telegram_require_mention(self) -> bool:
"""Return whether group chats should require an explicit bot trigger."""
configured = self.config.extra.get("require_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() in ("true", "1", "yes", "on")
return bool(configured)
return os.getenv("TELEGRAM_REQUIRE_MENTION", "false").lower() in ("true", "1", "yes", "on")
def _telegram_free_response_chats(self) -> set[str]:
raw = self.config.extra.get("free_response_chats")
if raw is None:
raw = os.getenv("TELEGRAM_FREE_RESPONSE_CHATS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
return {part.strip() for part in str(raw).split(",") if part.strip()}
def _compile_mention_patterns(self) -> List[re.Pattern]:
"""Compile optional regex wake-word patterns for group triggers."""
patterns = self.config.extra.get("mention_patterns")
if patterns is None:
raw = os.getenv("TELEGRAM_MENTION_PATTERNS", "").strip()
if raw:
try:
loaded = json.loads(raw)
except Exception:
loaded = [part.strip() for part in raw.splitlines() if part.strip()]
if not loaded:
loaded = [part.strip() for part in raw.split(",") if part.strip()]
patterns = loaded
if patterns is None:
return []
if isinstance(patterns, str):
patterns = [patterns]
if not isinstance(patterns, list):
logger.warning(
"[%s] telegram mention_patterns must be a list or string; got %s",
self.name,
type(patterns).__name__,
)
return []
compiled: List[re.Pattern] = []
for pattern in patterns:
if not isinstance(pattern, str) or not pattern.strip():
continue
try:
compiled.append(re.compile(pattern, re.IGNORECASE))
except re.error as exc:
logger.warning("[%s] Invalid Telegram mention pattern %r: %s", self.name, pattern, exc)
if compiled:
logger.info("[%s] Loaded %d Telegram mention pattern(s)", self.name, len(compiled))
return compiled
def _is_group_chat(self, message: Message) -> bool:
chat = getattr(message, "chat", None)
if not chat:
return False
chat_type = str(getattr(chat, "type", "")).split(".")[-1].lower()
return chat_type in ("group", "supergroup")
def _is_reply_to_bot(self, message: Message) -> bool:
if not self._bot or not getattr(message, "reply_to_message", None):
return False
reply_user = getattr(message.reply_to_message, "from_user", None)
return bool(reply_user and getattr(reply_user, "id", None) == getattr(self._bot, "id", None))
def _message_mentions_bot(self, message: Message) -> bool:
if not self._bot:
return False
bot_username = (getattr(self._bot, "username", None) or "").lstrip("@").lower()
bot_id = getattr(self._bot, "id", None)
def _iter_sources():
yield getattr(message, "text", None) or "", getattr(message, "entities", None) or []
yield getattr(message, "caption", None) or "", getattr(message, "caption_entities", None) or []
for source_text, entities in _iter_sources():
if bot_username and f"@{bot_username}" in source_text.lower():
return True
for entity in entities:
entity_type = str(getattr(entity, "type", "")).split(".")[-1].lower()
if entity_type == "mention" and bot_username:
offset = int(getattr(entity, "offset", -1))
length = int(getattr(entity, "length", 0))
if offset < 0 or length <= 0:
continue
if source_text[offset:offset + length].strip().lower() == f"@{bot_username}":
return True
elif entity_type == "text_mention":
user = getattr(entity, "user", None)
if user and getattr(user, "id", None) == bot_id:
return True
return False
def _message_matches_mention_patterns(self, message: Message) -> bool:
if not self._mention_patterns:
return False
for candidate in (getattr(message, "text", None), getattr(message, "caption", None)):
if not candidate:
continue
for pattern in self._mention_patterns:
if pattern.search(candidate):
return True
return False
def _clean_bot_trigger_text(self, text: Optional[str]) -> Optional[str]:
if not text or not self._bot or not getattr(self._bot, "username", None):
return text
username = re.escape(self._bot.username)
cleaned = re.sub(rf"(?i)@{username}\b[,:\-]*\s*", "", text).strip()
return cleaned or text
def _should_process_message(self, message: Message, *, is_command: bool = False) -> bool:
"""Apply Telegram group trigger rules.
DMs remain unrestricted. Group/supergroup messages are accepted when:
- the chat is explicitly allowlisted in ``free_response_chats``
- ``require_mention`` is disabled
- the message is a command
- the message replies to the bot
- the bot is @mentioned
- the text/caption matches a configured regex wake-word pattern
"""
if not self._is_group_chat(message):
return True
if str(getattr(getattr(message, "chat", None), "id", "")) in self._telegram_free_response_chats():
return True
if not self._telegram_require_mention():
return True
if is_command:
return True
if self._is_reply_to_bot(message):
return True
if self._message_mentions_bot(message):
return True
return self._message_matches_mention_patterns(message)
async def _handle_text_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming text messages.
@@ -1266,14 +1531,19 @@ class TelegramAdapter(BasePlatformAdapter):
"""
if not update.message or not update.message.text:
return
if not self._should_process_message(update.message):
return
event = self._build_message_event(update.message, MessageType.TEXT)
event.text = self._clean_bot_trigger_text(event.text)
self._enqueue_text_event(event)
async def _handle_command(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming command messages."""
if not update.message or not update.message.text:
return
if not self._should_process_message(update.message, is_command=True):
return
event = self._build_message_event(update.message, MessageType.COMMAND)
await self.handle_message(event)
@@ -1282,6 +1552,8 @@ class TelegramAdapter(BasePlatformAdapter):
"""Handle incoming location/venue pin messages."""
if not update.message:
return
if not self._should_process_message(update.message):
return
msg = update.message
venue = getattr(msg, "venue", None)
@@ -1425,6 +1697,8 @@ class TelegramAdapter(BasePlatformAdapter):
"""Handle incoming media messages, downloading images to local cache."""
if not update.message:
return
if not self._should_process_message(update.message):
return
msg = update.message
@@ -1448,7 +1722,7 @@ class TelegramAdapter(BasePlatformAdapter):
# Add caption as text
if msg.caption:
event.text = msg.caption
event.text = self._clean_bot_trigger_text(msg.caption)
# Handle stickers: describe via vision tool with caching
if msg.sticker:
@@ -1700,7 +1974,8 @@ class TelegramAdapter(BasePlatformAdapter):
recognized without a gateway restart.
"""
try:
config_path = _Path.home() / ".hermes" / "config.yaml"
from hermes_constants import get_hermes_home
config_path = get_hermes_home() / "config.yaml"
if not config_path.exists():
return
+245
View File
@@ -0,0 +1,245 @@
"""Telegram-specific network helpers.
Provides a hostname-preserving fallback transport for networks where
api.telegram.org resolves to an endpoint that is unreachable from the current
host. The transport keeps the logical request host and TLS SNI as
api.telegram.org while retrying the TCP connection against one or more fallback
IPv4 addresses.
"""
from __future__ import annotations
import asyncio
import ipaddress
import logging
import os
import socket
from typing import Iterable, Optional
import httpx
logger = logging.getLogger(__name__)
_TELEGRAM_API_HOST = "api.telegram.org"
# DNS-over-HTTPS providers used to discover Telegram API IPs that may differ
# from the (potentially unreachable) IP returned by the local system resolver.
_DOH_TIMEOUT = 4.0 # seconds — bounded so connect() isn't noticeably delayed
_DOH_PROVIDERS: list[dict] = [
{
"url": "https://dns.google/resolve",
"params": {"name": _TELEGRAM_API_HOST, "type": "A"},
"headers": {},
},
{
"url": "https://cloudflare-dns.com/dns-query",
"params": {"name": _TELEGRAM_API_HOST, "type": "A"},
"headers": {"Accept": "application/dns-json"},
},
]
# Last-resort IPs when DoH is also blocked. These are stable Telegram Bot API
# endpoints in the 149.154.160.0/20 block (same seed used by OpenClaw).
_SEED_FALLBACK_IPS: list[str] = ["149.154.167.220"]
def _resolve_proxy_url() -> str | None:
for key in ("HTTPS_PROXY", "HTTP_PROXY", "ALL_PROXY", "https_proxy", "http_proxy", "all_proxy"):
value = (os.environ.get(key) or "").strip()
if value:
return value
return None
class TelegramFallbackTransport(httpx.AsyncBaseTransport):
"""Retry Telegram Bot API requests via fallback IPs while preserving TLS/SNI.
Requests continue to target https://api.telegram.org/... logically, but on
connect failures the underlying TCP connection is retried against a known
reachable IP. This is effectively the programmatic equivalent of
``curl --resolve api.telegram.org:443:<ip>``.
"""
def __init__(self, fallback_ips: Iterable[str], **transport_kwargs):
self._fallback_ips = [ip for ip in dict.fromkeys(_normalize_fallback_ips(fallback_ips))]
proxy_url = _resolve_proxy_url()
if proxy_url and "proxy" not in transport_kwargs:
transport_kwargs["proxy"] = proxy_url
self._primary = httpx.AsyncHTTPTransport(**transport_kwargs)
self._fallbacks = {
ip: httpx.AsyncHTTPTransport(**transport_kwargs) for ip in self._fallback_ips
}
self._sticky_ip: Optional[str] = None
self._sticky_lock = asyncio.Lock()
async def handle_async_request(self, request: httpx.Request) -> httpx.Response:
if request.url.host != _TELEGRAM_API_HOST or not self._fallback_ips:
return await self._primary.handle_async_request(request)
sticky_ip = self._sticky_ip
attempt_order: list[Optional[str]] = [sticky_ip] if sticky_ip else [None]
for ip in self._fallback_ips:
if ip != sticky_ip:
attempt_order.append(ip)
last_error: Exception | None = None
for ip in attempt_order:
candidate = request if ip is None else _rewrite_request_for_ip(request, ip)
transport = self._primary if ip is None else self._fallbacks[ip]
try:
response = await transport.handle_async_request(candidate)
if ip is not None and self._sticky_ip != ip:
async with self._sticky_lock:
if self._sticky_ip != ip:
self._sticky_ip = ip
logger.warning(
"[Telegram] Primary api.telegram.org path unreachable; using sticky fallback IP %s",
ip,
)
return response
except Exception as exc:
last_error = exc
if not _is_retryable_connect_error(exc):
raise
if ip is None:
logger.warning(
"[Telegram] Primary api.telegram.org connection failed (%s); trying fallback IPs %s",
exc,
", ".join(self._fallback_ips),
)
continue
logger.warning("[Telegram] Fallback IP %s failed: %s", ip, exc)
continue
assert last_error is not None
raise last_error
async def aclose(self) -> None:
await self._primary.aclose()
for transport in self._fallbacks.values():
await transport.aclose()
def _normalize_fallback_ips(values: Iterable[str]) -> list[str]:
normalized: list[str] = []
for value in values:
raw = str(value).strip()
if not raw:
continue
try:
addr = ipaddress.ip_address(raw)
except ValueError:
logger.warning("Ignoring invalid Telegram fallback IP: %r", raw)
continue
if addr.version != 4:
logger.warning("Ignoring non-IPv4 Telegram fallback IP: %s", raw)
continue
normalized.append(str(addr))
return normalized
def parse_fallback_ip_env(value: str | None) -> list[str]:
if not value:
return []
parts = [part.strip() for part in value.split(",")]
return _normalize_fallback_ips(parts)
def _resolve_system_dns() -> set[str]:
"""Return the IPv4 addresses that the OS resolver gives for api.telegram.org."""
try:
results = socket.getaddrinfo(_TELEGRAM_API_HOST, 443, socket.AF_INET)
return {addr[4][0] for addr in results}
except Exception:
return set()
async def _query_doh_provider(
client: httpx.AsyncClient, provider: dict
) -> list[str]:
"""Query one DoH provider and return A-record IPs."""
try:
resp = await client.get(
provider["url"], params=provider["params"], headers=provider["headers"]
)
resp.raise_for_status()
data = resp.json()
ips: list[str] = []
for answer in data.get("Answer", []):
if answer.get("type") != 1: # A record
continue
raw = answer.get("data", "").strip()
try:
ipaddress.ip_address(raw)
ips.append(raw)
except ValueError:
continue
return ips
except Exception as exc:
logger.debug("DoH query to %s failed: %s", provider["url"], exc)
return []
async def discover_fallback_ips() -> list[str]:
"""Auto-discover Telegram API IPs via DNS-over-HTTPS.
Resolves api.telegram.org through Google and Cloudflare DoH, collects all
unique IPs, and excludes the system-DNS-resolved IP (which is presumably
unreachable on this network). Falls back to a hardcoded seed list when DoH
is also unavailable.
"""
async with httpx.AsyncClient(timeout=httpx.Timeout(_DOH_TIMEOUT)) as client:
doh_tasks = [_query_doh_provider(client, p) for p in _DOH_PROVIDERS]
system_dns_task = asyncio.to_thread(_resolve_system_dns)
results = await asyncio.gather(system_dns_task, *doh_tasks, return_exceptions=True)
# results[0] = system DNS IPs (set), results[1:] = DoH IP lists
system_ips: set[str] = results[0] if isinstance(results[0], set) else set()
doh_ips: list[str] = []
for r in results[1:]:
if isinstance(r, list):
doh_ips.extend(r)
# Deduplicate preserving order, exclude system-DNS IPs
seen: set[str] = set()
candidates: list[str] = []
for ip in doh_ips:
if ip not in seen and ip not in system_ips:
seen.add(ip)
candidates.append(ip)
# Validate through existing normalization
validated = _normalize_fallback_ips(candidates)
if validated:
logger.debug("Discovered Telegram fallback IPs via DoH: %s", ", ".join(validated))
return validated
logger.info(
"DoH discovery yielded no new IPs (system DNS: %s); using seed fallback IPs %s",
", ".join(system_ips) or "unknown",
", ".join(_SEED_FALLBACK_IPS),
)
return list(_SEED_FALLBACK_IPS)
def _rewrite_request_for_ip(request: httpx.Request, ip: str) -> httpx.Request:
original_host = request.url.host or _TELEGRAM_API_HOST
url = request.url.copy_with(host=ip)
headers = request.headers.copy()
headers["host"] = original_host
extensions = dict(request.extensions)
extensions["sni_hostname"] = original_host
return httpx.Request(
method=request.method,
url=url,
headers=headers,
stream=request.stream,
extensions=extensions,
)
def _is_retryable_connect_error(exc: Exception) -> bool:
return isinstance(exc, (httpx.ConnectTimeout, httpx.ConnectError))
+58 -1
View File
@@ -27,6 +27,7 @@ import hashlib
import hmac
import json
import logging
import os
import re
import subprocess
import time
@@ -53,6 +54,7 @@ logger = logging.getLogger(__name__)
DEFAULT_HOST = "0.0.0.0"
DEFAULT_PORT = 8644
_INSECURE_NO_AUTH = "INSECURE_NO_AUTH"
_DYNAMIC_ROUTES_FILENAME = "webhook_subscriptions.json"
def check_webhook_requirements() -> bool:
@@ -68,7 +70,10 @@ class WebhookAdapter(BasePlatformAdapter):
self._host: str = config.extra.get("host", DEFAULT_HOST)
self._port: int = int(config.extra.get("port", DEFAULT_PORT))
self._global_secret: str = config.extra.get("secret", "")
self._routes: Dict[str, dict] = config.extra.get("routes", {})
self._static_routes: Dict[str, dict] = config.extra.get("routes", {})
self._dynamic_routes: Dict[str, dict] = {}
self._dynamic_routes_mtime: float = 0.0
self._routes: Dict[str, dict] = dict(self._static_routes)
self._runner = None
# Delivery info keyed by session chat_id — consumed by send()
@@ -96,6 +101,9 @@ class WebhookAdapter(BasePlatformAdapter):
# ------------------------------------------------------------------
async def connect(self) -> bool:
# Load agent-created subscriptions before validating
self._reload_dynamic_routes()
# Validate routes at startup — secret is required per route
for name, route in self._routes.items():
secret = route.get("secret", self._global_secret)
@@ -110,6 +118,17 @@ class WebhookAdapter(BasePlatformAdapter):
app.router.add_get("/health", self._handle_health)
app.router.add_post("/webhooks/{route_name}", self._handle_webhook)
# Port conflict detection — fail fast if port is already in use
import socket as _socket
try:
with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as _s:
_s.settimeout(1)
_s.connect(('127.0.0.1', self._port))
logger.error('[webhook] Port %d already in use. Set a different port in config.yaml: platforms.webhook.port', self._port)
return False
except (ConnectionRefusedError, OSError):
pass # port is free
self._runner = web.AppRunner(app)
await self._runner.setup()
site = web.TCPSite(self._runner, self._host, self._port)
@@ -182,8 +201,46 @@ class WebhookAdapter(BasePlatformAdapter):
"""GET /health — simple health check."""
return web.json_response({"status": "ok", "platform": "webhook"})
def _reload_dynamic_routes(self) -> None:
"""Reload agent-created subscriptions from disk if the file changed."""
from pathlib import Path as _Path
hermes_home = _Path(
os.getenv("HERMES_HOME", str(_Path.home() / ".hermes"))
).expanduser()
subs_path = hermes_home / _DYNAMIC_ROUTES_FILENAME
if not subs_path.exists():
if self._dynamic_routes:
self._dynamic_routes = {}
self._routes = dict(self._static_routes)
logger.debug("[webhook] Dynamic subscriptions file removed, cleared dynamic routes")
return
try:
mtime = subs_path.stat().st_mtime
if mtime <= self._dynamic_routes_mtime:
return # No change
data = json.loads(subs_path.read_text(encoding="utf-8"))
if not isinstance(data, dict):
return
# Merge: static routes take precedence over dynamic ones
self._dynamic_routes = {
k: v for k, v in data.items()
if k not in self._static_routes
}
self._routes = {**self._dynamic_routes, **self._static_routes}
self._dynamic_routes_mtime = mtime
logger.info(
"[webhook] Reloaded %d dynamic route(s): %s",
len(self._dynamic_routes),
", ".join(self._dynamic_routes.keys()) or "(none)",
)
except Exception as e:
logger.warning("[webhook] Failed to reload dynamic routes: %s", e)
async def _handle_webhook(self, request: "web.Request") -> "web.Response":
"""POST /webhooks/{route_name} — receive and process a webhook event."""
# Hot-reload dynamic subscriptions on each request (mtime-gated, cheap)
self._reload_dynamic_routes()
route_name = request.match_info.get("route_name", "")
route_config = self._routes.get(route_name)
File diff suppressed because it is too large Load Diff
+151 -104
View File
@@ -26,6 +26,7 @@ from pathlib import Path
from typing import Dict, Optional, Any
from hermes_cli.config import get_hermes_home
from hermes_constants import get_hermes_dir
logger = logging.getLogger(__name__)
@@ -134,13 +135,15 @@ class WhatsAppAdapter(BasePlatformAdapter):
)
self._session_path: Path = Path(config.extra.get(
"session_path",
get_hermes_home() / "whatsapp" / "session"
get_hermes_dir("platforms/whatsapp/session", "whatsapp/session")
))
self._reply_prefix: Optional[str] = config.extra.get("reply_prefix")
self._message_queue: asyncio.Queue = asyncio.Queue()
self._bridge_log_fh = None
self._bridge_log: Optional[Path] = None
self._poll_task: Optional[asyncio.Task] = None
self._http_session: Optional["aiohttp.ClientSession"] = None
self._session_lock_identity: Optional[str] = None
async def connect(self) -> bool:
"""
@@ -159,6 +162,29 @@ class WhatsAppAdapter(BasePlatformAdapter):
logger.info("[%s] Bridge found at %s", self.name, bridge_path)
# Acquire scoped lock to prevent duplicate sessions
try:
from gateway.status import acquire_scoped_lock
self._session_lock_identity = str(self._session_path)
acquired, existing = acquire_scoped_lock(
"whatsapp-session",
self._session_lock_identity,
metadata={"platform": self.platform.value},
)
if not acquired:
owner_pid = existing.get("pid") if isinstance(existing, dict) else None
message = (
"Another local Hermes gateway is already using this WhatsApp session"
+ (f" (PID {owner_pid})." if owner_pid else ".")
+ " Stop the other gateway before starting a second WhatsApp bridge."
)
logger.error("[%s] %s", self.name, message)
self._set_fatal_error("whatsapp_session_lock", message, retryable=False)
return False
except Exception as e:
logger.warning("[%s] Could not acquire session lock (non-fatal): %s", self.name, e)
# Auto-install npm dependencies if node_modules doesn't exist
bridge_dir = bridge_path.parent
if not (bridge_dir / "node_modules").exists():
@@ -199,6 +225,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
print(f"[{self.name}] Using existing bridge (status: {bridge_status})")
self._mark_connected()
self._bridge_process = None # Not managed by us
self._http_session = aiohttp.ClientSession()
self._poll_task = asyncio.create_task(self._poll_messages())
return True
else:
@@ -304,6 +331,9 @@ class WhatsAppAdapter(BasePlatformAdapter):
print(f"[{self.name}] Bridge log: {self._bridge_log}")
print(f"[{self.name}] If session expired, re-pair: hermes whatsapp")
# Create a persistent HTTP session for all bridge communication
self._http_session = aiohttp.ClientSession()
# Start message polling task
self._poll_task = asyncio.create_task(self._poll_messages())
@@ -312,6 +342,12 @@ class WhatsAppAdapter(BasePlatformAdapter):
return True
except Exception as e:
if self._session_lock_identity:
try:
from gateway.status import release_scoped_lock
release_scoped_lock("whatsapp-session", self._session_lock_identity)
except Exception:
pass
logger.error("[%s] Failed to start bridge: %s", self.name, e, exc_info=True)
self._close_bridge_log()
return False
@@ -369,10 +405,32 @@ class WhatsAppAdapter(BasePlatformAdapter):
else:
# Bridge was not started by us, don't kill it
print(f"[{self.name}] Disconnecting (external bridge left running)")
# Cancel the poll task explicitly
if self._poll_task and not self._poll_task.done():
self._poll_task.cancel()
try:
await self._poll_task
except (asyncio.CancelledError, Exception):
pass
self._poll_task = None
# Close the persistent HTTP session
if self._http_session and not self._http_session.closed:
await self._http_session.close()
self._http_session = None
if self._session_lock_identity:
try:
from gateway.status import release_scoped_lock
release_scoped_lock("whatsapp-session", self._session_lock_identity)
except Exception as e:
logger.warning("[%s] Error releasing WhatsApp session lock: %s", self.name, e, exc_info=True)
self._mark_disconnected()
self._bridge_process = None
self._close_bridge_log()
self._session_lock_identity = None
print(f"[{self.name}] Disconnected")
async def send(
@@ -383,7 +441,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
metadata: Optional[Dict[str, Any]] = None
) -> SendResult:
"""Send a message via the WhatsApp bridge."""
if not self._running:
if not self._running or not self._http_session:
return SendResult(success=False, error="Not connected")
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
@@ -391,36 +449,29 @@ class WhatsAppAdapter(BasePlatformAdapter):
try:
import aiohttp
payload = {
"chatId": chat_id,
"message": content,
}
if reply_to:
payload["replyTo"] = reply_to
async with aiohttp.ClientSession() as session:
payload = {
"chatId": chat_id,
"message": content,
}
if reply_to:
payload["replyTo"] = reply_to
async with session.post(
f"http://127.0.0.1:{self._bridge_port}/send",
json=payload,
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except ImportError:
return SendResult(
success=False,
error="aiohttp not installed. Run: pip install aiohttp"
)
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/send",
json=payload,
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except Exception as e:
return SendResult(success=False, error=str(e))
@@ -431,28 +482,27 @@ class WhatsAppAdapter(BasePlatformAdapter):
content: str,
) -> SendResult:
"""Edit a previously sent message via the WhatsApp bridge."""
if not self._running:
if not self._running or not self._http_session:
return SendResult(success=False, error="Not connected")
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
return SendResult(success=False, error=bridge_exit)
try:
import aiohttp
async with aiohttp.ClientSession() as session:
async with session.post(
f"http://127.0.0.1:{self._bridge_port}/edit",
json={
"chatId": chat_id,
"messageId": message_id,
"message": content,
},
timeout=aiohttp.ClientTimeout(total=15)
) as resp:
if resp.status == 200:
return SendResult(success=True, message_id=message_id)
else:
error = await resp.text()
return SendResult(success=False, error=error)
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/edit",
json={
"chatId": chat_id,
"messageId": message_id,
"message": content,
},
timeout=aiohttp.ClientTimeout(total=15)
) as resp:
if resp.status == 200:
return SendResult(success=True, message_id=message_id)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except Exception as e:
return SendResult(success=False, error=str(e))
@@ -465,7 +515,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
file_name: Optional[str] = None,
) -> SendResult:
"""Send any media file via bridge /send-media endpoint."""
if not self._running:
if not self._running or not self._http_session:
return SendResult(success=False, error="Not connected")
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
@@ -486,22 +536,21 @@ class WhatsAppAdapter(BasePlatformAdapter):
if file_name:
payload["fileName"] = file_name
async with aiohttp.ClientSession() as session:
async with session.post(
f"http://127.0.0.1:{self._bridge_port}/send-media",
json=payload,
timeout=aiohttp.ClientTimeout(total=120),
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data,
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/send-media",
json=payload,
timeout=aiohttp.ClientTimeout(total=120),
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data,
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except Exception as e:
return SendResult(success=False, error=str(e))
@@ -526,6 +575,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a local image file natively via bridge."""
return await self._send_media_to_bridge(chat_id, image_path, "image", caption)
@@ -536,6 +586,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
video_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a video natively via bridge — plays inline in WhatsApp."""
return await self._send_media_to_bridge(chat_id, video_path, "video", caption)
@@ -547,6 +598,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a document/file as a downloadable attachment via bridge."""
return await self._send_media_to_bridge(
@@ -556,45 +608,43 @@ class WhatsAppAdapter(BasePlatformAdapter):
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Send typing indicator via bridge."""
if not self._running:
if not self._running or not self._http_session:
return
if await self._check_managed_bridge_exit():
return
try:
import aiohttp
async with aiohttp.ClientSession() as session:
await session.post(
f"http://127.0.0.1:{self._bridge_port}/typing",
json={"chatId": chat_id},
timeout=aiohttp.ClientTimeout(total=5)
)
await self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/typing",
json={"chatId": chat_id},
timeout=aiohttp.ClientTimeout(total=5)
)
except Exception:
pass # Ignore typing indicator failures
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a WhatsApp chat."""
if not self._running:
if not self._running or not self._http_session:
return {"name": "Unknown", "type": "dm"}
if await self._check_managed_bridge_exit():
return {"name": chat_id, "type": "dm"}
try:
import aiohttp
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://127.0.0.1:{self._bridge_port}/chat/{chat_id}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
data = await resp.json()
return {
"name": data.get("name", chat_id),
"type": "group" if data.get("isGroup") else "dm",
"participants": data.get("participants", []),
}
async with self._http_session.get(
f"http://127.0.0.1:{self._bridge_port}/chat/{chat_id}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
data = await resp.json()
return {
"name": data.get("name", chat_id),
"type": "group" if data.get("isGroup") else "dm",
"participants": data.get("participants", []),
}
except Exception as e:
logger.debug("Could not get WhatsApp chat info for %s: %s", chat_id, e)
@@ -602,29 +652,26 @@ class WhatsAppAdapter(BasePlatformAdapter):
async def _poll_messages(self) -> None:
"""Poll the bridge for incoming messages."""
try:
import aiohttp
except ImportError:
print(f"[{self.name}] aiohttp not installed, message polling disabled")
return
import aiohttp
while self._running:
if not self._http_session:
break
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
print(f"[{self.name}] {bridge_exit}")
break
try:
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://127.0.0.1:{self._bridge_port}/messages",
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
messages = await resp.json()
for msg_data in messages:
event = await self._build_message_event(msg_data)
if event:
await self.handle_message(event)
async with self._http_session.get(
f"http://127.0.0.1:{self._bridge_port}/messages",
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
messages = await resp.json()
for msg_data in messages:
event = await self._build_message_event(msg_data)
if event:
await self.handle_message(event)
except asyncio.CancelledError:
break
except Exception as e:
+516 -52
View File
@@ -77,6 +77,7 @@ sys.path.insert(0, str(Path(__file__).parent.parent))
# Resolve Hermes home directory (respects HERMES_HOME override)
from hermes_constants import get_hermes_home
from utils import atomic_yaml_write
_hermes_home = get_hermes_home()
# Load environment variables from ~/.hermes/.env first.
@@ -224,6 +225,49 @@ from gateway.session import (
from gateway.delivery import DeliveryRouter
from gateway.platforms.base import BasePlatformAdapter, MessageEvent, MessageType
def _normalize_whatsapp_identifier(value: str) -> str:
"""Strip WhatsApp JID/LID syntax down to its stable numeric identifier."""
return (
str(value or "")
.strip()
.replace("+", "", 1)
.split(":", 1)[0]
.split("@", 1)[0]
)
def _expand_whatsapp_auth_aliases(identifier: str) -> set:
"""Resolve WhatsApp phone/LID aliases using bridge session mapping files."""
normalized = _normalize_whatsapp_identifier(identifier)
if not normalized:
return set()
session_dir = _hermes_home / "whatsapp" / "session"
resolved = set()
queue = [normalized]
while queue:
current = queue.pop(0)
if not current or current in resolved:
continue
resolved.add(current)
for suffix in ("", "_reverse"):
mapping_path = session_dir / f"lid-mapping-{current}{suffix}.json"
if not mapping_path.exists():
continue
try:
mapped = _normalize_whatsapp_identifier(
json.loads(mapping_path.read_text(encoding="utf-8"))
)
except Exception:
continue
if mapped and mapped not in resolved:
queue.append(mapped)
return resolved
logger = logging.getLogger(__name__)
# Sentinel placed into _running_agents immediately when a session starts
@@ -257,6 +301,50 @@ def _resolve_runtime_agent_kwargs() -> dict:
}
def _check_unavailable_skill(command_name: str) -> str | None:
"""Check if a command matches a known-but-inactive skill.
Returns a helpful message if the skill exists but is disabled or only
available as an optional install. Returns None if no match found.
"""
# Normalize: command uses hyphens, skill names may use hyphens or underscores
normalized = command_name.lower().replace("_", "-")
try:
from tools.skills_tool import SKILLS_DIR, _get_disabled_skill_names
disabled = _get_disabled_skill_names()
# Check disabled built-in skills
for skill_md in SKILLS_DIR.rglob("SKILL.md"):
if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
continue
name = skill_md.parent.name.lower().replace("_", "-")
if name == normalized and name in disabled:
return (
f"The **{command_name}** skill is installed but disabled.\n"
f"Enable it with: `hermes skills config`"
)
# Check optional skills (shipped with repo but not installed)
from hermes_constants import get_hermes_home, get_optional_skills_dir
repo_root = Path(__file__).resolve().parent.parent
optional_dir = get_optional_skills_dir(repo_root / "optional-skills")
if optional_dir.exists():
for skill_md in optional_dir.rglob("SKILL.md"):
name = skill_md.parent.name.lower().replace("_", "-")
if name == normalized:
# Build install path: official/<category>/<name>
rel = skill_md.parent.relative_to(optional_dir)
parts = list(rel.parts)
install_path = f"official/{'/'.join(parts)}"
return (
f"The **{command_name}** skill is available but not installed.\n"
f"Install it with: `hermes skills install {install_path}`"
)
except Exception:
pass
return None
def _platform_config_key(platform: "Platform") -> str:
"""Map a Platform enum to its config.yaml key (LOCAL→"cli", rest→enum value)."""
return "cli" if platform == Platform.LOCAL else platform.value
@@ -279,16 +367,16 @@ def _resolve_gateway_model(config: dict | None = None) -> str:
"""Read model from env/config — mirrors the resolution in _run_agent_sync.
Without this, temporary AIAgent instances (memory flush, /compress) fall
back to the hardcoded default ("anthropic/claude-opus-4.6") which fails
when the active provider is openai-codex.
back to the hardcoded default which fails when the active provider is
openai-codex.
"""
model = os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL") or "anthropic/claude-opus-4.6"
model = os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL") or ""
cfg = config if config is not None else _load_gateway_config()
model_cfg = cfg.get("model", {})
if isinstance(model_cfg, str):
model = model_cfg
elif isinstance(model_cfg, dict):
model = model_cfg.get("default", model)
model = model_cfg.get("default") or model_cfg.get("model") or model
return model
@@ -388,6 +476,13 @@ class GatewayRunner:
self._honcho_managers: Dict[str, Any] = {}
self._honcho_configs: Dict[str, Any] = {}
# Rate-limit compression warning messages sent to users.
# Keyed by chat_id — value is the timestamp of the last warning sent.
# Prevents the warning from firing on every message when a session
# remains above the threshold after compression.
self._compression_warn_sent: Dict[str, float] = {}
self._compression_warn_cooldown: int = 3600 # seconds (1 hour)
# Ensure tirith security scanner is available (downloads if needed)
try:
from tools.tirith_security import ensure_installed
@@ -432,7 +527,7 @@ class GatewayRunner:
from honcho_integration.session import HonchoSessionManager
hcfg = HonchoClientConfig.from_global_config()
if not hcfg.enabled or not hcfg.api_key:
if not hcfg.enabled or not (hcfg.api_key or hcfg.base_url):
return None, hcfg
client = get_honcho_client(hcfg)
@@ -573,6 +668,10 @@ class GatewayRunner:
session_id=old_session_id,
honcho_session_key=honcho_session_key,
)
# Fully silence the flush agent — quiet_mode only suppresses init
# messages; tool call output still leaks to the terminal through
# _safe_print → _print_fn. Set a no-op to prevent that.
tmp_agent._print_fn = lambda *a, **kw: None
# Build conversation history from transcript
msgs = [
@@ -741,10 +840,22 @@ class GatewayRunner:
logger.error("No connected messaging platforms remain. Shutting down gateway cleanly.")
await self.stop()
elif not self.adapters and self._failed_platforms:
logger.warning(
"No connected messaging platforms remain, but %d platform(s) queued for reconnection",
len(self._failed_platforms),
)
# All platforms are down and queued for background reconnection.
# If the error is retryable, exit with failure so systemd Restart=on-failure
# can restart the process. Otherwise stay alive and keep retrying in background.
if adapter.fatal_error_retryable:
self._exit_reason = adapter.fatal_error_message or "All messaging platforms failed with retryable errors"
self._exit_with_failure = True
logger.error(
"All messaging platforms failed with retryable errors. "
"Shutting down gateway for service restart (systemd will retry)."
)
await self.stop()
else:
logger.warning(
"No connected messaging platforms remain, but %d platform(s) queued for reconnection",
len(self._failed_platforms),
)
def _request_clean_exit(self, reason: str) -> None:
self._exit_cleanly = True
@@ -902,11 +1013,12 @@ class GatewayRunner:
return {}
@staticmethod
def _load_fallback_model() -> dict | None:
"""Load fallback model config from config.yaml.
def _load_fallback_model() -> list | dict | None:
"""Load fallback provider chain from config.yaml.
Returns a dict with 'provider' and 'model' keys, or None if
not configured / both fields empty.
Returns a list of provider dicts (``fallback_providers``), a single
dict (legacy ``fallback_model``), or None if not configured.
AIAgent.__init__ normalizes both formats into a chain.
"""
try:
import yaml as _y
@@ -914,8 +1026,8 @@ class GatewayRunner:
if cfg_path.exists():
with open(cfg_path, encoding="utf-8") as _f:
cfg = _y.safe_load(_f) or {}
fb = cfg.get("fallback_model", {}) or {}
if fb.get("provider") and fb.get("model"):
fb = cfg.get("fallback_providers") or cfg.get("fallback_model") or None
if fb:
return fb
except Exception:
pass
@@ -943,6 +1055,13 @@ class GatewayRunner:
"""
logger.info("Starting Hermes Gateway...")
logger.info("Session storage: %s", self.config.sessions_dir)
try:
from hermes_cli.profiles import get_active_profile_name
_profile = get_active_profile_name()
if _profile and _profile != "default":
logger.info("Active profile: %s", _profile)
except Exception:
pass
try:
from gateway.status import write_runtime_status
write_runtime_status(gateway_state="starting", exit_reason=None)
@@ -954,12 +1073,24 @@ class GatewayRunner:
os.getenv(v)
for v in ("TELEGRAM_ALLOWED_USERS", "DISCORD_ALLOWED_USERS",
"WHATSAPP_ALLOWED_USERS", "SLACK_ALLOWED_USERS",
"SIGNAL_ALLOWED_USERS", "EMAIL_ALLOWED_USERS",
"SIGNAL_ALLOWED_USERS", "SIGNAL_GROUP_ALLOWED_USERS",
"EMAIL_ALLOWED_USERS",
"SMS_ALLOWED_USERS", "MATTERMOST_ALLOWED_USERS",
"MATRIX_ALLOWED_USERS", "DINGTALK_ALLOWED_USERS",
"FEISHU_ALLOWED_USERS",
"WECOM_ALLOWED_USERS",
"GATEWAY_ALLOWED_USERS")
)
_allow_all = os.getenv("GATEWAY_ALLOW_ALL_USERS", "").lower() in ("true", "1", "yes")
_allow_all = os.getenv("GATEWAY_ALLOW_ALL_USERS", "").lower() in ("true", "1", "yes") or any(
os.getenv(v, "").lower() in ("true", "1", "yes")
for v in ("TELEGRAM_ALLOW_ALL_USERS", "DISCORD_ALLOW_ALL_USERS",
"WHATSAPP_ALLOW_ALL_USERS", "SLACK_ALLOW_ALL_USERS",
"SIGNAL_ALLOW_ALL_USERS", "EMAIL_ALLOW_ALL_USERS",
"SMS_ALLOW_ALL_USERS", "MATTERMOST_ALLOW_ALL_USERS",
"MATRIX_ALLOW_ALL_USERS", "DINGTALK_ALLOW_ALL_USERS",
"FEISHU_ALLOW_ALL_USERS",
"WECOM_ALLOW_ALL_USERS")
)
if not _any_allowlist and not _allow_all:
logger.warning(
"No user allowlists configured. All unauthorized users will be denied. "
@@ -1401,6 +1532,20 @@ class GatewayRunner:
return None
return DingTalkAdapter(config)
elif platform == Platform.FEISHU:
from gateway.platforms.feishu import FeishuAdapter, check_feishu_requirements
if not check_feishu_requirements():
logger.warning("Feishu: lark-oapi not installed or FEISHU_APP_ID/SECRET not set")
return None
return FeishuAdapter(config)
elif platform == Platform.WECOM:
from gateway.platforms.wecom import WeComAdapter, check_wecom_requirements
if not check_wecom_requirements():
logger.warning("WeCom: aiohttp not installed or WECOM_BOT_ID/SECRET not set")
return None
return WeComAdapter(config)
elif platform == Platform.MATTERMOST:
from gateway.platforms.mattermost import MattermostAdapter, check_mattermost_requirements
if not check_mattermost_requirements():
@@ -1467,6 +1612,8 @@ class GatewayRunner:
Platform.MATTERMOST: "MATTERMOST_ALLOWED_USERS",
Platform.MATRIX: "MATRIX_ALLOWED_USERS",
Platform.DINGTALK: "DINGTALK_ALLOWED_USERS",
Platform.FEISHU: "FEISHU_ALLOWED_USERS",
Platform.WECOM: "WECOM_ALLOWED_USERS",
}
platform_allow_all_map = {
Platform.TELEGRAM: "TELEGRAM_ALLOW_ALL_USERS",
@@ -1479,6 +1626,8 @@ class GatewayRunner:
Platform.MATTERMOST: "MATTERMOST_ALLOW_ALL_USERS",
Platform.MATRIX: "MATRIX_ALLOW_ALL_USERS",
Platform.DINGTALK: "DINGTALK_ALLOW_ALL_USERS",
Platform.FEISHU: "FEISHU_ALLOW_ALL_USERS",
Platform.WECOM: "WECOM_ALLOW_ALL_USERS",
}
# Per-platform allow-all flag (e.g., DISCORD_ALLOW_ALL_USERS=true)
@@ -1506,10 +1655,23 @@ class GatewayRunner:
if global_allowlist:
allowed_ids.update(uid.strip() for uid in global_allowlist.split(",") if uid.strip())
# WhatsApp JIDs have @s.whatsapp.net suffix — strip it for comparison
check_ids = {user_id}
if "@" in user_id:
check_ids.add(user_id.split("@")[0])
# WhatsApp: resolve phone↔LID aliases from bridge session mapping files
if source.platform == Platform.WHATSAPP:
normalized_allowed_ids = set()
for allowed_id in allowed_ids:
normalized_allowed_ids.update(_expand_whatsapp_auth_aliases(allowed_id))
if normalized_allowed_ids:
allowed_ids = normalized_allowed_ids
check_ids.update(_expand_whatsapp_auth_aliases(user_id))
normalized_user_id = _normalize_whatsapp_identifier(user_id)
if normalized_user_id:
check_ids.add(normalized_user_id)
return bool(check_ids & allowed_ids)
def _get_unauthorized_dm_behavior(self, platform: Optional[Platform]) -> str:
@@ -1540,6 +1702,11 @@ class GatewayRunner:
# In DMs: offer pairing code. In groups: silently ignore.
if source.chat_type == "dm" and self._get_unauthorized_dm_behavior(source.platform) == "pair":
platform_name = source.platform.value if source.platform else "unknown"
# Rate-limit ALL pairing responses (code or rejection) to
# prevent spamming the user with repeated messages when
# multiple DMs arrive in quick succession.
if self.pairing_store._is_rate_limited(platform_name, source.user_id):
return None
code = self.pairing_store.generate_code(
platform_name, source.user_id, source.user_name or ""
)
@@ -1561,6 +1728,8 @@ class GatewayRunner:
"Too many pairing requests right now~ "
"Please try again later!"
)
# Record rate limit so subsequent messages are silently ignored
self.pairing_store._record_rate_limit(platform_name, source.user_id)
return None
# PRIORITY handling when an agent is already running for this session.
@@ -1706,7 +1875,13 @@ class GatewayRunner:
if canonical == "help":
return await self._handle_help_command(event)
if canonical == "commands":
return await self._handle_commands_command(event)
if canonical == "profile":
return await self._handle_profile_command(event)
if canonical == "status":
return await self._handle_status_command(event)
@@ -1719,6 +1894,9 @@ class GatewayRunner:
if canonical == "verbose":
return await self._handle_verbose_command(event)
if canonical == "yolo":
return await self._handle_yolo_command(event)
if canonical == "provider":
return await self._handle_provider_command(event)
@@ -1863,6 +2041,12 @@ class GatewayRunner:
if msg:
event.text = msg
# Fall through to normal message processing with skill content
else:
# Not an active skill — check if it's a known-but-disabled or
# uninstalled skill and give actionable guidance.
_unavail_msg = _check_unavailable_skill(command)
if _unavail_msg:
return _unavail_msg
except Exception as e:
logger.debug("Skill command check failed (non-fatal): %s", e)
@@ -1970,6 +2154,12 @@ class GatewayRunner:
f"Use /resume to browse and restore a previous session.\n"
f"Adjust reset timing in config.yaml under session_reset."
)
try:
session_info = self._format_session_info()
if session_info:
notice = f"{notice}\n\n{session_info}"
except Exception:
pass
await adapter.send(
source.chat_id, notice,
metadata=getattr(event, 'metadata', None),
@@ -2063,7 +2253,7 @@ class GatewayRunner:
if isinstance(_model_cfg, str):
_hyg_model = _model_cfg
elif isinstance(_model_cfg, dict):
_hyg_model = _model_cfg.get("default", _hyg_model)
_hyg_model = _model_cfg.get("default") or _model_cfg.get("model") or _hyg_model
# Read explicit context_length override from model config
# (same as run_agent.py lines 995-1005)
_raw_ctx = _model_cfg.get("context_length")
@@ -2094,6 +2284,29 @@ class GatewayRunner:
_hyg_api_key = _hyg_runtime.get("api_key")
except Exception:
pass
# Check custom_providers per-model context_length
# (same fallback as run_agent.py lines 1171-1189).
# Must run after runtime resolution so _hyg_base_url is set.
if _hyg_config_context_length is None and _hyg_base_url:
try:
_hyg_custom_providers = _hyg_data.get("custom_providers")
if isinstance(_hyg_custom_providers, list):
for _cp in _hyg_custom_providers:
if not isinstance(_cp, dict):
continue
_cp_url = (_cp.get("base_url") or "").rstrip("/")
if _cp_url and _cp_url == _hyg_base_url.rstrip("/"):
_cp_models = _cp.get("models", {})
if isinstance(_cp_models, dict):
_cp_model_cfg = _cp_models.get(_hyg_model, {})
if isinstance(_cp_model_cfg, dict):
_cp_ctx = _cp_model_cfg.get("context_length")
if _cp_ctx is not None:
_hyg_config_context_length = int(_cp_ctx)
break
except (TypeError, ValueError):
pass
except Exception:
pass
@@ -2175,6 +2388,7 @@ class GatewayRunner:
enabled_toolsets=["memory"],
session_id=session_entry.session_id,
)
_hyg_agent._print_fn = lambda *a, **kw: None
loop = asyncio.get_event_loop()
_compressed, _ = await loop.run_in_executor(
@@ -2185,6 +2399,15 @@ class GatewayRunner:
),
)
# _compress_context ends the old session and creates
# a new session_id. Write compressed messages into
# the NEW session so the old transcript stays intact
# and searchable via session_search.
_hyg_new_sid = _hyg_agent.session_id
if _hyg_new_sid != session_entry.session_id:
session_entry.session_id = _hyg_new_sid
self.session_store._save()
self.session_store.rewrite_transcript(
session_entry.session_id, _compressed
)
@@ -2217,13 +2440,18 @@ class GatewayRunner:
pass
# Still too large after compression — warn user
# Rate-limited to once per cooldown period per
# chat to avoid spamming on every message.
if _new_tokens >= _warn_token_threshold:
logger.warning(
"Session hygiene: still ~%s tokens after "
"compression — suggesting /reset",
f"{_new_tokens:,}",
)
if _hyg_adapter:
_now = time.time()
_last_warn = self._compression_warn_sent.get(source.chat_id, 0)
if _hyg_adapter and _now - _last_warn >= self._compression_warn_cooldown:
self._compression_warn_sent[source.chat_id] = _now
try:
await _hyg_adapter.send(
source.chat_id,
@@ -2245,7 +2473,10 @@ class GatewayRunner:
if _approx_tokens >= _warn_token_threshold:
_hyg_adapter = self.adapters.get(source.platform)
_hyg_meta = {"thread_id": source.thread_id} if source.thread_id else None
if _hyg_adapter:
_now = time.time()
_last_warn = self._compression_warn_sent.get(source.chat_id, 0)
if _hyg_adapter and _now - _last_warn >= self._compression_warn_cooldown:
self._compression_warn_sent[source.chat_id] = _now
try:
await _hyg_adapter.send(
source.chat_id,
@@ -2736,6 +2967,85 @@ class GatewayRunner:
# Clear session env
self._clear_session_env()
def _format_session_info(self) -> str:
"""Resolve current model config and return a formatted info block.
Surfaces model, provider, context length, and endpoint so gateway
users can immediately see if context detection went wrong (e.g.
local models falling to the 128K default).
"""
from agent.model_metadata import get_model_context_length, DEFAULT_FALLBACK_CONTEXT
model = _resolve_gateway_model()
config_context_length = None
provider = None
base_url = None
api_key = None
try:
cfg_path = _hermes_home / "config.yaml"
if cfg_path.exists():
import yaml as _info_yaml
with open(cfg_path, encoding="utf-8") as f:
data = _info_yaml.safe_load(f) or {}
model_cfg = data.get("model", {})
if isinstance(model_cfg, dict):
raw_ctx = model_cfg.get("context_length")
if raw_ctx is not None:
try:
config_context_length = int(raw_ctx)
except (TypeError, ValueError):
pass
provider = model_cfg.get("provider") or None
base_url = model_cfg.get("base_url") or None
except Exception:
pass
# Resolve runtime credentials for probing
try:
runtime = _resolve_runtime_agent_kwargs()
provider = provider or runtime.get("provider")
base_url = base_url or runtime.get("base_url")
api_key = runtime.get("api_key")
except Exception:
pass
context_length = get_model_context_length(
model,
base_url=base_url or "",
api_key=api_key or "",
config_context_length=config_context_length,
provider=provider or "",
)
# Format context source hint
if config_context_length is not None:
ctx_source = "config"
elif context_length == DEFAULT_FALLBACK_CONTEXT:
ctx_source = "default — set model.context_length in config to override"
else:
ctx_source = "detected"
# Format context length for display
if context_length >= 1_000_000:
ctx_display = f"{context_length / 1_000_000:.1f}M"
elif context_length >= 1_000:
ctx_display = f"{context_length // 1_000}K"
else:
ctx_display = str(context_length)
lines = [
f"◆ Model: `{model}`",
f"◆ Provider: {provider or 'openrouter'}",
f"◆ Context: {ctx_display} tokens ({ctx_source})",
]
# Show endpoint for local/custom setups
if base_url and ("localhost" in base_url or "127.0.0.1" in base_url or "0.0.0.0" in base_url):
lines.append(f"◆ Endpoint: {base_url}")
return "\n".join(lines)
async def _handle_reset_command(self, event: MessageEvent) -> str:
"""Handle /new or /reset command."""
source = event.source
@@ -2776,13 +3086,53 @@ class GatewayRunner:
"session_key": session_key,
})
# Resolve session config info to surface to the user
try:
session_info = self._format_session_info()
except Exception:
session_info = ""
if new_entry:
return "✨ Session reset! I've started fresh with no memory of our previous conversation."
header = "✨ Session reset! Starting fresh."
else:
# No existing session, just create one
self.session_store.get_or_create_session(source, force_new=True)
return "✨ New session started!"
header = "✨ New session started!"
if session_info:
return f"{header}\n\n{session_info}"
return header
async def _handle_profile_command(self, event: MessageEvent) -> str:
"""Handle /profile — show active profile name and home directory."""
from hermes_constants import get_hermes_home, display_hermes_home
from pathlib import Path
home = get_hermes_home()
display = display_hermes_home()
# Detect profile name from HERMES_HOME path
# Profile paths look like: ~/.hermes/profiles/<name>
profiles_parent = Path.home() / ".hermes" / "profiles"
try:
rel = home.relative_to(profiles_parent)
profile_name = str(rel).split("/")[0]
except ValueError:
profile_name = None
if profile_name:
lines = [
f"👤 **Profile:** `{profile_name}`",
f"📂 **Home:** `{display}`",
]
else:
lines = [
"👤 **Profile:** default",
f"📂 **Home:** `{display}`",
]
return "\n".join(lines)
async def _handle_status_command(self, event: MessageEvent) -> str:
"""Handle /status command."""
source = event.source
@@ -2849,12 +3199,69 @@ class GatewayRunner:
from agent.skill_commands import get_skill_commands
skill_cmds = get_skill_commands()
if skill_cmds:
lines.append(f"\n⚡ **Skill Commands** ({len(skill_cmds)} installed):")
for cmd in sorted(skill_cmds):
lines.append(f"\n⚡ **Skill Commands** ({len(skill_cmds)} active):")
# Show first 10, then point to /commands for the rest
sorted_cmds = sorted(skill_cmds)
for cmd in sorted_cmds[:10]:
lines.append(f"`{cmd}` — {skill_cmds[cmd]['description']}")
if len(sorted_cmds) > 10:
lines.append(f"\n... and {len(sorted_cmds) - 10} more. Use `/commands` for the full paginated list.")
except Exception:
pass
return "\n".join(lines)
async def _handle_commands_command(self, event: MessageEvent) -> str:
"""Handle /commands [page] - paginated list of all commands and skills."""
from hermes_cli.commands import gateway_help_lines
raw_args = event.get_command_args().strip()
if raw_args:
try:
requested_page = int(raw_args)
except ValueError:
return "Usage: `/commands [page]`"
else:
requested_page = 1
# Build combined entry list: built-in commands + skill commands
entries = list(gateway_help_lines())
try:
from agent.skill_commands import get_skill_commands
skill_cmds = get_skill_commands()
if skill_cmds:
entries.append("")
entries.append("⚡ **Skill Commands**:")
for cmd in sorted(skill_cmds):
desc = skill_cmds[cmd].get("description", "").strip() or "Skill command"
entries.append(f"`{cmd}` — {desc}")
except Exception:
pass
if not entries:
return "No commands available."
from gateway.config import Platform
page_size = 15 if event.source.platform == Platform.TELEGRAM else 20
total_pages = max(1, (len(entries) + page_size - 1) // page_size)
page = max(1, min(requested_page, total_pages))
start = (page - 1) * page_size
page_entries = entries[start:start + page_size]
lines = [
f"📚 **Commands** ({len(entries)} total, page {page}/{total_pages})",
"",
*page_entries,
]
if total_pages > 1:
nav_parts = []
if page > 1:
nav_parts.append(f"`/commands {page - 1}` ← prev")
if page < total_pages:
nav_parts.append(f"next → `/commands {page + 1}`")
lines.extend(["", " | ".join(nav_parts)])
if page != requested_page:
lines.append(f"_(Requested page {requested_page} was out of range, showing page {page}.)_")
return "\n".join(lines)
async def _handle_provider_command(self, event: MessageEvent) -> str:
"""Handle /provider command - show available providers."""
@@ -2959,8 +3366,7 @@ class GatewayRunner:
if "agent" not in config or not isinstance(config.get("agent"), dict):
config["agent"] = {}
config["agent"]["system_prompt"] = ""
with open(config_path, "w") as f:
yaml.dump(config, f, default_flow_style=False, sort_keys=False)
atomic_yaml_write(config_path, config)
except Exception as e:
return f"⚠️ Failed to save personality change: {e}"
self._ephemeral_system_prompt = ""
@@ -2973,8 +3379,7 @@ class GatewayRunner:
if "agent" not in config or not isinstance(config.get("agent"), dict):
config["agent"] = {}
config["agent"]["system_prompt"] = new_prompt
with open(config_path, 'w', encoding="utf-8") as f:
yaml.dump(config, f, default_flow_style=False, sort_keys=False)
atomic_yaml_write(config_path, config)
except Exception as e:
return f"⚠️ Failed to save personality change: {e}"
@@ -3064,8 +3469,7 @@ class GatewayRunner:
with open(config_path, encoding="utf-8") as f:
user_config = yaml.safe_load(f) or {}
user_config[env_key] = chat_id
with open(config_path, 'w', encoding="utf-8") as f:
yaml.dump(user_config, f, default_flow_style=False)
atomic_yaml_write(config_path, user_config)
# Also set in the current environment so it takes effect immediately
os.environ[env_key] = str(chat_id)
except Exception as e:
@@ -3678,7 +4082,7 @@ class GatewayRunner:
# Send media files
for media_path in (media_files or []):
try:
await adapter.send_file(
await adapter.send_document(
chat_id=source.chat_id,
file_path=media_path,
)
@@ -3733,8 +4137,7 @@ class GatewayRunner:
current[k] = {}
current = current[k]
current[keys[-1]] = value
with open(config_path, "w", encoding="utf-8") as f:
yaml.dump(user_config, f, default_flow_style=False, sort_keys=False)
atomic_yaml_write(config_path, user_config)
return True
except Exception as e:
logger.error("Failed to save config key %s: %s", key_path, e)
@@ -3787,6 +4190,16 @@ class GatewayRunner:
else:
return f"🧠 ✓ Reasoning effort set to `{effort}` (this session only)"
async def _handle_yolo_command(self, event: MessageEvent) -> str:
"""Handle /yolo — toggle dangerous command approval bypass."""
current = bool(os.environ.get("HERMES_YOLO_MODE"))
if current:
os.environ.pop("HERMES_YOLO_MODE", None)
return "⚠️ YOLO mode **OFF** — dangerous commands will require approval."
else:
os.environ["HERMES_YOLO_MODE"] = "1"
return "⚡ YOLO mode **ON** — all commands auto-approved. Use with caution."
async def _handle_verbose_command(self, event: MessageEvent) -> str:
"""Handle /verbose command — cycle tool progress display mode.
@@ -3842,8 +4255,7 @@ class GatewayRunner:
if "display" not in user_config or not isinstance(user_config.get("display"), dict):
user_config["display"] = {}
user_config["display"]["tool_progress"] = new_mode
with open(config_path, "w", encoding="utf-8") as f:
yaml.dump(user_config, f, default_flow_style=False, sort_keys=False)
atomic_yaml_write(config_path, user_config)
return f"{descriptions[new_mode]}\n_(saved to config — takes effect on next message)_"
except Exception as e:
logger.warning("Failed to save tool_progress mode: %s", e)
@@ -3885,17 +4297,27 @@ class GatewayRunner:
enabled_toolsets=["memory"],
session_id=session_entry.session_id,
)
tmp_agent._print_fn = lambda *a, **kw: None
loop = asyncio.get_event_loop()
compressed, _ = await loop.run_in_executor(
None,
lambda: tmp_agent._compress_context(msgs, "", approx_tokens=approx_tokens),
lambda: tmp_agent._compress_context(msgs, "", approx_tokens=approx_tokens)
)
self.session_store.rewrite_transcript(session_entry.session_id, compressed)
# _compress_context already calls end_session() on the old session
# (preserving its full transcript in SQLite) and creates a new
# session_id for the continuation. Write the compressed messages
# into the NEW session so the original history stays searchable.
new_session_id = tmp_agent.session_id
if new_session_id != session_entry.session_id:
session_entry.session_id = new_session_id
self.session_store._save()
self.session_store.rewrite_transcript(new_session_id, compressed)
# Reset stored token count — transcript changed, old value is stale
self.session_store.update_session(
session_entry.session_key, last_prompt_tokens=0,
session_entry.session_key, last_prompt_tokens=0
)
new_count = len(compressed)
new_tokens = estimate_messages_tokens_rough(compressed)
@@ -4051,7 +4473,7 @@ class GatewayRunner:
]
ctx = agent.context_compressor
if ctx.last_prompt_tokens:
pct = ctx.last_prompt_tokens / ctx.context_length * 100 if ctx.context_length else 0
pct = min(100, ctx.last_prompt_tokens / ctx.context_length * 100) if ctx.context_length else 0
lines.append(f"Context: {ctx.last_prompt_tokens:,} / {ctx.context_length:,} ({pct:.0f}%)")
if ctx.compression_count:
lines.append(f"Compressions: {ctx.compression_count}")
@@ -4273,6 +4695,10 @@ class GatewayRunner:
import shutil
import subprocess
from datetime import datetime
from hermes_cli.config import is_managed, format_managed_message
if is_managed():
return f"{format_managed_message('update Hermes Agent')}"
project_root = Path(__file__).parent.parent.resolve()
git_dir = project_root / '.git'
@@ -4798,10 +5224,23 @@ class GatewayRunner:
from hermes_cli.tools_config import _get_platform_tools
enabled_toolsets = sorted(_get_platform_tools(user_config, platform_key))
# Apply tool preview length config (0 = no limit)
try:
from agent.display import set_tool_preview_max_len
_tpl = user_config.get("display", {}).get("tool_preview_length", 0)
set_tool_preview_max_len(int(_tpl) if _tpl else 0)
except Exception:
pass
# Tool progress mode from config.yaml: "all", "new", "verbose", "off"
# Falls back to env vars for backward compatibility
# Falls back to env vars for backward compatibility.
# YAML 1.1 parses bare `off` as boolean False — normalise before
# the `or` chain so it doesn't silently fall through to "all".
_raw_tp = user_config.get("display", {}).get("tool_progress")
if _raw_tp is False:
_raw_tp = "off"
progress_mode = (
user_config.get("display", {}).get("tool_progress")
_raw_tp
or os.getenv("HERMES_TOOL_PROGRESS_MODE")
or "all"
)
@@ -4838,9 +5277,11 @@ class GatewayRunner:
return
if preview:
# Truncate preview to keep messages clean
if len(preview) > 80:
preview = preview[:77] + "..."
# Truncate preview unless config says unlimited
from agent.display import get_tool_preview_max_len
_pl = get_tool_preview_max_len()
if _pl > 0 and len(preview) > _pl:
preview = preview[:_pl - 3] + "..."
msg = f"{emoji} {tool_name}: \"{preview}\""
else:
msg = f"{emoji} {tool_name}..."
@@ -4860,12 +5301,17 @@ class GatewayRunner:
progress_queue.put(msg)
# Background task to send progress messages
# Accumulates tool lines into a single message that gets edited
# For DM top-level Slack messages, source.thread_id is None but the
# final reply will be threaded under the original message via reply_to.
# Use event_message_id as fallback so progress messages land in the
# same thread as the final response instead of going to the DM root.
_progress_thread_id = source.thread_id or event_message_id
# Accumulates tool lines into a single message that gets edited.
#
# Threading metadata is platform-specific:
# - Slack DM threading needs event_message_id fallback (reply thread)
# - Telegram uses message_thread_id only for forum topics; passing a
# normal DM/group message id as thread_id causes send failures
# - Other platforms should use explicit source.thread_id only
if source.platform == Platform.SLACK:
_progress_thread_id = source.thread_id or event_message_id
else:
_progress_thread_id = source.thread_id
_progress_metadata = {"thread_id": _progress_thread_id} if _progress_thread_id else None
async def send_progress_messages():
@@ -5128,7 +5574,25 @@ class GatewayRunner:
agent.stream_delta_callback = _stream_delta_cb
agent.status_callback = _status_callback_sync
agent.reasoning_config = reasoning_config
# Background review delivery — send "💾 Memory updated" etc. to user
def _bg_review_send(message: str) -> None:
if not _status_adapter:
return
try:
asyncio.run_coroutine_threadsafe(
_status_adapter.send(
_status_chat_id,
message,
metadata=_status_thread_metadata,
),
_loop_for_step,
)
except Exception as _e:
logger.debug("background_review_callback error: %s", _e)
agent.background_review_callback = _bg_review_send
# Store agent reference for interrupt support
agent_holder[0] = agent
# Capture the full tool definitions for transcript logging
+14 -7
View File
@@ -762,14 +762,16 @@ class SessionStore:
if session_key in self._entries:
entry = self._entries[session_key]
entry.updated_at = _now()
entry.input_tokens += input_tokens
entry.output_tokens += output_tokens
entry.cache_read_tokens += cache_read_tokens
entry.cache_write_tokens += cache_write_tokens
# Direct assignment — the gateway receives cumulative totals
# from the cached agent, not per-call deltas.
entry.input_tokens = input_tokens
entry.output_tokens = output_tokens
entry.cache_read_tokens = cache_read_tokens
entry.cache_write_tokens = cache_write_tokens
if last_prompt_tokens is not None:
entry.last_prompt_tokens = last_prompt_tokens
if estimated_cost_usd is not None:
entry.estimated_cost_usd += estimated_cost_usd
entry.estimated_cost_usd = estimated_cost_usd
if cost_status:
entry.cost_status = cost_status
entry.total_tokens = (
@@ -783,7 +785,7 @@ class SessionStore:
if self._db and db_session_id:
try:
self._db.update_token_counts(
self._db.set_token_counts(
db_session_id,
input_tokens=input_tokens,
output_tokens=output_tokens,
@@ -795,6 +797,7 @@ class SessionStore:
billing_provider=provider,
billing_base_url=base_url,
model=model,
absolute=True,
)
except Exception as e:
logger.debug("Session DB operation failed: %s", e)
@@ -955,13 +958,17 @@ class SessionStore:
try:
self._db.clear_messages(session_id)
for msg in messages:
role = msg.get("role", "unknown")
self._db.append_message(
session_id=session_id,
role=msg.get("role", "unknown"),
role=role,
content=msg.get("content"),
tool_name=msg.get("tool_name"),
tool_calls=msg.get("tool_calls"),
tool_call_id=msg.get("tool_call_id"),
reasoning=msg.get("reasoning") if role == "assistant" else None,
reasoning_details=msg.get("reasoning_details") if role == "assistant" else None,
codex_reasoning_items=msg.get("codex_reasoning_items") if role == "assistant" else None,
)
except Exception as e:
logger.debug("Failed to rewrite transcript in DB: %s", e)
+5 -6
View File
@@ -1,12 +1,11 @@
#!/usr/bin/env python3
"""
Hermes Agent CLI Launcher
Hermes Agent CLI launcher.
This is a convenience wrapper to launch the Hermes CLI.
Usage: ./hermes [options]
This wrapper should behave like the installed `hermes` command, including
subcommands such as `gateway`, `cron`, and `doctor`.
"""
if __name__ == "__main__":
from cli import main
import fire
fire.Fire(main)
from hermes_cli.main import main
main()
+2 -2
View File
@@ -11,5 +11,5 @@ Provides subcommands for:
- hermes cron - Manage cron jobs
"""
__version__ = "0.4.0"
__release_date__ = "2026.3.23"
__version__ = "0.6.0"
__release_date__ = "2026.3.30"
+28 -10
View File
@@ -160,7 +160,7 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
id="alibaba",
name="Alibaba Cloud (DashScope)",
auth_type="api_key",
inference_base_url="https://dashscope-intl.aliyuncs.com/apps/anthropic",
inference_base_url="https://coding-intl.dashscope.aliyuncs.com/v1",
api_key_env_vars=("DASHSCOPE_API_KEY",),
base_url_env_var="DASHSCOPE_BASE_URL",
),
@@ -212,6 +212,14 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
api_key_env_vars=("KILOCODE_API_KEY",),
base_url_env_var="KILOCODE_BASE_URL",
),
"huggingface": ProviderConfig(
id="huggingface",
name="Hugging Face",
auth_type="api_key",
inference_base_url="https://router.huggingface.co/v1",
api_key_env_vars=("HF_TOKEN",),
base_url_env_var="HF_BASE_URL",
),
}
@@ -685,8 +693,13 @@ def resolve_provider(
"github-copilot-acp": "copilot-acp", "copilot-acp-agent": "copilot-acp",
"aigateway": "ai-gateway", "vercel": "ai-gateway", "vercel-ai-gateway": "ai-gateway",
"opencode": "opencode-zen", "zen": "opencode-zen",
"hf": "huggingface", "hugging-face": "huggingface", "huggingface-hub": "huggingface",
"go": "opencode-go", "opencode-go-sub": "opencode-go",
"kilo": "kilocode", "kilo-code": "kilocode", "kilo-gateway": "kilocode",
# Local server aliases — route through the generic custom provider
"lmstudio": "custom", "lm-studio": "custom", "lm_studio": "custom",
"ollama": "custom", "vllm": "custom", "llamacpp": "custom",
"llama.cpp": "custom", "llama-cpp": "custom",
}
normalized = _PROVIDER_ALIASES.get(normalized, normalized)
@@ -733,7 +746,12 @@ def resolve_provider(
if has_usable_secret(os.getenv(env_var, "")):
return pid
return "openrouter"
raise AuthError(
"No inference provider configured. Run 'hermes model' to choose a "
"provider and model, or set an API key (OPENROUTER_API_KEY, "
"OPENAI_API_KEY, etc.) in ~/.hermes/.env.",
code="no_provider_configured",
)
# =============================================================================
@@ -2012,7 +2030,8 @@ def _login_openai_codex(args, pconfig: ProviderConfig) -> None:
config_path = _update_config_for_provider("openai-codex", creds.get("base_url", DEFAULT_CODEX_BASE_URL))
print()
print("Login successful!")
print(" Auth state: ~/.hermes/auth.json")
from hermes_constants import display_hermes_home as _dhh
print(f" Auth state: {_dhh()}/auth.json")
print(f" Config updated: {config_path} (model.provider=openai-codex)")
@@ -2291,21 +2310,20 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
raise AuthError("No runtime API key available to fetch models",
provider="nous", code="invalid_token")
model_ids = fetch_nous_models(
inference_base_url=runtime_base_url,
api_key=runtime_key,
timeout_seconds=timeout_seconds,
verify=verify,
)
# Use curated model list (same as OpenRouter defaults) instead
# of the full /models dump which returns hundreds of models.
from hermes_cli.models import _PROVIDER_MODELS
model_ids = _PROVIDER_MODELS.get("nous", [])
print()
if model_ids:
print(f"Showing {len(model_ids)} curated models — use \"Enter custom model name\" for others.")
selected_model = _prompt_model_selection(model_ids)
if selected_model:
_save_model_choice(selected_model)
print(f"Default model set to: {selected_model}")
else:
print("No models were returned by the inference API.")
print("No curated models available for Nous Portal.")
except Exception as exc:
message = format_auth_error(exc) if isinstance(exc, AuthError) else str(exc)
print()
+27 -3
View File
@@ -258,7 +258,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
get_toolset_for_tool: Callable to map tool name -> toolset name.
context_length: Model's context window size in tokens.
"""
from model_tools import check_tool_availability
from model_tools import check_tool_availability, TOOLSET_REQUIREMENTS
if get_toolset_for_tool is None:
from model_tools import get_toolset_for_tool
@@ -267,8 +267,18 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
_, unavailable_toolsets = check_tool_availability(quiet=True)
disabled_tools = set()
# Tools whose toolset has a check_fn are lazy-initialized (e.g. honcho,
# homeassistant) — they show as unavailable at banner time because the
# check hasn't run yet, but they aren't misconfigured.
lazy_tools = set()
for item in unavailable_toolsets:
disabled_tools.update(item.get("tools", []))
toolset_name = item.get("name", "")
ts_req = TOOLSET_REQUIREMENTS.get(toolset_name, {})
tools_in_ts = item.get("tools", [])
if ts_req.get("check_fn"):
lazy_tools.update(tools_in_ts)
else:
disabled_tools.update(tools_in_ts)
layout_table = Table.grid(padding=(0, 2))
layout_table.add_column("left", justify="center")
@@ -328,6 +338,8 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
for name in sorted(tool_names):
if name in disabled_tools:
colored_names.append(f"[red]{name}[/]")
elif name in lazy_tools:
colored_names.append(f"[yellow]{name}[/]")
else:
colored_names.append(f"[{text}]{name}[/]")
@@ -347,6 +359,8 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
colored_names.append("[dim]...[/]")
elif name in disabled_tools:
colored_names.append(f"[red]{name}[/]")
elif name in lazy_tools:
colored_names.append(f"[yellow]{name}[/]")
else:
colored_names.append(f"[{text}]{name}[/]")
tools_str = ", ".join(colored_names)
@@ -403,16 +417,26 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
if mcp_connected:
summary_parts.append(f"{mcp_connected} MCP servers")
summary_parts.append("/help for commands")
# Show active profile name when not 'default'
try:
from hermes_cli.profiles import get_active_profile_name
_profile_name = get_active_profile_name()
if _profile_name and _profile_name != "default":
right_lines.append(f"[bold {accent}]Profile:[/] [{text}]{_profile_name}[/]")
except Exception:
pass # Never break the banner over a profiles.py bug
right_lines.append(f"[dim {dim}]{' · '.join(summary_parts)}[/]")
# Update check — use prefetched result if available
try:
behind = get_update_result(timeout=0.5)
if behind and behind > 0:
from hermes_cli.config import recommended_update_command
commits_word = "commit" if behind == 1 else "commits"
right_lines.append(
f"[bold yellow]⚠ {behind} {commits_word} behind[/]"
f"[dim yellow] — run [bold]hermes update[/bold] to update[/]"
f"[dim yellow] — run [bold]{recommended_update_command()}[/bold] to update[/]"
)
except Exception:
pass # Never break the banner over an update check
+7 -3
View File
@@ -12,6 +12,7 @@ import getpass
from hermes_cli.banner import cprint, _DIM, _RST
from hermes_cli.config import save_env_value_secure
from hermes_constants import display_hermes_home
def clarify_callback(cli, question, choices):
@@ -131,7 +132,8 @@ def prompt_for_secret(cli, var_name: str, prompt: str, metadata=None) -> dict:
}
stored = save_env_value_secure(var_name, value)
cprint(f"\n{_DIM} ✓ Stored secret in ~/.hermes/.env as {var_name}{_RST}")
_dhh = display_hermes_home()
cprint(f"\n{_DIM} ✓ Stored secret in {_dhh}/.env as {var_name}{_RST}")
return {
**stored,
"skipped": False,
@@ -183,7 +185,8 @@ def prompt_for_secret(cli, var_name: str, prompt: str, metadata=None) -> dict:
}
stored = save_env_value_secure(var_name, value)
cprint(f"\n{_DIM} ✓ Stored secret in ~/.hermes/.env as {var_name}{_RST}")
_dhh = display_hermes_home()
cprint(f"\n{_DIM} ✓ Stored secret in {_dhh}/.env as {var_name}{_RST}")
return {
**stored,
"skipped": False,
@@ -238,7 +241,8 @@ def approval_callback(cli, command: str, description: str) -> str:
lock = cli._approval_lock
with lock:
timeout = 60
from cli import CLI_CONFIG
timeout = CLI_CONFIG.get("approvals", {}).get("timeout", 60)
response_queue = queue.Queue()
choices = ["once", "session", "always", "deny"]
if len(command) > 70:
+5
View File
@@ -5,6 +5,7 @@ toggleable list of items. Falls back to a numbered text UI when
curses is unavailable (Windows without curses, piped stdin, etc.).
"""
import sys
from typing import List, Set
from hermes_cli.colors import Colors, color
@@ -26,6 +27,10 @@ def curses_checklist(
The indices the user confirmed as checked. On cancel (ESC/q),
returns ``pre_selected`` unchanged.
"""
# Safety: return defaults when stdin is not a terminal.
if not sys.stdin.isatty():
return set(pre_selected)
try:
import curses
selected = set(pre_selected)
+265 -5
View File
@@ -4,14 +4,19 @@ Usage:
hermes claw migrate # Interactive migration from ~/.openclaw
hermes claw migrate --dry-run # Preview what would be migrated
hermes claw migrate --preset full --overwrite # Full migration, overwrite conflicts
hermes claw cleanup # Archive leftover OpenClaw directories
hermes claw cleanup --dry-run # Preview what would be archived
"""
import importlib.util
import logging
import shutil
import sys
from datetime import datetime
from pathlib import Path
from hermes_cli.config import get_hermes_home, get_config_path, load_config, save_config
from hermes_constants import get_optional_skills_dir
from hermes_cli.setup import (
Colors,
color,
@@ -19,6 +24,7 @@ from hermes_cli.setup import (
print_info,
print_success,
print_error,
print_warning,
prompt_yes_no,
)
@@ -27,8 +33,7 @@ logger = logging.getLogger(__name__)
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
_OPENCLAW_SCRIPT = (
PROJECT_ROOT
/ "optional-skills"
get_optional_skills_dir(PROJECT_ROOT / "optional-skills")
/ "migration"
/ "openclaw-migration"
/ "scripts"
@@ -45,6 +50,18 @@ _OPENCLAW_SCRIPT_INSTALLED = (
/ "openclaw_to_hermes.py"
)
# Known OpenClaw directory names (current + legacy)
_OPENCLAW_DIR_NAMES = (".openclaw", ".clawdbot", ".moldbot")
# State files commonly found in OpenClaw workspace directories that cause
# confusion after migration (the agent discovers them and writes to them)
_WORKSPACE_STATE_GLOBS = (
"*/todo.json",
"*/sessions/*",
"*/memory/*.json",
"*/logs/*",
)
def _find_migration_script() -> Path | None:
"""Find the openclaw_to_hermes.py script in known locations."""
@@ -71,24 +88,105 @@ def _load_migration_module(script_path: Path):
return mod
def _find_openclaw_dirs() -> list[Path]:
"""Find all OpenClaw directories on disk."""
found = []
for name in _OPENCLAW_DIR_NAMES:
candidate = Path.home() / name
if candidate.is_dir():
found.append(candidate)
return found
def _scan_workspace_state(source_dir: Path) -> list[tuple[Path, str]]:
"""Scan an OpenClaw directory for workspace state files that cause confusion.
Returns a list of (path, description) tuples.
"""
findings: list[tuple[Path, str]] = []
# Direct state files in the root
for name in ("todo.json", "sessions", "logs"):
candidate = source_dir / name
if candidate.exists():
kind = "directory" if candidate.is_dir() else "file"
findings.append((candidate, f"Root {kind}: {name}"))
# State files inside workspace directories
for child in sorted(source_dir.iterdir()):
if not child.is_dir() or child.name.startswith("."):
continue
# Check for workspace-like subdirectories
for state_name in ("todo.json", "sessions", "logs", "memory"):
state_path = child / state_name
if state_path.exists():
kind = "directory" if state_path.is_dir() else "file"
rel = state_path.relative_to(source_dir)
findings.append((state_path, f"Workspace {kind}: {rel}"))
return findings
def _archive_directory(source_dir: Path, dry_run: bool = False) -> Path:
"""Rename an OpenClaw directory to .pre-migration.
Returns the archive path.
"""
timestamp = datetime.now().strftime("%Y%m%d")
archive_name = f"{source_dir.name}.pre-migration"
archive_path = source_dir.parent / archive_name
# If archive already exists, add timestamp
if archive_path.exists():
archive_name = f"{source_dir.name}.pre-migration-{timestamp}"
archive_path = source_dir.parent / archive_name
# If still exists (multiple runs same day), add counter
counter = 2
while archive_path.exists():
archive_name = f"{source_dir.name}.pre-migration-{timestamp}-{counter}"
archive_path = source_dir.parent / archive_name
counter += 1
if not dry_run:
source_dir.rename(archive_path)
return archive_path
def claw_command(args):
"""Route hermes claw subcommands."""
action = getattr(args, "claw_action", None)
if action == "migrate":
_cmd_migrate(args)
elif action in ("cleanup", "clean"):
_cmd_cleanup(args)
else:
print("Usage: hermes claw migrate [options]")
print("Usage: hermes claw <command> [options]")
print()
print("Commands:")
print(" migrate Migrate settings from OpenClaw to Hermes")
print(" cleanup Archive leftover OpenClaw directories after migration")
print()
print("Run 'hermes claw migrate --help' for migration options.")
print("Run 'hermes claw <command> --help' for options.")
def _cmd_migrate(args):
"""Run the OpenClaw → Hermes migration."""
source_dir = Path(getattr(args, "source", None) or Path.home() / ".openclaw")
# Check current and legacy OpenClaw directories
explicit_source = getattr(args, "source", None)
if explicit_source:
source_dir = Path(explicit_source)
else:
source_dir = Path.home() / ".openclaw"
if not source_dir.is_dir():
# Try legacy directory names
for legacy in (".clawdbot", ".moldbot"):
candidate = Path.home() / legacy
if candidate.is_dir():
source_dir = candidate
break
dry_run = getattr(args, "dry_run", False)
preset = getattr(args, "preset", "full")
overwrite = getattr(args, "overwrite", False)
@@ -198,6 +296,168 @@ def _cmd_migrate(args):
# Print results
_print_migration_report(report, dry_run)
# After successful non-dry-run migration, offer to archive the source directory
if not dry_run and report.get("summary", {}).get("migrated", 0) > 0:
_offer_source_archival(source_dir, getattr(args, "yes", False))
def _offer_source_archival(source_dir: Path, auto_yes: bool = False):
"""After migration, offer to rename the source directory to prevent state fragmentation.
OpenClaw workspace directories contain state files (todo.json, sessions, etc.)
that the agent may discover and write to, causing confusion. Renaming the
directory prevents this.
"""
if not source_dir.is_dir():
return
# Scan for state files that could cause problems
state_files = _scan_workspace_state(source_dir)
print()
print_header("Post-Migration Cleanup")
print_info("The OpenClaw directory still exists and contains workspace state files")
print_info("that can confuse the agent (todo lists, sessions, logs).")
if state_files:
print()
print(color(" Found state files:", Colors.YELLOW))
# Show up to 10 most relevant findings
for path, desc in state_files[:10]:
print(f" {desc}")
if len(state_files) > 10:
print(f" ... and {len(state_files) - 10} more")
print()
print_info(f"Recommend: rename {source_dir.name}/ to {source_dir.name}.pre-migration/")
print_info("This prevents the agent from discovering old workspace directories.")
print_info("You can always rename it back if needed.")
print()
if auto_yes or prompt_yes_no(f"Archive {source_dir} now?", default=True):
try:
archive_path = _archive_directory(source_dir)
print_success(f"Archived: {source_dir}{archive_path}")
print_info("The original directory has been renamed, not deleted.")
print_info(f"To undo: mv {archive_path} {source_dir}")
except OSError as e:
print_error(f"Could not archive: {e}")
print_info(f"You can do it manually: mv {source_dir} {source_dir}.pre-migration")
else:
print_info("Skipped. You can archive later with: hermes claw cleanup")
def _cmd_cleanup(args):
"""Archive leftover OpenClaw directories after migration.
Scans for OpenClaw directories that still exist after migration and offers
to rename them to .pre-migration to prevent state fragmentation.
"""
dry_run = getattr(args, "dry_run", False)
auto_yes = getattr(args, "yes", False)
explicit_source = getattr(args, "source", None)
print()
print(
color(
"┌─────────────────────────────────────────────────────────┐",
Colors.MAGENTA,
)
)
print(
color(
"│ ⚕ Hermes — OpenClaw Cleanup │",
Colors.MAGENTA,
)
)
print(
color(
"└─────────────────────────────────────────────────────────┘",
Colors.MAGENTA,
)
)
# Find OpenClaw directories
if explicit_source:
dirs_to_check = [Path(explicit_source)]
else:
dirs_to_check = _find_openclaw_dirs()
if not dirs_to_check:
print()
print_success("No OpenClaw directories found. Nothing to clean up.")
return
total_archived = 0
for source_dir in dirs_to_check:
print()
print_header(f"Found: {source_dir}")
# Scan for state files
state_files = _scan_workspace_state(source_dir)
# Show directory stats
try:
workspace_dirs = [
d for d in source_dir.iterdir()
if d.is_dir() and not d.name.startswith(".")
and any((d / name).exists() for name in ("todo.json", "SOUL.md", "MEMORY.md", "USER.md"))
]
except OSError:
workspace_dirs = []
if workspace_dirs:
print_info(f"Workspace directories: {len(workspace_dirs)}")
for ws in workspace_dirs[:5]:
items = []
if (ws / "todo.json").exists():
items.append("todo.json")
if (ws / "sessions").is_dir():
items.append("sessions/")
if (ws / "SOUL.md").exists():
items.append("SOUL.md")
if (ws / "MEMORY.md").exists():
items.append("MEMORY.md")
detail = ", ".join(items) if items else "empty"
print(f" {ws.name}/ ({detail})")
if len(workspace_dirs) > 5:
print(f" ... and {len(workspace_dirs) - 5} more")
if state_files:
print()
print(color(f" {len(state_files)} state file(s) that could cause confusion:", Colors.YELLOW))
for path, desc in state_files[:8]:
print(f" {desc}")
if len(state_files) > 8:
print(f" ... and {len(state_files) - 8} more")
print()
if dry_run:
archive_path = _archive_directory(source_dir, dry_run=True)
print_info(f"Would archive: {source_dir}{archive_path}")
else:
if auto_yes or prompt_yes_no(f"Archive {source_dir}?", default=True):
try:
archive_path = _archive_directory(source_dir)
print_success(f"Archived: {source_dir}{archive_path}")
total_archived += 1
except OSError as e:
print_error(f"Could not archive: {e}")
print_info(f"Try manually: mv {source_dir} {source_dir}.pre-migration")
else:
print_info("Skipped.")
# Summary
print()
if dry_run:
print_info(f"Dry run complete. {len(dirs_to_check)} directory(ies) would be archived.")
print_info("Run without --dry-run to archive them.")
elif total_archived:
print_success(f"Cleaned up {total_archived} OpenClaw directory(ies).")
print_info("Directories were renamed, not deleted. You can undo by renaming them back.")
else:
print_info("No directories were archived.")
def _print_migration_report(report: dict, dry_run: bool):
"""Print a formatted migration report."""
+4 -1
View File
@@ -12,6 +12,8 @@ import os
logger = logging.getLogger(__name__)
DEFAULT_CODEX_MODELS: List[str] = [
"gpt-5.4-mini",
"gpt-5.4",
"gpt-5.3-codex",
"gpt-5.2-codex",
"gpt-5.1-codex-max",
@@ -19,8 +21,9 @@ DEFAULT_CODEX_MODELS: List[str] = [
]
_FORWARD_COMPAT_TEMPLATE_MODELS: List[tuple[str, tuple[str, ...]]] = [
("gpt-5.3-codex", ("gpt-5.2-codex",)),
("gpt-5.4-mini", ("gpt-5.3-codex", "gpt-5.2-codex")),
("gpt-5.4", ("gpt-5.3-codex", "gpt-5.2-codex")),
("gpt-5.3-codex", ("gpt-5.2-codex",)),
("gpt-5.3-codex-spark", ("gpt-5.3-codex", "gpt-5.2-codex")),
]
+18 -2
View File
@@ -1,8 +1,24 @@
"""Shared ANSI color utilities for Hermes CLI modules."""
import os
import sys
def should_use_color() -> bool:
"""Return True when colored output is appropriate.
Respects the NO_COLOR environment variable (https://no-color.org/)
and TERM=dumb, in addition to the existing TTY check.
"""
if os.environ.get("NO_COLOR") is not None:
return False
if os.environ.get("TERM") == "dumb":
return False
if not sys.stdout.isatty():
return False
return True
class Colors:
RESET = "\033[0m"
BOLD = "\033[1m"
@@ -16,7 +32,7 @@ class Colors:
def color(text: str, *codes) -> str:
"""Apply color codes to text (only when output is a TTY)."""
if not sys.stdout.isatty():
"""Apply color codes to text (only when color output is appropriate)."""
if not should_use_color():
return text
return "".join(codes) + text + Colors.RESET
+68
View File
@@ -71,6 +71,7 @@ COMMAND_REGISTRY: list[CommandDef] = [
aliases=("q",), args_hint="<prompt>"),
CommandDef("status", "Show session info", "Session",
gateway_only=True),
CommandDef("profile", "Show active profile name and home directory", "Info"),
CommandDef("sethome", "Set this chat as the home channel", "Session",
gateway_only=True, aliases=("set-home",)),
CommandDef("resume", "Resume a previously-named session", "Session",
@@ -90,6 +91,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("verbose", "Cycle tool progress display: off -> new -> all -> verbose",
"Configuration", cli_only=True,
gateway_config_gate="display.tool_progress_command"),
CommandDef("yolo", "Toggle YOLO mode (skip all dangerous command approvals)",
"Configuration"),
CommandDef("reasoning", "Manage reasoning effort and display", "Configuration",
args_hint="[level|show|hide]",
subcommands=("none", "low", "minimal", "medium", "high", "xhigh", "show", "hide", "on", "off")),
@@ -118,6 +121,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
"Tools & Skills", cli_only=True),
# Info
CommandDef("commands", "Browse all commands and skills (paginated)", "Info",
gateway_only=True, args_hint="[page]"),
CommandDef("help", "Show available commands", "Info"),
CommandDef("usage", "Show token usage for the current session", "Info"),
CommandDef("insights", "Show usage insights and analytics", "Info",
@@ -361,6 +366,69 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
return result
def telegram_menu_commands(max_commands: int = 100) -> tuple[list[tuple[str, str]], int]:
"""Return Telegram menu commands capped to the Bot API limit.
Priority order (higher priority = never bumped by overflow):
1. Core CommandDef commands (always included)
2. Plugin slash commands (take precedence over skills)
3. Built-in skill commands (fill remaining slots, alphabetical)
Skills are the only tier that gets trimmed when the cap is hit.
User-installed hub skills are excluded accessible via /skills.
Returns:
(menu_commands, hidden_count) where hidden_count is the number of
skill commands omitted due to the cap.
"""
all_commands = list(telegram_bot_commands())
# Plugin slash commands get priority over skills
try:
from hermes_cli.plugins import get_plugin_manager
pm = get_plugin_manager()
plugin_cmds = getattr(pm, "_plugin_commands", {})
for cmd_name in sorted(plugin_cmds):
tg_name = cmd_name.replace("-", "_")
desc = "Plugin command"
if len(desc) > 40:
desc = desc[:37] + "..."
all_commands.append((tg_name, desc))
except Exception:
pass
# Remaining slots go to built-in skill commands (not hub-installed).
skill_entries: list[tuple[str, str]] = []
try:
from agent.skill_commands import get_skill_commands
from tools.skills_tool import SKILLS_DIR
_skills_dir = str(SKILLS_DIR.resolve())
_hub_dir = str((SKILLS_DIR / ".hub").resolve())
skill_cmds = get_skill_commands()
for cmd_key in sorted(skill_cmds):
info = skill_cmds[cmd_key]
skill_path = info.get("skill_md_path", "")
if not skill_path.startswith(_skills_dir):
continue
if skill_path.startswith(_hub_dir):
continue
name = cmd_key.lstrip("/").replace("-", "_")
desc = info.get("description", "")
# Keep descriptions short — setMyCommands has an undocumented
# total payload limit. 40 chars fits 100 commands safely.
if len(desc) > 40:
desc = desc[:37] + "..."
skill_entries.append((name, desc))
except Exception:
pass
# Skills fill remaining slots — they're the only tier that gets trimmed
remaining_slots = max(0, max_commands - len(all_commands))
hidden_count = max(0, len(skill_entries) - remaining_slots)
all_commands.extend(skill_entries[:remaining_slots])
return all_commands[:max_commands], hidden_count
def slack_subcommand_map() -> dict[str, str]:
"""Return subcommand -> /command mapping for Slack /hermes handler.
+154 -15
View File
@@ -34,6 +34,8 @@ _EXTRA_ENV_KEYS = frozenset({
"SIGNAL_ACCOUNT", "SIGNAL_HTTP_URL",
"SIGNAL_ALLOWED_USERS", "SIGNAL_GROUP_ALLOWED_USERS",
"DINGTALK_CLIENT_ID", "DINGTALK_CLIENT_SECRET",
"FEISHU_APP_ID", "FEISHU_APP_SECRET", "FEISHU_ENCRYPT_KEY", "FEISHU_VERIFICATION_TOKEN",
"WECOM_BOT_ID", "WECOM_SECRET",
"TERMINAL_ENV", "TERMINAL_SSH_KEY", "TERMINAL_SSH_PORT",
"WHATSAPP_MODE", "WHATSAPP_ENABLED",
"MATTERMOST_HOME_CHANNEL", "MATTERMOST_REPLY_MODE",
@@ -50,26 +52,86 @@ from hermes_cli.default_soul import DEFAULT_SOUL_MD
# Managed mode (NixOS declarative config)
# =============================================================================
_MANAGED_TRUE_VALUES = ("true", "1", "yes")
_MANAGED_SYSTEM_NAMES = {
"brew": "Homebrew",
"homebrew": "Homebrew",
"nix": "NixOS",
"nixos": "NixOS",
}
def get_managed_system() -> Optional[str]:
"""Return the package manager owning this install, if any."""
raw = os.getenv("HERMES_MANAGED", "").strip()
if raw:
normalized = raw.lower()
if normalized in _MANAGED_TRUE_VALUES:
return "NixOS"
return _MANAGED_SYSTEM_NAMES.get(normalized, raw)
managed_marker = get_hermes_home() / ".managed"
if managed_marker.exists():
return "NixOS"
return None
def is_managed() -> bool:
"""Check if hermes is running in Nix-managed mode.
"""Check if Hermes is running in package-manager-managed mode.
Two signals: the HERMES_MANAGED env var (set by the systemd service),
or a .managed marker file in HERMES_HOME (set by the NixOS activation
script, so interactive shells also see it).
"""
if os.getenv("HERMES_MANAGED", "").lower() in ("true", "1", "yes"):
return True
managed_marker = get_hermes_home() / ".managed"
return managed_marker.exists()
return get_managed_system() is not None
def get_managed_update_command() -> Optional[str]:
"""Return the preferred upgrade command for a managed install."""
managed_system = get_managed_system()
if managed_system == "Homebrew":
return "brew upgrade hermes-agent"
if managed_system == "NixOS":
return "sudo nixos-rebuild switch"
return None
def recommended_update_command() -> str:
"""Return the best update command for the current installation."""
return get_managed_update_command() or "hermes update"
def format_managed_message(action: str = "modify this Hermes installation") -> str:
"""Build a user-facing error for managed installs."""
managed_system = get_managed_system() or "a package manager"
raw = os.getenv("HERMES_MANAGED", "").strip().lower()
if managed_system == "NixOS":
env_hint = "true" if raw in _MANAGED_TRUE_VALUES else raw or "true"
return (
f"Cannot {action}: this Hermes installation is managed by NixOS "
f"(HERMES_MANAGED={env_hint}).\n"
"Edit services.hermes-agent.settings in your configuration.nix and run:\n"
" sudo nixos-rebuild switch"
)
if managed_system == "Homebrew":
env_hint = raw or "homebrew"
return (
f"Cannot {action}: this Hermes installation is managed by Homebrew "
f"(HERMES_MANAGED={env_hint}).\n"
"Use:\n"
" brew upgrade hermes-agent"
)
return (
f"Cannot {action}: this Hermes installation is managed by {managed_system}.\n"
"Use your package manager to upgrade or reinstall Hermes."
)
def managed_error(action: str = "modify configuration"):
"""Print user-friendly error for managed mode."""
print(
f"Cannot {action}: configuration is managed by NixOS (HERMES_MANAGED=true).\n"
"Edit services.hermes-agent.settings in your configuration.nix and run:\n"
" sudo nixos-rebuild switch",
file=sys.stderr,
)
print(format_managed_message(action), file=sys.stderr)
# =============================================================================
@@ -135,9 +197,16 @@ def ensure_hermes_home():
DEFAULT_CONFIG = {
"model": "anthropic/claude-opus-4.6",
"fallback_providers": [],
"toolsets": ["hermes-cli"],
"agent": {
"max_turns": 90,
# Tool-use enforcement: injects system prompt guidance that tells the
# model to actually call tools instead of describing intended actions.
# Values: "auto" (default — applies to gpt/codex models), true/false
# (force on/off for all models), or a list of model-name substrings
# to match (e.g. ["gpt", "codex", "gemini", "qwen"]).
"tool_use_enforcement": "auto",
},
"terminal": {
@@ -214,49 +283,57 @@ DEFAULT_CONFIG = {
"model": "", # e.g. "google/gemini-2.5-flash", "gpt-4o"
"base_url": "", # direct OpenAI-compatible endpoint (takes precedence over provider)
"api_key": "", # API key for base_url (falls back to OPENAI_API_KEY)
"timeout": 30, # seconds — increase for slow local vision models
"timeout": 30, # seconds — LLM API call timeout; increase for slow local vision models
"download_timeout": 30, # seconds — image HTTP download timeout; increase for slow connections
},
"web_extract": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30, # seconds — increase for slow local models
},
"compression": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 120, # seconds — compression summarises large contexts; increase for local models
},
"session_search": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30,
},
"skills_hub": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30,
},
"approval": {
"provider": "auto",
"model": "", # fast/cheap model recommended (e.g. gemini-flash, haiku)
"base_url": "",
"api_key": "",
"timeout": 30,
},
"mcp": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30,
},
"flush_memories": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30,
},
},
@@ -264,12 +341,14 @@ DEFAULT_CONFIG = {
"compact": False,
"personality": "kawaii",
"resume_display": "full",
"busy_input_mode": "interrupt",
"bell_on_complete": False,
"show_reasoning": False,
"streaming": False,
"show_cost": False, # Show $ cost in the status bar (off by default)
"skin": "default",
"tool_progress_command": False, # Enable /verbose command in messaging gateway
"tool_preview_length": 0, # Max chars for tool call previews (0 = no limit, show full paths/commands)
},
# Privacy settings
@@ -352,6 +431,13 @@ DEFAULT_CONFIG = {
# Never saved to sessions, logs, or trajectories.
"prefill_messages_file": "",
# Skills — external skill directories for sharing skills across tools/agents.
# Each path is expanded (~, ${VAR}) and resolved. Read-only — skill creation
# always goes to ~/.hermes/skills/.
"skills": {
"external_dirs": [], # e.g. ["~/.agents/skills", "/shared/team-skills"]
},
# Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
# This section is only needed for hermes-specific overrides; everything else
# (apiKey, workspace, peerName, sessions, enabled) comes from the global config.
@@ -382,6 +468,7 @@ DEFAULT_CONFIG = {
# off — skip all approval prompts (equivalent to --yolo)
"approvals": {
"mode": "manual",
"timeout": 60,
},
# Permanently allowed dangerous command patterns (added via "always" approval)
@@ -407,6 +494,12 @@ DEFAULT_CONFIG = {
},
},
"cron": {
# Wrap delivered cron responses with a header (task name) and footer
# ("The agent cannot see this message"). Set to false for clean output.
"wrap_response": True,
},
# Config schema version - bump this when adding new required fields
"_config_version": 10,
}
@@ -546,14 +639,14 @@ OPTIONAL_ENV_VARS = {
"category": "provider",
},
"DASHSCOPE_API_KEY": {
"description": "Alibaba Cloud DashScope API key for Qwen models",
"description": "Alibaba Cloud DashScope API key (Qwen + multi-provider models)",
"prompt": "DashScope API Key",
"url": "https://modelstudio.console.alibabacloud.com/",
"password": True,
"category": "provider",
},
"DASHSCOPE_BASE_URL": {
"description": "Custom DashScope base URL (default: international endpoint)",
"description": "Custom DashScope base URL (default: coding-intl OpenAI-compat endpoint)",
"prompt": "DashScope Base URL",
"url": "",
"password": False,
@@ -592,8 +685,31 @@ OPTIONAL_ENV_VARS = {
"category": "provider",
"advanced": True,
},
"HF_TOKEN": {
"description": "Hugging Face token for Inference Providers (20+ open models via router.huggingface.co)",
"prompt": "Hugging Face Token",
"url": "https://huggingface.co/settings/tokens",
"password": True,
"category": "provider",
},
"HF_BASE_URL": {
"description": "Hugging Face Inference Providers base URL override",
"prompt": "HF base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
# ── Tool API keys ──
"EXA_API_KEY": {
"description": "Exa API key for AI-native web search and contents",
"prompt": "Exa API key",
"url": "https://exa.ai/",
"tools": ["web_search", "web_extract"],
"password": True,
"category": "tool",
},
"PARALLEL_API_KEY": {
"description": "Parallel API key for AI-native web search and extract",
"prompt": "Parallel API key",
@@ -650,6 +766,14 @@ OPTIONAL_ENV_VARS = {
"password": True,
"category": "tool",
},
"CAMOFOX_URL": {
"description": "Camofox browser server URL for local anti-detection browsing (e.g. http://localhost:9377)",
"prompt": "Camofox server URL",
"url": "https://github.com/jo-inc/camofox-browser",
"tools": ["browser_navigate", "browser_click"],
"password": False,
"category": "tool",
},
"FAL_KEY": {
"description": "FAL API key for image generation",
"prompt": "FAL API key",
@@ -780,6 +904,20 @@ OPTIONAL_ENV_VARS = {
"password": False,
"category": "messaging",
},
"MATTERMOST_REQUIRE_MENTION": {
"description": "Require @mention in Mattermost channels (default: true). Set to false to respond to all messages.",
"prompt": "Require @mention in channels",
"url": None,
"password": False,
"category": "messaging",
},
"MATTERMOST_FREE_RESPONSE_CHANNELS": {
"description": "Comma-separated Mattermost channel IDs where bot responds without @mention",
"prompt": "Free-response channel IDs (comma-separated)",
"url": None,
"password": False,
"category": "messaging",
},
"MATRIX_HOMESERVER": {
"description": "Matrix homeserver URL (e.g. https://matrix.example.org)",
"prompt": "Matrix homeserver URL",
@@ -1650,6 +1788,7 @@ def show_config():
keys = [
("OPENROUTER_API_KEY", "OpenRouter"),
("VOICE_TOOLS_OPENAI_KEY", "OpenAI (STT/TTS)"),
("EXA_API_KEY", "Exa"),
("PARALLEL_API_KEY", "Parallel"),
("FIRECRAWL_API_KEY", "Firecrawl"),
("TAVILY_API_KEY", "Tavily"),
@@ -1809,7 +1948,7 @@ def set_config_value(key: str, value: str):
# Check if it's an API key (goes to .env)
api_keys = [
'OPENROUTER_API_KEY', 'OPENAI_API_KEY', 'ANTHROPIC_API_KEY', 'VOICE_TOOLS_OPENAI_KEY',
'PARALLEL_API_KEY', 'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL', 'TAVILY_API_KEY',
'EXA_API_KEY', 'PARALLEL_API_KEY', 'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL', 'TAVILY_API_KEY',
'BROWSERBASE_API_KEY', 'BROWSERBASE_PROJECT_ID', 'BROWSER_USE_API_KEY',
'FAL_KEY', 'TELEGRAM_BOT_TOKEN', 'DISCORD_BOT_TOKEN',
'TERMINAL_SSH_HOST', 'TERMINAL_SSH_USER', 'TERMINAL_SSH_KEY',
+36 -4
View File
@@ -4,7 +4,8 @@ Used by `hermes tools` and `hermes skills` for interactive checklists.
Provides a curses multi-select with keyboard navigation, plus a
text-based numbered fallback for terminals without curses support.
"""
from typing import List, Set
import sys
from typing import Callable, List, Optional, Set
from hermes_cli.colors import Colors, color
@@ -15,6 +16,7 @@ def curses_checklist(
selected: Set[int],
*,
cancel_returns: Set[int] | None = None,
status_fn: Optional[Callable[[Set[int]], str]] = None,
) -> Set[int]:
"""Curses multi-select checklist. Returns set of selected indices.
@@ -23,10 +25,18 @@ def curses_checklist(
items: Display labels for each row.
selected: Indices that start checked (pre-selected).
cancel_returns: Returned on ESC/q. Defaults to the original *selected*.
status_fn: Optional callback ``f(chosen_indices) -> str`` whose return
value is rendered on the bottom row of the terminal. Use this for
live aggregate info (e.g. estimated token counts).
"""
if cancel_returns is None:
cancel_returns = set(selected)
# Safety: curses and input() both hang or spin when stdin is not a
# terminal (e.g. subprocess pipe). Return defaults immediately.
if not sys.stdin.isatty():
return cancel_returns
try:
import curses
chosen = set(selected)
@@ -47,6 +57,9 @@ def curses_checklist(
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
# Reserve bottom row for status bar when status_fn provided
footer_rows = 1 if status_fn else 0
# Header
try:
hattr = curses.A_BOLD
@@ -62,7 +75,7 @@ def curses_checklist(
pass
# Scrollable item list
visible_rows = max_y - 3
visible_rows = max_y - 3 - footer_rows
if cursor < scroll_offset:
scroll_offset = cursor
elif cursor >= scroll_offset + visible_rows:
@@ -72,7 +85,7 @@ def curses_checklist(
range(scroll_offset, min(len(items), scroll_offset + visible_rows))
):
y = draw_i + 3
if y >= max_y - 1:
if y >= max_y - 1 - footer_rows:
break
check = "" if i in chosen else " "
arrow = "" if i == cursor else " "
@@ -87,6 +100,20 @@ def curses_checklist(
except curses.error:
pass
# Status bar (bottom row, right-aligned)
if status_fn:
try:
status_text = status_fn(chosen)
if status_text:
# Right-align on the bottom row
sx = max(0, max_x - len(status_text) - 1)
sattr = curses.A_DIM
if curses.has_colors():
sattr |= curses.color_pair(3)
stdscr.addnstr(max_y - 1, sx, status_text, max_x - sx - 1, sattr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
@@ -107,7 +134,7 @@ def curses_checklist(
return result_holder[0] if result_holder[0] is not None else cancel_returns
except Exception:
return _numbered_fallback(title, items, selected, cancel_returns)
return _numbered_fallback(title, items, selected, cancel_returns, status_fn)
def _numbered_fallback(
@@ -115,6 +142,7 @@ def _numbered_fallback(
items: List[str],
selected: Set[int],
cancel_returns: Set[int],
status_fn: Optional[Callable[[Set[int]], str]] = None,
) -> Set[int]:
"""Text-based toggle fallback for terminals without curses."""
chosen = set(selected)
@@ -125,6 +153,10 @@ def _numbered_fallback(
for i, label in enumerate(items):
marker = color("[✓]", Colors.GREEN) if i in chosen else "[ ]"
print(f" {marker} {i + 1:>2}. {label}")
if status_fn:
status_text = status_fn(chosen)
if status_text:
print(color(f"\n {status_text}", Colors.DIM))
print()
try:
val = input(color(" Toggle # (or Enter to confirm): ", Colors.DIM)).strip()
+91 -35
View File
@@ -10,9 +10,11 @@ import subprocess
import shutil
from hermes_cli.config import get_project_root, get_hermes_home, get_env_path
from hermes_constants import display_hermes_home
PROJECT_ROOT = get_project_root()
HERMES_HOME = get_hermes_home()
_DHH = display_hermes_home() # user-facing display path (e.g. ~/.hermes or ~/.hermes/profiles/coder)
# Load environment variables from ~/.hermes/.env so API key checks work
from dotenv import load_dotenv
@@ -56,7 +58,7 @@ def _honcho_is_configured_for_doctor() -> bool:
from honcho_integration.client import HonchoClientConfig
cfg = HonchoClientConfig.from_global_config()
return bool(cfg.enabled and cfg.api_key)
return bool(cfg.enabled and (cfg.api_key or cfg.base_url))
except Exception:
return False
@@ -209,14 +211,14 @@ def run_doctor(args):
# Check ~/.hermes/.env (primary location for user config)
env_path = HERMES_HOME / '.env'
if env_path.exists():
check_ok("~/.hermes/.env file exists")
check_ok(f"{_DHH}/.env file exists")
# Check for common issues
content = env_path.read_text()
if _has_provider_env_config(content):
check_ok("API key or custom endpoint configured")
else:
check_warn("No API key found in ~/.hermes/.env")
check_warn(f"No API key found in {_DHH}/.env")
issues.append("Run 'hermes setup' to configure API keys")
else:
# Also check project root as fallback
@@ -224,11 +226,11 @@ def run_doctor(args):
if fallback_env.exists():
check_ok(".env file exists (in project directory)")
else:
check_fail("~/.hermes/.env file missing")
check_fail(f"{_DHH}/.env file missing")
if should_fix:
env_path.parent.mkdir(parents=True, exist_ok=True)
env_path.touch()
check_ok("Created empty ~/.hermes/.env")
check_ok(f"Created empty {_DHH}/.env")
check_info("Run 'hermes setup' to configure API keys")
fixed_count += 1
else:
@@ -238,7 +240,7 @@ def run_doctor(args):
# Check ~/.hermes/config.yaml (primary) or project cli-config.yaml (fallback)
config_path = HERMES_HOME / 'config.yaml'
if config_path.exists():
check_ok("~/.hermes/config.yaml exists")
check_ok(f"{_DHH}/config.yaml exists")
else:
fallback_config = PROJECT_ROOT / 'cli-config.yaml'
if fallback_config.exists():
@@ -248,11 +250,11 @@ def run_doctor(args):
if should_fix and example_config.exists():
config_path.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(str(example_config), str(config_path))
check_ok("Created ~/.hermes/config.yaml from cli-config.yaml.example")
check_ok(f"Created {_DHH}/config.yaml from cli-config.yaml.example")
fixed_count += 1
elif should_fix:
check_warn("config.yaml not found and no example to copy from")
manual_issues.append("Create ~/.hermes/config.yaml manually")
manual_issues.append(f"Create {_DHH}/config.yaml manually")
else:
check_warn("config.yaml not found", "(using defaults)")
@@ -294,28 +296,28 @@ def run_doctor(args):
hermes_home = HERMES_HOME
if hermes_home.exists():
check_ok("~/.hermes directory exists")
check_ok(f"{_DHH} directory exists")
else:
if should_fix:
hermes_home.mkdir(parents=True, exist_ok=True)
check_ok("Created ~/.hermes directory")
check_ok(f"Created {_DHH} directory")
fixed_count += 1
else:
check_warn("~/.hermes not found", "(will be created on first use)")
check_warn(f"{_DHH} not found", "(will be created on first use)")
# Check expected subdirectories
expected_subdirs = ["cron", "sessions", "logs", "skills", "memories"]
for subdir_name in expected_subdirs:
subdir_path = hermes_home / subdir_name
if subdir_path.exists():
check_ok(f"~/.hermes/{subdir_name}/ exists")
check_ok(f"{_DHH}/{subdir_name}/ exists")
else:
if should_fix:
subdir_path.mkdir(parents=True, exist_ok=True)
check_ok(f"Created ~/.hermes/{subdir_name}/")
check_ok(f"Created {_DHH}/{subdir_name}/")
fixed_count += 1
else:
check_warn(f"~/.hermes/{subdir_name}/ not found", "(will be created on first use)")
check_warn(f"{_DHH}/{subdir_name}/ not found", "(will be created on first use)")
# Check for SOUL.md persona file
soul_path = hermes_home / "SOUL.md"
@@ -324,11 +326,11 @@ def run_doctor(args):
# Check if it's just the template comments (no real content)
lines = [l for l in content.splitlines() if l.strip() and not l.strip().startswith(("<!--", "-->", "#"))]
if lines:
check_ok("~/.hermes/SOUL.md exists (persona configured)")
check_ok(f"{_DHH}/SOUL.md exists (persona configured)")
else:
check_info("~/.hermes/SOUL.md exists but is empty — edit it to customize personality")
check_info(f"{_DHH}/SOUL.md exists but is empty — edit it to customize personality")
else:
check_warn("~/.hermes/SOUL.md not found", "(create it to give Hermes a custom personality)")
check_warn(f"{_DHH}/SOUL.md not found", "(create it to give Hermes a custom personality)")
if should_fix:
soul_path.parent.mkdir(parents=True, exist_ok=True)
soul_path.write_text(
@@ -337,13 +339,13 @@ def run_doctor(args):
"You are Hermes, a helpful AI assistant.\n",
encoding="utf-8",
)
check_ok("Created ~/.hermes/SOUL.md with basic template")
check_ok(f"Created {_DHH}/SOUL.md with basic template")
fixed_count += 1
# Check memory directory
memories_dir = hermes_home / "memories"
if memories_dir.exists():
check_ok("~/.hermes/memories/ directory exists")
check_ok(f"{_DHH}/memories/ directory exists")
memory_file = memories_dir / "MEMORY.md"
user_file = memories_dir / "USER.md"
if memory_file.exists():
@@ -357,10 +359,10 @@ def run_doctor(args):
else:
check_info("USER.md not created yet (will be created when the agent first writes a memory)")
else:
check_warn("~/.hermes/memories/ not found", "(will be created on first use)")
check_warn(f"{_DHH}/memories/ not found", "(will be created on first use)")
if should_fix:
memories_dir.mkdir(parents=True, exist_ok=True)
check_ok("Created ~/.hermes/memories/")
check_ok(f"Created {_DHH}/memories/")
fixed_count += 1
# Check SQLite session store
@@ -372,11 +374,11 @@ def run_doctor(args):
cursor = conn.execute("SELECT COUNT(*) FROM sessions")
count = cursor.fetchone()[0]
conn.close()
check_ok(f"~/.hermes/state.db exists ({count} sessions)")
check_ok(f"{_DHH}/state.db exists ({count} sessions)")
except Exception as e:
check_warn(f"~/.hermes/state.db exists but has issues: {e}")
check_warn(f"{_DHH}/state.db exists but has issues: {e}")
else:
check_info("~/.hermes/state.db not created yet (will be created on first session)")
check_info(f"{_DHH}/state.db not created yet (will be created on first session)")
_check_gateway_service_linger(issues)
@@ -404,8 +406,11 @@ def run_doctor(args):
if terminal_env == "docker":
if shutil.which("docker"):
# Check if docker daemon is running
result = subprocess.run(["docker", "info"], capture_output=True)
if result.returncode == 0:
try:
result = subprocess.run(["docker", "info"], capture_output=True, timeout=10)
except subprocess.TimeoutExpired:
result = None
if result is not None and result.returncode == 0:
check_ok("docker", "(daemon running)")
else:
check_fail("docker daemon not running")
@@ -424,12 +429,16 @@ def run_doctor(args):
ssh_host = os.getenv("TERMINAL_SSH_HOST")
if ssh_host:
# Try to connect
result = subprocess.run(
["ssh", "-o", "ConnectTimeout=5", "-o", "BatchMode=yes", ssh_host, "echo ok"],
capture_output=True,
text=True
)
if result.returncode == 0:
try:
result = subprocess.run(
["ssh", "-o", "ConnectTimeout=5", "-o", "BatchMode=yes", ssh_host, "echo ok"],
capture_output=True,
text=True,
timeout=15
)
except subprocess.TimeoutExpired:
result = None
if result is not None and result.returncode == 0:
check_ok(f"SSH connection to {ssh_host}")
else:
check_fail(f"SSH connection to {ssh_host}")
@@ -691,7 +700,7 @@ def run_doctor(args):
if github_token:
check_ok("GitHub token configured (authenticated API access)")
else:
check_warn("No GITHUB_TOKEN", "(60 req/hr rate limit — set in ~/.hermes/.env for better rates)")
check_warn("No GITHUB_TOKEN", f"(60 req/hr rate limit — set in {_DHH}/.env for better rates)")
# =========================================================================
# Honcho memory
@@ -708,8 +717,8 @@ def run_doctor(args):
check_warn("Honcho config not found", "run: hermes honcho setup")
elif not hcfg.enabled:
check_info(f"Honcho disabled (set enabled: true in {_honcho_cfg_path} to activate)")
elif not hcfg.api_key:
check_fail("Honcho API key not set", "run: hermes honcho setup")
elif not (hcfg.api_key or hcfg.base_url):
check_fail("Honcho API key or base URL not set", "run: hermes honcho setup")
issues.append("No Honcho API key — run 'hermes honcho setup'")
else:
from honcho_integration.client import get_honcho_client, reset_honcho_client
@@ -728,6 +737,53 @@ def run_doctor(args):
except Exception as _e:
check_warn("Honcho check failed", str(_e))
# =========================================================================
# Profiles
# =========================================================================
try:
from hermes_cli.profiles import list_profiles, _get_wrapper_dir, profile_exists
import re as _re
named_profiles = [p for p in list_profiles() if not p.is_default]
if named_profiles:
print()
print(color("◆ Profiles", Colors.CYAN, Colors.BOLD))
check_ok(f"{len(named_profiles)} profile(s) found")
wrapper_dir = _get_wrapper_dir()
for p in named_profiles:
parts = []
if p.gateway_running:
parts.append("gateway running")
if p.model:
parts.append(p.model[:30])
if not (p.path / "config.yaml").exists():
parts.append("⚠ missing config")
if not (p.path / ".env").exists():
parts.append("no .env")
wrapper = wrapper_dir / p.name
if not wrapper.exists():
parts.append("no alias")
status = ", ".join(parts) if parts else "configured"
check_ok(f" {p.name}: {status}")
# Check for orphan wrappers
if wrapper_dir.is_dir():
for wrapper in wrapper_dir.iterdir():
if not wrapper.is_file():
continue
try:
content = wrapper.read_text()
if "hermes -p" in content:
_m = _re.search(r"hermes -p (\S+)", content)
if _m and not profile_exists(_m.group(1)):
check_warn(f"Orphan alias: {wrapper.name} → profile '{_m.group(1)}' no longer exists")
except Exception:
pass
except ImportError:
pass
except Exception as _e:
logger.debug("Profile health check failed: %s", _e)
# =========================================================================
# Summary
# =========================================================================
+176 -21
View File
@@ -15,6 +15,8 @@ from pathlib import Path
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
from hermes_cli.config import get_env_value, get_hermes_home, save_env_value, is_managed, managed_error
# display_hermes_home is imported lazily at call sites to avoid ImportError
# when hermes_constants is cached from a pre-update version during `hermes update`.
from hermes_cli.setup import (
print_header, print_info, print_success, print_warning, print_error,
prompt, prompt_choice, prompt_yes_no,
@@ -125,20 +127,43 @@ _SERVICE_BASE = "hermes-gateway"
SERVICE_DESCRIPTION = "Hermes Agent Gateway - Messaging Platform Integration"
def _profile_suffix() -> str:
"""Derive a service-name suffix from the current HERMES_HOME.
Returns ``""`` for the default ``~/.hermes``, the profile name for
``~/.hermes/profiles/<name>``, or a short hash for any other custom
HERMES_HOME path.
"""
import hashlib
import re
from pathlib import Path as _Path
home = get_hermes_home().resolve()
default = (_Path.home() / ".hermes").resolve()
if home == default:
return ""
# Detect ~/.hermes/profiles/<name> pattern → use the profile name
profiles_root = (default / "profiles").resolve()
try:
rel = home.relative_to(profiles_root)
parts = rel.parts
if len(parts) == 1 and re.match(r"^[a-z0-9][a-z0-9_-]{0,63}$", parts[0]):
return parts[0]
except ValueError:
pass
# Fallback: short hash for arbitrary HERMES_HOME paths
return hashlib.sha256(str(home).encode()).hexdigest()[:8]
def get_service_name() -> str:
"""Derive a systemd service name scoped to this HERMES_HOME.
Default ``~/.hermes`` returns ``hermes-gateway`` (backward compatible).
Any other HERMES_HOME appends a short hash so multiple installations
can each have their own systemd service without conflicting.
Profile ``~/.hermes/profiles/coder`` returns ``hermes-gateway-coder``.
Any other HERMES_HOME appends a short hash for uniqueness.
"""
import hashlib
from pathlib import Path as _Path # local import to avoid monkeypatch interference
home = get_hermes_home().resolve()
default = (_Path.home() / ".hermes").resolve()
if home == default:
suffix = _profile_suffix()
if not suffix:
return _SERVICE_BASE
suffix = hashlib.sha256(str(home).encode()).hexdigest()[:8]
return f"{_SERVICE_BASE}-{suffix}"
@@ -369,7 +394,14 @@ def print_systemd_linger_guidance() -> None:
print(" sudo loginctl enable-linger $USER")
def get_launchd_plist_path() -> Path:
return Path.home() / "Library" / "LaunchAgents" / "ai.hermes.gateway.plist"
"""Return the launchd plist path, scoped per profile.
Default ``~/.hermes`` ``ai.hermes.gateway.plist`` (backward compatible).
Profile ``~/.hermes/profiles/coder`` ``ai.hermes.gateway-coder.plist``.
"""
suffix = _profile_suffix()
name = f"ai.hermes.gateway-{suffix}" if suffix else "ai.hermes.gateway"
return Path.home() / "Library" / "LaunchAgents" / f"{name}.plist"
def _detect_venv_dir() -> Path | None:
"""Detect the active virtualenv directory.
@@ -420,6 +452,17 @@ def get_hermes_cli_path() -> str:
# Systemd (Linux)
# =============================================================================
def _build_user_local_paths(home: Path, path_entries: list[str]) -> list[str]:
"""Return user-local bin dirs that exist and aren't already in *path_entries*."""
candidates = [
str(home / ".local" / "bin"), # uv, uvx, pip-installed CLIs
str(home / ".cargo" / "bin"), # Rust/cargo tools
str(home / "go" / "bin"), # Go tools
str(home / ".npm-global" / "bin"), # npm global packages
]
return [p for p in candidates if p not in path_entries and Path(p).exists()]
def generate_systemd_unit(system: bool = False, run_as_user: str | None = None) -> str:
python_path = get_python_path()
working_dir = str(PROJECT_ROOT)
@@ -434,13 +477,16 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
resolved_node_dir = str(Path(resolved_node).resolve().parent)
if resolved_node_dir not in path_entries:
path_entries.append(resolved_node_dir)
path_entries.extend(["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"])
sane_path = ":".join(path_entries)
hermes_home = str(get_hermes_home().resolve())
common_bin_paths = ["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"]
if system:
username, group_name, home_dir = _system_service_identity(run_as_user)
path_entries.extend(_build_user_local_paths(Path(home_dir), path_entries))
path_entries.extend(common_bin_paths)
sane_path = ":".join(path_entries)
return f"""[Unit]
Description={SERVICE_DESCRIPTION}
After=network-online.target
@@ -472,6 +518,9 @@ StandardError=journal
WantedBy=multi-user.target
"""
path_entries.extend(_build_user_local_paths(Path.home(), path_entries))
path_entries.extend(common_bin_paths)
sane_path = ":".join(path_entries)
return f"""[Unit]
Description={SERVICE_DESCRIPTION}
After=network.target
@@ -752,18 +801,46 @@ def systemd_status(deep: bool = False, system: bool = False):
# Launchd (macOS)
# =============================================================================
def get_launchd_label() -> str:
"""Return the launchd service label, scoped per profile."""
suffix = _profile_suffix()
return f"ai.hermes.gateway-{suffix}" if suffix else "ai.hermes.gateway"
def generate_launchd_plist() -> str:
python_path = get_python_path()
working_dir = str(PROJECT_ROOT)
hermes_home = str(get_hermes_home().resolve())
log_dir = get_hermes_home() / "logs"
log_dir.mkdir(parents=True, exist_ok=True)
label = get_launchd_label()
# Build a sane PATH for the launchd plist. launchd provides only a
# minimal default (/usr/bin:/bin:/usr/sbin:/sbin) which misses Homebrew,
# nvm, cargo, etc. We prepend venv/bin and node_modules/.bin (matching
# the systemd unit), then capture the user's full shell PATH so every
# user-installed tool (node, ffmpeg, …) is reachable.
detected_venv = _detect_venv_dir()
venv_bin = str(detected_venv / "bin") if detected_venv else str(PROJECT_ROOT / "venv" / "bin")
venv_dir = str(detected_venv) if detected_venv else str(PROJECT_ROOT / "venv")
node_bin = str(PROJECT_ROOT / "node_modules" / ".bin")
# Resolve the directory containing the node binary (e.g. Homebrew, nvm)
# so it's explicitly in PATH even if the user's shell PATH changes later.
priority_dirs = [venv_bin, node_bin]
resolved_node = shutil.which("node")
if resolved_node:
resolved_node_dir = str(Path(resolved_node).resolve().parent)
if resolved_node_dir not in priority_dirs:
priority_dirs.append(resolved_node_dir)
sane_path = ":".join(
dict.fromkeys(priority_dirs + [p for p in os.environ.get("PATH", "").split(":") if p])
)
return f"""<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>ai.hermes.gateway</string>
<string>{label}</string>
<key>ProgramArguments</key>
<array>
@@ -778,6 +855,16 @@ def generate_launchd_plist() -> str:
<key>WorkingDirectory</key>
<string>{working_dir}</string>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>{sane_path}</string>
<key>VIRTUAL_ENV</key>
<string>{venv_dir}</string>
<key>HERMES_HOME</key>
<string>{hermes_home}</string>
</dict>
<key>RunAtLoad</key>
<true/>
@@ -850,7 +937,8 @@ def launchd_install(force: bool = False):
print()
print("Next steps:")
print(" hermes gateway status # Check status")
print(" tail -f ~/.hermes/logs/gateway.log # View logs")
from hermes_constants import display_hermes_home as _dhh
print(f" tail -f {_dhh()}/logs/gateway.log # View logs")
def launchd_uninstall():
plist_path = get_launchd_plist_path()
@@ -863,20 +951,33 @@ def launchd_uninstall():
print("✓ Service uninstalled")
def launchd_start():
refresh_launchd_plist_if_needed()
plist_path = get_launchd_plist_path()
label = get_launchd_label()
# Self-heal if the plist is missing entirely (e.g., manual cleanup, failed upgrade)
if not plist_path.exists():
print("↻ launchd plist missing; regenerating service definition")
plist_path.parent.mkdir(parents=True, exist_ok=True)
plist_path.write_text(generate_launchd_plist(), encoding="utf-8")
subprocess.run(["launchctl", "load", str(plist_path)], check=True)
subprocess.run(["launchctl", "start", label], check=True)
print("✓ Service started")
return
refresh_launchd_plist_if_needed()
try:
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
subprocess.run(["launchctl", "start", label], check=True)
except subprocess.CalledProcessError as e:
if e.returncode != 3 or not plist_path.exists():
if e.returncode != 3:
raise
print("↻ launchd job was unloaded; reloading service definition")
subprocess.run(["launchctl", "load", str(plist_path)], check=True)
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
subprocess.run(["launchctl", "start", label], check=True)
print("✓ Service started")
def launchd_stop():
subprocess.run(["launchctl", "stop", "ai.hermes.gateway"], check=True)
label = get_launchd_label()
subprocess.run(["launchctl", "stop", label], check=True)
print("✓ Service stopped")
def _wait_for_gateway_exit(timeout: float = 10.0, force_after: float = 5.0):
@@ -931,8 +1032,9 @@ def launchd_restart():
def launchd_status(deep: bool = False):
plist_path = get_launchd_plist_path()
label = get_launchd_label()
result = subprocess.run(
["launchctl", "list", "ai.hermes.gateway"],
["launchctl", "list", label],
capture_output=True,
text=True
)
@@ -1220,6 +1322,59 @@ _PLATFORMS = [
"help": "The AppSecret from your DingTalk application credentials."},
],
},
{
"key": "feishu",
"label": "Feishu / Lark",
"emoji": "🪽",
"token_var": "FEISHU_APP_ID",
"setup_instructions": [
"1. Go to https://open.feishu.cn/ (or https://open.larksuite.com/ for Lark)",
"2. Create an app and copy the App ID and App Secret",
"3. Enable the Bot capability for the app",
"4. Choose WebSocket (recommended) or Webhook connection mode",
"5. Add the bot to a group chat or message it directly",
"6. Restrict access with FEISHU_ALLOWED_USERS for production use",
],
"vars": [
{"name": "FEISHU_APP_ID", "prompt": "App ID", "password": False,
"help": "The App ID from your Feishu/Lark application."},
{"name": "FEISHU_APP_SECRET", "prompt": "App Secret", "password": True,
"help": "The App Secret from your Feishu/Lark application."},
{"name": "FEISHU_DOMAIN", "prompt": "Domain — feishu or lark (default: feishu)", "password": False,
"help": "Use 'feishu' for Feishu China, or 'lark' for Lark international."},
{"name": "FEISHU_CONNECTION_MODE", "prompt": "Connection mode — websocket or webhook (default: websocket)", "password": False,
"help": "websocket is recommended unless you specifically need webhook mode."},
{"name": "FEISHU_ALLOWED_USERS", "prompt": "Allowed user IDs (comma-separated, or empty)", "password": False,
"is_allowlist": True,
"help": "Restrict which Feishu/Lark users can interact with the bot."},
{"name": "FEISHU_HOME_CHANNEL", "prompt": "Home chat ID (optional, for cron/notifications)", "password": False,
"help": "Chat ID for scheduled results and notifications."},
],
},
{
"key": "wecom",
"label": "WeCom (Enterprise WeChat)",
"emoji": "💬",
"token_var": "WECOM_BOT_ID",
"setup_instructions": [
"1. Go to WeCom Admin Console → Applications → Create AI Bot",
"2. Copy the Bot ID and Secret from the bot's credentials page",
"3. The bot connects via WebSocket — no public endpoint needed",
"4. Add the bot to a group chat or message it directly in WeCom",
"5. Restrict access with WECOM_ALLOWED_USERS for production use",
],
"vars": [
{"name": "WECOM_BOT_ID", "prompt": "Bot ID", "password": False,
"help": "The Bot ID from your WeCom AI Bot."},
{"name": "WECOM_SECRET", "prompt": "Secret", "password": True,
"help": "The secret from your WeCom AI Bot."},
{"name": "WECOM_ALLOWED_USERS", "prompt": "Allowed user IDs (comma-separated, or empty)", "password": False,
"is_allowlist": True,
"help": "Restrict which WeCom users can interact with the bot."},
{"name": "WECOM_HOME_CHANNEL", "prompt": "Home chat ID (optional, for cron/notifications)", "password": False,
"help": "Chat ID for scheduled results and notifications."},
],
},
]
@@ -1437,7 +1592,7 @@ def _is_service_running() -> bool:
return False
elif is_macos() and get_launchd_plist_path().exists():
result = subprocess.run(
["launchctl", "list", "ai.hermes.gateway"],
["launchctl", "list", get_launchd_label()],
capture_output=True, text=True
)
return result.returncode == 0
+790 -83
View File
File diff suppressed because it is too large Load Diff
+13 -2
View File
@@ -24,6 +24,7 @@ from hermes_cli.config import (
get_hermes_home, # noqa: F401 — used by test mocks
)
from hermes_cli.colors import Colors, color
from hermes_constants import display_hermes_home
logger = logging.getLogger(__name__)
@@ -244,7 +245,7 @@ def cmd_mcp_add(args):
api_key = _prompt("API key / Bearer token", password=True)
if api_key:
save_env_value(env_key, api_key)
_success(f"Saved to ~/.hermes/.env as {env_key}")
_success(f"Saved to {display_hermes_home()}/.env as {env_key}")
# Set header with env var interpolation
if api_key or existing_key:
@@ -332,7 +333,7 @@ def cmd_mcp_add(args):
_save_mcp_server(name, server_config)
print()
_success(f"Saved '{name}' to ~/.hermes/config.yaml ({tool_count}/{total} tools enabled)")
_success(f"Saved '{name}' to {display_hermes_home()}/config.yaml ({tool_count}/{total} tools enabled)")
_info("Start a new session to use these tools.")
@@ -510,6 +511,10 @@ def _interpolate_value(value: str) -> str:
def cmd_mcp_configure(args):
"""Reconfigure which tools are enabled for an existing MCP server."""
import sys as _sys
if not _sys.stdin.isatty():
print("Error: 'hermes mcp configure' requires an interactive terminal.", file=_sys.stderr)
_sys.exit(1)
name = args.name
servers = _get_mcp_servers()
@@ -607,6 +612,11 @@ def mcp_command(args):
"""Main dispatcher for ``hermes mcp`` subcommands."""
action = getattr(args, "mcp_action", None)
if action == "serve":
from mcp_serve import run_mcp_server
run_mcp_server(verbose=getattr(args, "verbose", False))
return
handlers = {
"add": cmd_mcp_add,
"remove": cmd_mcp_remove,
@@ -625,6 +635,7 @@ def mcp_command(args):
# No subcommand — show list
cmd_mcp_list()
print(color(" Commands:", Colors.CYAN))
_info("hermes mcp serve Run as MCP server")
_info("hermes mcp add <name> --url <endpoint> Add an MCP server")
_info("hermes mcp add <name> --command <cmd> Add a stdio server")
_info("hermes mcp remove <name> Remove a server")
+30 -5
View File
@@ -35,6 +35,8 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("openai/gpt-5.3-codex", ""),
("google/gemini-3-pro-preview", ""),
("google/gemini-3-flash-preview", ""),
("google/gemini-3.1-pro-preview", ""),
("google/gemini-3.1-flash-lite-preview", ""),
("qwen/qwen3.5-plus-02-15", ""),
("qwen/qwen3.5-35b-a3b", ""),
("stepfun/step-3.5-flash", ""),
@@ -62,6 +64,8 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"openai/gpt-5.3-codex",
"google/gemini-3-pro-preview",
"google/gemini-3-flash-preview",
"google/gemini-3.1-pro-preview",
"google/gemini-3.1-flash-lite-preview",
"qwen/qwen3.5-plus-02-15",
"qwen/qwen3.5-35b-a3b",
"stepfun/step-3.5-flash",
@@ -208,14 +212,31 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"google/gemini-3-pro-preview",
"google/gemini-3-flash-preview",
],
# Alibaba DashScope Coding platform (coding-intl) — default endpoint.
# Supports Qwen models + third-party providers (GLM, Kimi, MiniMax).
# Users with classic DashScope keys should override DASHSCOPE_BASE_URL
# to https://dashscope-intl.aliyuncs.com/compatible-mode/v1 (OpenAI-compat)
# or https://dashscope-intl.aliyuncs.com/apps/anthropic (Anthropic-compat).
"alibaba": [
"qwen3.5-plus",
"qwen3-max",
"qwen3-coder-plus",
"qwen3-coder-next",
"qwen-plus-latest",
"qwen3.5-flash",
"qwen-vl-max",
# Third-party models available on coding-intl
"glm-5",
"glm-4.7",
"kimi-k2.5",
"MiniMax-M2.5",
],
# Curated HF model list — only agentic models that map to OpenRouter defaults.
"huggingface": [
"Qwen/Qwen3.5-397B-A17B",
"Qwen/Qwen3.5-35B-A3B",
"deepseek-ai/DeepSeek-V3.2",
"moonshotai/Kimi-K2.5",
"MiniMaxAI/MiniMax-M2.5",
"zai-org/GLM-5",
"XiaomiMiMo/MiMo-V2-Flash",
"moonshotai/Kimi-K2-Thinking",
],
}
@@ -236,6 +257,7 @@ _PROVIDER_LABELS = {
"ai-gateway": "AI Gateway",
"kilocode": "Kilo Code",
"alibaba": "Alibaba Cloud (DashScope)",
"huggingface": "Hugging Face",
"custom": "Custom endpoint",
}
@@ -271,6 +293,9 @@ _PROVIDER_ALIASES = {
"aliyun": "alibaba",
"qwen": "alibaba",
"alibaba-cloud": "alibaba",
"hf": "huggingface",
"hugging-face": "huggingface",
"huggingface-hub": "huggingface",
}
@@ -304,7 +329,7 @@ def list_available_providers() -> list[dict[str, str]]:
# Canonical providers in display order
_PROVIDER_ORDER = [
"openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
"zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
"huggingface", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
"opencode-zen", "opencode-go",
"ai-gateway", "deepseek", "custom",
]
+64 -6
View File
@@ -68,6 +68,17 @@ def _env_enabled(name: str) -> bool:
return os.getenv(name, "").strip().lower() in {"1", "true", "yes", "on"}
def _get_disabled_plugins() -> set:
"""Read the disabled plugins list from config.yaml."""
try:
from hermes_cli.config import load_config
config = load_config()
disabled = config.get("plugins", {}).get("disabled", [])
return set(disabled) if isinstance(disabled, list) else set()
except Exception:
return set()
# ---------------------------------------------------------------------------
# Data classes
# ---------------------------------------------------------------------------
@@ -141,6 +152,34 @@ class PluginContext:
self._manager._plugin_tool_names.add(name)
logger.debug("Plugin %s registered tool: %s", self.manifest.name, name)
# -- message injection --------------------------------------------------
def inject_message(self, content: str, role: str = "user") -> bool:
"""Inject a message into the active conversation.
If the agent is idle (waiting for user input), this starts a new turn.
If the agent is running, this interrupts and injects the message.
This enables plugins (e.g. remote control viewers, messaging bridges)
to send messages into the conversation from external sources.
Returns True if the message was queued successfully.
"""
cli = self._manager._cli_ref
if cli is None:
logger.warning("inject_message: no CLI reference (not available in gateway mode)")
return False
msg = content if role == "user" else f"[{role}] {content}"
if getattr(cli, "_agent_running", False):
# Agent is mid-turn — interrupt with the message
cli._interrupt_queue.put(msg)
else:
# Agent is idle — queue as next input
cli._pending_input.put(msg)
return True
# -- hook registration --------------------------------------------------
def register_hook(self, hook_name: str, callback: Callable) -> None:
@@ -173,6 +212,7 @@ class PluginManager:
self._hooks: Dict[str, List[Callable]] = {}
self._plugin_tool_names: Set[str] = set()
self._discovered: bool = False
self._cli_ref = None # Set by CLI after plugin discovery
# -----------------------------------------------------------------------
# Public
@@ -199,8 +239,15 @@ class PluginManager:
# 3. Pip / entry-point plugins
manifests.extend(self._scan_entry_points())
# Load each manifest
# Load each manifest (skip user-disabled plugins)
disabled = _get_disabled_plugins()
for manifest in manifests:
if manifest.name in disabled:
loaded = LoadedPlugin(manifest=manifest, enabled=False)
loaded.error = "disabled via config"
self._plugins[manifest.name] = loaded
logger.debug("Skipping disabled plugin '%s'", manifest.name)
continue
self._load_plugin(manifest)
if manifests:
@@ -385,16 +432,23 @@ class PluginManager:
# Hook invocation
# -----------------------------------------------------------------------
def invoke_hook(self, hook_name: str, **kwargs: Any) -> None:
def invoke_hook(self, hook_name: str, **kwargs: Any) -> List[Any]:
"""Call all registered callbacks for *hook_name*.
Each callback is wrapped in its own try/except so a misbehaving
plugin cannot break the core agent loop.
Returns a list of non-``None`` return values from callbacks.
This allows hooks like ``pre_llm_call`` to contribute context
that the agent core can collect and inject.
"""
callbacks = self._hooks.get(hook_name, [])
results: List[Any] = []
for cb in callbacks:
try:
cb(**kwargs)
ret = cb(**kwargs)
if ret is not None:
results.append(ret)
except Exception as exc:
logger.warning(
"Hook '%s' callback %s raised: %s",
@@ -402,6 +456,7 @@ class PluginManager:
getattr(cb, "__name__", repr(cb)),
exc,
)
return results
# -----------------------------------------------------------------------
# Introspection
@@ -446,9 +501,12 @@ def discover_plugins() -> None:
get_plugin_manager().discover_and_load()
def invoke_hook(hook_name: str, **kwargs: Any) -> None:
"""Invoke a lifecycle hook on all loaded plugins."""
get_plugin_manager().invoke_hook(hook_name, **kwargs)
def invoke_hook(hook_name: str, **kwargs: Any) -> List[Any]:
"""Invoke a lifecycle hook on all loaded plugins.
Returns a list of non-``None`` return values from plugin callbacks.
"""
return get_plugin_manager().invoke_hook(hook_name, **kwargs)
def get_plugin_tool_names() -> Set[str]:
+155 -3
View File
@@ -265,10 +265,11 @@ def cmd_install(identifier: str, force: bool = False) -> None:
)
sys.exit(1)
if mv_int > _SUPPORTED_MANIFEST_VERSION:
from hermes_cli.config import recommended_update_command
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' requires manifest_version "
f"{mv}, but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}.\n"
f"Run [bold]hermes update[/bold] to get a newer installer."
f"Run [bold]{recommended_update_command()}[/bold] to get a newer installer."
)
sys.exit(1)
@@ -374,6 +375,73 @@ def cmd_remove(name: str) -> None:
_display_removed(name, plugins_dir)
def _get_disabled_set() -> set:
"""Read the disabled plugins set from config.yaml."""
try:
from hermes_cli.config import load_config
config = load_config()
disabled = config.get("plugins", {}).get("disabled", [])
return set(disabled) if isinstance(disabled, list) else set()
except Exception:
return set()
def _save_disabled_set(disabled: set) -> None:
"""Write the disabled plugins list to config.yaml."""
from hermes_cli.config import load_config, save_config
config = load_config()
if "plugins" not in config:
config["plugins"] = {}
config["plugins"]["disabled"] = sorted(disabled)
save_config(config)
def cmd_enable(name: str) -> None:
"""Enable a previously disabled plugin."""
from rich.console import Console
console = Console()
plugins_dir = _plugins_dir()
# Verify the plugin exists
target = plugins_dir / name
if not target.is_dir():
console.print(f"[red]Plugin '{name}' is not installed.[/red]")
sys.exit(1)
disabled = _get_disabled_set()
if name not in disabled:
console.print(f"[dim]Plugin '{name}' is already enabled.[/dim]")
return
disabled.discard(name)
_save_disabled_set(disabled)
console.print(f"[green]✓[/green] Plugin [bold]{name}[/bold] enabled. Takes effect on next session.")
def cmd_disable(name: str) -> None:
"""Disable a plugin without removing it."""
from rich.console import Console
console = Console()
plugins_dir = _plugins_dir()
# Verify the plugin exists
target = plugins_dir / name
if not target.is_dir():
console.print(f"[red]Plugin '{name}' is not installed.[/red]")
sys.exit(1)
disabled = _get_disabled_set()
if name in disabled:
console.print(f"[dim]Plugin '{name}' is already disabled.[/dim]")
return
disabled.add(name)
_save_disabled_set(disabled)
console.print(f"[yellow]⊘[/yellow] Plugin [bold]{name}[/bold] disabled. Takes effect on next session.")
def cmd_list() -> None:
"""List installed plugins."""
from rich.console import Console
@@ -393,8 +461,11 @@ def cmd_list() -> None:
console.print("[dim]Install with:[/dim] hermes plugins install owner/repo")
return
disabled = _get_disabled_set()
table = Table(title="Installed Plugins", show_lines=False)
table.add_column("Name", style="bold")
table.add_column("Status")
table.add_column("Version", style="dim")
table.add_column("Description")
table.add_column("Source", style="dim")
@@ -420,11 +491,86 @@ def cmd_list() -> None:
if (d / ".git").exists():
source = "git"
table.add_row(name, str(version), description, source)
is_disabled = name in disabled or d.name in disabled
status = "[red]disabled[/red]" if is_disabled else "[green]enabled[/green]"
table.add_row(name, status, str(version), description, source)
console.print()
console.print(table)
console.print()
console.print("[dim]Interactive toggle:[/dim] hermes plugins")
console.print("[dim]Enable/disable:[/dim] hermes plugins enable/disable <name>")
def cmd_toggle() -> None:
"""Interactive curses checklist to enable/disable installed plugins."""
from rich.console import Console
try:
import yaml
except ImportError:
yaml = None
console = Console()
plugins_dir = _plugins_dir()
dirs = sorted(d for d in plugins_dir.iterdir() if d.is_dir())
if not dirs:
console.print("[dim]No plugins installed.[/dim]")
console.print("[dim]Install with:[/dim] hermes plugins install owner/repo")
return
disabled = _get_disabled_set()
# Build items list: "name — description" for display
names = []
labels = []
selected = set()
for i, d in enumerate(dirs):
manifest_file = d / "plugin.yaml"
name = d.name
description = ""
if manifest_file.exists() and yaml:
try:
with open(manifest_file) as f:
manifest = yaml.safe_load(f) or {}
name = manifest.get("name", d.name)
description = manifest.get("description", "")
except Exception:
pass
names.append(name)
label = f"{name}{description}" if description else name
labels.append(label)
if name not in disabled and d.name not in disabled:
selected.add(i)
from hermes_cli.curses_ui import curses_checklist
result = curses_checklist(
title="Plugins — toggle enabled/disabled",
items=labels,
selected=selected,
)
# Compute new disabled set from deselected items
new_disabled = set()
for i, name in enumerate(names):
if i not in result:
new_disabled.add(name)
if new_disabled != disabled:
_save_disabled_set(new_disabled)
enabled_count = len(names) - len(new_disabled)
console.print(
f"\n[green]✓[/green] {enabled_count} enabled, {len(new_disabled)} disabled. "
f"Takes effect on next session."
)
else:
console.print("\n[dim]No changes.[/dim]")
def plugins_command(args) -> None:
@@ -437,8 +583,14 @@ def plugins_command(args) -> None:
cmd_update(args.name)
elif action in ("remove", "rm", "uninstall"):
cmd_remove(args.name)
elif action in ("list", "ls") or action is None:
elif action == "enable":
cmd_enable(args.name)
elif action == "disable":
cmd_disable(args.name)
elif action in ("list", "ls"):
cmd_list()
elif action is None:
cmd_toggle()
else:
from rich.console import Console
+906
View File
@@ -0,0 +1,906 @@
"""
Profile management for multiple isolated Hermes instances.
Each profile is a fully independent HERMES_HOME directory with its own
config.yaml, .env, memory, sessions, skills, gateway, cron, and logs.
Profiles live under ``~/.hermes/profiles/<name>/`` by default.
The "default" profile is ``~/.hermes`` itself backward compatible,
zero migration needed.
Usage::
hermes profile create coder # fresh profile + bundled skills
hermes profile create coder --clone # also copy config, .env, SOUL.md
hermes profile create coder --clone-all # full copy of source profile
coder chat # use via wrapper alias
hermes -p coder chat # or via flag
hermes profile use coder # set as sticky default
hermes profile delete coder # remove profile + alias + service
"""
import json
import os
import re
import shutil
import stat
import subprocess
import sys
from dataclasses import dataclass, field
from pathlib import Path
from typing import List, Optional
_PROFILE_ID_RE = re.compile(r"^[a-z0-9][a-z0-9_-]{0,63}$")
# Directories bootstrapped inside every new profile
_PROFILE_DIRS = [
"memories",
"sessions",
"skills",
"skins",
"logs",
"plans",
"workspace",
"cron",
]
# Files copied during --clone (if they exist in the source)
_CLONE_CONFIG_FILES = [
"config.yaml",
".env",
"SOUL.md",
]
# Runtime files stripped after --clone-all (shouldn't carry over)
_CLONE_ALL_STRIP = [
"gateway.pid",
"gateway_state.json",
"processes.json",
]
# Names that cannot be used as profile aliases
_RESERVED_NAMES = frozenset({
"hermes", "default", "test", "tmp", "root", "sudo",
})
# Hermes subcommands that cannot be used as profile names/aliases
_HERMES_SUBCOMMANDS = frozenset({
"chat", "model", "gateway", "setup", "whatsapp", "login", "logout",
"status", "cron", "doctor", "config", "pairing", "skills", "tools",
"mcp", "sessions", "insights", "version", "update", "uninstall",
"profile", "plugins", "honcho", "acp",
})
# ---------------------------------------------------------------------------
# Path helpers
# ---------------------------------------------------------------------------
def _get_profiles_root() -> Path:
"""Return the directory where named profiles are stored.
Always ``~/.hermes/profiles/`` anchored to the user's home,
NOT to the current HERMES_HOME (which may itself be a profile).
This ensures ``coder profile list`` can see all profiles.
"""
return Path.home() / ".hermes" / "profiles"
def _get_default_hermes_home() -> Path:
"""Return the default (pre-profile) HERMES_HOME path."""
return Path.home() / ".hermes"
def _get_active_profile_path() -> Path:
"""Return the path to the sticky active_profile file."""
return _get_default_hermes_home() / "active_profile"
def _get_wrapper_dir() -> Path:
"""Return the directory for wrapper scripts."""
return Path.home() / ".local" / "bin"
# ---------------------------------------------------------------------------
# Validation
# ---------------------------------------------------------------------------
def validate_profile_name(name: str) -> None:
"""Raise ``ValueError`` if *name* is not a valid profile identifier."""
if name == "default":
return # special alias for ~/.hermes
if not _PROFILE_ID_RE.match(name):
raise ValueError(
f"Invalid profile name {name!r}. Must match "
f"[a-z0-9][a-z0-9_-]{{0,63}}"
)
def get_profile_dir(name: str) -> Path:
"""Resolve a profile name to its HERMES_HOME directory."""
if name == "default":
return _get_default_hermes_home()
return _get_profiles_root() / name
def profile_exists(name: str) -> bool:
"""Check whether a profile directory exists."""
if name == "default":
return True
return get_profile_dir(name).is_dir()
# ---------------------------------------------------------------------------
# Alias / wrapper script management
# ---------------------------------------------------------------------------
def check_alias_collision(name: str) -> Optional[str]:
"""Return a human-readable collision message, or None if the name is safe.
Checks: reserved names, hermes subcommands, existing binaries in PATH.
"""
if name in _RESERVED_NAMES:
return f"'{name}' is a reserved name"
if name in _HERMES_SUBCOMMANDS:
return f"'{name}' conflicts with a hermes subcommand"
# Check existing commands in PATH
wrapper_dir = _get_wrapper_dir()
try:
result = subprocess.run(
["which", name], capture_output=True, text=True, timeout=5,
)
if result.returncode == 0:
existing_path = result.stdout.strip()
# Allow overwriting our own wrappers
if existing_path == str(wrapper_dir / name):
try:
content = (wrapper_dir / name).read_text()
if "hermes -p" in content:
return None # it's our wrapper, safe to overwrite
except Exception:
pass
return f"'{name}' conflicts with an existing command ({existing_path})"
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
return None # safe
def _is_wrapper_dir_in_path() -> bool:
"""Check if ~/.local/bin is in PATH."""
wrapper_dir = str(_get_wrapper_dir())
return wrapper_dir in os.environ.get("PATH", "").split(os.pathsep)
def create_wrapper_script(name: str) -> Optional[Path]:
"""Create a shell wrapper script at ~/.local/bin/<name>.
Returns the path to the created wrapper, or None if creation failed.
"""
wrapper_dir = _get_wrapper_dir()
try:
wrapper_dir.mkdir(parents=True, exist_ok=True)
except OSError as e:
print(f"⚠ Could not create {wrapper_dir}: {e}")
return None
wrapper_path = wrapper_dir / name
try:
wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {name} "$@"\n')
wrapper_path.chmod(wrapper_path.stat().st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)
return wrapper_path
except OSError as e:
print(f"⚠ Could not create wrapper at {wrapper_path}: {e}")
return None
def remove_wrapper_script(name: str) -> bool:
"""Remove the wrapper script for a profile. Returns True if removed."""
wrapper_path = _get_wrapper_dir() / name
if wrapper_path.exists():
try:
# Verify it's our wrapper before removing
content = wrapper_path.read_text()
if "hermes -p" in content:
wrapper_path.unlink()
return True
except Exception:
pass
return False
# ---------------------------------------------------------------------------
# ProfileInfo
# ---------------------------------------------------------------------------
@dataclass
class ProfileInfo:
"""Summary information about a profile."""
name: str
path: Path
is_default: bool
gateway_running: bool
model: Optional[str] = None
provider: Optional[str] = None
has_env: bool = False
skill_count: int = 0
alias_path: Optional[Path] = None
def _read_config_model(profile_dir: Path) -> tuple:
"""Read model/provider from a profile's config.yaml. Returns (model, provider)."""
config_path = profile_dir / "config.yaml"
if not config_path.exists():
return None, None
try:
import yaml
with open(config_path, "r") as f:
cfg = yaml.safe_load(f) or {}
model_cfg = cfg.get("model", {})
if isinstance(model_cfg, str):
return model_cfg, None
if isinstance(model_cfg, dict):
return model_cfg.get("model"), model_cfg.get("provider")
return None, None
except Exception:
return None, None
def _check_gateway_running(profile_dir: Path) -> bool:
"""Check if a gateway is running for a given profile directory."""
pid_file = profile_dir / "gateway.pid"
if not pid_file.exists():
return False
try:
raw = pid_file.read_text().strip()
if not raw:
return False
data = json.loads(raw) if raw.startswith("{") else {"pid": int(raw)}
pid = int(data["pid"])
os.kill(pid, 0) # existence check
return True
except (json.JSONDecodeError, KeyError, ValueError, TypeError,
ProcessLookupError, PermissionError, OSError):
return False
def _count_skills(profile_dir: Path) -> int:
"""Count installed skills in a profile."""
skills_dir = profile_dir / "skills"
if not skills_dir.is_dir():
return 0
count = 0
for md in skills_dir.rglob("SKILL.md"):
if "/.hub/" not in str(md) and "/.git/" not in str(md):
count += 1
return count
# ---------------------------------------------------------------------------
# CRUD operations
# ---------------------------------------------------------------------------
def list_profiles() -> List[ProfileInfo]:
"""Return info for all profiles, including the default."""
profiles = []
wrapper_dir = _get_wrapper_dir()
# Default profile
default_home = _get_default_hermes_home()
if default_home.is_dir():
model, provider = _read_config_model(default_home)
profiles.append(ProfileInfo(
name="default",
path=default_home,
is_default=True,
gateway_running=_check_gateway_running(default_home),
model=model,
provider=provider,
has_env=(default_home / ".env").exists(),
skill_count=_count_skills(default_home),
))
# Named profiles
profiles_root = _get_profiles_root()
if profiles_root.is_dir():
for entry in sorted(profiles_root.iterdir()):
if not entry.is_dir():
continue
name = entry.name
if not _PROFILE_ID_RE.match(name):
continue
model, provider = _read_config_model(entry)
alias_path = wrapper_dir / name
profiles.append(ProfileInfo(
name=name,
path=entry,
is_default=False,
gateway_running=_check_gateway_running(entry),
model=model,
provider=provider,
has_env=(entry / ".env").exists(),
skill_count=_count_skills(entry),
alias_path=alias_path if alias_path.exists() else None,
))
return profiles
def create_profile(
name: str,
clone_from: Optional[str] = None,
clone_all: bool = False,
clone_config: bool = False,
no_alias: bool = False,
) -> Path:
"""Create a new profile directory.
Parameters
----------
name:
Profile identifier (lowercase, alphanumeric, hyphens, underscores).
clone_from:
Source profile to clone from. If ``None`` and clone_config/clone_all
is True, defaults to the currently active profile.
clone_all:
If True, do a full copytree of the source (all state).
clone_config:
If True, copy only config files (config.yaml, .env, SOUL.md).
no_alias:
If True, skip wrapper script creation.
Returns
-------
Path
The newly created profile directory.
"""
validate_profile_name(name)
if name == "default":
raise ValueError(
"Cannot create a profile named 'default' — it is the built-in profile (~/.hermes)."
)
profile_dir = get_profile_dir(name)
if profile_dir.exists():
raise FileExistsError(f"Profile '{name}' already exists at {profile_dir}")
# Resolve clone source
source_dir = None
if clone_from is not None or clone_all or clone_config:
if clone_from is None:
# Default: clone from active profile
from hermes_constants import get_hermes_home
source_dir = get_hermes_home()
else:
validate_profile_name(clone_from)
source_dir = get_profile_dir(clone_from)
if not source_dir.is_dir():
raise FileNotFoundError(
f"Source profile '{clone_from or 'active'}' does not exist at {source_dir}"
)
if clone_all and source_dir:
# Full copy of source profile
shutil.copytree(source_dir, profile_dir)
# Strip runtime files
for stale in _CLONE_ALL_STRIP:
(profile_dir / stale).unlink(missing_ok=True)
else:
# Bootstrap directory structure
profile_dir.mkdir(parents=True, exist_ok=True)
for subdir in _PROFILE_DIRS:
(profile_dir / subdir).mkdir(parents=True, exist_ok=True)
# Clone config files from source
if source_dir is not None:
for filename in _CLONE_CONFIG_FILES:
src = source_dir / filename
if src.exists():
shutil.copy2(src, profile_dir / filename)
return profile_dir
def seed_profile_skills(profile_dir: Path, quiet: bool = False) -> Optional[dict]:
"""Seed bundled skills into a profile via subprocess.
Uses subprocess because sync_skills() caches HERMES_HOME at module level.
Returns the sync result dict, or None on failure.
"""
project_root = Path(__file__).parent.parent.resolve()
try:
result = subprocess.run(
[sys.executable, "-c",
"import json; from tools.skills_sync import sync_skills; "
"r = sync_skills(quiet=True); print(json.dumps(r))"],
env={**os.environ, "HERMES_HOME": str(profile_dir)},
cwd=str(project_root),
capture_output=True, text=True, timeout=60,
)
if result.returncode == 0 and result.stdout.strip():
return json.loads(result.stdout.strip())
if not quiet:
print(f"⚠ Skill seeding returned exit code {result.returncode}")
if result.stderr.strip():
print(f" {result.stderr.strip()[:200]}")
return None
except subprocess.TimeoutExpired:
if not quiet:
print("⚠ Skill seeding timed out (60s)")
return None
except Exception as e:
if not quiet:
print(f"⚠ Skill seeding failed: {e}")
return None
def delete_profile(name: str, yes: bool = False) -> Path:
"""Delete a profile, its wrapper script, and its gateway service.
Stops the gateway if running. Disables systemd/launchd service first
to prevent auto-restart.
Returns the path that was removed.
"""
validate_profile_name(name)
if name == "default":
raise ValueError(
"Cannot delete the default profile (~/.hermes).\n"
"To remove everything, use: hermes uninstall"
)
profile_dir = get_profile_dir(name)
if not profile_dir.is_dir():
raise FileNotFoundError(f"Profile '{name}' does not exist.")
# Show what will be deleted
model, provider = _read_config_model(profile_dir)
gw_running = _check_gateway_running(profile_dir)
skill_count = _count_skills(profile_dir)
print(f"\nProfile: {name}")
print(f"Path: {profile_dir}")
if model:
print(f"Model: {model}" + (f" ({provider})" if provider else ""))
if skill_count:
print(f"Skills: {skill_count}")
items = [
"All config, API keys, memories, sessions, skills, cron jobs",
]
# Check for service
from hermes_cli.gateway import _profile_suffix, get_service_name
wrapper_path = _get_wrapper_dir() / name
has_wrapper = wrapper_path.exists()
if has_wrapper:
items.append(f"Command alias ({wrapper_path})")
print(f"\nThis will permanently delete:")
for item in items:
print(f"{item}")
if gw_running:
print(f" ⚠ Gateway is running — it will be stopped.")
# Confirmation
if not yes:
print()
try:
confirm = input(f"Type '{name}' to confirm: ").strip()
except (KeyboardInterrupt, EOFError):
print("\nCancelled.")
return profile_dir
if confirm != name:
print("Cancelled.")
return profile_dir
# 1. Disable service (prevents auto-restart)
_cleanup_gateway_service(name, profile_dir)
# 2. Stop running gateway
if gw_running:
_stop_gateway_process(profile_dir)
# 3. Remove wrapper script
if has_wrapper:
if remove_wrapper_script(name):
print(f"✓ Removed {wrapper_path}")
# 4. Remove profile directory
try:
shutil.rmtree(profile_dir)
print(f"✓ Removed {profile_dir}")
except Exception as e:
print(f"⚠ Could not remove {profile_dir}: {e}")
# 5. Clear active_profile if it pointed to this profile
try:
active = get_active_profile()
if active == name:
set_active_profile("default")
print("✓ Active profile reset to default")
except Exception:
pass
print(f"\nProfile '{name}' deleted.")
return profile_dir
def _cleanup_gateway_service(name: str, profile_dir: Path) -> None:
"""Disable and remove systemd/launchd service for a profile."""
import platform as _platform
# Derive service name for this profile
# Temporarily set HERMES_HOME so _profile_suffix resolves correctly
old_home = os.environ.get("HERMES_HOME")
try:
os.environ["HERMES_HOME"] = str(profile_dir)
from hermes_cli.gateway import get_service_name, get_launchd_plist_path
if _platform.system() == "Linux":
svc_name = get_service_name()
svc_file = Path.home() / ".config" / "systemd" / "user" / f"{svc_name}.service"
if svc_file.exists():
subprocess.run(
["systemctl", "--user", "disable", svc_name],
capture_output=True, check=False, timeout=10,
)
subprocess.run(
["systemctl", "--user", "stop", svc_name],
capture_output=True, check=False, timeout=10,
)
svc_file.unlink(missing_ok=True)
subprocess.run(
["systemctl", "--user", "daemon-reload"],
capture_output=True, check=False, timeout=10,
)
print(f"✓ Service {svc_name} removed")
elif _platform.system() == "Darwin":
plist_path = get_launchd_plist_path()
if plist_path.exists():
subprocess.run(
["launchctl", "unload", str(plist_path)],
capture_output=True, check=False, timeout=10,
)
plist_path.unlink(missing_ok=True)
print(f"✓ Launchd service removed")
except Exception as e:
print(f"⚠ Service cleanup: {e}")
finally:
if old_home is not None:
os.environ["HERMES_HOME"] = old_home
elif "HERMES_HOME" in os.environ:
del os.environ["HERMES_HOME"]
def _stop_gateway_process(profile_dir: Path) -> None:
"""Stop a running gateway process via its PID file."""
import signal as _signal
import time as _time
pid_file = profile_dir / "gateway.pid"
if not pid_file.exists():
return
try:
raw = pid_file.read_text().strip()
data = json.loads(raw) if raw.startswith("{") else {"pid": int(raw)}
pid = int(data["pid"])
os.kill(pid, _signal.SIGTERM)
# Wait up to 10s for graceful shutdown
for _ in range(20):
_time.sleep(0.5)
try:
os.kill(pid, 0)
except ProcessLookupError:
print(f"✓ Gateway stopped (PID {pid})")
return
# Force kill
try:
os.kill(pid, _signal.SIGKILL)
except ProcessLookupError:
pass
print(f"✓ Gateway force-stopped (PID {pid})")
except (ProcessLookupError, PermissionError):
print("✓ Gateway already stopped")
except Exception as e:
print(f"⚠ Could not stop gateway: {e}")
# ---------------------------------------------------------------------------
# Active profile (sticky default)
# ---------------------------------------------------------------------------
def get_active_profile() -> str:
"""Read the sticky active profile name.
Returns ``"default"`` if no active_profile file exists or it's empty.
"""
path = _get_active_profile_path()
try:
name = path.read_text().strip()
if not name:
return "default"
return name
except (FileNotFoundError, UnicodeDecodeError, OSError):
return "default"
def set_active_profile(name: str) -> None:
"""Set the sticky active profile.
Writes to ``~/.hermes/active_profile``. Use ``"default"`` to clear.
"""
validate_profile_name(name)
if name != "default" and not profile_exists(name):
raise FileNotFoundError(
f"Profile '{name}' does not exist. "
f"Create it with: hermes profile create {name}"
)
path = _get_active_profile_path()
path.parent.mkdir(parents=True, exist_ok=True)
if name == "default":
# Remove the file to indicate default
path.unlink(missing_ok=True)
else:
# Atomic write
tmp = path.with_suffix(".tmp")
tmp.write_text(name + "\n")
tmp.replace(path)
def get_active_profile_name() -> str:
"""Infer the current profile name from HERMES_HOME.
Returns ``"default"`` if HERMES_HOME is not set or points to ``~/.hermes``.
Returns the profile name if HERMES_HOME points into ``~/.hermes/profiles/<name>``.
Returns ``"custom"`` if HERMES_HOME is set to an unrecognized path.
"""
from hermes_constants import get_hermes_home
hermes_home = get_hermes_home()
resolved = hermes_home.resolve()
default_resolved = _get_default_hermes_home().resolve()
if resolved == default_resolved:
return "default"
profiles_root = _get_profiles_root().resolve()
try:
rel = resolved.relative_to(profiles_root)
parts = rel.parts
if len(parts) == 1 and _PROFILE_ID_RE.match(parts[0]):
return parts[0]
except ValueError:
pass
return "custom"
# ---------------------------------------------------------------------------
# Export / Import
# ---------------------------------------------------------------------------
def export_profile(name: str, output_path: str) -> Path:
"""Export a profile to a tar.gz archive.
Returns the output file path.
"""
validate_profile_name(name)
profile_dir = get_profile_dir(name)
if not profile_dir.is_dir():
raise FileNotFoundError(f"Profile '{name}' does not exist.")
output = Path(output_path)
# shutil.make_archive wants the base name without extension
base = str(output).removesuffix(".tar.gz").removesuffix(".tgz")
result = shutil.make_archive(base, "gztar", str(profile_dir.parent), name)
return Path(result)
def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
"""Import a profile from a tar.gz archive.
If *name* is not given, infers it from the archive's top-level directory.
Returns the imported profile directory.
"""
import tarfile
archive = Path(archive_path)
if not archive.exists():
raise FileNotFoundError(f"Archive not found: {archive}")
# Peek at the archive to find the top-level directory name
with tarfile.open(archive, "r:gz") as tf:
top_dirs = {m.name.split("/")[0] for m in tf.getmembers() if "/" in m.name}
if not top_dirs:
top_dirs = {m.name for m in tf.getmembers() if m.isdir()}
inferred_name = name or (top_dirs.pop() if len(top_dirs) == 1 else None)
if not inferred_name:
raise ValueError(
"Cannot determine profile name from archive. "
"Specify it explicitly: hermes profile import <archive> --name <name>"
)
validate_profile_name(inferred_name)
profile_dir = get_profile_dir(inferred_name)
if profile_dir.exists():
raise FileExistsError(f"Profile '{inferred_name}' already exists at {profile_dir}")
profiles_root = _get_profiles_root()
profiles_root.mkdir(parents=True, exist_ok=True)
shutil.unpack_archive(str(archive), str(profiles_root))
# If the archive extracted under a different name, rename
extracted = profiles_root / (top_dirs.pop() if top_dirs else inferred_name)
if extracted != profile_dir and extracted.exists():
extracted.rename(profile_dir)
return profile_dir
# ---------------------------------------------------------------------------
# Rename
# ---------------------------------------------------------------------------
def rename_profile(old_name: str, new_name: str) -> Path:
"""Rename a profile: directory, wrapper script, service, active_profile.
Returns the new profile directory.
"""
validate_profile_name(old_name)
validate_profile_name(new_name)
if old_name == "default":
raise ValueError("Cannot rename the default profile.")
if new_name == "default":
raise ValueError("Cannot rename to 'default' — it is reserved.")
old_dir = get_profile_dir(old_name)
new_dir = get_profile_dir(new_name)
if not old_dir.is_dir():
raise FileNotFoundError(f"Profile '{old_name}' does not exist.")
if new_dir.exists():
raise FileExistsError(f"Profile '{new_name}' already exists.")
# 1. Stop gateway if running
if _check_gateway_running(old_dir):
_cleanup_gateway_service(old_name, old_dir)
_stop_gateway_process(old_dir)
# 2. Rename directory
old_dir.rename(new_dir)
print(f"✓ Renamed {old_dir.name}{new_dir.name}")
# 3. Update wrapper script
remove_wrapper_script(old_name)
collision = check_alias_collision(new_name)
if not collision:
create_wrapper_script(new_name)
print(f"✓ Alias updated: {new_name}")
else:
print(f"⚠ Cannot create alias '{new_name}'{collision}")
# 4. Update active_profile if it pointed to old name
try:
if get_active_profile() == old_name:
set_active_profile(new_name)
print(f"✓ Active profile updated: {new_name}")
except Exception:
pass
return new_dir
# ---------------------------------------------------------------------------
# Tab completion
# ---------------------------------------------------------------------------
def generate_bash_completion() -> str:
"""Generate a bash completion script for hermes profile names."""
return '''# Hermes Agent profile completion
# Add to ~/.bashrc: eval "$(hermes completion bash)"
_hermes_profiles() {
local profiles_dir="$HOME/.hermes/profiles"
local profiles="default"
if [ -d "$profiles_dir" ]; then
profiles="$profiles $(ls "$profiles_dir" 2>/dev/null)"
fi
echo "$profiles"
}
_hermes_completion() {
local cur prev
cur="${COMP_WORDS[COMP_CWORD]}"
prev="${COMP_WORDS[COMP_CWORD-1]}"
# Complete profile names after -p / --profile
if [[ "$prev" == "-p" || "$prev" == "--profile" ]]; then
COMPREPLY=($(compgen -W "$(_hermes_profiles)" -- "$cur"))
return
fi
# Complete profile subcommands
if [[ "${COMP_WORDS[1]}" == "profile" ]]; then
case "$prev" in
profile)
COMPREPLY=($(compgen -W "list use create delete show alias rename export import" -- "$cur"))
return
;;
use|delete|show|alias|rename|export)
COMPREPLY=($(compgen -W "$(_hermes_profiles)" -- "$cur"))
return
;;
esac
fi
# Top-level subcommands
if [[ "$COMP_CWORD" == 1 ]]; then
local commands="chat model gateway setup status cron doctor config skills tools mcp sessions profile update version"
COMPREPLY=($(compgen -W "$commands" -- "$cur"))
fi
}
complete -F _hermes_completion hermes
'''
def generate_zsh_completion() -> str:
"""Generate a zsh completion script for hermes profile names."""
return '''#compdef hermes
# Hermes Agent profile completion
# Add to ~/.zshrc: eval "$(hermes completion zsh)"
_hermes() {
local -a profiles
profiles=(default)
if [[ -d "$HOME/.hermes/profiles" ]]; then
profiles+=("${(@f)$(ls $HOME/.hermes/profiles 2>/dev/null)}")
fi
_arguments \\
'-p[Profile name]:profile:($profiles)' \\
'--profile[Profile name]:profile:($profiles)' \\
'1:command:(chat model gateway setup status cron doctor config skills tools mcp sessions profile update version)' \\
'*::arg:->args'
case $words[1] in
profile)
_arguments '1:action:(list use create delete show alias rename export import)' \\
'2:profile:($profiles)'
;;
esac
}
_hermes "$@"
'''
# ---------------------------------------------------------------------------
# Profile env resolution (called from _apply_profile_override)
# ---------------------------------------------------------------------------
def resolve_profile_env(profile_name: str) -> str:
"""Resolve a profile name to a HERMES_HOME path string.
Called early in the CLI entry point, before any hermes modules
are imported, to set the HERMES_HOME environment variable.
"""
validate_profile_name(profile_name)
profile_dir = get_profile_dir(profile_name)
if profile_name != "default" and not profile_dir.is_dir():
raise FileNotFoundError(
f"Profile '{profile_name}' does not exist. "
f"Create it with: hermes profile create {profile_name}"
)
return str(profile_dir)
+6 -9
View File
@@ -63,8 +63,11 @@ def _get_model_config() -> Dict[str, Any]:
model_cfg = config.get("model")
if isinstance(model_cfg, dict):
cfg = dict(model_cfg)
default = cfg.get("default", "").strip()
base_url = cfg.get("base_url", "").strip()
# Accept "model" as alias for "default" (users intuitively write model.model)
if not cfg.get("default") and cfg.get("model"):
cfg["default"] = cfg["model"]
default = (cfg.get("default") or "").strip()
base_url = (cfg.get("base_url") or "").strip()
is_local = "localhost" in base_url or "127.0.0.1" in base_url
is_fallback = not default or default == "anthropic/claude-opus-4.6"
if is_local and is_fallback and base_url:
@@ -203,7 +206,7 @@ def _resolve_named_custom_runtime(
or _detect_api_mode_for_url(base_url)
or "chat_completions",
"base_url": base_url,
"api_key": api_key,
"api_key": api_key or "no-key-required",
"source": f"custom_provider:{custom_provider.get('name', requested_provider)}",
}
@@ -407,12 +410,6 @@ def resolve_runtime_provider(
# (e.g. https://api.minimax.io/anthropic, https://dashscope.../anthropic)
elif base_url.rstrip("/").endswith("/anthropic"):
api_mode = "anthropic_messages"
# MiniMax providers always use Anthropic Messages API.
# Auto-correct stale /v1 URLs (from old .env or config) to /anthropic.
elif provider in ("minimax", "minimax-cn"):
api_mode = "anthropic_messages"
if base_url.rstrip("/").endswith("/v1"):
base_url = base_url.rstrip("/")[:-3] + "/anthropic"
return {
"provider": provider,
"api_mode": api_mode,
+193 -31
View File
@@ -18,6 +18,8 @@ import sys
from pathlib import Path
from typing import Optional, Dict, Any
from hermes_constants import get_optional_skills_dir
logger = logging.getLogger(__name__)
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
@@ -80,6 +82,11 @@ _DEFAULT_PROVIDER_MODELS = {
"minimax-cn": ["MiniMax-M2.7", "MiniMax-M2.7-highspeed", "MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
"ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
"kilocode": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5.4", "google/gemini-3-pro-preview", "google/gemini-3-flash-preview"],
"huggingface": [
"Qwen/Qwen3.5-397B-A17B", "Qwen/Qwen3-235B-A22B-Thinking-2507",
"Qwen/Qwen3-Coder-480B-A35B-Instruct", "deepseek-ai/DeepSeek-R1-0528",
"deepseek-ai/DeepSeek-V3.2", "moonshotai/Kimi-K2.5",
],
}
@@ -284,6 +291,7 @@ from hermes_cli.config import (
get_env_value,
ensure_hermes_home,
)
# display_hermes_home imported lazily at call sites (stale-module safety during hermes update)
from hermes_cli.colors import Colors, color
@@ -580,11 +588,11 @@ def _print_setup_summary(config: dict, hermes_home):
else:
tool_status.append(("Mixture of Agents", False, "OPENROUTER_API_KEY"))
# Web tools (Parallel, Firecrawl, or Tavily)
if get_env_value("PARALLEL_API_KEY") or get_env_value("FIRECRAWL_API_KEY") or get_env_value("FIRECRAWL_API_URL") or get_env_value("TAVILY_API_KEY"):
# Web tools (Exa, Parallel, Firecrawl, or Tavily)
if get_env_value("EXA_API_KEY") or get_env_value("PARALLEL_API_KEY") or get_env_value("FIRECRAWL_API_KEY") or get_env_value("FIRECRAWL_API_URL") or get_env_value("TAVILY_API_KEY"):
tool_status.append(("Web Search & Extract", True, None))
else:
tool_status.append(("Web Search & Extract", False, "PARALLEL_API_KEY, FIRECRAWL_API_KEY, or TAVILY_API_KEY"))
tool_status.append(("Web Search & Extract", False, "EXA_API_KEY, PARALLEL_API_KEY, FIRECRAWL_API_KEY, or TAVILY_API_KEY"))
# Browser tools (local Chromium or Browserbase cloud)
import shutil
@@ -595,13 +603,15 @@ def _print_setup_summary(config: dict, hermes_home):
Path(__file__).parent.parent / "node_modules" / ".bin" / "agent-browser"
).exists()
)
if get_env_value("BROWSERBASE_API_KEY"):
if get_env_value("CAMOFOX_URL"):
tool_status.append(("Browser Automation (Camofox)", True, None))
elif get_env_value("BROWSERBASE_API_KEY"):
tool_status.append(("Browser Automation (Browserbase)", True, None))
elif _ab_found:
tool_status.append(("Browser Automation (local)", True, None))
else:
tool_status.append(
("Browser Automation", False, "npm install -g agent-browser")
("Browser Automation", False, "npm install -g agent-browser or set CAMOFOX_URL")
)
# FAL (image generation)
@@ -678,7 +688,8 @@ def _print_setup_summary(config: dict, hermes_home):
print_warning(
"Some tools are disabled. Run 'hermes setup tools' to configure them,"
)
print_warning("or edit ~/.hermes/.env directly to add the missing API keys.")
from hermes_constants import display_hermes_home as _dhh
print_warning(f"or edit {_dhh()}/.env directly to add the missing API keys.")
print()
# Done banner
@@ -701,7 +712,8 @@ def _print_setup_summary(config: dict, hermes_home):
print()
# Show file locations prominently
print(color("📁 All your files are in ~/.hermes/:", Colors.CYAN, Colors.BOLD))
from hermes_constants import display_hermes_home as _dhh
print(color(f"📁 All your files are in {_dhh()}/:", Colors.CYAN, Colors.BOLD))
print()
print(f" {color('Settings:', Colors.YELLOW)} {get_config_path()}")
print(f" {color('API Keys:', Colors.YELLOW)} {get_env_path()}")
@@ -884,6 +896,7 @@ def setup_model_provider(config: dict):
"OpenCode Go (open models, $10/month subscription)",
"GitHub Copilot (uses GITHUB_TOKEN or gh auth token)",
"GitHub Copilot ACP (spawns `copilot --acp --stdio`)",
"Hugging Face Inference Providers (20+ open models)",
]
if keep_label:
provider_choices.append(keep_label)
@@ -993,10 +1006,9 @@ def setup_model_provider(config: dict):
min_key_ttl_seconds=5 * 60,
timeout_seconds=15.0,
)
nous_models = fetch_nous_models(
inference_base_url=creds.get("base_url", ""),
api_key=creds.get("api_key", ""),
)
# Use curated model list instead of full /models dump
from hermes_cli.models import _PROVIDER_MODELS
nous_models = _PROVIDER_MODELS.get("nous", [])
except Exception as e:
logger.debug("Could not fetch Nous models after login: %s", e)
@@ -1528,7 +1540,26 @@ def setup_model_provider(config: dict):
_set_model_provider(config, "copilot-acp", pconfig.inference_base_url)
selected_base_url = pconfig.inference_base_url
# else: provider_idx == 16 (Keep current) — only shown when a provider already exists
elif provider_idx == 16: # Hugging Face Inference Providers
selected_provider = "huggingface"
print()
print_header("Hugging Face API Token")
pconfig = PROVIDER_REGISTRY["huggingface"]
print_info(f"Provider: {pconfig.name}")
print_info("Get your token at: https://huggingface.co/settings/tokens")
print_info("Required permission: 'Make calls to Inference Providers'")
print()
api_key = prompt(" HF Token", password=True)
if api_key:
save_env_value("HF_TOKEN", api_key)
# Clear OpenRouter env vars to prevent routing confusion
save_env_value("OPENAI_BASE_URL", "")
save_env_value("OPENAI_API_KEY", "")
_set_model_provider(config, "huggingface", pconfig.inference_base_url)
selected_base_url = pconfig.inference_base_url
# else: provider_idx == 17 (Keep current) — only shown when a provider already exists
# Normalize "keep current" to an explicit provider so downstream logic
# doesn't fall back to the generic OpenRouter/static-model path.
if selected_provider is None:
@@ -2067,11 +2098,11 @@ def setup_terminal_backend(config: dict):
print_info("Serverless cloud sandboxes. Each session gets its own container.")
print_info("Requires a Modal account: https://modal.com")
# Check if swe-rex[modal] is installed
# Check if modal SDK is installed
try:
__import__("swe_rex")
__import__("modal")
except ImportError:
print_info("Installing swe-rex[modal]...")
print_info("Installing modal SDK...")
import subprocess
uv_bin = shutil.which("uv")
@@ -2083,22 +2114,22 @@ def setup_terminal_backend(config: dict):
"install",
"--python",
sys.executable,
"swe-rex[modal]",
"modal",
],
capture_output=True,
text=True,
)
else:
result = subprocess.run(
[sys.executable, "-m", "pip", "install", "swe-rex[modal]"],
[sys.executable, "-m", "pip", "install", "modal"],
capture_output=True,
text=True,
)
if result.returncode == 0:
print_success("swe-rex[modal] installed")
print_success("modal SDK installed")
else:
print_warning(
"Install failed — run manually: pip install 'swe-rex[modal]'"
"Install failed — run manually: pip install modal"
)
# Modal token
@@ -2682,10 +2713,38 @@ def setup_gateway(config: dict):
if token or get_env_value("MATRIX_PASSWORD"):
# E2EE
print()
if prompt_yes_no("Enable end-to-end encryption (E2EE)?", False):
want_e2ee = prompt_yes_no("Enable end-to-end encryption (E2EE)?", False)
if want_e2ee:
save_env_value("MATRIX_ENCRYPTION", "true")
print_success("E2EE enabled")
print_info(" Requires: pip install 'matrix-nio[e2e]'")
# Auto-install matrix-nio
matrix_pkg = "matrix-nio[e2e]" if want_e2ee else "matrix-nio"
try:
__import__("nio")
except ImportError:
print_info(f"Installing {matrix_pkg}...")
import subprocess
uv_bin = shutil.which("uv")
if uv_bin:
result = subprocess.run(
[uv_bin, "pip", "install", "--python", sys.executable, matrix_pkg],
capture_output=True,
text=True,
)
else:
result = subprocess.run(
[sys.executable, "-m", "pip", "install", matrix_pkg],
capture_output=True,
text=True,
)
if result.returncode == 0:
print_success(f"{matrix_pkg} installed")
else:
print_warning(f"Install failed — run manually: pip install '{matrix_pkg}'")
if result.stderr:
print_info(f" Error: {result.stderr.strip().splitlines()[-1]}")
# Allowed users
print()
@@ -2812,7 +2871,8 @@ def setup_gateway(config: dict):
save_env_value("WEBHOOK_ENABLED", "true")
print()
print_success("Webhooks enabled! Next steps:")
print_info(" 1. Define webhook routes in ~/.hermes/config.yaml")
from hermes_constants import display_hermes_home as _dhh
print_info(f" 1. Define webhook routes in {_dhh()}/config.yaml")
print_info(" 2. Point your service (GitHub, GitLab, etc.) at:")
print_info(" http://your-server:8644/webhooks/<route-name>")
print()
@@ -2968,14 +3028,102 @@ def setup_tools(config: dict, first_install: bool = False):
tools_command(first_install=first_install, config=config)
# =============================================================================
# Post-Migration Section Skip Logic
# =============================================================================
def _get_section_config_summary(config: dict, section_key: str) -> Optional[str]:
"""Return a short summary if a setup section is already configured, else None.
Used after OpenClaw migration to detect which sections can be skipped.
``get_env_value`` is the module-level import from hermes_cli.config
so that test patches on ``setup_mod.get_env_value`` take effect.
"""
if section_key == "model":
has_key = bool(
get_env_value("OPENROUTER_API_KEY")
or get_env_value("OPENAI_API_KEY")
or get_env_value("ANTHROPIC_API_KEY")
)
if not has_key:
# Check for OAuth providers
try:
from hermes_cli.auth import get_active_provider
if get_active_provider():
has_key = True
except Exception:
pass
if not has_key:
return None
model = config.get("model")
if isinstance(model, str) and model.strip():
return model.strip()
if isinstance(model, dict):
return str(model.get("default") or model.get("model") or "configured")
return "configured"
elif section_key == "terminal":
backend = config.get("terminal", {}).get("backend", "local")
return f"backend: {backend}"
elif section_key == "agent":
max_turns = config.get("agent", {}).get("max_turns", 90)
return f"max turns: {max_turns}"
elif section_key == "gateway":
platforms = []
if get_env_value("TELEGRAM_BOT_TOKEN"):
platforms.append("Telegram")
if get_env_value("DISCORD_BOT_TOKEN"):
platforms.append("Discord")
if get_env_value("SLACK_BOT_TOKEN"):
platforms.append("Slack")
if get_env_value("WHATSAPP_PHONE_NUMBER_ID"):
platforms.append("WhatsApp")
if get_env_value("SIGNAL_ACCOUNT"):
platforms.append("Signal")
if platforms:
return ", ".join(platforms)
return None # No platforms configured — section must run
elif section_key == "tools":
tools = []
if get_env_value("ELEVENLABS_API_KEY"):
tools.append("TTS/ElevenLabs")
if get_env_value("BROWSERBASE_API_KEY"):
tools.append("Browser")
if get_env_value("FIRECRAWL_API_KEY"):
tools.append("Firecrawl")
if tools:
return ", ".join(tools)
return None
return None
def _skip_configured_section(
config: dict, section_key: str, label: str
) -> bool:
"""Show an already-configured section summary and offer to skip.
Returns True if the user chose to skip, False if the section should run.
"""
summary = _get_section_config_summary(config, section_key)
if not summary:
return False
print()
print_success(f" {label}: {summary}")
return not prompt_yes_no(f" Reconfigure {label.lower()}?", default=False)
# =============================================================================
# OpenClaw Migration
# =============================================================================
_OPENCLAW_SCRIPT = (
PROJECT_ROOT
/ "optional-skills"
get_optional_skills_dir(PROJECT_ROOT / "optional-skills")
/ "migration"
/ "openclaw-migration"
/ "scripts"
@@ -3039,7 +3187,7 @@ def _offer_openclaw_migration(hermes_home: Path) -> bool:
target_root=hermes_home.resolve(),
execute=True,
workspace_target=None,
overwrite=False,
overwrite=True,
migrate_secrets=True,
output_dir=None,
selected_options=selected,
@@ -3195,6 +3343,8 @@ def run_setup_wizard(args):
)
)
migration_ran = False
if is_existing:
# ── Returning User Menu ──
print()
@@ -3264,7 +3414,8 @@ def run_setup_wizard(args):
return
# Offer OpenClaw migration before configuration begins
if _offer_openclaw_migration(hermes_home):
migration_ran = _offer_openclaw_migration(hermes_home)
if migration_ran:
# Reload config in case migration wrote to it
config = load_config()
@@ -3277,20 +3428,31 @@ def run_setup_wizard(args):
print()
print_info("You can edit these files directly or use 'hermes config edit'")
if migration_ran:
print()
print_info("Settings were imported from OpenClaw.")
print_info("Each section below will show what was imported — press Enter to keep,")
print_info("or choose to reconfigure if needed.")
# Section 1: Model & Provider
setup_model_provider(config)
if not (migration_ran and _skip_configured_section(config, "model", "Model & Provider")):
setup_model_provider(config)
# Section 2: Terminal Backend
setup_terminal_backend(config)
if not (migration_ran and _skip_configured_section(config, "terminal", "Terminal Backend")):
setup_terminal_backend(config)
# Section 3: Agent Settings
setup_agent_settings(config)
if not (migration_ran and _skip_configured_section(config, "agent", "Agent Settings")):
setup_agent_settings(config)
# Section 4: Messaging Platforms
setup_gateway(config)
if not (migration_ran and _skip_configured_section(config, "gateway", "Messaging Platforms")):
setup_gateway(config)
# Section 5: Tools
setup_tools(config, first_install=not is_existing)
if not (migration_ran and _skip_configured_section(config, "tools", "Tools")):
setup_tools(config, first_install=not is_existing)
# Save and show summary
save_config(config)
+6
View File
@@ -24,6 +24,12 @@ PLATFORMS = {
"whatsapp": "📱 WhatsApp",
"signal": "📡 Signal",
"email": "📧 Email",
"homeassistant": "🏠 Home Assistant",
"mattermost": "💬 Mattermost",
"matrix": "💬 Matrix",
"dingtalk": "💬 DingTalk",
"feishu": "🪽 Feishu",
"wecom": "💬 WeCom",
}
# ─── Config Helpers ───────────────────────────────────────────────────────────
+69 -19
View File
@@ -21,6 +21,7 @@ from rich.table import Table
# Lazy imports to avoid circular dependencies and slow startup.
# tools.skills_hub and tools.skills_guard are imported inside functions.
from hermes_constants import display_hermes_home
_console = Console()
@@ -304,7 +305,8 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all",
def do_install(identifier: str, category: str = "", force: bool = False,
console: Optional[Console] = None, skip_confirm: bool = False) -> None:
console: Optional[Console] = None, skip_confirm: bool = False,
invalidate_cache: bool = True) -> None:
"""Fetch, quarantine, scan, confirm, and install a skill."""
from tools.skills_hub import (
GitHubAuth, create_source_router, ensure_hub_dirs,
@@ -352,7 +354,14 @@ def do_install(identifier: str, category: str = "", force: bool = False,
extra_metadata.update(getattr(bundle, "metadata", {}) or {})
# Quarantine the bundle
q_path = quarantine_bundle(bundle)
try:
q_path = quarantine_bundle(bundle)
except ValueError as exc:
c.print(f"[bold red]Installation blocked:[/] {exc}\n")
from tools.skills_hub import append_audit_log
append_audit_log("BLOCKED", bundle.name, bundle.source,
bundle.trust_level, "invalid_path", str(exc))
return
c.print(f"[dim]Quarantined to {q_path.relative_to(q_path.parent.parent.parent)}[/]")
# Scan
@@ -387,7 +396,7 @@ def do_install(identifier: str, category: str = "", force: bool = False,
"[bold bright_cyan]This is an official optional skill maintained by Nous Research.[/]\n\n"
"It ships with hermes-agent but is not activated by default.\n"
"Installing will copy it to your skills directory where the agent can use it.\n\n"
f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
f"Files will be at: [cyan]{display_hermes_home()}/skills/{category + '/' if category else ''}{bundle.name}/[/]",
title="Official Skill",
border_style="bright_cyan",
))
@@ -397,7 +406,7 @@ def do_install(identifier: str, category: str = "", force: bool = False,
"External skills can contain instructions that influence agent behavior,\n"
"shell commands, and scripts. Even after automated scanning, you should\n"
"review the installed files before use.\n\n"
f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
f"Files will be at: [cyan]{display_hermes_home()}/skills/{category + '/' if category else ''}{bundle.name}/[/]",
title="Disclaimer",
border_style="yellow",
))
@@ -412,11 +421,30 @@ def do_install(identifier: str, category: str = "", force: bool = False,
return
# Install
install_dir = install_from_quarantine(q_path, bundle.name, category, bundle, result)
try:
install_dir = install_from_quarantine(q_path, bundle.name, category, bundle, result)
except ValueError as exc:
c.print(f"[bold red]Installation blocked:[/] {exc}\n")
shutil.rmtree(q_path, ignore_errors=True)
from tools.skills_hub import append_audit_log
append_audit_log("BLOCKED", bundle.name, bundle.source,
bundle.trust_level, "invalid_path", str(exc))
return
from tools.skills_hub import SKILLS_DIR
c.print(f"[bold green]Installed:[/] {install_dir.relative_to(SKILLS_DIR)}")
c.print(f"[dim]Files: {', '.join(bundle.files.keys())}[/]\n")
if invalidate_cache:
# Invalidate the skills prompt cache so the new skill appears immediately
try:
from agent.prompt_builder import clear_skills_system_prompt_cache
clear_skills_system_prompt_cache(clear_snapshot=True)
except Exception:
pass
else:
c.print("[dim]Skill will be available in your next session.[/]")
c.print("[dim]Use /reset to start a new session now, or --now to activate immediately (invalidates prompt cache).[/]\n")
def do_inspect(identifier: str, console: Optional[Console] = None) -> None:
"""Preview a skill's SKILL.md content without installing."""
@@ -603,7 +631,8 @@ def do_audit(name: Optional[str] = None, console: Optional[Console] = None) -> N
def do_uninstall(name: str, console: Optional[Console] = None,
skip_confirm: bool = False) -> None:
skip_confirm: bool = False,
invalidate_cache: bool = True) -> None:
"""Remove a hub-installed skill with confirmation."""
from tools.skills_hub import uninstall_skill
@@ -623,6 +652,15 @@ def do_uninstall(name: str, console: Optional[Console] = None,
success, msg = uninstall_skill(name)
if success:
c.print(f"[bold green]{msg}[/]\n")
if invalidate_cache:
try:
from agent.prompt_builder import clear_skills_system_prompt_cache
clear_skills_system_prompt_cache(clear_snapshot=True)
except Exception:
pass
else:
c.print("[dim]Change will take effect in your next session.[/]")
c.print("[dim]Use /reset to start a new session now, or --now to apply immediately (invalidates prompt cache).[/]\n")
else:
c.print(f"[bold red]Error:[/] {msg}\n")
@@ -722,7 +760,7 @@ def do_publish(skill_path: str, target: str = "github", repo: str = "",
auth = GitHubAuth()
if not auth.is_authenticated():
c.print("[bold red]Error:[/] GitHub authentication required.\n"
"Set GITHUB_TOKEN in ~/.hermes/.env or run 'gh auth login'.\n")
f"Set GITHUB_TOKEN in {display_hermes_home()}/.env or run 'gh auth login'.\n")
return
c.print(f"[bold]Publishing '{name}' to {repo}...[/]")
@@ -865,10 +903,15 @@ def do_snapshot_export(output_path: str, console: Optional[Console] = None) -> N
"taps": tap_list,
}
out = Path(output_path)
out.write_text(json.dumps(snapshot, indent=2, ensure_ascii=False) + "\n")
c.print(f"[bold green]Snapshot exported:[/] {out}")
c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")
payload = json.dumps(snapshot, indent=2, ensure_ascii=False) + "\n"
if output_path == "-":
import sys
sys.stdout.write(payload)
else:
out = Path(output_path)
out.write_text(payload)
c.print(f"[bold green]Snapshot exported:[/] {out}")
c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")
def do_snapshot_import(input_path: str, force: bool = False,
@@ -1059,19 +1102,23 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
elif action == "install":
if not args:
c.print("[bold red]Usage:[/] /skills install <identifier> [--category <cat>] [--force|--yes]\n")
c.print("[bold red]Usage:[/] /skills install <identifier> [--category <cat>] [--force] [--now]\n")
return
identifier = args[0]
category = ""
# --yes / -y bypasses confirmation prompt (needed in TUI mode)
# --force handles reinstall override
skip_confirm = any(flag in args for flag in ("--yes", "-y"))
# Slash commands run inside prompt_toolkit where input() hangs.
# Always skip confirmation — the user typing the command is implicit consent.
skip_confirm = True
force = "--force" in args
# --now invalidates prompt cache immediately (costs more money).
# Default: defer to next session to preserve cache.
invalidate_cache = "--now" in args
for i, a in enumerate(args):
if a == "--category" and i + 1 < len(args):
category = args[i + 1]
do_install(identifier, category=category, force=force,
skip_confirm=skip_confirm, console=c)
skip_confirm=skip_confirm, invalidate_cache=invalidate_cache,
console=c)
elif action == "inspect":
if not args:
@@ -1101,10 +1148,13 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
elif action == "uninstall":
if not args:
c.print("[bold red]Usage:[/] /skills uninstall <name> [--yes]\n")
c.print("[bold red]Usage:[/] /skills uninstall <name> [--now]\n")
return
skip_confirm = any(flag in args for flag in ("--yes", "-y"))
do_uninstall(args[0], console=c, skip_confirm=skip_confirm)
# Slash commands run inside prompt_toolkit where input() hangs.
skip_confirm = True
invalidate_cache = "--now" in args
do_uninstall(args[0], console=c, skip_confirm=skip_confirm,
invalidate_cache=invalidate_cache)
elif action == "publish":
if not args:
+24 -12
View File
@@ -254,6 +254,9 @@ def show_status(args):
"Slack": ("SLACK_BOT_TOKEN", None),
"Email": ("EMAIL_ADDRESS", "EMAIL_HOME_ADDRESS"),
"SMS": ("TWILIO_ACCOUNT_SID", "SMS_HOME_CHANNEL"),
"DingTalk": ("DINGTALK_CLIENT_ID", None),
"Feishu": ("FEISHU_APP_ID", "FEISHU_HOME_CHANNEL"),
"WeCom": ("WECOM_BOT_ID", "WECOM_HOME_CHANNEL"),
}
for name, (token_var, home_var) in platforms.items():
@@ -282,22 +285,31 @@ def show_status(args):
_gw_svc = get_service_name()
except Exception:
_gw_svc = "hermes-gateway"
result = subprocess.run(
["systemctl", "--user", "is-active", _gw_svc],
capture_output=True,
text=True
)
is_active = result.stdout.strip() == "active"
try:
result = subprocess.run(
["systemctl", "--user", "is-active", _gw_svc],
capture_output=True,
text=True,
timeout=5
)
is_active = result.stdout.strip() == "active"
except subprocess.TimeoutExpired:
is_active = False
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
print(" Manager: systemd (user)")
elif sys.platform == 'darwin':
result = subprocess.run(
["launchctl", "list", "ai.hermes.gateway"],
capture_output=True,
text=True
)
is_loaded = result.returncode == 0
from hermes_cli.gateway import get_launchd_label
try:
result = subprocess.run(
["launchctl", "list", get_launchd_label()],
capture_output=True,
text=True,
timeout=5
)
is_loaded = result.returncode == 0
except subprocess.TimeoutExpired:
is_loaded = False
print(f" Status: {check_mark(is_loaded)} {'loaded' if is_loaded else 'not loaded'}")
print(" Manager: launchd")
else:
+128 -6
View File
@@ -9,6 +9,8 @@ Saves per-platform tool configuration to ~/.hermes/config.yaml under
the `platform_toolsets` key.
"""
import json as _json
import logging
import sys
from pathlib import Path
from typing import Dict, List, Optional, Set
@@ -19,6 +21,8 @@ from hermes_cli.config import (
)
from hermes_cli.colors import Colors, color
logger = logging.getLogger(__name__)
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
@@ -108,7 +112,8 @@ def _get_effective_configurable_toolsets():
"""
result = list(CONFIGURABLE_TOOLSETS)
try:
from hermes_cli.plugins import get_plugin_toolsets
from hermes_cli.plugins import discover_plugins, get_plugin_toolsets
discover_plugins() # idempotent — ensures plugins are loaded
result.extend(get_plugin_toolsets())
except Exception:
pass
@@ -118,7 +123,8 @@ def _get_effective_configurable_toolsets():
def _get_plugin_toolset_keys() -> set:
"""Return the set of toolset keys provided by plugins."""
try:
from hermes_cli.plugins import get_plugin_toolsets
from hermes_cli.plugins import discover_plugins, get_plugin_toolsets
discover_plugins() # idempotent — ensures plugins are loaded
return {ts_key for ts_key, _, _ in get_plugin_toolsets()}
except Exception:
return set()
@@ -133,8 +139,12 @@ PLATFORMS = {
"signal": {"label": "📡 Signal", "default_toolset": "hermes-signal"},
"homeassistant": {"label": "🏠 Home Assistant", "default_toolset": "hermes-homeassistant"},
"email": {"label": "📧 Email", "default_toolset": "hermes-email"},
"dingtalk": {"label": "💬 DingTalk", "default_toolset": "hermes-dingtalk"},
"matrix": {"label": "💬 Matrix", "default_toolset": "hermes-matrix"},
"dingtalk": {"label": "💬 DingTalk", "default_toolset": "hermes-dingtalk"},
"feishu": {"label": "🪽 Feishu", "default_toolset": "hermes-feishu"},
"wecom": {"label": "💬 WeCom", "default_toolset": "hermes-wecom"},
"api_server": {"label": "🌐 API Server", "default_toolset": "hermes-api-server"},
"mattermost": {"label": "💬 Mattermost", "default_toolset": "hermes-mattermost"},
}
@@ -186,6 +196,14 @@ TOOL_CATEGORIES = {
{"key": "FIRECRAWL_API_KEY", "prompt": "Firecrawl API key", "url": "https://firecrawl.dev"},
],
},
{
"name": "Exa",
"tag": "AI-native search and contents",
"web_backend": "exa",
"env_vars": [
{"key": "EXA_API_KEY", "prompt": "Exa API key", "url": "https://exa.ai"},
],
},
{
"name": "Parallel",
"tag": "AI-native search and extract",
@@ -255,6 +273,16 @@ TOOL_CATEGORIES = {
"browser_provider": "browser-use",
"post_setup": "browserbase",
},
{
"name": "Camofox",
"tag": "Local anti-detection browser (Firefox/Camoufox)",
"env_vars": [
{"key": "CAMOFOX_URL", "prompt": "Camofox server URL", "default": "http://localhost:9377",
"url": "https://github.com/jo-inc/camofox-browser"},
],
"browser_provider": "camofox",
"post_setup": "camofox",
},
],
},
"homeassistant": {
@@ -314,10 +342,33 @@ def _run_post_setup(post_setup_key: str):
if result.returncode == 0:
_print_success(" Node.js dependencies installed")
else:
_print_warning(" npm install failed - run manually: cd ~/.hermes/hermes-agent && npm install")
from hermes_constants import display_hermes_home
_print_warning(f" npm install failed - run manually: cd {display_hermes_home()}/hermes-agent && npm install")
elif not node_modules.exists():
_print_warning(" Node.js not found - browser tools require: npm install (in hermes-agent directory)")
elif post_setup_key == "camofox":
camofox_dir = PROJECT_ROOT / "node_modules" / "@askjo" / "camoufox-browser"
if not camofox_dir.exists() and shutil.which("npm"):
_print_info(" Installing Camofox browser server...")
import subprocess
result = subprocess.run(
["npm", "install", "--silent"],
capture_output=True, text=True, cwd=str(PROJECT_ROOT)
)
if result.returncode == 0:
_print_success(" Camofox installed")
else:
_print_warning(" npm install failed - run manually: npm install")
if camofox_dir.exists():
_print_info(" Start the Camofox server:")
_print_info(" npx @askjo/camoufox-browser")
_print_info(" First run downloads the Camoufox engine (~300MB)")
_print_info(" Or use Docker: docker run -p 9377:9377 jo-inc/camofox-browser")
elif not shutil.which("npm"):
_print_warning(" Node.js not found. Install Camofox via Docker:")
_print_info(" docker run -p 9377:9377 jo-inc/camofox-browser")
elif post_setup_key == "rl_training":
try:
__import__("tinker_atropos")
@@ -546,7 +597,9 @@ def _toolset_has_keys(ts_key: str) -> bool:
if cat:
for provider in cat.get("providers", []):
env_vars = provider.get("env_vars", [])
if env_vars and all(get_env_value(e["key"]) for e in env_vars):
if not env_vars:
return True # No-key provider (e.g. Local Browser, Edge TTS)
if all(get_env_value(e["key"]) for e in env_vars):
return True
return False
@@ -640,9 +693,61 @@ def _prompt_choice(question: str, choices: list, default: int = 0) -> int:
return default
# ─── Token Estimation ────────────────────────────────────────────────────────
# Module-level cache so discovery + tokenization runs at most once per process.
_tool_token_cache: Optional[Dict[str, int]] = None
def _estimate_tool_tokens() -> Dict[str, int]:
"""Return estimated token counts per individual tool name.
Uses tiktoken (cl100k_base) to count tokens in the JSON-serialised
OpenAI-format tool schema. Triggers tool discovery on first call,
then caches the result for the rest of the process.
Returns an empty dict when tiktoken or the registry is unavailable.
"""
global _tool_token_cache
if _tool_token_cache is not None:
return _tool_token_cache
try:
import tiktoken
enc = tiktoken.get_encoding("cl100k_base")
except Exception:
logger.debug("tiktoken unavailable; skipping tool token estimation")
_tool_token_cache = {}
return _tool_token_cache
try:
# Trigger full tool discovery (imports all tool modules).
import model_tools # noqa: F401
from tools.registry import registry
except Exception:
logger.debug("Tool registry unavailable; skipping token estimation")
_tool_token_cache = {}
return _tool_token_cache
counts: Dict[str, int] = {}
for name in registry.get_all_tool_names():
schema = registry.get_schema(name)
if schema:
# Mirror what gets sent to the API:
# {"type": "function", "function": <schema>}
text = _json.dumps({"type": "function", "function": schema})
counts[name] = len(enc.encode(text))
_tool_token_cache = counts
return _tool_token_cache
def _prompt_toolset_checklist(platform_label: str, enabled: Set[str]) -> Set[str]:
"""Multi-select checklist of toolsets. Returns set of selected toolset keys."""
from hermes_cli.curses_ui import curses_checklist
from toolsets import resolve_toolset
# Pre-compute per-tool token counts (cached after first call).
tool_tokens = _estimate_tool_tokens()
effective = _get_effective_configurable_toolsets()
@@ -658,11 +763,27 @@ def _prompt_toolset_checklist(platform_label: str, enabled: Set[str]) -> Set[str
if ts_key in enabled
}
# Build a live status function that shows deduplicated total token cost.
status_fn = None
if tool_tokens:
ts_keys = [ts_key for ts_key, _, _ in effective]
def status_fn(chosen: set) -> str:
# Collect unique tool names across all selected toolsets
all_tools: set = set()
for idx in chosen:
all_tools.update(resolve_toolset(ts_keys[idx]))
total = sum(tool_tokens.get(name, 0) for name in all_tools)
if total >= 1000:
return f"Est. tool context: ~{total / 1000:.1f}k tokens"
return f"Est. tool context: ~{total} tokens"
chosen = curses_checklist(
f"Tools for {platform_label}",
labels,
pre_selected,
cancel_returns=pre_selected,
status_fn=status_fn,
)
return {effective[i][0] for i in chosen}
@@ -1252,7 +1373,8 @@ def tools_command(args=None, first_install: bool = False, config: dict = None):
platform_choices[idx] = f"Configure {pinfo['label']} ({new_count}/{total} enabled)"
print()
print(color(" Tool configuration saved to ~/.hermes/config.yaml", Colors.DIM))
from hermes_constants import display_hermes_home
print(color(f" Tool configuration saved to {display_hermes_home()}/config.yaml", Colors.DIM))
print(color(" Changes take effect on next 'hermes' or gateway restart.", Colors.DIM))
print()
+260
View File
@@ -0,0 +1,260 @@
"""hermes webhook — manage dynamic webhook subscriptions from the CLI.
Usage:
hermes webhook subscribe <name> [options]
hermes webhook list
hermes webhook remove <name>
hermes webhook test <name> [--payload '{"key": "value"}']
Subscriptions persist to ~/.hermes/webhook_subscriptions.json and are
hot-reloaded by the webhook adapter without a gateway restart.
"""
import json
import os
import re
import secrets
import time
from pathlib import Path
from typing import Dict, Optional
from hermes_constants import display_hermes_home
_SUBSCRIPTIONS_FILENAME = "webhook_subscriptions.json"
def _hermes_home() -> Path:
return Path(
os.getenv("HERMES_HOME", str(Path.home() / ".hermes"))
).expanduser()
def _subscriptions_path() -> Path:
return _hermes_home() / _SUBSCRIPTIONS_FILENAME
def _load_subscriptions() -> Dict[str, dict]:
path = _subscriptions_path()
if not path.exists():
return {}
try:
data = json.loads(path.read_text(encoding="utf-8"))
return data if isinstance(data, dict) else {}
except Exception:
return {}
def _save_subscriptions(subs: Dict[str, dict]) -> None:
path = _subscriptions_path()
path.parent.mkdir(parents=True, exist_ok=True)
tmp_path = path.with_suffix(".tmp")
tmp_path.write_text(
json.dumps(subs, indent=2, ensure_ascii=False),
encoding="utf-8",
)
os.replace(str(tmp_path), str(path))
def _get_webhook_config() -> dict:
"""Load webhook platform config. Returns {} if not configured."""
try:
from hermes_cli.config import load_config
cfg = load_config()
return cfg.get("platforms", {}).get("webhook", {})
except Exception:
return {}
def _is_webhook_enabled() -> bool:
return bool(_get_webhook_config().get("enabled"))
def _get_webhook_base_url() -> str:
wh = _get_webhook_config().get("extra", {})
host = wh.get("host", "0.0.0.0")
port = wh.get("port", 8644)
display_host = "localhost" if host == "0.0.0.0" else host
return f"http://{display_host}:{port}"
def _setup_hint() -> str:
_dhh = display_hermes_home()
return f"""
Webhook platform is not enabled. To set it up:
1. Run the gateway setup wizard:
hermes gateway setup
2. Or manually add to {_dhh}/config.yaml:
platforms:
webhook:
enabled: true
extra:
host: "0.0.0.0"
port: 8644
secret: "your-global-hmac-secret"
3. Or set environment variables in {_dhh}/.env:
WEBHOOK_ENABLED=true
WEBHOOK_PORT=8644
WEBHOOK_SECRET=your-global-secret
Then start the gateway: hermes gateway run
"""
def _require_webhook_enabled() -> bool:
"""Check webhook is enabled. Print setup guide and return False if not."""
if _is_webhook_enabled():
return True
print(_setup_hint())
return False
def webhook_command(args):
"""Entry point for 'hermes webhook' subcommand."""
sub = getattr(args, "webhook_action", None)
if not sub:
print("Usage: hermes webhook {subscribe|list|remove|test}")
print("Run 'hermes webhook --help' for details.")
return
if not _require_webhook_enabled():
return
if sub in ("subscribe", "add"):
_cmd_subscribe(args)
elif sub in ("list", "ls"):
_cmd_list(args)
elif sub in ("remove", "rm"):
_cmd_remove(args)
elif sub == "test":
_cmd_test(args)
def _cmd_subscribe(args):
name = args.name.strip().lower().replace(" ", "-")
if not re.match(r'^[a-z0-9][a-z0-9_-]*$', name):
print(f"Error: Invalid name '{name}'. Use lowercase alphanumeric with hyphens/underscores.")
return
subs = _load_subscriptions()
is_update = name in subs
secret = args.secret or secrets.token_urlsafe(32)
events = [e.strip() for e in args.events.split(",")] if args.events else []
route = {
"description": args.description or f"Agent-created subscription: {name}",
"events": events,
"secret": secret,
"prompt": args.prompt or "",
"skills": [s.strip() for s in args.skills.split(",")] if args.skills else [],
"deliver": args.deliver or "log",
"created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
}
if args.deliver_chat_id:
route["deliver_extra"] = {"chat_id": args.deliver_chat_id}
subs[name] = route
_save_subscriptions(subs)
base_url = _get_webhook_base_url()
status = "Updated" if is_update else "Created"
print(f"\n {status} webhook subscription: {name}")
print(f" URL: {base_url}/webhooks/{name}")
print(f" Secret: {secret}")
if events:
print(f" Events: {', '.join(events)}")
else:
print(" Events: (all)")
print(f" Deliver: {route['deliver']}")
if route.get("prompt"):
prompt_preview = route["prompt"][:80] + ("..." if len(route["prompt"]) > 80 else "")
print(f" Prompt: {prompt_preview}")
print(f"\n Configure your service to POST to the URL above.")
print(f" Use the secret for HMAC-SHA256 signature validation.")
print(f" The gateway must be running to receive events (hermes gateway run).\n")
def _cmd_list(args):
subs = _load_subscriptions()
if not subs:
print(" No dynamic webhook subscriptions.")
print(" Create one with: hermes webhook subscribe <name>")
return
base_url = _get_webhook_base_url()
print(f"\n {len(subs)} webhook subscription(s):\n")
for name, route in subs.items():
events = ", ".join(route.get("events", [])) or "(all)"
deliver = route.get("deliver", "log")
desc = route.get("description", "")
print(f"{name}")
if desc:
print(f" {desc}")
print(f" URL: {base_url}/webhooks/{name}")
print(f" Events: {events}")
print(f" Deliver: {deliver}")
print()
def _cmd_remove(args):
name = args.name.strip().lower()
subs = _load_subscriptions()
if name not in subs:
print(f" No subscription named '{name}'.")
print(" Note: Static routes from config.yaml cannot be removed here.")
return
del subs[name]
_save_subscriptions(subs)
print(f" Removed webhook subscription: {name}")
def _cmd_test(args):
"""Send a test POST to a webhook route."""
name = args.name.strip().lower()
subs = _load_subscriptions()
if name not in subs:
print(f" No subscription named '{name}'.")
return
route = subs[name]
secret = route.get("secret", "")
base_url = _get_webhook_base_url()
url = f"{base_url}/webhooks/{name}"
payload = args.payload or '{"test": true, "event_type": "test", "message": "Hello from hermes webhook test"}'
import hmac
import hashlib
sig = "sha256=" + hmac.new(
secret.encode(), payload.encode(), hashlib.sha256
).hexdigest()
print(f" Sending test POST to {url}")
try:
import urllib.request
req = urllib.request.Request(
url,
data=payload.encode(),
headers={
"Content-Type": "application/json",
"X-Hub-Signature-256": sig,
"X-GitHub-Event": "test",
},
method="POST",
)
with urllib.request.urlopen(req, timeout=10) as resp:
body = resp.read().decode()
print(f" Response ({resp.status}): {body}")
except Exception as e:
print(f" Error: {e}")
print(" Is the gateway running? (hermes gateway run)")
+55
View File
@@ -17,6 +17,61 @@ def get_hermes_home() -> Path:
return Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
def get_optional_skills_dir(default: Path | None = None) -> Path:
"""Return the optional-skills directory, honoring package-manager wrappers.
Packaged installs may ship ``optional-skills`` outside the Python package
tree and expose it via ``HERMES_OPTIONAL_SKILLS``.
"""
override = os.getenv("HERMES_OPTIONAL_SKILLS", "").strip()
if override:
return Path(override)
if default is not None:
return default
return get_hermes_home() / "optional-skills"
def get_hermes_dir(new_subpath: str, old_name: str) -> Path:
"""Resolve a Hermes subdirectory with backward compatibility.
New installs get the consolidated layout (e.g. ``cache/images``).
Existing installs that already have the old path (e.g. ``image_cache``)
keep using it no migration required.
Args:
new_subpath: Preferred path relative to HERMES_HOME (e.g. ``"cache/images"``).
old_name: Legacy path relative to HERMES_HOME (e.g. ``"image_cache"``).
Returns:
Absolute ``Path`` old location if it exists on disk, otherwise the new one.
"""
home = get_hermes_home()
old_path = home / old_name
if old_path.exists():
return old_path
return home / new_subpath
def display_hermes_home() -> str:
"""Return a user-friendly display string for the current HERMES_HOME.
Uses ``~/`` shorthand for readability::
default: ``~/.hermes``
profile: ``~/.hermes/profiles/coder``
custom: ``/opt/hermes-custom``
Use this in **user-facing** print/log messages instead of hardcoding
``~/.hermes``. For code that needs a real ``Path``, use
:func:`get_hermes_home` instead.
"""
home = get_hermes_home()
try:
return "~/" + str(home.relative_to(Path.home()))
except ValueError:
return str(home)
VALID_REASONING_EFFORTS = ("xhigh", "high", "medium", "low", "minimal")
+304 -84
View File
@@ -15,15 +15,20 @@ Key design decisions:
"""
import json
import logging
import os
import random
import re
import sqlite3
import threading
import time
from pathlib import Path
from hermes_constants import get_hermes_home
from typing import Dict, Any, List, Optional
from typing import Any, Callable, Dict, List, Optional, TypeVar
logger = logging.getLogger(__name__)
T = TypeVar("T")
DEFAULT_DB_PATH = get_hermes_home() / "state.db"
@@ -116,18 +121,38 @@ class SessionDB:
single writer via WAL mode). Each method opens its own cursor.
"""
# ── Write-contention tuning ──
# With multiple hermes processes (gateway + CLI sessions + worktree agents)
# all sharing one state.db, WAL write-lock contention causes visible TUI
# freezes. SQLite's built-in busy handler uses a deterministic sleep
# schedule that causes convoy effects under high concurrency.
#
# Instead, we keep the SQLite timeout short (1s) and handle retries at the
# application level with random jitter, which naturally staggers competing
# writers and avoids the convoy.
_WRITE_MAX_RETRIES = 15
_WRITE_RETRY_MIN_S = 0.020 # 20ms
_WRITE_RETRY_MAX_S = 0.150 # 150ms
# Attempt a PASSIVE WAL checkpoint every N successful writes.
_CHECKPOINT_EVERY_N_WRITES = 50
def __init__(self, db_path: Path = None):
self.db_path = db_path or DEFAULT_DB_PATH
self.db_path.parent.mkdir(parents=True, exist_ok=True)
self._lock = threading.Lock()
self._write_count = 0
self._conn = sqlite3.connect(
str(self.db_path),
check_same_thread=False,
# 30s gives the WAL writer (CLI or gateway) time to finish a batch
# flush before the concurrent reader/writer gives up. 10s was too
# short when the CLI is doing frequent memory flushes.
timeout=30.0,
# Short timeout — application-level retry with random jitter
# handles contention instead of sitting in SQLite's internal
# busy handler for up to 30s.
timeout=1.0,
# Autocommit mode: Python's default isolation_level="" auto-starts
# transactions on DML, which conflicts with our explicit
# BEGIN IMMEDIATE. None = we manage transactions ourselves.
isolation_level=None,
)
self._conn.row_factory = sqlite3.Row
self._conn.execute("PRAGMA journal_mode=WAL")
@@ -135,6 +160,96 @@ class SessionDB:
self._init_schema()
# ── Core write helper ──
def _execute_write(self, fn: Callable[[sqlite3.Connection], T]) -> T:
"""Execute a write transaction with BEGIN IMMEDIATE and jitter retry.
*fn* receives the connection and should perform INSERT/UPDATE/DELETE
statements. The caller must NOT call ``commit()`` that's handled
here after *fn* returns.
BEGIN IMMEDIATE acquires the WAL write lock at transaction start
(not at commit time), so lock contention surfaces immediately.
On ``database is locked``, we release the Python lock, sleep a
random 20-150ms, and retry breaking the convoy pattern that
SQLite's built-in deterministic backoff creates.
Returns whatever *fn* returns.
"""
last_err: Optional[Exception] = None
for attempt in range(self._WRITE_MAX_RETRIES):
try:
with self._lock:
self._conn.execute("BEGIN IMMEDIATE")
try:
result = fn(self._conn)
self._conn.commit()
except BaseException:
try:
self._conn.rollback()
except Exception:
pass
raise
# Success — periodic best-effort checkpoint.
self._write_count += 1
if self._write_count % self._CHECKPOINT_EVERY_N_WRITES == 0:
self._try_wal_checkpoint()
return result
except sqlite3.OperationalError as exc:
err_msg = str(exc).lower()
if "locked" in err_msg or "busy" in err_msg:
last_err = exc
if attempt < self._WRITE_MAX_RETRIES - 1:
jitter = random.uniform(
self._WRITE_RETRY_MIN_S,
self._WRITE_RETRY_MAX_S,
)
time.sleep(jitter)
continue
# Non-lock error or retries exhausted — propagate.
raise
# Retries exhausted (shouldn't normally reach here).
raise last_err or sqlite3.OperationalError(
"database is locked after max retries"
)
def _try_wal_checkpoint(self) -> None:
"""Best-effort PASSIVE WAL checkpoint. Never blocks, never raises.
Flushes committed WAL frames back into the main DB file for any
frames that no other connection currently needs. Keeps the WAL
from growing unbounded when many processes hold persistent
connections.
"""
try:
with self._lock:
result = self._conn.execute(
"PRAGMA wal_checkpoint(PASSIVE)"
).fetchone()
if result and result[1] > 0:
logger.debug(
"WAL checkpoint: %d/%d pages checkpointed",
result[2], result[1],
)
except Exception:
pass # Best effort — never fatal.
def close(self):
"""Close the database connection.
Attempts a PASSIVE WAL checkpoint first so that exiting processes
help keep the WAL file from growing unbounded.
"""
with self._lock:
if self._conn:
try:
self._conn.execute("PRAGMA wal_checkpoint(PASSIVE)")
except Exception:
pass
self._conn.close()
self._conn = None
def _init_schema(self):
"""Create tables and FTS if they don't exist, run migrations."""
cursor = self._conn.cursor()
@@ -256,8 +371,8 @@ class SessionDB:
parent_session_id: str = None,
) -> str:
"""Create a new session record. Returns the session_id."""
with self._lock:
self._conn.execute(
def _do(conn):
conn.execute(
"""INSERT OR IGNORE INTO sessions (id, source, user_id, model, model_config,
system_prompt, parent_session_id, started_at)
VALUES (?, ?, ?, ?, ?, ?, ?, ?)""",
@@ -272,26 +387,35 @@ class SessionDB:
time.time(),
),
)
self._conn.commit()
self._execute_write(_do)
return session_id
def end_session(self, session_id: str, end_reason: str) -> None:
"""Mark a session as ended."""
with self._lock:
self._conn.execute(
def _do(conn):
conn.execute(
"UPDATE sessions SET ended_at = ?, end_reason = ? WHERE id = ?",
(time.time(), end_reason, session_id),
)
self._conn.commit()
self._execute_write(_do)
def reopen_session(self, session_id: str) -> None:
"""Clear ended_at/end_reason so a session can be resumed."""
def _do(conn):
conn.execute(
"UPDATE sessions SET ended_at = NULL, end_reason = NULL WHERE id = ?",
(session_id,),
)
self._execute_write(_do)
def update_system_prompt(self, session_id: str, system_prompt: str) -> None:
"""Store the full assembled system prompt snapshot."""
with self._lock:
self._conn.execute(
def _do(conn):
conn.execute(
"UPDATE sessions SET system_prompt = ? WHERE id = ?",
(system_prompt, session_id),
)
self._conn.commit()
self._execute_write(_do)
def update_token_counts(
self,
@@ -310,11 +434,39 @@ class SessionDB:
billing_provider: Optional[str] = None,
billing_base_url: Optional[str] = None,
billing_mode: Optional[str] = None,
absolute: bool = False,
) -> None:
"""Increment token counters and backfill model if not already set."""
with self._lock:
self._conn.execute(
"""UPDATE sessions SET
"""Update token counters and backfill model if not already set.
When *absolute* is False (default), values are **incremented** use
this for per-API-call deltas (CLI path).
When *absolute* is True, values are **set directly** use this when
the caller already holds cumulative totals (gateway path, where the
cached agent accumulates across messages).
"""
if absolute:
sql = """UPDATE sessions SET
input_tokens = ?,
output_tokens = ?,
cache_read_tokens = ?,
cache_write_tokens = ?,
reasoning_tokens = ?,
estimated_cost_usd = COALESCE(?, 0),
actual_cost_usd = CASE
WHEN ? IS NULL THEN actual_cost_usd
ELSE ?
END,
cost_status = COALESCE(?, cost_status),
cost_source = COALESCE(?, cost_source),
pricing_version = COALESCE(?, pricing_version),
billing_provider = COALESCE(billing_provider, ?),
billing_base_url = COALESCE(billing_base_url, ?),
billing_mode = COALESCE(billing_mode, ?),
model = COALESCE(model, ?)
WHERE id = ?"""
else:
sql = """UPDATE sessions SET
input_tokens = input_tokens + ?,
output_tokens = output_tokens + ?,
cache_read_tokens = cache_read_tokens + ?,
@@ -332,6 +484,94 @@ class SessionDB:
billing_base_url = COALESCE(billing_base_url, ?),
billing_mode = COALESCE(billing_mode, ?),
model = COALESCE(model, ?)
WHERE id = ?"""
params = (
input_tokens,
output_tokens,
cache_read_tokens,
cache_write_tokens,
reasoning_tokens,
estimated_cost_usd,
actual_cost_usd,
actual_cost_usd,
cost_status,
cost_source,
pricing_version,
billing_provider,
billing_base_url,
billing_mode,
model,
session_id,
)
def _do(conn):
conn.execute(sql, params)
self._execute_write(_do)
def ensure_session(
self,
session_id: str,
source: str = "unknown",
model: str = None,
) -> None:
"""Ensure a session row exists, creating it with minimal metadata if absent.
Used by _flush_messages_to_session_db to recover from a failed
create_session() call (e.g. transient SQLite lock at agent startup).
INSERT OR IGNORE is safe to call even when the row already exists.
"""
def _do(conn):
conn.execute(
"""INSERT OR IGNORE INTO sessions
(id, source, model, started_at)
VALUES (?, ?, ?, ?)""",
(session_id, source, model, time.time()),
)
self._execute_write(_do)
def set_token_counts(
self,
session_id: str,
input_tokens: int = 0,
output_tokens: int = 0,
model: str = None,
cache_read_tokens: int = 0,
cache_write_tokens: int = 0,
reasoning_tokens: int = 0,
estimated_cost_usd: Optional[float] = None,
actual_cost_usd: Optional[float] = None,
cost_status: Optional[str] = None,
cost_source: Optional[str] = None,
pricing_version: Optional[str] = None,
billing_provider: Optional[str] = None,
billing_base_url: Optional[str] = None,
billing_mode: Optional[str] = None,
) -> None:
"""Set token counters to absolute values (not increment).
Use this when the caller provides cumulative totals from a completed
conversation run (e.g. the gateway, where the cached agent's
session_prompt_tokens already reflects the running total).
"""
def _do(conn):
conn.execute(
"""UPDATE sessions SET
input_tokens = ?,
output_tokens = ?,
cache_read_tokens = ?,
cache_write_tokens = ?,
reasoning_tokens = ?,
estimated_cost_usd = ?,
actual_cost_usd = CASE
WHEN ? IS NULL THEN actual_cost_usd
ELSE ?
END,
cost_status = COALESCE(?, cost_status),
cost_source = COALESCE(?, cost_source),
pricing_version = COALESCE(?, pricing_version),
billing_provider = COALESCE(billing_provider, ?),
billing_base_url = COALESCE(billing_base_url, ?),
billing_mode = COALESCE(billing_mode, ?),
model = COALESCE(model, ?)
WHERE id = ?""",
(
input_tokens,
@@ -352,28 +592,7 @@ class SessionDB:
session_id,
),
)
self._conn.commit()
def ensure_session(
self,
session_id: str,
source: str = "unknown",
model: str = None,
) -> None:
"""Ensure a session row exists, creating it with minimal metadata if absent.
Used by _flush_messages_to_session_db to recover from a failed
create_session() call (e.g. transient SQLite lock at agent startup).
INSERT OR IGNORE is safe to call even when the row already exists.
"""
with self._lock:
self._conn.execute(
"""INSERT OR IGNORE INTO sessions
(id, source, model, started_at)
VALUES (?, ?, ?, ?)""",
(session_id, source, model, time.time()),
)
self._conn.commit()
self._execute_write(_do)
def get_session(self, session_id: str) -> Optional[Dict[str, Any]]:
"""Get a session by ID."""
@@ -467,10 +686,10 @@ class SessionDB:
Empty/whitespace-only strings are normalized to None (clearing the title).
"""
title = self.sanitize_title(title)
with self._lock:
def _do(conn):
if title:
# Check uniqueness (allow the same session to keep its own title)
cursor = self._conn.execute(
cursor = conn.execute(
"SELECT id FROM sessions WHERE title = ? AND id != ?",
(title, session_id),
)
@@ -479,12 +698,12 @@ class SessionDB:
raise ValueError(
f"Title '{title}' is already in use by session {conflict['id']}"
)
cursor = self._conn.execute(
cursor = conn.execute(
"UPDATE sessions SET title = ? WHERE id = ?",
(title, session_id),
)
self._conn.commit()
rowcount = cursor.rowcount
return cursor.rowcount
rowcount = self._execute_write(_do)
return rowcount > 0
def get_session_title(self, session_id: str) -> Optional[str]:
@@ -656,17 +875,24 @@ class SessionDB:
Also increments the session's message_count (and tool_call_count
if role is 'tool' or tool_calls is present).
"""
with self._lock:
# Serialize structured fields to JSON for storage
reasoning_details_json = (
json.dumps(reasoning_details)
if reasoning_details else None
)
codex_items_json = (
json.dumps(codex_reasoning_items)
if codex_reasoning_items else None
)
cursor = self._conn.execute(
# Serialize structured fields to JSON before entering the write txn
reasoning_details_json = (
json.dumps(reasoning_details)
if reasoning_details else None
)
codex_items_json = (
json.dumps(codex_reasoning_items)
if codex_reasoning_items else None
)
tool_calls_json = json.dumps(tool_calls) if tool_calls else None
# Pre-compute tool call count
num_tool_calls = 0
if tool_calls is not None:
num_tool_calls = len(tool_calls) if isinstance(tool_calls, list) else 1
def _do(conn):
cursor = conn.execute(
"""INSERT INTO messages (session_id, role, content, tool_call_id,
tool_calls, tool_name, timestamp, token_count, finish_reason,
reasoning, reasoning_details, codex_reasoning_items)
@@ -676,7 +902,7 @@ class SessionDB:
role,
content,
tool_call_id,
json.dumps(tool_calls) if tool_calls else None,
tool_calls_json,
tool_name,
time.time(),
token_count,
@@ -689,25 +915,20 @@ class SessionDB:
msg_id = cursor.lastrowid
# Update counters
# Count actual tool calls from the tool_calls list (not from tool responses).
# A single assistant message can contain multiple parallel tool calls.
num_tool_calls = 0
if tool_calls is not None:
num_tool_calls = len(tool_calls) if isinstance(tool_calls, list) else 1
if num_tool_calls > 0:
self._conn.execute(
conn.execute(
"""UPDATE sessions SET message_count = message_count + 1,
tool_call_count = tool_call_count + ? WHERE id = ?""",
(num_tool_calls, session_id),
)
else:
self._conn.execute(
conn.execute(
"UPDATE sessions SET message_count = message_count + 1 WHERE id = ?",
(session_id,),
)
return msg_id
self._conn.commit()
return msg_id
return self._execute_write(_do)
def get_messages(self, session_id: str) -> List[Dict[str, Any]]:
"""Load all messages for a session, ordered by timestamp."""
@@ -1001,54 +1222,53 @@ class SessionDB:
def clear_messages(self, session_id: str) -> None:
"""Delete all messages for a session and reset its counters."""
with self._lock:
self._conn.execute(
def _do(conn):
conn.execute(
"DELETE FROM messages WHERE session_id = ?", (session_id,)
)
self._conn.execute(
conn.execute(
"UPDATE sessions SET message_count = 0, tool_call_count = 0 WHERE id = ?",
(session_id,),
)
self._conn.commit()
self._execute_write(_do)
def delete_session(self, session_id: str) -> bool:
"""Delete a session and all its messages. Returns True if found."""
with self._lock:
cursor = self._conn.execute(
def _do(conn):
cursor = conn.execute(
"SELECT COUNT(*) FROM sessions WHERE id = ?", (session_id,)
)
if cursor.fetchone()[0] == 0:
return False
self._conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
self._conn.execute("DELETE FROM sessions WHERE id = ?", (session_id,))
self._conn.commit()
conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
conn.execute("DELETE FROM sessions WHERE id = ?", (session_id,))
return True
return self._execute_write(_do)
def prune_sessions(self, older_than_days: int = 90, source: str = None) -> int:
"""
Delete sessions older than N days. Returns count of deleted sessions.
Only prunes ended sessions (not active ones).
"""
import time as _time
cutoff = _time.time() - (older_than_days * 86400)
cutoff = time.time() - (older_than_days * 86400)
with self._lock:
def _do(conn):
if source:
cursor = self._conn.execute(
cursor = conn.execute(
"""SELECT id FROM sessions
WHERE started_at < ? AND ended_at IS NOT NULL AND source = ?""",
(cutoff, source),
)
else:
cursor = self._conn.execute(
cursor = conn.execute(
"SELECT id FROM sessions WHERE started_at < ? AND ended_at IS NOT NULL",
(cutoff,),
)
session_ids = [row["id"] for row in cursor.fetchall()]
for sid in session_ids:
self._conn.execute("DELETE FROM messages WHERE session_id = ?", (sid,))
self._conn.execute("DELETE FROM sessions WHERE id = ?", (sid,))
conn.execute("DELETE FROM messages WHERE session_id = ?", (sid,))
conn.execute("DELETE FROM sessions WHERE id = ?", (sid,))
return len(session_ids)
self._conn.commit()
return len(session_ids)
return self._execute_write(_do)
+24 -10
View File
@@ -10,16 +10,27 @@ import os
import sys
from pathlib import Path
from hermes_constants import get_hermes_home
from honcho_integration.client import resolve_config_path, GLOBAL_CONFIG_PATH
HOST = "hermes"
def _config_path() -> Path:
"""Return the active Honcho config path (instance-local or global)."""
"""Return the active Honcho config path for reading (instance-local or global)."""
return resolve_config_path()
def _local_config_path() -> Path:
"""Return the instance-local Honcho config path for writing.
Always returns $HERMES_HOME/honcho.json so each profile/instance gets
its own config file. The global ~/.honcho/config.json is only used as
a read fallback (via resolve_config_path) for cross-app interop.
"""
return get_hermes_home() / "honcho.json"
def _read_config() -> dict:
path = _config_path()
if path.exists():
@@ -31,7 +42,7 @@ def _read_config() -> dict:
def _write_config(cfg: dict, path: Path | None = None) -> None:
path = path or _config_path()
path = path or _local_config_path()
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(
json.dumps(cfg, indent=2, ensure_ascii=False) + "\n",
@@ -95,13 +106,13 @@ def cmd_setup(args) -> None:
"""Interactive Honcho setup wizard."""
cfg = _read_config()
active_path = _config_path()
write_path = _local_config_path()
read_path = _config_path()
print("\nHoncho memory setup\n" + "" * 40)
print(" Honcho gives Hermes persistent cross-session memory.")
if active_path != GLOBAL_CONFIG_PATH:
print(f" Instance config: {active_path}")
else:
print(" Config is shared with other hosts at ~/.honcho/config.json")
print(f" Config: {write_path}")
if read_path != write_path and read_path.exists():
print(f" (seeding from existing config at {read_path})")
print()
if not _ensure_sdk_installed():
@@ -189,7 +200,7 @@ def cmd_setup(args) -> None:
hermes_host.setdefault("saveMessages", True)
_write_config(cfg)
print(f"\n Config written to {active_path}")
print(f"\n Config written to {write_path}")
# Test connection
print(" Testing connection... ", end="", flush=True)
@@ -237,6 +248,7 @@ def cmd_status(args) -> None:
cfg = _read_config()
active_path = _config_path()
write_path = _local_config_path()
if not cfg:
print(f" No Honcho config found at {active_path}")
@@ -259,6 +271,8 @@ def cmd_status(args) -> None:
print(f" Workspace: {hcfg.workspace_id}")
print(f" Host: {hcfg.host}")
print(f" Config path: {active_path}")
if write_path != active_path:
print(f" Write path: {write_path} (instance-local)")
print(f" AI peer: {hcfg.ai_peer}")
print(f" User peer: {hcfg.peer_name or 'not set'}")
print(f" Session key: {hcfg.resolve_session_name()}")
@@ -270,7 +284,7 @@ def cmd_status(args) -> None:
print(f" {peer}: {mode}")
print(f" Write freq: {hcfg.write_frequency}")
if hcfg.enabled and hcfg.api_key:
if hcfg.enabled and (hcfg.api_key or hcfg.base_url):
print("\n Connection... ", end="", flush=True)
try:
get_honcho_client(hcfg)
@@ -278,7 +292,7 @@ def cmd_status(args) -> None:
except Exception as e:
print(f"FAILED ({e})\n")
else:
reason = "disabled" if not hcfg.enabled else "no API key"
reason = "disabled" if not hcfg.enabled else "no API key or base URL"
print(f"\n Not connected ({reason})\n")
+10 -1
View File
@@ -417,9 +417,18 @@ def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:
else:
logger.info("Initializing Honcho client (host: %s, workspace: %s)", config.host, config.workspace_id)
# Local Honcho instances don't require an API key, but the SDK
# expects a non-empty string. Use a placeholder for local URLs.
_is_local = resolved_base_url and (
"localhost" in resolved_base_url
or "127.0.0.1" in resolved_base_url
or "::1" in resolved_base_url
)
effective_api_key = config.api_key or ("local" if _is_local else None)
kwargs: dict = {
"workspace_id": config.workspace_id,
"api_key": config.api_key,
"api_key": effective_api_key,
"environment": config.environment,
}
if resolved_base_url:
+868
View File
@@ -0,0 +1,868 @@
"""
Hermes MCP Server expose messaging conversations as MCP tools.
Starts a stdio MCP server that lets any MCP client (Claude Code, Cursor, Codex,
etc.) list conversations, read message history, send messages, poll for live
events, and manage approval requests across all connected platforms.
Matches OpenClaw's 9-tool MCP channel bridge surface:
conversations_list, conversation_get, messages_read, attachments_fetch,
events_poll, events_wait, messages_send, permissions_list_open,
permissions_respond
Plus: channels_list (Hermes-specific extra)
Usage:
hermes mcp serve
hermes mcp serve --verbose
MCP client config (e.g. claude_desktop_config.json):
{
"mcpServers": {
"hermes": {
"command": "hermes",
"args": ["mcp", "serve"]
}
}
}
"""
from __future__ import annotations
import json
import logging
import os
import re
import sys
import threading
import time
from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Optional
logger = logging.getLogger("hermes.mcp_serve")
# ---------------------------------------------------------------------------
# Lazy MCP SDK import
# ---------------------------------------------------------------------------
_MCP_SERVER_AVAILABLE = False
try:
from mcp.server.fastmcp import FastMCP
_MCP_SERVER_AVAILABLE = True
except ImportError:
FastMCP = None # type: ignore[assignment,misc]
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _get_sessions_dir() -> Path:
"""Return the sessions directory using HERMES_HOME."""
try:
from hermes_constants import get_hermes_home
return get_hermes_home() / "sessions"
except ImportError:
return Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes")) / "sessions"
def _get_session_db():
"""Get a SessionDB instance for reading message transcripts."""
try:
from hermes_state import SessionDB
return SessionDB()
except Exception as e:
logger.debug("SessionDB unavailable: %s", e)
return None
def _load_sessions_index() -> dict:
"""Load the gateway sessions.json index directly.
Returns a dict of session_key -> entry_dict with platform routing info.
This avoids importing the full SessionStore which needs GatewayConfig.
"""
sessions_file = _get_sessions_dir() / "sessions.json"
if not sessions_file.exists():
return {}
try:
with open(sessions_file, "r", encoding="utf-8") as f:
return json.load(f)
except Exception as e:
logger.debug("Failed to load sessions.json: %s", e)
return {}
def _load_channel_directory() -> dict:
"""Load the cached channel directory for available targets."""
try:
from hermes_constants import get_hermes_home
directory_file = get_hermes_home() / "channel_directory.json"
except ImportError:
directory_file = Path(
os.environ.get("HERMES_HOME", Path.home() / ".hermes")
) / "channel_directory.json"
if not directory_file.exists():
return {}
try:
with open(directory_file, "r", encoding="utf-8") as f:
return json.load(f)
except Exception as e:
logger.debug("Failed to load channel_directory.json: %s", e)
return {}
def _extract_message_content(msg: dict) -> str:
"""Extract text content from a message, handling multi-part content."""
content = msg.get("content", "")
if isinstance(content, list):
text_parts = [
p.get("text", "") for p in content
if isinstance(p, dict) and p.get("type") == "text"
]
return "\n".join(text_parts)
return str(content) if content else ""
def _extract_attachments(msg: dict) -> List[dict]:
"""Extract non-text attachments from a message.
Finds: multi-part image/file content blocks, MEDIA: tags in text,
image URLs, and file references.
"""
attachments = []
content = msg.get("content", "")
# Multi-part content blocks (image_url, file, etc.)
if isinstance(content, list):
for part in content:
if not isinstance(part, dict):
continue
ptype = part.get("type", "")
if ptype == "image_url":
url = part.get("image_url", {}).get("url", "") if isinstance(part.get("image_url"), dict) else ""
if url:
attachments.append({"type": "image", "url": url})
elif ptype == "image":
url = part.get("url", part.get("source", {}).get("url", ""))
if url:
attachments.append({"type": "image", "url": url})
elif ptype not in ("text",):
# Unknown non-text content type
attachments.append({"type": ptype, "data": part})
# MEDIA: tags in text content
text = _extract_message_content(msg)
if text:
media_pattern = re.compile(r'MEDIA:\s*(\S+)')
for match in media_pattern.finditer(text):
path = match.group(1)
attachments.append({"type": "media", "path": path})
return attachments
# ---------------------------------------------------------------------------
# Event Bridge — polls SessionDB for new messages, maintains event queue
# ---------------------------------------------------------------------------
QUEUE_LIMIT = 1000
POLL_INTERVAL = 0.2 # seconds between DB polls (200ms)
@dataclass
class QueueEvent:
"""An event in the bridge's in-memory queue."""
cursor: int
type: str # "message", "approval_requested", "approval_resolved"
session_key: str = ""
data: dict = field(default_factory=dict)
class EventBridge:
"""Background poller that watches SessionDB for new messages and
maintains an in-memory event queue with waiter support.
This is the Hermes equivalent of OpenClaw's WebSocket gateway bridge.
Instead of WebSocket events, we poll the SQLite database for changes.
"""
def __init__(self):
self._queue: List[QueueEvent] = []
self._cursor = 0
self._lock = threading.Lock()
self._new_event = threading.Event()
self._running = False
self._thread: Optional[threading.Thread] = None
self._last_poll_timestamps: Dict[str, float] = {} # session_key -> unix timestamp
# In-memory approval tracking (populated from events)
self._pending_approvals: Dict[str, dict] = {}
# mtime cache — skip expensive work when files haven't changed
self._sessions_json_mtime: float = 0.0
self._state_db_mtime: float = 0.0
self._cached_sessions_index: dict = {}
def start(self):
"""Start the background polling thread."""
if self._running:
return
self._running = True
self._thread = threading.Thread(target=self._poll_loop, daemon=True)
self._thread.start()
logger.debug("EventBridge started")
def stop(self):
"""Stop the background polling thread."""
self._running = False
self._new_event.set() # Wake any waiters
if self._thread:
self._thread.join(timeout=5)
logger.debug("EventBridge stopped")
def poll_events(
self,
after_cursor: int = 0,
session_key: Optional[str] = None,
limit: int = 20,
) -> dict:
"""Return events since after_cursor, optionally filtered by session_key."""
with self._lock:
events = [
e for e in self._queue
if e.cursor > after_cursor
and (not session_key or e.session_key == session_key)
][:limit]
next_cursor = events[-1].cursor if events else after_cursor
return {
"events": [
{"cursor": e.cursor, "type": e.type,
"session_key": e.session_key, **e.data}
for e in events
],
"next_cursor": next_cursor,
}
def wait_for_event(
self,
after_cursor: int = 0,
session_key: Optional[str] = None,
timeout_ms: int = 30000,
) -> Optional[dict]:
"""Block until a matching event arrives or timeout expires."""
deadline = time.monotonic() + (timeout_ms / 1000.0)
while time.monotonic() < deadline:
with self._lock:
for e in self._queue:
if e.cursor > after_cursor and (
not session_key or e.session_key == session_key
):
return {
"cursor": e.cursor, "type": e.type,
"session_key": e.session_key, **e.data,
}
remaining = deadline - time.monotonic()
if remaining <= 0:
break
self._new_event.clear()
self._new_event.wait(timeout=min(remaining, POLL_INTERVAL))
return None
def list_pending_approvals(self) -> List[dict]:
"""List approval requests observed during this bridge session."""
with self._lock:
return sorted(
self._pending_approvals.values(),
key=lambda a: a.get("created_at", ""),
)
def respond_to_approval(self, approval_id: str, decision: str) -> dict:
"""Resolve a pending approval (best-effort without gateway IPC)."""
with self._lock:
approval = self._pending_approvals.pop(approval_id, None)
if not approval:
return {"error": f"Approval not found: {approval_id}"}
self._enqueue(QueueEvent(
cursor=0, # Will be set by _enqueue
type="approval_resolved",
session_key=approval.get("session_key", ""),
data={"approval_id": approval_id, "decision": decision},
))
return {"resolved": True, "approval_id": approval_id, "decision": decision}
def _enqueue(self, event: QueueEvent) -> None:
"""Add an event to the queue and wake any waiters."""
with self._lock:
self._cursor += 1
event.cursor = self._cursor
self._queue.append(event)
# Trim queue to limit
while len(self._queue) > QUEUE_LIMIT:
self._queue.pop(0)
self._new_event.set()
def _poll_loop(self):
"""Background loop: poll SessionDB for new messages."""
db = _get_session_db()
if not db:
logger.warning("EventBridge: SessionDB unavailable, event polling disabled")
return
while self._running:
try:
self._poll_once(db)
except Exception as e:
logger.debug("EventBridge poll error: %s", e)
time.sleep(POLL_INTERVAL)
def _poll_once(self, db):
"""Check for new messages across all sessions.
Uses mtime checks on sessions.json and state.db to skip work
when nothing has changed makes 200ms polling essentially free.
"""
# Check if sessions.json has changed (mtime check is ~1μs)
sessions_file = _get_sessions_dir() / "sessions.json"
try:
sj_mtime = sessions_file.stat().st_mtime if sessions_file.exists() else 0.0
except OSError:
sj_mtime = 0.0
if sj_mtime != self._sessions_json_mtime:
self._sessions_json_mtime = sj_mtime
self._cached_sessions_index = _load_sessions_index()
# Check if state.db has changed
try:
from hermes_constants import get_hermes_home
db_file = get_hermes_home() / "state.db"
except ImportError:
db_file = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes")) / "state.db"
try:
db_mtime = db_file.stat().st_mtime if db_file.exists() else 0.0
except OSError:
db_mtime = 0.0
if db_mtime == self._state_db_mtime and sj_mtime == self._sessions_json_mtime:
return # Nothing changed since last poll — skip entirely
self._state_db_mtime = db_mtime
entries = self._cached_sessions_index
for session_key, entry in entries.items():
session_id = entry.get("session_id", "")
if not session_id:
continue
last_seen = self._last_poll_timestamps.get(session_key, 0.0)
try:
messages = db.get_messages(session_id)
except Exception:
continue
if not messages:
continue
# Normalize timestamps to float for comparison
def _ts_float(ts) -> float:
if isinstance(ts, (int, float)):
return float(ts)
if isinstance(ts, str) and ts:
try:
return float(ts)
except ValueError:
# ISO string — parse to epoch
try:
from datetime import datetime
return datetime.fromisoformat(ts).timestamp()
except Exception:
return 0.0
return 0.0
# Find messages newer than our last seen timestamp
new_messages = []
for msg in messages:
ts = _ts_float(msg.get("timestamp", 0))
role = msg.get("role", "")
if role not in ("user", "assistant"):
continue
if ts > last_seen:
new_messages.append(msg)
for msg in new_messages:
content = _extract_message_content(msg)
if not content:
continue
self._enqueue(QueueEvent(
cursor=0,
type="message",
session_key=session_key,
data={
"role": msg.get("role", ""),
"content": content[:500],
"timestamp": str(msg.get("timestamp", "")),
"message_id": str(msg.get("id", "")),
},
))
# Update last seen to the most recent message timestamp
all_ts = [_ts_float(m.get("timestamp", 0)) for m in messages]
if all_ts:
latest = max(all_ts)
if latest > last_seen:
self._last_poll_timestamps[session_key] = latest
# ---------------------------------------------------------------------------
# MCP Server
# ---------------------------------------------------------------------------
def create_mcp_server(event_bridge: Optional[EventBridge] = None) -> "FastMCP":
"""Create and return the Hermes MCP server with all tools registered."""
if not _MCP_SERVER_AVAILABLE:
raise ImportError(
"MCP server requires the 'mcp' package. "
"Install with: pip install 'hermes-agent[mcp]'"
)
mcp = FastMCP(
"hermes",
instructions=(
"Hermes Agent messaging bridge. Use these tools to interact with "
"conversations across Telegram, Discord, Slack, WhatsApp, Signal, "
"Matrix, and other connected platforms."
),
)
bridge = event_bridge or EventBridge()
# -- conversations_list ------------------------------------------------
@mcp.tool()
def conversations_list(
platform: Optional[str] = None,
limit: int = 50,
search: Optional[str] = None,
) -> str:
"""List active messaging conversations across connected platforms.
Returns conversations with their session keys (needed for messages_read),
platform, chat type, display name, and last activity time.
Args:
platform: Filter by platform name (telegram, discord, slack, etc.)
limit: Maximum number of conversations to return (default 50)
search: Optional text to filter conversations by name
"""
entries = _load_sessions_index()
conversations = []
for key, entry in entries.items():
origin = entry.get("origin", {})
entry_platform = entry.get("platform") or origin.get("platform", "")
if platform and entry_platform.lower() != platform.lower():
continue
display_name = entry.get("display_name", "")
chat_name = origin.get("chat_name", "")
if search:
search_lower = search.lower()
if (search_lower not in display_name.lower()
and search_lower not in chat_name.lower()
and search_lower not in key.lower()):
continue
conversations.append({
"session_key": key,
"session_id": entry.get("session_id", ""),
"platform": entry_platform,
"chat_type": entry.get("chat_type", origin.get("chat_type", "")),
"display_name": display_name,
"chat_name": chat_name,
"user_name": origin.get("user_name", ""),
"updated_at": entry.get("updated_at", ""),
})
conversations.sort(key=lambda c: c.get("updated_at", ""), reverse=True)
conversations = conversations[:limit]
return json.dumps({
"count": len(conversations),
"conversations": conversations,
}, indent=2)
# -- conversation_get --------------------------------------------------
@mcp.tool()
def conversation_get(session_key: str) -> str:
"""Get detailed info about one conversation by its session key.
Args:
session_key: The session key from conversations_list
"""
entries = _load_sessions_index()
entry = entries.get(session_key)
if not entry:
return json.dumps({"error": f"Conversation not found: {session_key}"})
origin = entry.get("origin", {})
return json.dumps({
"session_key": session_key,
"session_id": entry.get("session_id", ""),
"platform": entry.get("platform") or origin.get("platform", ""),
"chat_type": entry.get("chat_type", origin.get("chat_type", "")),
"display_name": entry.get("display_name", ""),
"user_name": origin.get("user_name", ""),
"chat_name": origin.get("chat_name", ""),
"chat_id": origin.get("chat_id", ""),
"thread_id": origin.get("thread_id"),
"updated_at": entry.get("updated_at", ""),
"created_at": entry.get("created_at", ""),
"input_tokens": entry.get("input_tokens", 0),
"output_tokens": entry.get("output_tokens", 0),
"total_tokens": entry.get("total_tokens", 0),
}, indent=2)
# -- messages_read -----------------------------------------------------
@mcp.tool()
def messages_read(
session_key: str,
limit: int = 50,
) -> str:
"""Read recent messages from a conversation.
Returns the message history in chronological order with role, content,
and timestamp for each message.
Args:
session_key: The session key from conversations_list
limit: Maximum number of messages to return (default 50, most recent)
"""
entries = _load_sessions_index()
entry = entries.get(session_key)
if not entry:
return json.dumps({"error": f"Conversation not found: {session_key}"})
session_id = entry.get("session_id", "")
if not session_id:
return json.dumps({"error": "No session ID for this conversation"})
db = _get_session_db()
if not db:
return json.dumps({"error": "Session database unavailable"})
try:
all_messages = db.get_messages(session_id)
except Exception as e:
return json.dumps({"error": f"Failed to read messages: {e}"})
filtered = []
for msg in all_messages:
role = msg.get("role", "")
if role in ("user", "assistant"):
content = _extract_message_content(msg)
if content:
filtered.append({
"id": str(msg.get("id", "")),
"role": role,
"content": content[:2000],
"timestamp": msg.get("timestamp", ""),
})
messages = filtered[-limit:]
return json.dumps({
"session_key": session_key,
"count": len(messages),
"total_in_session": len(filtered),
"messages": messages,
}, indent=2)
# -- attachments_fetch -------------------------------------------------
@mcp.tool()
def attachments_fetch(
session_key: str,
message_id: str,
) -> str:
"""List non-text attachments for a message in a conversation.
Extracts images, media files, and other non-text content blocks
from the specified message.
Args:
session_key: The session key from conversations_list
message_id: The message ID from messages_read
"""
entries = _load_sessions_index()
entry = entries.get(session_key)
if not entry:
return json.dumps({"error": f"Conversation not found: {session_key}"})
session_id = entry.get("session_id", "")
if not session_id:
return json.dumps({"error": "No session ID for this conversation"})
db = _get_session_db()
if not db:
return json.dumps({"error": "Session database unavailable"})
try:
all_messages = db.get_messages(session_id)
except Exception as e:
return json.dumps({"error": f"Failed to read messages: {e}"})
# Find the target message
target_msg = None
for msg in all_messages:
if str(msg.get("id", "")) == message_id:
target_msg = msg
break
if not target_msg:
return json.dumps({"error": f"Message not found: {message_id}"})
attachments = _extract_attachments(target_msg)
return json.dumps({
"message_id": message_id,
"count": len(attachments),
"attachments": attachments,
}, indent=2)
# -- events_poll -------------------------------------------------------
@mcp.tool()
def events_poll(
after_cursor: int = 0,
session_key: Optional[str] = None,
limit: int = 20,
) -> str:
"""Poll for new conversation events since a cursor position.
Returns events that have occurred since the given cursor. Use the
returned next_cursor value for subsequent polls.
Event types: message, approval_requested, approval_resolved
Args:
after_cursor: Return events after this cursor (0 for all)
session_key: Optional filter to one conversation
limit: Maximum events to return (default 20)
"""
result = bridge.poll_events(
after_cursor=after_cursor,
session_key=session_key,
limit=limit,
)
return json.dumps(result, indent=2)
# -- events_wait -------------------------------------------------------
@mcp.tool()
def events_wait(
after_cursor: int = 0,
session_key: Optional[str] = None,
timeout_ms: int = 30000,
) -> str:
"""Wait for the next conversation event (long-poll).
Blocks until a matching event arrives or the timeout expires.
Use this for near-real-time event delivery without polling.
Args:
after_cursor: Wait for events after this cursor
session_key: Optional filter to one conversation
timeout_ms: Maximum wait time in milliseconds (default 30000)
"""
event = bridge.wait_for_event(
after_cursor=after_cursor,
session_key=session_key,
timeout_ms=min(timeout_ms, 300000), # Cap at 5 minutes
)
if event:
return json.dumps({"event": event}, indent=2)
return json.dumps({"event": None, "reason": "timeout"}, indent=2)
# -- messages_send -----------------------------------------------------
@mcp.tool()
def messages_send(
target: str,
message: str,
) -> str:
"""Send a message to a platform conversation.
The target format is "platform:chat_id" same format used by the
channels_list tool. You can also use human-friendly channel names
that will be resolved automatically.
Examples:
target="telegram:6308981865"
target="discord:#general"
target="slack:#engineering"
Args:
target: Platform target in "platform:identifier" format
message: The message text to send
"""
if not target or not message:
return json.dumps({"error": "Both target and message are required"})
try:
from tools.send_message_tool import send_message_tool
result_str = send_message_tool(
{"action": "send", "target": target, "message": message}
)
return result_str
except ImportError:
return json.dumps({"error": "Send message tool not available"})
except Exception as e:
return json.dumps({"error": f"Send failed: {e}"})
# -- channels_list -----------------------------------------------------
@mcp.tool()
def channels_list(platform: Optional[str] = None) -> str:
"""List available messaging channels and targets across platforms.
Returns channels that you can send messages to. The target strings
returned here can be used directly with the messages_send tool.
Args:
platform: Filter by platform name (telegram, discord, slack, etc.)
"""
directory = _load_channel_directory()
if not directory:
entries = _load_sessions_index()
targets = []
seen = set()
for key, entry in entries.items():
origin = entry.get("origin", {})
p = entry.get("platform") or origin.get("platform", "")
chat_id = origin.get("chat_id", "")
if not p or not chat_id:
continue
if platform and p.lower() != platform.lower():
continue
target_str = f"{p}:{chat_id}"
if target_str in seen:
continue
seen.add(target_str)
targets.append({
"target": target_str,
"platform": p,
"name": entry.get("display_name") or origin.get("chat_name", ""),
"chat_type": entry.get("chat_type", origin.get("chat_type", "")),
})
return json.dumps({"count": len(targets), "channels": targets}, indent=2)
channels = []
for plat, entries_list in directory.items():
if platform and plat.lower() != platform.lower():
continue
if isinstance(entries_list, list):
for ch in entries_list:
if isinstance(ch, dict):
chat_id = ch.get("id", ch.get("chat_id", ""))
channels.append({
"target": f"{plat}:{chat_id}" if chat_id else plat,
"platform": plat,
"name": ch.get("name", ch.get("display_name", "")),
"chat_type": ch.get("type", ""),
})
return json.dumps({"count": len(channels), "channels": channels}, indent=2)
# -- permissions_list_open ---------------------------------------------
@mcp.tool()
def permissions_list_open() -> str:
"""List pending approval requests observed during this bridge session.
Returns exec and plugin approval requests that the bridge has seen
since it started. Approvals are live-session only older approvals
from before the bridge connected are not included.
"""
approvals = bridge.list_pending_approvals()
return json.dumps({
"count": len(approvals),
"approvals": approvals,
}, indent=2)
# -- permissions_respond -----------------------------------------------
@mcp.tool()
def permissions_respond(
id: str,
decision: str,
) -> str:
"""Respond to a pending approval request.
Args:
id: The approval ID from permissions_list_open
decision: One of "allow-once", "allow-always", or "deny"
"""
if decision not in ("allow-once", "allow-always", "deny"):
return json.dumps({
"error": f"Invalid decision: {decision}. "
f"Must be allow-once, allow-always, or deny"
})
result = bridge.respond_to_approval(id, decision)
return json.dumps(result, indent=2)
return mcp
# ---------------------------------------------------------------------------
# Entry point
# ---------------------------------------------------------------------------
def run_mcp_server(verbose: bool = False) -> None:
"""Start the Hermes MCP server on stdio."""
if not _MCP_SERVER_AVAILABLE:
print(
"Error: MCP server requires the 'mcp' package.\n"
"Install with: pip install 'hermes-agent[mcp]'",
file=sys.stderr,
)
sys.exit(1)
if verbose:
logging.basicConfig(level=logging.DEBUG, stream=sys.stderr)
else:
logging.basicConfig(level=logging.WARNING, stream=sys.stderr)
bridge = EventBridge()
bridge.start()
server = create_mcp_server(event_bridge=bridge)
import asyncio
async def _run():
try:
await server.run_stdio_async()
finally:
bridge.stop()
try:
asyncio.run(_run())
except KeyboardInterrupt:
bridge.stop()
+46 -8
View File
@@ -10,6 +10,12 @@
# container recreation. Environment variables are written to $HERMES_HOME/.env
# and read by hermes at startup — no container recreation needed for env changes.
#
# Tool resolution: the hermes wrapper uses --suffix PATH for nix store tools,
# so apt/uv-installed versions take priority. The container entrypoint provisions
# extensible tools on first boot: nodejs/npm via apt, uv via curl, and a Python
# 3.11 venv (bootstrapped entirely by uv) at ~/.venv with pip seeded. Agents get
# writable tool prefixes for npm i -g, pip install, uv tool install, etc.
#
# Usage:
# services.hermes-agent = {
# enable = true;
@@ -105,22 +111,52 @@
fi
mkdir -p "$TARGET_HOME"
chown "$HERMES_UID:$HERMES_GID" "$TARGET_HOME"
chmod 0750 "$TARGET_HOME"
# Ensure HERMES_HOME is owned by the target user
if [ -n "''${HERMES_HOME:-}" ] && [ -d "$HERMES_HOME" ]; then
chown -R "$HERMES_UID:$HERMES_GID" "$HERMES_HOME"
fi
# Install sudo on Debian/Ubuntu if missing (first boot only, cached in writable layer)
if command -v apt-get >/dev/null 2>&1 && ! command -v sudo >/dev/null 2>&1; then
apt-get update -qq >/dev/null 2>&1 && apt-get install -y -qq sudo >/dev/null 2>&1 || true
# ── Provision apt packages (first boot only, cached in writable layer) ──
# sudo: agent self-modification
# nodejs/npm: writable node so npm i -g works (nix store copies are read-only)
# curl: needed for uv installer
if [ ! -f /var/lib/hermes-tools-provisioned ] && command -v apt-get >/dev/null 2>&1; then
echo "First boot: provisioning agent tools..."
apt-get update -qq
apt-get install -y -qq sudo nodejs npm curl
touch /var/lib/hermes-tools-provisioned
fi
if command -v sudo >/dev/null 2>&1 && [ ! -f /etc/sudoers.d/hermes ]; then
mkdir -p /etc/sudoers.d
echo "$TARGET_USER ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/hermes
chmod 0440 /etc/sudoers.d/hermes
fi
# uv (Python manager) — not in Ubuntu repos, retry-safe outside the sentinel
if ! command -v uv >/dev/null 2>&1 && [ ! -x "$TARGET_HOME/.local/bin/uv" ] && command -v curl >/dev/null 2>&1; then
su -s /bin/sh "$TARGET_USER" -c 'curl -LsSf https://astral.sh/uv/install.sh | sh' || true
fi
# Python 3.11 venv — gives the agent a writable Python with pip.
# Uses uv to install Python 3.11 (Ubuntu 24.04 ships 3.12).
# --seed includes pip/setuptools so bare `pip install` works.
_UV_BIN="$TARGET_HOME/.local/bin/uv"
if [ ! -d "$TARGET_HOME/.venv" ] && [ -x "$_UV_BIN" ]; then
su -s /bin/sh "$TARGET_USER" -c "
export PATH=\"\$HOME/.local/bin:\$PATH\"
uv python install 3.11
uv venv --python 3.11 --seed \"\$HOME/.venv\"
" || true
fi
# Put the agent venv first on PATH so python/pip resolve to writable copies
if [ -d "$TARGET_HOME/.venv/bin" ]; then
export PATH="$TARGET_HOME/.venv/bin:$PATH"
fi
if command -v setpriv >/dev/null 2>&1; then
exec setpriv --reuid="$HERMES_UID" --regid="$HERMES_GID" --init-groups "$@"
elif command -v su >/dev/null 2>&1; then
@@ -516,8 +552,8 @@
# ── Directories ───────────────────────────────────────────────────
{
systemd.tmpfiles.rules = [
"d ${cfg.stateDir} 0755 ${cfg.user} ${cfg.group} - -"
"d ${cfg.stateDir}/.hermes 0755 ${cfg.user} ${cfg.group} - -"
"d ${cfg.stateDir} 0750 ${cfg.user} ${cfg.group} - -"
"d ${cfg.stateDir}/.hermes 0750 ${cfg.user} ${cfg.group} - -"
"d ${cfg.stateDir}/home 0750 ${cfg.user} ${cfg.group} - -"
"d ${cfg.workingDirectory} 0750 ${cfg.user} ${cfg.group} - -"
];
@@ -531,21 +567,23 @@
mkdir -p ${cfg.stateDir}/home
mkdir -p ${cfg.workingDirectory}
chown ${cfg.user}:${cfg.group} ${cfg.stateDir} ${cfg.stateDir}/.hermes ${cfg.stateDir}/home ${cfg.workingDirectory}
chmod 0750 ${cfg.stateDir} ${cfg.stateDir}/.hermes ${cfg.stateDir}/home ${cfg.workingDirectory}
# Merge Nix settings into existing config.yaml.
# Preserves user-added keys (skills, streaming, etc.); Nix keys win.
# If configFile is user-provided (not generated), overwrite instead of merge.
${if cfg.configFile != null then ''
install -o ${cfg.user} -g ${cfg.group} -m 0644 -D ${configFile} ${cfg.stateDir}/.hermes/config.yaml
install -o ${cfg.user} -g ${cfg.group} -m 0640 -D ${configFile} ${cfg.stateDir}/.hermes/config.yaml
'' else ''
${configMergeScript} ${generatedConfigFile} ${cfg.stateDir}/.hermes/config.yaml
chown ${cfg.user}:${cfg.group} ${cfg.stateDir}/.hermes/config.yaml
chmod 0644 ${cfg.stateDir}/.hermes/config.yaml
chmod 0640 ${cfg.stateDir}/.hermes/config.yaml
''}
# Managed mode marker (so interactive shells also detect NixOS management)
touch ${cfg.stateDir}/.hermes/.managed
chown ${cfg.user}:${cfg.group} ${cfg.stateDir}/.hermes/.managed
chmod 0644 ${cfg.stateDir}/.hermes/.managed
# Seed auth file if provided
${lib.optionalString (cfg.authFile != null) ''
@@ -577,7 +615,7 @@ HERMES_NIX_ENV_EOF
# Link documents into workspace
${lib.concatStringsSep "\n" (lib.mapAttrsToList (name: _value: ''
install -o ${cfg.user} -g ${cfg.group} -m 0644 ${documentDerivation}/${name} ${cfg.workingDirectory}/${name}
install -o ${cfg.user} -g ${cfg.group} -m 0640 ${documentDerivation}/${name} ${cfg.workingDirectory}/${name}
'') cfg.documents)}
'';
}
+1 -1
View File
@@ -35,7 +35,7 @@
${pkgs.lib.concatMapStringsSep "\n" (name: ''
makeWrapper ${hermesVenv}/bin/${name} $out/bin/${name} \
--prefix PATH : "${runtimePath}" \
--suffix PATH : "${runtimePath}" \
--set HERMES_BUNDLED_SKILLS $out/share/hermes-agent/skills
'') [ "hermes" "hermes-agent" "hermes-acp" ]}
@@ -0,0 +1 @@
Communication and decision-making frameworks — structured response formats for proposals, trade-off analysis, and stakeholder-ready recommendations.
@@ -0,0 +1,103 @@
---
name: one-three-one-rule
description: >
Structured decision-making framework for technical proposals and trade-off analysis.
When the user faces a choice between multiple approaches (architecture decisions,
tool selection, refactoring strategies, migration paths), this skill produces a
1-3-1 format: one clear problem statement, three distinct options with pros/cons,
and one concrete recommendation with definition of done and implementation plan.
Use when the user asks for a "1-3-1", says "give me options", or needs help
choosing between competing approaches.
version: 1.0.0
author: Willard Moore
license: MIT
category: communication
metadata:
hermes:
tags: [communication, decision-making, proposals, trade-offs]
---
# 1-3-1 Communication Rule
Structured decision-making format for when a task has multiple viable approaches and the user needs a clear recommendation. Produces a concise problem framing, three options with trade-offs, and an actionable plan for the recommended path.
## When to Use
- The user explicitly asks for a "1-3-1" response.
- The user says "give me options" or "what are my choices" for a technical decision.
- A task has multiple viable approaches with meaningful trade-offs (architecture, tooling, migration strategy).
- The user needs a proposal they can forward to a team or stakeholder.
Do NOT use for simple questions with one obvious answer, debugging sessions, or tasks where the user has already decided on an approach.
## Procedure
1. **Problem** (one sentence)
- State the core decision or desired outcome in a single concise sentence.
- Focus on the *what*, not the *how* — no implementation details, tool names, or specific technologies.
- Keep it tight. If you need "and", you're describing two problems.
2. **Options** (exactly three)
- Present three distinct, viable approaches labeled A, B, C.
- Each option gets a brief description, pros, and cons.
- Options should represent genuinely different strategies, not minor variations of the same approach.
3. **Recommendation** (one option)
- State which option you recommend and why, based on the user's context and priorities.
- Be direct — this is your professional judgment, not a hedge.
4. **Definition of Done**
- List the specific success criteria for the recommended option.
- These are concrete, verifiable outcomes — not vague aspirations.
- If the user picks a different option, revise this section to match.
5. **Implementation Plan**
- Concrete steps to execute the recommended option.
- Include specific commands, tools, or actions where applicable.
- If the user picks a different option, revise this section to match.
## Verification
- Response has exactly one Problem sentence.
- Response has exactly three Options (A, B, C) with pros and cons for each.
- Response has a single Recommendation that picks one option with reasoning.
- Definition of Done and Implementation Plan align with the recommended option.
- If the user selects a different option, Recommendation, DoD, and Implementation Plan update accordingly.
## Example
User: "Give me a 1-3-1 for adding retry logic to our API client."
Agent:
**Problem:** API calls to external services fail intermittently under load and need graceful retry handling.
**Options:**
- **Option A: Exponential backoff with jitter in a shared utility.**
Pros: Simple, well-understood pattern; single implementation shared across all callers.
Cons: No circuit-breaking; callers still block during retries.
- **Option B: Retry library (tenacity) as a decorator on each endpoint call.**
Pros: Configurable per-endpoint; built-in support for circuit breakers and rate limits.
Cons: Adds a dependency; decorator pattern can obscure control flow.
- **Option C: Queue-based async retry with dead letter handling.**
Pros: Non-blocking; failed calls are tracked and replayable; scales well.
Cons: Significant complexity increase; requires a queue backend; overkill for low-volume services.
**Recommendation:** Option A. The call volume doesn't justify queue infrastructure, and a shared utility with exponential backoff + jitter covers the failure mode cleanly without new dependencies.
**Definition of Done:**
- `api_retry()` utility handles transient HTTP errors (429, 502, 503, 504).
- Retries up to 3 times with exponential backoff and jitter.
- All external API calls route through the retry utility.
- Logs each retry attempt with endpoint and attempt number.
- Tests cover retry exhaustion, immediate success, and transient-then-success scenarios.
**Implementation Plan:**
1. Create `utils/api_retry.py` with configurable max retries, base delay, and retryable status codes.
2. Add jitter using `random.uniform(0, base_delay)` to prevent thundering herd.
3. Wrap existing API calls in `api_client.py` with the retry utility.
4. Add unit tests mocking HTTP responses for each retry scenario.
5. Verify under load with a simple stress test against a flaky endpoint mock.
@@ -304,6 +304,29 @@ def ensure_parent(path: Path) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
def resolve_secret_input(value: Any, env: Optional[Dict[str, str]] = None) -> Optional[str]:
"""Resolve an OpenClaw SecretInput value to a plain string.
SecretInput can be:
- A plain string: "sk-..."
- An env template: "${OPENROUTER_API_KEY}"
- A SecretRef object: {"source": "env", "id": "OPENROUTER_API_KEY"}
"""
if isinstance(value, str):
# Check for env template: "${VAR_NAME}"
m = re.match(r"^\$\{(\w+)\}$", value.strip())
if m and env:
return env.get(m.group(1), "").strip() or None
return value.strip() or None
if isinstance(value, dict):
source = value.get("source", "")
ref_id = value.get("id", "")
if source == "env" and ref_id and env:
return env.get(ref_id, "").strip() or None
# File/exec sources can't be resolved here — return None
return None
def load_yaml_file(path: Path) -> Dict[str, Any]:
if yaml is None or not path.exists():
return {}
@@ -890,14 +913,20 @@ class Migrator:
self.record("command-allowlist", source, destination, "migrated", "Would merge patterns", added_patterns=added)
def load_openclaw_config(self) -> Dict[str, Any]:
config_path = self.source_root / "openclaw.json"
if not config_path.exists():
return {}
try:
data = json.loads(config_path.read_text(encoding="utf-8"))
return data if isinstance(data, dict) else {}
except json.JSONDecodeError:
return {}
# Check current name and legacy config filenames
for name in ("openclaw.json", "clawdbot.json", "moldbot.json"):
config_path = self.source_root / name
if config_path.exists():
try:
data = json.loads(config_path.read_text(encoding="utf-8"))
return data if isinstance(data, dict) else {}
except json.JSONDecodeError:
continue
return {}
def load_openclaw_env(self) -> Dict[str, str]:
"""Load the OpenClaw .env file for secrets that live there instead of config."""
return parse_env_file(self.source_root / ".env")
def merge_env_values(self, additions: Dict[str, str], kind: str, source: Path) -> None:
destination = self.target_root / ".env"
@@ -1024,6 +1053,10 @@ class Migrator:
supported_targets=sorted(SUPPORTED_SECRET_TARGETS),
)
def _resolve_channel_secret(self, value: Any) -> Optional[str]:
"""Resolve a channel config value that may be a SecretRef."""
return resolve_secret_input(value, self.load_openclaw_env())
def migrate_discord_settings(self, config: Optional[Dict[str, Any]] = None) -> None:
config = config or self.load_openclaw_config()
additions: Dict[str, str] = {}
@@ -1118,15 +1151,17 @@ class Migrator:
secret_additions: Dict[str, str] = {}
# Extract provider API keys from models.providers
# Note: apiKey values can be strings, env templates, or SecretRef objects
openclaw_env = self.load_openclaw_env()
providers = config.get("models", {}).get("providers", {})
if isinstance(providers, dict):
for provider_name, provider_cfg in providers.items():
if not isinstance(provider_cfg, dict):
continue
api_key = provider_cfg.get("apiKey")
if not isinstance(api_key, str) or not api_key.strip():
raw_key = provider_cfg.get("apiKey")
api_key = resolve_secret_input(raw_key, openclaw_env)
if not api_key:
continue
api_key = api_key.strip()
base_url = provider_cfg.get("baseUrl", "")
api_type = provider_cfg.get("api", "")
@@ -1170,6 +1205,50 @@ class Migrator:
if isinstance(oai_key, str) and oai_key.strip():
secret_additions["VOICE_TOOLS_OPENAI_KEY"] = oai_key.strip()
# Also check the OpenClaw .env file — many users store keys there
# instead of inline in openclaw.json
openclaw_env = self.load_openclaw_env()
env_key_mapping = {
"OPENROUTER_API_KEY": "OPENROUTER_API_KEY",
"OPENAI_API_KEY": "OPENAI_API_KEY",
"ANTHROPIC_API_KEY": "ANTHROPIC_API_KEY",
"ELEVENLABS_API_KEY": "ELEVENLABS_API_KEY",
"TELEGRAM_BOT_TOKEN": "TELEGRAM_BOT_TOKEN",
"DEEPSEEK_API_KEY": "DEEPSEEK_API_KEY",
"GEMINI_API_KEY": "GEMINI_API_KEY",
"ZAI_API_KEY": "ZAI_API_KEY",
"MINIMAX_API_KEY": "MINIMAX_API_KEY",
}
for oc_key, hermes_key in env_key_mapping.items():
val = openclaw_env.get(oc_key, "").strip()
if val and hermes_key not in secret_additions:
secret_additions[hermes_key] = val
# Check per-agent auth-profiles.json for additional credentials
auth_profiles_path = self.source_root / "agents" / "main" / "agent" / "auth-profiles.json"
if auth_profiles_path.exists():
try:
profiles = json.loads(auth_profiles_path.read_text(encoding="utf-8"))
if isinstance(profiles, dict):
# auth-profiles.json wraps profiles in a "profiles" key
profile_entries = profiles.get("profiles", profiles) if isinstance(profiles.get("profiles"), dict) else profiles
for profile_name, profile_data in profile_entries.items():
if not isinstance(profile_data, dict):
continue
# Canonical field is "key", "apiKey" is accepted as alias
api_key = profile_data.get("key", "") or profile_data.get("apiKey", "")
if not isinstance(api_key, str) or not api_key.strip():
continue
name_lower = profile_name.lower()
if "openrouter" in name_lower and "OPENROUTER_API_KEY" not in secret_additions:
secret_additions["OPENROUTER_API_KEY"] = api_key.strip()
elif "openai" in name_lower and "OPENAI_API_KEY" not in secret_additions:
secret_additions["OPENAI_API_KEY"] = api_key.strip()
elif "anthropic" in name_lower and "ANTHROPIC_API_KEY" not in secret_additions:
secret_additions["ANTHROPIC_API_KEY"] = api_key.strip()
except (json.JSONDecodeError, OSError):
pass
if secret_additions:
self.merge_env_values(secret_additions, "provider-keys", self.source_root / "openclaw.json")
else:
@@ -1218,7 +1297,11 @@ class Migrator:
if self.execute:
backup_path = self.maybe_backup(destination)
hermes_config["model"] = model_str
existing_model = hermes_config.get("model")
if isinstance(existing_model, dict):
existing_model["default"] = model_str
else:
hermes_config["model"] = {"default": model_str}
dump_yaml_file(destination, hermes_config)
self.record("model-config", source_path, destination, "migrated", backup=str(backup_path) if backup_path else "", model=model_str)
else:
@@ -1244,22 +1327,44 @@ class Migrator:
if isinstance(provider, str) and provider in ("elevenlabs", "openai", "edge"):
tts_data["provider"] = provider
elevenlabs = tts.get("elevenlabs", {})
# TTS provider settings live under messages.tts.providers.{provider}
# in OpenClaw (not messages.tts.elevenlabs directly)
providers = tts.get("providers") or {}
# Also check the top-level "talk" config which has provider settings too
talk_cfg = (config or self.load_openclaw_config()).get("talk") or {}
talk_providers = talk_cfg.get("providers") or {}
# Merge: messages.tts.providers takes priority, then talk.providers,
# then legacy flat keys (messages.tts.elevenlabs, etc.)
elevenlabs = (
(providers.get("elevenlabs") or {})
if isinstance(providers.get("elevenlabs"), dict) else
(talk_providers.get("elevenlabs") or {})
if isinstance(talk_providers.get("elevenlabs"), dict) else
(tts.get("elevenlabs") or {})
)
if isinstance(elevenlabs, dict):
el_settings: Dict[str, str] = {}
voice_id = elevenlabs.get("voiceId")
voice_id = elevenlabs.get("voiceId") or talk_cfg.get("voiceId")
if isinstance(voice_id, str) and voice_id.strip():
el_settings["voice_id"] = voice_id.strip()
model_id = elevenlabs.get("modelId")
model_id = elevenlabs.get("modelId") or talk_cfg.get("modelId")
if isinstance(model_id, str) and model_id.strip():
el_settings["model_id"] = model_id.strip()
if el_settings:
tts_data["elevenlabs"] = el_settings
openai_tts = tts.get("openai", {})
openai_tts = (
(providers.get("openai") or {})
if isinstance(providers.get("openai"), dict) else
(talk_providers.get("openai") or {})
if isinstance(talk_providers.get("openai"), dict) else
(tts.get("openai") or {})
)
if isinstance(openai_tts, dict):
oai_settings: Dict[str, str] = {}
oai_model = openai_tts.get("model")
oai_model = openai_tts.get("model") or openai_tts.get("modelId")
if isinstance(oai_model, str) and oai_model.strip():
oai_settings["model"] = oai_model.strip()
oai_voice = openai_tts.get("voice")
@@ -1268,7 +1373,11 @@ class Migrator:
if oai_settings:
tts_data["openai"] = oai_settings
edge_tts = tts.get("edge", {})
edge_tts = (
(providers.get("edge") or {})
if isinstance(providers.get("edge"), dict) else
(tts.get("edge") or {})
)
if isinstance(edge_tts, dict):
edge_voice = edge_tts.get("voice")
if isinstance(edge_voice, str) and edge_voice.strip():
@@ -1298,15 +1407,29 @@ class Migrator:
self.record("tts-config", source_path, destination, "migrated", "Would set TTS config", settings=list(tts_data.keys()))
def migrate_shared_skills(self) -> None:
source_root = self.source_root / "skills"
# Check all OpenClaw skill sources: managed, personal, project-level
skill_sources = [
(self.source_root / "skills", "shared-skills", "managed skills"),
(Path.home() / ".agents" / "skills", "personal-skills", "personal cross-project skills"),
(self.source_root / "workspace" / ".agents" / "skills", "project-skills", "project-level shared skills"),
(self.source_root / "workspace.default" / ".agents" / "skills", "project-skills", "project-level shared skills"),
]
found_any = False
for source_root, kind_label, desc in skill_sources:
if source_root.exists():
found_any = True
self._import_skill_directory(source_root, kind_label, desc)
if not found_any:
destination_root = self.target_root / "skills" / SKILL_CATEGORY_DIRNAME
self.record("shared-skills", None, destination_root, "skipped", "No shared OpenClaw skills directories found")
def _import_skill_directory(self, source_root: Path, kind_label: str, desc: str) -> None:
"""Import skills from a single source directory into openclaw-imports."""
destination_root = self.target_root / "skills" / SKILL_CATEGORY_DIRNAME
if not source_root.exists():
self.record("shared-skills", None, destination_root, "skipped", "No shared OpenClaw skills directory found")
return
skill_dirs = [p for p in sorted(source_root.iterdir()) if p.is_dir() and (p / "SKILL.md").exists()]
if not skill_dirs:
self.record("shared-skills", source_root, destination_root, "skipped", "No shared skills with SKILL.md found")
self.record(kind_label, source_root, destination_root, "skipped", f"No skills with SKILL.md found in {desc}")
return
for skill_dir in skill_dirs:
@@ -1314,7 +1437,7 @@ class Migrator:
final_destination = destination
if destination.exists():
if self.skill_conflict_mode == "skip":
self.record("shared-skill", skill_dir, destination, "conflict", "Destination skill already exists")
self.record(kind_label, skill_dir, destination, "conflict", "Destination skill already exists")
continue
if self.skill_conflict_mode == "rename":
final_destination = self.resolve_skill_destination(destination)
@@ -1329,19 +1452,19 @@ class Migrator:
details: Dict[str, Any] = {"backup": str(backup_path) if backup_path else ""}
if final_destination != destination:
details["renamed_from"] = str(destination)
self.record("shared-skill", skill_dir, final_destination, "migrated", **details)
self.record(kind_label, skill_dir, final_destination, "migrated", **details)
else:
if final_destination != destination:
self.record(
"shared-skill",
kind_label,
skill_dir,
final_destination,
"migrated",
"Would copy shared skill directory under a renamed folder",
f"Would copy {desc} directory under a renamed folder",
renamed_from=str(destination),
)
else:
self.record("shared-skill", skill_dir, final_destination, "migrated", "Would copy shared skill directory")
self.record(kind_label, skill_dir, final_destination, "migrated", f"Would copy {desc} directory")
desc_path = destination_root / "DESCRIPTION.md"
if self.execute:
@@ -1518,6 +1641,7 @@ class Migrator:
self.source_candidate("workspace/IDENTITY.md", "workspace.default/IDENTITY.md"),
self.source_candidate("workspace/TOOLS.md", "workspace.default/TOOLS.md"),
self.source_candidate("workspace/HEARTBEAT.md", "workspace.default/HEARTBEAT.md"),
self.source_candidate("workspace/BOOTSTRAP.md", "workspace.default/BOOTSTRAP.md"),
]
for candidate in candidates:
if candidate:
@@ -1789,8 +1913,9 @@ class Migrator:
human_delay = defaults.get("humanDelay") or {}
if human_delay:
hd = hermes_cfg.get("human_delay") or {}
if human_delay.get("enabled"):
hd["mode"] = "natural"
hd_mode = human_delay.get("mode") or ("natural" if human_delay.get("enabled") else None)
if hd_mode and hd_mode != "off":
hd["mode"] = hd_mode
if human_delay.get("minMs"):
hd["min_ms"] = human_delay["minMs"]
if human_delay.get("maxMs"):
@@ -1804,11 +1929,11 @@ class Migrator:
changes = True
# Map terminal/exec settings
exec_cfg = defaults.get("exec") or (config.get("tools") or {}).get("exec") or {}
exec_cfg = (config.get("tools") or {}).get("exec") or {}
if exec_cfg:
terminal_cfg = hermes_cfg.get("terminal") or {}
if exec_cfg.get("timeout"):
terminal_cfg["timeout"] = exec_cfg["timeout"]
if exec_cfg.get("timeoutSec") or exec_cfg.get("timeout"):
terminal_cfg["timeout"] = exec_cfg.get("timeoutSec") or exec_cfg.get("timeout")
changes = True
hermes_cfg["terminal"] = terminal_cfg
@@ -1883,24 +2008,34 @@ class Migrator:
sr = hermes_cfg.get("session_reset") or {}
changes = False
reset_triggers = session.get("resetTriggers") or session.get("reset_triggers") or {}
if reset_triggers:
daily = reset_triggers.get("daily") or {}
idle = reset_triggers.get("idle") or {}
# OpenClaw uses session.reset (structured) and session.resetTriggers (string array)
reset = session.get("reset") or {}
reset_triggers = session.get("resetTriggers") or session.get("reset_triggers") or []
if daily.get("enabled") and idle.get("enabled"):
sr["mode"] = "both"
elif daily.get("enabled"):
if reset:
# Structured reset config: has mode, atHour, idleMinutes
mode = reset.get("mode", "")
if mode == "daily":
sr["mode"] = "daily"
elif idle.get("enabled"):
elif mode == "idle":
sr["mode"] = "idle"
else:
sr["mode"] = "none"
if daily.get("hour") is not None:
sr["at_hour"] = daily["hour"]
if idle.get("minutes") or idle.get("timeoutMinutes"):
sr["idle_minutes"] = idle.get("minutes") or idle.get("timeoutMinutes")
sr["mode"] = mode or "none"
if reset.get("atHour") is not None:
sr["at_hour"] = reset["atHour"]
if reset.get("idleMinutes"):
sr["idle_minutes"] = reset["idleMinutes"]
changes = True
elif isinstance(reset_triggers, list) and reset_triggers:
# Simple string triggers: ["daily", "idle"]
has_daily = "daily" in reset_triggers
has_idle = "idle" in reset_triggers
if has_daily and has_idle:
sr["mode"] = "both"
elif has_daily:
sr["mode"] = "daily"
elif has_idle:
sr["mode"] = "idle"
changes = True
if changes:
@@ -2092,11 +2227,12 @@ class Migrator:
browser_hermes = hermes_cfg.get("browser") or {}
changed = False
if browser.get("inactivityTimeoutMs"):
browser_hermes["inactivity_timeout"] = browser["inactivityTimeoutMs"] // 1000
# Map fields that have Hermes equivalents
if browser.get("cdpUrl"):
browser_hermes["cdp_url"] = browser["cdpUrl"]
changed = True
if browser.get("commandTimeoutMs"):
browser_hermes["command_timeout"] = browser["commandTimeoutMs"] // 1000
if browser.get("headless") is not None:
browser_hermes["headless"] = browser["headless"]
changed = True
if changed:
@@ -2107,9 +2243,9 @@ class Migrator:
self.record("browser-config", "openclaw.json browser.*", "config.yaml browser",
"migrated")
# Archive advanced browser settings
# Archive remaining browser settings
advanced = {k: v for k, v in browser.items()
if k not in ("inactivityTimeoutMs", "commandTimeoutMs") and v}
if k not in ("cdpUrl", "headless") and v}
if advanced and self.archive_dir:
if self.execute:
self.archive_dir.mkdir(parents=True, exist_ok=True)
@@ -2130,18 +2266,22 @@ class Migrator:
hermes_cfg = load_yaml_file(hermes_cfg_path)
changed = False
# Map exec timeout -> terminal timeout
# Map exec timeout -> terminal timeout (field is timeoutSec in OpenClaw)
exec_cfg = tools.get("exec") or {}
if exec_cfg.get("timeout"):
timeout_val = exec_cfg.get("timeoutSec") or exec_cfg.get("timeout")
if timeout_val:
terminal_cfg = hermes_cfg.get("terminal") or {}
terminal_cfg["timeout"] = exec_cfg["timeout"]
terminal_cfg["timeout"] = timeout_val
hermes_cfg["terminal"] = terminal_cfg
changed = True
# Map web search API key
web_cfg = tools.get("webSearch") or tools.get("web") or {}
if web_cfg.get("braveApiKey") and self.migrate_secrets:
self._set_env_var("BRAVE_API_KEY", web_cfg["braveApiKey"], "tools.webSearch.braveApiKey")
# Map web search API key (path: tools.web.search.brave.apiKey in OpenClaw)
web_cfg = tools.get("web") or tools.get("webSearch") or {}
search_cfg = web_cfg.get("search") or web_cfg if not web_cfg.get("search") else web_cfg["search"]
brave_cfg = search_cfg.get("brave") or {}
brave_key = brave_cfg.get("apiKey") or search_cfg.get("braveApiKey") or web_cfg.get("braveApiKey")
if brave_key and isinstance(brave_key, str) and self.migrate_secrets:
self._set_env_var("BRAVE_API_KEY", brave_key, "tools.web.search.brave.apiKey")
if changed and self.execute:
self.maybe_backup(hermes_cfg_path)
@@ -2169,8 +2309,9 @@ class Migrator:
hermes_cfg_path = self.target_root / "config.yaml"
hermes_cfg = load_yaml_file(hermes_cfg_path)
# Map approval mode
mode = approvals.get("mode") or approvals.get("defaultMode")
# Map approval mode (nested under approvals.exec.mode in OpenClaw)
exec_approvals = approvals.get("exec") or {}
mode = (exec_approvals.get("mode") if isinstance(exec_approvals, dict) else None) or approvals.get("mode") or approvals.get("defaultMode")
if mode:
mode_map = {"auto": "off", "always": "manual", "smart": "smart", "manual": "manual"}
hermes_mode = mode_map.get(mode, "manual")
@@ -2314,9 +2455,24 @@ class Migrator:
notes.append("")
notes.extend([
"## IMPORTANT: Archive the OpenClaw Directory",
"",
"After migration, your OpenClaw directory still exists on disk with workspace",
"state files (todo.json, sessions, logs). If the Hermes agent discovers these",
"directories, it may read/write to them instead of the Hermes state, causing",
"confusion (e.g., cron jobs reading a different todo list than interactive sessions).",
"",
"**Strongly recommended:** Run `hermes claw cleanup` to rename the OpenClaw",
"directory to `.openclaw.pre-migration`. This prevents the agent from finding it.",
"The directory is renamed, not deleted — you can undo this at any time.",
"",
"If you skip this step and notice the agent getting confused about workspaces",
"or todo lists, run `hermes claw cleanup` to fix it.",
"",
"## Hermes-Specific Setup",
"",
"After migration, you may want to:",
"- Run `hermes claw cleanup` to archive the OpenClaw directory (prevents state confusion)",
"- Run `hermes setup` to configure any remaining settings",
"- Run `hermes mcp list` to verify MCP servers were imported correctly",
"- Run `hermes cron` to recreate scheduled tasks (see archive/cron-config.json)",

Some files were not shown because too many files have changed in this diff Show More